ENHANCING BREAST CANCER PREDICTION WITH AUTOML TPOT CONFIGURATIONS FOR MAMMOGRAPHIC BREAST CANCER CLASSIFICATION

Authors

  • Faikah Awang@ Ismail School of Biology, Faculty of Applied Sciences, Universiti Teknologi MARA, Cawangan Negeri Sembilan, Kampus Kuala Pilah
  • Muhammad Khalis Abdul Karim Department of Physics, Faculty of Sciences, Universiti Putra Malaysia
  • Mohd Mustafa Awang Kechik Department of Physics, Faculty of Sciences, Universiti Putra Malaysia
  • Izdihar Kamal Department of Physics, Faculty of Sciences, Universiti Putra Malaysia
  • Zahurin Ismail Department of Radiology, Hospital Putrajaya, Ministry of Health Malaysia
  • Nurul Syazatul Filzah Mohamad Suchah Department of Physics, Faculty of Sciences, Universiti Putra Malaysia

DOI:

https://doi.org/10.35631/IJIREV.721001

Keywords:

Breast Cancer, Automated Machine Learning, Machine Learning, Mammography, Radiomic, TPOT

Abstract

Breast cancer remains a leading cause of death among women worldwide. Early detection using mammographic imaging improves patients’ outcomes; however, its reliability is heavily dependent on radiologist’s expertise, often leading to variability and misdiagnosis. This study explores the potential of Automated Machine Learning (AutoML) on enhancing breast cancer prediction by comparing three configurations Tree-Based Pipeline Optimization Tool (TPOT): Default, Light and Sparse. 244 mammography images were obtained from two database access through The Cancer Imaging Archive (TCIA). Image pre-processing was using MATLAB R2022a, USA, employing Contrast Limited Adaptive Histogram Equalization (CLAHE) for image enhancement and Active Contour Method (ACM) for image segmentation of region of interest (ROI). subsequently enabling the extraction of radiomic features. These extracted features were then used to train and test three TPOT configurations; TPOT Defaults, TPOT Light and TPOT Sparse via Python version 3.9. The classification models later were evaluated for accuracy, sensitivity and specificity to ascertain the model’s efficiency in distinguishing between benign and malignant breast cancer. 37 radiomic features: six First-Order Statistical features, 21Gray-Level Co-occurrence Matrix (GLCM) Texture features and ten Shape-Based Features were extracted from sample images. The TPOT Default configuration achieved the highest accuracy of 0.735 (CI95%: 0.611-0.859), with a sensitivity of 0.760 (CI95%: 0.642- 0.878), and precision of 0.731 (CI95%:0.607-0.855) outperforming both TPOT Light, accuracy: 0.633 (CI95%:0.498 -0.768), sensitivity: 0.667 (CI95%: 0.537- 0.797), precision; 0.615 (CI95%:0.485-0.745) and TPOT Sparse with accuracy: 0.673 (CI95%: 0.543-0.803), sensitivity, 0.653 (CI95%: 0.521-0.785) and precision; 0.708 (CI95%:0.587 -0.829). These results demonstrate that the TPOT Default configuration delivers the most reliable classification, highlighting AutoML’s potential as a clinical decision support tool. By reducing manual feature engineering and improving diagnostic accuracy, AutoML could significantly streamline breast cancer detection and improve outcomes in radiological practice.

Downloads

Download data is not yet available.

Downloads

Published

2025-06-05

How to Cite

Faikah Awang@ Ismail, Muhammad Khalis Abdul Karim, Mohd Mustafa Awang Kechik, Izdihar Kamal, Zahurin Ismail, & Nurul Syazatul Filzah Mohamad Suchah. (2025). ENHANCING BREAST CANCER PREDICTION WITH AUTOML TPOT CONFIGURATIONS FOR MAMMOGRAPHIC BREAST CANCER CLASSIFICATION. INTERNATIONAL JOURNAL OF INNOVATION AND INDUSTRIAL REVOLUTION (IJIREV), 7(21). https://doi.org/10.35631/IJIREV.721001