ENHANCING BREAST CANCER PREDICTION WITH AUTOML TPOT CONFIGURATIONS FOR MAMMOGRAPHIC BREAST CANCER CLASSIFICATION
DOI:
https://doi.org/10.35631/IJIREV.721001Keywords:
Breast Cancer, Automated Machine Learning, Machine Learning, Mammography, Radiomic, TPOTAbstract
Breast cancer remains a leading cause of death among women worldwide. Early detection using mammographic imaging improves patients’ outcomes; however, its reliability is heavily dependent on radiologist’s expertise, often leading to variability and misdiagnosis. This study explores the potential of Automated Machine Learning (AutoML) on enhancing breast cancer prediction by comparing three configurations Tree-Based Pipeline Optimization Tool (TPOT): Default, Light and Sparse. 244 mammography images were obtained from two database access through The Cancer Imaging Archive (TCIA). Image pre-processing was using MATLAB R2022a, USA, employing Contrast Limited Adaptive Histogram Equalization (CLAHE) for image enhancement and Active Contour Method (ACM) for image segmentation of region of interest (ROI). subsequently enabling the extraction of radiomic features. These extracted features were then used to train and test three TPOT configurations; TPOT Defaults, TPOT Light and TPOT Sparse via Python version 3.9. The classification models later were evaluated for accuracy, sensitivity and specificity to ascertain the model’s efficiency in distinguishing between benign and malignant breast cancer. 37 radiomic features: six First-Order Statistical features, 21Gray-Level Co-occurrence Matrix (GLCM) Texture features and ten Shape-Based Features were extracted from sample images. The TPOT Default configuration achieved the highest accuracy of 0.735 (CI95%: 0.611-0.859), with a sensitivity of 0.760 (CI95%: 0.642- 0.878), and precision of 0.731 (CI95%:0.607-0.855) outperforming both TPOT Light, accuracy: 0.633 (CI95%:0.498 -0.768), sensitivity: 0.667 (CI95%: 0.537- 0.797), precision; 0.615 (CI95%:0.485-0.745) and TPOT Sparse with accuracy: 0.673 (CI95%: 0.543-0.803), sensitivity, 0.653 (CI95%: 0.521-0.785) and precision; 0.708 (CI95%:0.587 -0.829). These results demonstrate that the TPOT Default configuration delivers the most reliable classification, highlighting AutoML’s potential as a clinical decision support tool. By reducing manual feature engineering and improving diagnostic accuracy, AutoML could significantly streamline breast cancer detection and improve outcomes in radiological practice.