Please use this identifier to cite or link to this item: http://repository.aaup.edu/jspui/handle/123456789/3666
Title: Machine learning in colorectal cancer prediction and diagnosis: a systematic review of models’ performance
Authors: Abuzuhri, Mohammad-Ali$AAUP$Palestinian
Najjar, Shahenaz$AAUP$Palestinian
Awad, Mohammed$AAUP$Palestinian
Cruz-Correia, Ricardo $Other$Other
Falna, Hiba$Other$Palestinian
Aktas, Emine$Other$Other
Abu Al Rob, Basem Mohammed$Other$Palestinian
Oliveira, Miguel$Other$Other
Mughnamin, Ibrahim$Other$Palestinian
Novo Esteves, Sara$Other$Other
Awlad Mohammad, Yousef$AAUP$Palestinian
Keywords: Colorectal cancer
Machine learning
Prediction
Diagnosis
Performance
Feature selection
Data preprocessing
Ensemble learning
Issue Date: 28-Oct-2025
Publisher: International Journal of Medical Informatics (IJMI)
Citation: Mohammad-Ali Abuzuhri, Shahenaz Najjar, Mohammed Awad, Ricardo Cruz-Correia, Hiba Falna, Emine Aktas, Basem Mohammed Abu Al Rob, Miguel Oliveira, Ibrahim Mughnamin, Sara Novo Esteves, Yousef Awlad Mohammad, Pedro Vieira-Marques, Machine learning in colorectal cancer prediction and diagnosis: a systematic review of models' performance, International Journal of Medical Informatics (IJMI), Volume 206, 2026, 106170, ISSN 1386-5056, https://doi.org/10.1016/j.ijmedinf.2025.106170.
Series/Report no.: 206;
Abstract: Introduction Colorectal cancer (CRC) poses a significant global health burden, demanding early and accurate detection strategies. However, Machine Learning (ML) models are increasingly being applied for CRC prediction; yet their performance requires systematic evaluation to guide adoption. Purpose This review evaluates the performance of ML models in predicting and diagnosing CRC, focusing on studies published between 2019 and 2024. It intends to identify the most frequently used ML models, determine their performance, and analyze the impact of different models’ settings on their performance. Methods A comprehensive search was conducted using SCOPUS, PubMed, and Web of Science databases. Following PRISMA guidelines, studies evaluating ML models for CRC prediction were selected and reviewed. Study selection, data extraction, and risk-of-bias assessment were performed independently by multiple reviewers. Extracted data included study characteristics, model specifications, validation methods, and performance metrics. Results Thirty studies met the inclusion criteria. Random Forest (RF) was the most frequently evaluated model. At the same time, Ensemble Learning (EML), Neural Networks (ANN/DNN), and Support Vector Machines (SVM) consistently demonstrated the highest performance across multiple metrics. Most studies employed molecular datasets, and feature selection methods varied widely, significantly influencing model performance. Conclusions ML models, particularly EML, ANN, DNN, and SVM, demonstrate high diagnostic performance in CRC prediction and diagnosis, suggesting substantial diagnostic potential; any effect on decision-making and outcomes through improved accuracy and early detection requires external and prospective validation. However, variability in datasets, methodologies, and reporting quality highlights significant research gaps, including the lack of standardized validation procedures and consistent performance reporting, which are crucial for facilitating clinical adoption and informing healthcare policy decisions.
URI: http://repository.aaup.edu/jspui/handle/123456789/3666
ISSN: https://doi.org/10.1016/j.ijmedinf.2025.106170
Appears in Collections:Faculty & Staff Scientific Research publications

Files in This Item:
File Description SizeFormat 
1-s2.0-S1386505625003879-main (3).pdfNon-open access2.7 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Admin Tools