Explainable machine learning models for early Alzheimer's disease detection using multimodal clinical data.

Soladoye, Afeez Adekunle; Aderinto, Nicholas; Osho, Damilola; Olawade, David

Explainable machine learning models for early Alzheimer's disease detection using multimodal clinical data.

Soladoye, Afeez Adekunle, Aderinto, Nicholas, Osho, Damilola and Olawade, David ORCID: https://orcid.org/0000-0003-0188-9836 (2025) Explainable machine learning models for early Alzheimer's disease detection using multimodal clinical data. International journal of medical informatics, 204. p. 106093.

[thumbnail of 1-s2.0-S1386505625003107-main.pdf]

Preview

Text
1-s2.0-S1386505625003107-main.pdf - Published Version
Available under License Creative Commons Attribution.
| Preview

Official URL: https://doi.org/10.1016/j.ijmedinf.2025.106093

Abstract

Alzheimer's disease (AD) represents a significant global health challenge requiring early and accurate prediction for effective intervention. While machine learning models demonstrate promising capabilities in AD prediction, their black-box nature limits clinical adoption due to a lack of interpretability and transparency. This study aims to develop and evaluate explainable artificial intelligence (XAI) frameworks for AD prediction using comprehensive multimodal patient data, with a focus on enhancing model interpretability through SHAP and LIME techniques. A comprehensive dataset of 2,149 patients aged 60-90 years was obtained from Kaggle, encompassing demographic, medical history, lifestyle, clinical measurements, cognitive assessments, and symptom data. Rigorous preprocessing included MinMax normalisation, Synthetic Minority Over-sampling Technique (SMOTE) for class imbalance, and Backward Elimination Feature Selection reduced 32 features to 26 optimal predictors. Six machine learning models were evaluated: K-Nearest Neighbours (KNN), Support Vector Machine (SVM), Logistic Regression (LR), XGBoost, Stacked Ensemble, and Random Forest (RF). RF's optimal hyperparameters were obtained using Ant colony Optimization Model interpretability was enhanced using SHAP and LIME frameworks for both global and local explanations. The optimised Random Forest with backward elimination feature selection and ant colony optimisation achieved superior performance with 95 % accuracy, 95 % precision, 94 % recall, 94 % F1-score, and 98 % AUC. SHAP analysis identified functional assessment, activities of daily living (ADL), memory complaints, and Mini-Mental State Examination (MMSE) as the most influential predictors. LIME provided complementary local explanations, validating the clinical relevance of identified features. The integration of explainable AI techniques with machine learning models provides clinically meaningful insights for AD prediction, enhancing transparency and fostering trust in AI-driven diagnostic tools whilst maintaining high predictive accuracy. Future work should focus on external validation, clinical workflow integration, and addressing computational requirements for real-world deployment. [Abstract copyright: Copyright © 2025 The Author(s). Published by Elsevier B.V. All rights reserved.]