Evaluating machine learning models for cardiovascular risk prediction: A Shapley Additive Explanations-based approach with statistical testing

Kunwar, Shiv; Ganesan, Swathi; Pokhrel, Sangita

Evaluating machine learning models for cardiovascular risk prediction: A Shapley Additive Explanations-based approach with statistical testing

Kunwar, Shiv, Ganesan, Swathi ORCID: https://orcid.org/0000-0002-6278-2090 and Pokhrel, Sangita ORCID: https://orcid.org/0009-0008-2092-7029 (2026) Evaluating machine learning models for cardiovascular risk prediction: A Shapley Additive Explanations-based approach with statistical testing. Brain & Heart. 025260032.

Preview

Text
manuscript_bh06302.pdf - Published Version
Available under License Creative Commons Attribution.
| Preview

Official URL: https://doi.org/10.36922/BH025260032

Abstract

Cardiovascular disease (CVD) remains the leading global cause of mortality, underscoring the need for accurate and interpretable prediction models to facilitate early diagnosis. Existing machine learning (ML) approaches often face challenges balancing predictive performance with clinical interpretability, limiting their adoption. This study introduces a structured evaluation framework combining A/B testing with statistical hypothesis validation to rigorously compare ML models for CVD risk prediction. Utilizing a dataset of 1,001 patient records, models including logistic regression, random forest (RF), artificial neural networks, and extreme gradient boosting (XGBoost) were trained and evaluated. Synthetic Minority Oversampling Technique was applied to address class imbalance, while Shapley Additive Explanations (SHAP) provided insights into feature contributions and guided the development of reduced-feature models. Results indicate that RF achieved the highest accuracy (98.5%) and area under the receiver operating characteristic curve (0.9991), whereas XGBoost coupled with SHAP enabled effective feature selection with minimal loss in predictive power. A/B testing demonstrated the trade-offs between model complexity and interpretability, while statistical testing confirmed the significance of performance differences. These findings suggest that interpretable, reduced-feature models may be viable for deployment in resource-limited clinical settings, advancing the integration of artificial intelligence in cardiovascular healthcare.

Item Type:	Article
Status:	Published
DOI:	10.36922/BH025260032
Subjects:	Q Science > Q Science (General) > Q325 Machine learning
School/Department:	London Campus
URI:	https://ray.yorksj.ac.uk/id/eprint/14197

University Staff: Request a correction | RaY Editors: Update this record

Altmetric

View Altmetric information about this item.

CORE (COnnecting REpositories)

Tools

Deposit and Record Details

ID Code:	14197
Depositing User:	Ganesan, Swathi
Deposited On:	11 Mar 2026 08:44
Last Modified:	11 Mar 2026 08:45