Machine learning techniques for stroke prediction: A systematic review of algorithms, datasets, and regional gaps.

Soladoye, Afeez Adekunle; Aderinto, Nicholas; Popoola, Mayowa Racheal; Adeyanju, Ibrahim A; Osonuga, Ayokunle; Olawade, David

Machine learning techniques for stroke prediction: A systematic review of algorithms, datasets, and regional gaps.

Soladoye, Afeez Adekunle, Aderinto, Nicholas, Popoola, Mayowa Racheal, Adeyanju, Ibrahim A, Osonuga, Ayokunle and Olawade, David ORCID: https://orcid.org/0000-0003-0188-9836 (2025) Machine learning techniques for stroke prediction: A systematic review of algorithms, datasets, and regional gaps. International journal of medical informatics, 203. p. 106041.

[thumbnail of 1-s2.0-S1386505625002588-main.pdf]

Preview

Text
1-s2.0-S1386505625002588-main.pdf - Published Version
Available under License Creative Commons Attribution.
| Preview

Official URL: https://doi.org/10.1016/j.ijmedinf.2025.106041

Abstract

Stroke is a leading cause of mortality and disability worldwide, with approximately 15 million people suffering strokes annually. Machine learning (ML) techniques have emerged as powerful tools for stroke prediction, enabling early identification of risk factors through data-driven approaches. However, the clinical utility and performance characteristics of these approaches require systematic evaluation. To systematically review and analyze ML techniques used for stroke prediction, systematically synthesize performance metrics across different prediction targets and data sources, evaluate their clinical applicability, and identify research trends focusing on patient population characteristics and stroke prevalence patterns. A systematic review was conducted following PRISMA guidelines. Five databases (Google Scholar, Lens, PubMed, ResearchGate, and Semantic Scholar) were searched for open-access publications on ML-based stroke prediction published between January 2013 and December 2024. Data were extracted on publication characteristics, datasets, ML methodologies, evaluation metrics, prediction targets (stroke occurrence vs. outcomes), data sources (EHR, imaging, biosignals), patient demographics, and stroke prevalence. Descriptive synthesis was performed due to substantial heterogeneity precluding quantitative meta-analysis. Fifty-eight studies were included, with peak publication output in 2021 (21 articles). Studies targeted three main prediction objectives: stroke occurrence prediction (n = 52, 62.7 %), stroke outcome prediction (n = 19, 22.9 %), and stroke type classification (n = 12, 14.4 %). Data sources included electronic health records (n = 48, 57.8 %), medical imaging (n = 21, 25.3 %), and biosignals (n = 14, 16.9 %). Systematic analysis revealed ensemble methods consistently achieved highest accuracies for stroke occurrence prediction (range: 90.4-97.8 %), while deep learning excelled in imaging-based applications. African populations, despite highest stroke mortality rates globally, were represented in fewer than 4 studies. ML techniques show promising results for stroke prediction. However, significant gaps exist in representation of high-risk populations and real-world clinical validation. Future research should prioritize population-specific model development and clinical implementation frameworks. [Abstract copyright: Copyright © 2025 The Authors. Published by Elsevier B.V. All rights reserved.]

Item Type:	Article
Additional Information:	From PubMed via Jisc Publications Router History: received 12-06-2025; revised 03-07-2025; accepted 06-07-2025.
Status:	Published
DOI:	10.1016/j.ijmedinf.2025.106041
School/Department:	London Campus
URI:	https://ray.yorksj.ac.uk/id/eprint/12391

University Staff: Request a correction | RaY Editors: Update this record

Altmetric

CORE (COnnecting REpositories)

Tools

Deposit and Record Details

ID Code:	12391
Depositing User:	Olawade, David
Deposited On:	28 Jul 2025 09:12
Last Modified:	12 Oct 2025 14:45