Back

ESUS-AI:a machine learning framework to estimate the most likely embolic source in embolic stroke of undetermined source

Bonura, A.; Juega, J.; Meza, C.; Kühne Escola, J.; Muchada, M.; Rubiera, M.; Olive Gadea, M.; Requena, M.; Rodrigo-Gisbert, M.; Rodriguez-Villatoro, N.; Rodriguez-Luna, D.; Rizzo, F.; Fiore, G. M.; Simonetti, R.; Brunelli, N.; Fernandez-Galera, R.; Francisco Pascal, J.; Colangelo, G.; Ribo, M.; Molina, C. A.; PAGOLA, J.

2026-02-06 neurology
10.64898/2026.02.04.26345615 medRxiv
Show abstract

Background and PurposeEmbolic stroke of undetermined source (ESUS) emains a major diagnostic challenge in vascular neurology, as a substantial proportion of patients lack an identifiable embolic source despite standardized diagnostic workup. The failure of empiric anticoagulation strategies highlights the need for individualized, mechanism-oriented risk stratification. We aimed to develop a machine learning-based framework to estimate the most likely embolic source in ESUS using routinely available clinical data. MethodsWe retrospectively analyzed consecutive ESUS patients admitted to the Stroke Unit of Vall dHebron Hospital between 2020 and 2024. Three supervised machine learning models (XGBoost, Random Forest, and regularized logistic regression) were trained to independently predict the presence of left atrial enlargement (LAE), left ventricular dysfunction or akinesia (LVD), and complex aortic plaques (AP), based on demographic, clinical, laboratory, and imaging variables available at diagnosis. Model interpretability was assessed using permutation importance and SHAP analyses. ResultsAmong 1,741 ESUS patients (mean age 71.5{+/-}14.6 years; 48.3% women), LAE was present in 40.5%, AP in 11.0%, and LVD in 6.5%. XGBoost achieved the best overall performance across targets (PR-AUC: 0.71 for LAE, 0.29 for AP, 0.44 for LVD). Distinct and biologically coherent risk profiles emerged. LAE was driven by older age, elevated NT-proBNP, higher stroke severity, and a non-linear association with cholesterol. AP was associated with advanced age and traditional vascular risk factors. LVD showed a cardiomyopathic pattern characterized by elevated NT-proBNP, younger age, male sex, and severe strokes. ConclusionsA machine learning-based approach can provide probabilistic, mechanism-oriented stratification in ESUS, capturing non-linear interactions among routinely available variables. This framework may support clinicians in prioritizing targeted diagnostic pathways and tailoring secondary prevention strategies, pending external validation.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Stroke
35 papers in training set
Top 0.2%
10.1%
2
Frontiers in Neurology
91 papers in training set
Top 0.6%
9.1%
3
Stroke: Vascular and Interventional Neurology
13 papers in training set
Top 0.1%
8.2%
4
Journal of the American Heart Association
119 papers in training set
Top 1%
6.4%
5
Journal of Thrombosis and Haemostasis
28 papers in training set
Top 0.1%
4.8%
6
Brain
154 papers in training set
Top 1%
4.0%
7
Journal of Stroke and Cerebrovascular Diseases
12 papers in training set
Top 0.1%
4.0%
8
Neurology
44 papers in training set
Top 0.4%
3.7%
50% of probability mass above
9
NeuroImage: Clinical
132 papers in training set
Top 1%
3.6%
10
Brain Communications
147 papers in training set
Top 0.7%
3.6%
11
Scientific Reports
3102 papers in training set
Top 40%
3.2%
12
PLOS ONE
4510 papers in training set
Top 45%
2.6%
13
Journal of Neurology
26 papers in training set
Top 0.4%
2.4%
14
Annals of Neurology
57 papers in training set
Top 0.8%
2.4%
15
Frontiers in Neuroscience
223 papers in training set
Top 3%
2.1%
16
Journal of Neurology, Neurosurgery & Psychiatry
29 papers in training set
Top 0.6%
1.9%
17
European Journal of Neurology
20 papers in training set
Top 0.2%
1.8%
18
Journal of the Neurological Sciences
17 papers in training set
Top 0.3%
1.5%
19
Critical Care Explorations
15 papers in training set
Top 0.3%
1.3%
20
EClinicalMedicine
21 papers in training set
Top 0.5%
1.2%
21
Neurocritical Care
11 papers in training set
Top 0.3%
1.1%
22
Journal of Cerebral Blood Flow & Metabolism
43 papers in training set
Top 0.5%
0.9%
23
npj Digital Medicine
97 papers in training set
Top 3%
0.9%
24
Circulation
66 papers in training set
Top 2%
0.8%
25
BMC Neurology
12 papers in training set
Top 0.8%
0.8%
26
BMC Medicine
163 papers in training set
Top 7%
0.7%
27
Journal of Clinical Medicine
91 papers in training set
Top 6%
0.7%
28
Diagnostics
48 papers in training set
Top 2%
0.7%
29
Atherosclerosis
29 papers in training set
Top 1%
0.7%
30
Annals of Clinical and Translational Neurology
29 papers in training set
Top 1%
0.6%