Back

Prognosis of stroke subtypes in whole population health systems data: a matched cohort study

Hosking, A.; Iveson, M. H.; Sherlock, L.; Mukherjee, M.; Grover, C.; Alex, B.; Parepalli, S.; Mair, G.; Doubal, F.; Whalley, H. C.; Tobin, R.; Wardlaw, J. M.; Al-Shahi Salman, R.; Whiteley, W. N.

2026-04-20 neurology
10.64898/2026.04.17.26351150 medRxiv
Show abstract

Background Outcome after stroke varies according to stroke subtype by location, but healthcare systems data studies do not include subtyping information. We linked natural language processing (NLP) of brain imaging reports to routinely collected data to estimate risk of death and other outcomes after stroke subtypes in a nationwide dataset. Methods We applied a previously validated NLP algorithm to all CT and MRI head scan reports in Scotland between 2010 and 2018. We linked the reports to hospital readmissions, prescriptions and death data to identify and characterize people with stroke, and to categorize into deep and cortical ischemic stroke, deep and lobar intracerebral hemorrhage (ICH), subarachnoid hemorrhage, and subdural hemorrhage. We used a matched cohort design, and age- and sex-matched four controls per case who never had a stroke. By subtype, we estimated rehospitalization with stroke, myocardial infarction (MI), cancer, dementia, epilepsy and death, accounting for confounders and competing risk of death. Results From 785,331 people with a head scan, we identified 64,219 with clinical stroke phenotypes (mean age 73.4yrs, 49.5% male), and subtyped 12,616 with deep ischaemic stroke; 14,103 with cortical ischaemic stroke; 1,814 with deep ICH; and 1,456 with lobar ICH. There was higher absolute rate of 1-year hospital readmission for lobar compared with deep ICH (4.9% [95%CI 3.9% - 6.1%] vs 3.4% [2.6% - 4.3%]), higher risk of dementia beyond 6 months after lobar ICH compared to controls than for other stroke subtypes (aHR 3.5 [2.3-5.3]); and higher risk of MI within 6 months of cortical ischemic stroke than for other stroke subtypes (aHR 4.6 [3.4-6.3]). Conclusions NLP of free-text reports linked to coded data successfully subtyped stroke at scale, and we estimated risk of clinically relevant outcomes. Future work should use free text to enable large-scale audit and epidemiology of people with stroke.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Stroke
35 papers in training set
Top 0.1%
17.0%
2
Alzheimer's & Dementia
143 papers in training set
Top 0.9%
8.2%
3
BMC Medicine
163 papers in training set
Top 0.4%
7.0%
4
The Lancet Digital Health
25 papers in training set
Top 0.1%
6.1%
5
Frontiers in Neurology
91 papers in training set
Top 1%
4.7%
6
Journal of the American Heart Association
119 papers in training set
Top 2%
3.6%
7
Neurology
44 papers in training set
Top 0.4%
3.5%
8
Circulation
66 papers in training set
Top 1%
3.5%
50% of probability mass above
9
PLOS Medicine
98 papers in training set
Top 1%
3.5%
10
Annals of Neurology
57 papers in training set
Top 0.7%
3.0%
11
Nature Communications
4913 papers in training set
Top 43%
2.8%
12
npj Digital Medicine
97 papers in training set
Top 2%
2.6%
13
Journal of Neurology, Neurosurgery & Psychiatry
29 papers in training set
Top 0.5%
2.4%
14
PLOS ONE
4510 papers in training set
Top 47%
2.3%
15
EClinicalMedicine
21 papers in training set
Top 0.1%
2.0%
16
JAMA Network Open
127 papers in training set
Top 2%
2.0%
17
Med
38 papers in training set
Top 0.3%
1.6%
18
BMJ Open
554 papers in training set
Top 9%
1.6%
19
Stroke: Vascular and Interventional Neurology
13 papers in training set
Top 0.3%
1.6%
20
Nature Medicine
117 papers in training set
Top 2%
1.6%
21
Brain
154 papers in training set
Top 3%
1.4%
22
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.2%
23
Journal of Neurology
26 papers in training set
Top 1%
0.9%
24
Scientific Reports
3102 papers in training set
Top 76%
0.7%
25
Journal of the Neurological Sciences
17 papers in training set
Top 0.9%
0.6%
26
Neurocritical Care
11 papers in training set
Top 0.5%
0.6%