Back

Multicohort development and validation of a machine learning model to predict six-month functional traumatic brain injury outcomes in a large national registry

Vattipally, V. N.; Jillala, R. R.; Kramer, P.; Elshareif, M.; Singh, S.; Jo, J.; Suarez, J. I.; Sakran, J. V.; Haut, E. R.; Huang, J.; Bettegowda, C.; Azad, T. D.

2026-04-27 intensive care and critical care medicine
10.64898/2026.04.23.26351622 medRxiv
Show abstract

Background: Prognostication after moderate-to-severe traumatic brain injury (TBI) rarely captures long-term functional recovery, despite its importance to patients, families, and clinicians. Large trauma registries such as the Trauma Quality Improvement Program (TQIP) dataset contain detailed clinical data but lack systematic follow-up, limiting their ability to study longer-term functional outcomes. Methods: We developed and externally validated a machine learning model to predict favorable six-month functional outcome (GOS MD/GR or GOSE >=5) using harmonized data from two randomized clinical trials: CRASH (training) and ROC-TBI (validation). Five candidate classifiers (random forest [RF], linear discriminant analysis, k-nearest neighbors, naive Bayes, and support vector machine) were trained using seven shared clinical predictors. Models were evaluated using ROC-AUC, calibration metrics, and performance at the Youden optimal threshold and a high-sensitivity secondary threshold. The final model was applied to patients with moderate-to-severe TBI in the national TQIP registry (2017-2022) to estimate population-level recovery patterns. Results: The RF model demonstrated the highest overall performance after recalibration, achieving strong discrimination (AUC internal and external, 0.887 and 0.784), good calibration, and high sensitivity (0.890) and negative predictive value (0.909). Applied to 63,289 patients from TQIP, the model estimated that 45% would achieve favorable six-month outcomes at the Youden optimal threshold and 57% at the high-sensitivity threshold, with predicted recovery aligning with established clinical correlates such as younger age, higher admission GCS, and lower rates of penetrating or brainstem injuries. Conclusion: A machine learning model trained on high-quality trial data can generate clinically plausible estimates of long-term functional recovery when applied at scale to national trauma registries that lack systematic follow-up. This approach enables imputation of functional outcomes in datasets lacking follow-up, supports benchmarking and quality improvement across trauma systems, and provides a foundation for future models incorporating physiologic time-series, imaging, and biomarker data.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of Neurotrauma
27 papers in training set
Top 0.1%
40.4%
2
Frontiers in Neurology
91 papers in training set
Top 0.5%
10.3%
50% of probability mass above
3
PLOS ONE
4510 papers in training set
Top 27%
6.5%
4
Scientific Reports
3102 papers in training set
Top 22%
5.0%
5
JAMA Network Open
127 papers in training set
Top 0.9%
3.7%
6
Critical Care Explorations
15 papers in training set
Top 0.2%
2.1%
7
Annals of Clinical and Translational Neurology
29 papers in training set
Top 0.4%
2.1%
8
Neurocritical Care
11 papers in training set
Top 0.1%
1.9%
9
Biology Methods and Protocols
53 papers in training set
Top 0.7%
1.9%
10
Journal of Clinical Medicine
91 papers in training set
Top 4%
1.5%
11
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.1%
12
Annals of Neurology
57 papers in training set
Top 2%
1.0%
13
EClinicalMedicine
21 papers in training set
Top 0.6%
1.0%
14
BMJ Open
554 papers in training set
Top 11%
0.9%
15
Journal of Stroke and Cerebrovascular Diseases
12 papers in training set
Top 0.4%
0.9%
16
Journal of Neurology
26 papers in training set
Top 1%
0.9%
17
BMC Medicine
163 papers in training set
Top 6%
0.8%
18
PLOS Digital Health
91 papers in training set
Top 3%
0.8%
19
Experimental Neurology
57 papers in training set
Top 1%
0.8%
20
Biomedicines
66 papers in training set
Top 3%
0.8%
21
Journal of Clinical Epidemiology
28 papers in training set
Top 0.6%
0.8%
22
Journal of the American Heart Association
119 papers in training set
Top 4%
0.8%
23
The Lancet
16 papers in training set
Top 0.7%
0.8%
24
Brain Communications
147 papers in training set
Top 3%
0.7%
25
BMC Neurology
12 papers in training set
Top 1.0%
0.7%
26
Stroke: Vascular and Interventional Neurology
13 papers in training set
Top 0.4%
0.7%
27
Psychiatry and Clinical Neurosciences
11 papers in training set
Top 0.4%
0.7%
28
Science Translational Medicine
111 papers in training set
Top 8%
0.5%
29
British Journal of Anaesthesia
14 papers in training set
Top 1.0%
0.5%
30
Frontiers in Medicine
113 papers in training set
Top 8%
0.5%