Multicohort development and validation of a machine learning model to predict six-month functional traumatic brain injury outcomes in a large national registry
Vattipally, V. N.; Jillala, R. R.; Kramer, P.; Elshareif, M.; Singh, S.; Jo, J.; Suarez, J. I.; Sakran, J. V.; Haut, E. R.; Huang, J.; Bettegowda, C.; Azad, T. D.
Show abstract
Background: Prognostication after moderate-to-severe traumatic brain injury (TBI) rarely captures long-term functional recovery, despite its importance to patients, families, and clinicians. Large trauma registries such as the Trauma Quality Improvement Program (TQIP) dataset contain detailed clinical data but lack systematic follow-up, limiting their ability to study longer-term functional outcomes. Methods: We developed and externally validated a machine learning model to predict favorable six-month functional outcome (GOS MD/GR or GOSE >=5) using harmonized data from two randomized clinical trials: CRASH (training) and ROC-TBI (validation). Five candidate classifiers (random forest [RF], linear discriminant analysis, k-nearest neighbors, naive Bayes, and support vector machine) were trained using seven shared clinical predictors. Models were evaluated using ROC-AUC, calibration metrics, and performance at the Youden optimal threshold and a high-sensitivity secondary threshold. The final model was applied to patients with moderate-to-severe TBI in the national TQIP registry (2017-2022) to estimate population-level recovery patterns. Results: The RF model demonstrated the highest overall performance after recalibration, achieving strong discrimination (AUC internal and external, 0.887 and 0.784), good calibration, and high sensitivity (0.890) and negative predictive value (0.909). Applied to 63,289 patients from TQIP, the model estimated that 45% would achieve favorable six-month outcomes at the Youden optimal threshold and 57% at the high-sensitivity threshold, with predicted recovery aligning with established clinical correlates such as younger age, higher admission GCS, and lower rates of penetrating or brainstem injuries. Conclusion: A machine learning model trained on high-quality trial data can generate clinically plausible estimates of long-term functional recovery when applied at scale to national trauma registries that lack systematic follow-up. This approach enables imputation of functional outcomes in datasets lacking follow-up, supports benchmarking and quality improvement across trauma systems, and provides a foundation for future models incorporating physiologic time-series, imaging, and biomarker data.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.