Classification of Recurrence Status After Surgical Treatment of Chronic Subdural Hemorrhage - A Machine Learning Approach
Hamou, H.; Kernbach, J.; Ridwan, H.; Fay-Rodrian, K.; Clusmann, H.; Hoellig, A.; Veldeman, M.
Show abstract
Background Chronic subdural hematoma (cSDH) recurrence requiring reoperation occurs in 5-33% of cases, representing a substantial clinical and economic burden. The ability to predict recurrence could enable risk-stratified surveillance protocols, potentially reducing imaging burden in low-risk patients while maintaining close monitoring for high-risk individuals. We evaluated whether machine learning algorithms could achieve clinically actionable recurrence prediction using routinely available clinical and radiographic variables. Methods This retrospective single-center study included 564 consecutive patients who underwent surgical evacuation of cSDH between 2015 and 2023. Data were randomly divided into training (75%, n=422) and test (25%, n=142) sets. We developed and compared three machine learning models--regularized logistic regression, Random Forest, and XGBoost--using 31 predictor variables including demographics, comorbidities, medications, laboratory values, hematoma characteristics, and postoperative features. Model development and hyperparameter tuning were performed exclusively on the training set using 10-fold cross-validation. The best-performing model was selected and evaluated on the held-out test set. The primary outcome was postoperative recurrence requiring reoperation. Results Postoperative recurrence occurred in 170 patients (30.1%). Within the training set, XGBoost achieved the highest cross-validated ROC AUC of 0.713 (SE=0.024), outperforming regularized logistic regression (0.686) and matching Random Forest (0.713). Variable importance analysis identified hematoma volume, coagulation parameters (INR, platelets, aPTT), and disease severity markers (ICU admission, GCS) as the most influential predictors, though absolute effect sizes remained modest. On the held-out test set, the final XGBoost model achieved ROC AUC 0.688 (95% CI: 0.590-0.772) with excellent calibration. However, at the clinically relevant 90% sensitivity threshold, test set specificity was only 30.3%, allowing potential imaging reduction in approximately one-third of non-recurrence patients. The consistency between training and test performance confirmed that limitations stem from inherent predictor information content rather than overfitting. Conclusions Machine learning models using routinely available clinical and radiographic variables cannot achieve clinically actionable risk stratification for cSDH recurrence. Despite rigorous methodology and internal validation, discriminative capacity remained insufficient to identify a low-risk patient subgroup suitable for de-escalated surveillance. These findings suggest that recurrence is driven by factors not captured in standard clinical assessment, and support either uniform surveillance protocols or symptom-driven imaging strategies rather than risk-stratified approaches.
Matching journals
The top 11 journals account for 50% of the predicted probability mass.