Outcome Risk Modeling for Disability-Free Longevity: Comparison of Random Forest and Random Survival Forest Methods
Vanghelof, J. C.; Tzimas, G.; Du, L.; Tchoua, R.; Shah, R. C.
Show abstract
BackgroundWhen creating risk prediction models for time-to-event data, methods that incorporate time are typically used. Random survival forests (RSF), an extension of random forests (RF), are one such class of models. We compared RSF to RF in the context of time-to-event outcomes in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial. We hypothesize that RSF will have superior discrimination and calibration versus RF. MethodsParticipants from ASPREE residing outside the US or with missing data were excluded. A total of 2,291 participants were assigned 1:1 into training and test sets. RF and RSF models were trained using a total of 115 measures as candidate predictors. The outcome of interest was the earliest of incident dementia, physical disability, or death. ResultsThe primary endpoint occurred in 10.5% of participants. Discrimination was similar between the models: sensitivity ([~]0.75), specificity ([~]0.57), positive predictive value ([~]0.17), time dependent AUC ([~]0.71), and Harrells concordance ([~]0.73). Calibration was likewise similar, Brier score ([~]0.09). DiscussionThe RF and RSF models exhibited comparable discrimination and calibration. We conclude that RSF may not always lead to more accurate predictions of outcomes compared to RF. Further examination in different clinical trial cohorts is needed to better understand the context in which adding time into outcomes risk modeling adds value.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.