When clinical prediction models do not generalize: a simulation study in liver transplantation
Brulhart, D.; Magini, G.; Schafer, A.; Schwab, S.; Held, U.
Show abstract
Objectives: Clinical prediction models estimate the risk of a future outcome in patients. Such models are often externally validated using independent datasets; however, even when a model has been rigorously validated in a new setting and patient population, its performance across other clinical settings remains unclear. Therefore, we systematically evaluated model performance and clinical utility across diverse patient populations to quantify the limits of transportability. Methods: Using liver transplantation as an example, we used the UK donation-after-circulatory-death (DCD) risk score and descriptive statistics from Swiss DCD liver transplant populations to simulate realistic target populations with varying donor and recipient characteristics. The risk score's ability to predict one-year graft failure was evaluated using calibration intercept, calibration slope, area under the receiver operating characteristic (ROC) curve, and net benefit. Results: The UK DCD Risk Score's performance depended heavily on the simulated population characteristics. While the score performed adequately in settings similar to those where it was derived, it was not satisfactory in others. Discussion: The study showed, using a risk score in liver transplantation as an example, that the application of a prediction model can be limited in certain external populations when they differ, and that its transportability in new settings is not guaranteed. Conclusion: This study highlights the importance of external validation of clinical prediction models to determine transportability to various target populations. Their application requires careful consideration and potential model re-estimation.
Matching journals
The top 9 journals account for 50% of the predicted probability mass.