Interpretable Machine Learning for Population-Level Severe Tooth Loss Prediction: A Two-Axis External Validation
LAM, Q. T.; Fan, F.-Y.; Wang, Y.-L.; Wu, C.-Y.; Sun, Y.-S.; Vo, T. T. T.; Kuo, H.; Kha, Q. H.; Le, M. H. N.; Vu, G.; Le, N. Q. K.; Lee, I.-T.
Show abstract
Objectives: Machine learning can predict severe tooth loss (STL, 6 or more missing teeth), but opaque black-box models neglecting complex survey designs limit clinical adoption. This study developed and externally validated an intrinsically interpretable, survey-weighted framework for population-level STL prediction, capturing complex socio-behavioral and systemic health determinants. Methods: We analyzed nationally representative data from BRFSS 2022 (derivation, N=433,772), BRFSS 2024 (temporal validation, N=448,213), and the clinically examined NHANES 2015-2018 (cross-domain validation, N=10,775). Missing data were resolved using an anti-leakage HistGradientBoosting MICE pipeline, preserving multivariate epidemiological variance. An Explainable Boosting Machine (EBM, GA2M) was natively trained by integrating complex survey weights. For external clinical validation, structural domain shift was addressed through non-parametric Isotonic Regression recalibration. Results: The EBM achieved strong temporal stability on BRFSS 2024 (AUC: 0.8627; Brier Score: 0.0845). Upon cross-domain validation against NHANES 2015-2018, the calibrated model demonstrated robust transportability (AUC: 0.7504; Brier Score: 0.1358). Notably, the zero-shot EBM (AUC: 0.7591) closely matched the predictive ceiling of a black-box stacked meta-ensemble (AUC: 0.7706), eliminating the need for unstable post-hoc approximations. Fully auditable shape functions explicitly revealed non-linear risk thresholds and synergistic pairwise interactions for key predictors including age, smoking, income, and diabetes. Decision curve analysis confirmed substantial positive net clinical benefit across a 5%-50% risk threshold continuum. Conclusions: The MICE-EBM framework predicts STL with complete intrinsic transparency and robust probabilistic reliability. By successfully generalizing across unobserved temporal and clinical cohorts, this TRIPOD+AI compliant framework provides a clinically deployable tool to optimize targeted dental public health interventions.
Matching journals
The top 7 journals account for 50% of the predicted probability mass.