Interpretable Multimodal Machine Learning Model for Predicting Health Risks of Patients with Heart Failure
Chae, R.; Zhou, J.; Chou, O. H. I.; Yang, B.; Pu, H.; Tse, G.; Cheung, B. M. Y.; Zhu, T.; Car, J.; Lu, L.
Show abstract
Heart failure (HF) is one of the major causes of morbidity and mortality globally, necessitating accurate tools for health outcome prediction and risk stratification. In this study, we propose an interpretable multimodal machine learning framework integrating four clinical data modalities (i.e., demographics, medications, laboratory tests, and electrocardiograms [ECGs]) to predict 30-day all-cause mortality and hospital readmission in HF patients. Using clinical data from 2,868 HF patients across 43 local hospitals in Hong Kong, we trained and evaluated ten machine learning models for HF risk prediction, with the best performing model achieving an area under the receiver operating characteristic curve (AUC) of 0.881 for mortality and 0.709 for readmission. Notably, laboratory tests and ECG features dominate predictive power, and their combination alone yielded near-optimal results (AUC: 0.872), suggesting that these two modalities may be adequate for effective risk prediction in resource-constrained settings. The SHapley Additive exPlanations (SHAP) analysis identified serum albumin, high-sensitivity troponin I, lactate dehydrogenase, and QT interval dispersion as key predictors. Feature redundancy analysis further revealed strong correlations within laboratory tests and ECG features, suggesting opportunities for model simplification. To the best of our knowledge, this is the first study that comprehensively evaluates diverse configurations of four data modalities for HF risk prediction through ablation analysis, quantifying the marginal gains of each data modality and their combinations. Our findings demonstrate that interpretable multimodal machine learning model can enhance risk prediction in HF patients, supporting personalized management and scalable deployment across diverse healthcare settings.
Matching journals
The top 7 journals account for 50% of the predicted probability mass.