Back

CardioAI: An Explainable Machine Learning System for Cardiovascular Risk Prediction and Patient Retention in Nigerian Healthcare Settings

Gboh-Igbara, D. C.

2026-03-31 rehabilitation medicine and physical therapy
10.64898/2026.03.29.26349642 medRxiv
Show abstract

Abstract Background: Cardiovascular disease is the leading cause of mortality in Nigeria and across sub-Saharan Africa, with rising incidence attributable to urbanisation, sedentary lifestyles, and limited access to early detection tools. Concurrently, patient dropout from rehabilitation programs remains a critical operational challenge for Nigerian clinics, with many patients failing to return after their initial consultation. Methods: We developed CardioAI, an Explainable Artificial Intelligence system comprising two predictive modules. The cardiovascular risk module trained four machine learning models - Logistic Regression, Random Forest, Gradient Boosting (XGBoost), and a Multilayer Perceptron - on a combined UCI Heart Disease dataset of 1,025 patient records. A novel Lifestyle Risk Index was engineered from five modifiable clinical markers. SHAP (SHapley Additive exPlanations) was applied for per-prediction feature attribution. The patient retention module trained three classifiers on a synthetic dataset of 800 records, modelling 10 operational and behavioural dropout factors. An NLP and OCR pipeline using Tesseract v5.5 and spaCy was implemented for clinical document processing. Results: The cardiovascular risk module achieved an AUC-ROC of 0.999 (XGBoost), 0.998 (Random Forest), 0.994 (MLP), and 0.927 (Logistic Regression) on the held-out test set. Cross-validated AUC with constrained tree depth was 0.97, confirming generalisation. SHAP analysis identified the Lifestyle Risk Index, ST depression, resting blood pressure, exercise-induced angina, and cholesterol as the five most influential predictors. The retention module achieved AUC-ROC of 0.66 (Logistic Regression), demonstrating the difficulty of dropout prediction with synthetic data. Conclusions: CardioAI demonstrates that explainable machine learning can provide clinically actionable cardiovascular risk assessment and patient retention intelligence in a low-resource Nigerian healthcare context. The system is freely deployable, open-source, and designed for pilot validation in teaching hospitals across Lagos and Port Harcourt. Keywords: cardiovascular risk prediction, machine learning, explainable AI, SHAP, patient retention, clinical decision support, Nigeria, sub-Saharan Africa, XGBoost, random forest, digital health

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.