Back

Ai-Driven Diagnosis Of Non-Alcoholic Fatty Liver Disease And Associated Comorbidities

Kumar, S. N.; K S, G.; Chinnakanu, S. J.; Krishnan, H.; M, N.; Subramaniam, S.

2026-02-18 health informatics
10.64898/2026.02.12.26345169 medRxiv
Show abstract

Non-alcoholic fatty liver disease (NAFLD) is a globally prevalent hepatic condition caused by the buildup of fat in the liver. It is frequently associated with metabolic comorbidities such as hypertension, cardiovascular disease (CVD), and prediabetes. However, early detection remains challenging due to the asymptomatic progression, and existing primary diagnostic methods, such as imaging or liver biopsy, are often expensive and inaccessible in rural areas. This study proposes a two-stage, interpretable machine learning pipeline for the non-invasive and cost-effective prediction of NAFLD and its key comorbidities using routine clinical parameters. The NAFLD prediction model was developed using the XGBoost algorithm, trained on a hybrid dataset that combines real patient data with rule-based synthetic data generated by simulating clinically plausible cases. Upon NAFLD-positive prediction, three separate XGB models, trained on data labelled based on thresholds, assess individual risks for hypertension, cardiovascular disease, and prediabetes. Explainability is obtained using SHAP (SHapley Additive exPlanations), which provides insight into feature relevance, while biomarker radar plots help in the visual interpretation of comorbidities. A user-friendly Streamlit interface enables real-time interaction with the tool for potential clinical application. The NAFLD model demonstrated robust performance, while the models used for predicting comorbidities achieved perfect performance, which may be a reflection of the limited dataset size used in the second stage. This work underscores the potential of AI-driven tools in NAFLD diagnosis, particularly when combined with explainable AI methods.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Computers in Biology and Medicine
120 papers in training set
Top 0.1%
22.8%
2
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.3%
8.3%
3
Scientific Reports
3102 papers in training set
Top 12%
7.3%
4
JAMIA Open
37 papers in training set
Top 0.3%
4.0%
5
npj Digital Medicine
97 papers in training set
Top 1%
3.1%
6
PLOS ONE
4510 papers in training set
Top 45%
2.6%
7
PLOS Digital Health
91 papers in training set
Top 1.0%
2.6%
50% of probability mass above
8
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.1%
2.4%
9
Biology Methods and Protocols
53 papers in training set
Top 0.6%
2.1%
10
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.8%
2.1%
11
Expert Systems with Applications
11 papers in training set
Top 0.1%
1.9%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
13
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
14
Journal of Personalized Medicine
28 papers in training set
Top 0.4%
1.7%
15
Communications Medicine
85 papers in training set
Top 0.3%
1.7%
16
Biomedicines
66 papers in training set
Top 1%
1.3%
17
Bioinformatics
1061 papers in training set
Top 8%
1.3%
18
JMIR Medical Informatics
17 papers in training set
Top 1%
1.2%
19
Frontiers in Physiology
93 papers in training set
Top 4%
1.1%
20
eBioMedicine
130 papers in training set
Top 3%
1.0%
21
International Journal of Environmental Research and Public Health
124 papers in training set
Top 6%
0.9%
22
Journal of Biomedical Informatics
45 papers in training set
Top 1%
0.8%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
24
Nature Communications
4913 papers in training set
Top 61%
0.8%
25
Frontiers in Public Health
140 papers in training set
Top 8%
0.8%
26
Patterns
70 papers in training set
Top 2%
0.8%
27
Journal of Medical Internet Research
85 papers in training set
Top 4%
0.8%
28
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.8%
29
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.8%
30
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.8%