Back

A bibliometric review of explainable AI in diabetes risk prediction: Trends, gaps, and knowledge graph opportunities

Van, T. A.

2026-04-20 health informatics
10.64898/2026.04.16.26351069 medRxiv
Show abstract

BackgroundType 2 diabetes mellitus (T2DM) is a leading global public health challenge. Machine learning (ML) combined with Explainable AI (XAI) is increasingly applied to T2DM risk prediction, but the field lacks a quantitative overview of methodological trends and integration gaps. MethodsWe present a structured synthesis and critical analysis of the XAI literature on T2DM risk prediction, combining (i) quantitative bibliometric analysis of a two-database corpus (N = 2,048 documents from Scopus and PubMed/MEDLINE, deduplicated via a transparent three-tier pipeline) and (ii) an in-depth selective review of 15 highly cited papers. Reporting follows PRISMA 2020, adapted for metadata-based synthesis; analyses include keyword frequency, rule-based thematic clustering, and publication trend analysis. ResultsThe field grew rapidly, from 36 documents (2020) to 866 (2025). SHAP and LIME dominate XAI methods; XGBoost and Random Forest dominate ML models. Critically, KG/GNN terms appeared in only 17 documents ([~]0.83%) compared with 906 for XAI methods, a 53.3:1 disparity. This gap is consistent across both databases, which share 33.2% of their records, ruling out a single-database artifact. The selective review confirmed that none of the 15 highly cited papers combined all three components, ML, XAI, and KG, in T2DM risk prediction. ConclusionsThe XAI for T2DM risk prediction field exhibits a clinical interpretability gap: statistical explanations are rarely linked to structured clinical pathways. We propose a three-layer conceptual framework (Predictive [->] Explainability [->] Knowledge) that integrates KG as a supplementary semantic layer, with potential applications in clinical decision support and population-level screening. The framework does not perform true causal inference but structures explanations around established pathophysiological knowledge. This study contributes a transferable methodology and a quantified research gap to guide future work integrating ML, XAI, and structured medical knowledge.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.1%
13.8%
2
JAMIA Open
37 papers in training set
Top 0.1%
13.8%
3
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.3%
9.7%
4
Journal of Biomedical Informatics
45 papers in training set
Top 0.2%
7.9%
5
JMIR Medical Informatics
17 papers in training set
Top 0.2%
4.7%
6
Scientific Reports
3102 papers in training set
Top 32%
3.8%
50% of probability mass above
7
Journal of Medical Internet Research
85 papers in training set
Top 1%
3.7%
8
BMC Medical Research Methodology
43 papers in training set
Top 0.3%
3.5%
9
PLOS ONE
4510 papers in training set
Top 41%
3.5%
10
PLOS Digital Health
91 papers in training set
Top 0.8%
3.5%
11
npj Digital Medicine
97 papers in training set
Top 2%
2.0%
12
eBioMedicine
130 papers in training set
Top 1%
1.8%
13
JMIR Public Health and Surveillance
45 papers in training set
Top 2%
1.6%
14
Journal of Personalized Medicine
28 papers in training set
Top 0.4%
1.6%
15
BMJ Open
554 papers in training set
Top 10%
1.4%
16
International Journal of Medical Informatics
25 papers in training set
Top 1.0%
1.4%
17
BMJ Health & Care Informatics
13 papers in training set
Top 0.6%
1.3%
18
BMC Medicine
163 papers in training set
Top 5%
1.3%
19
The Lancet Digital Health
25 papers in training set
Top 0.7%
1.2%
20
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.2%
21
Artificial Intelligence in Medicine
15 papers in training set
Top 0.6%
0.9%
22
Frontiers in Medicine
113 papers in training set
Top 7%
0.8%
23
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.7%
0.8%
24
Bioinformatics
1061 papers in training set
Top 10%
0.7%
25
Acta Neuropsychiatrica
12 papers in training set
Top 1%
0.7%
26
European Journal of Epidemiology
40 papers in training set
Top 0.8%
0.7%
27
Expert Systems with Applications
11 papers in training set
Top 0.6%
0.6%
28
Communications Medicine
85 papers in training set
Top 2%
0.6%