Back

Artificial Intelligence in Diabetes Care: Evaluating GPT-4's Competency in Reviewing Diabetic Patient Management Plan in Comparison to Expert Review

Mondal, A.; Naskar, A.

2024-04-14 endocrinology
10.1101/2024.04.12.24305732 medRxiv
Show abstract

BackgroundThe escalating global burden of diabetes necessitates innovative management strategies. Artificial intelligence, particularly large language models like GPT-4, presents a promising avenue for improving guideline adherence in diabetes care. Such technologies could revolutionize patient management by offering personalized, evidence-based treatment recommendations. MethodsA comparative, blinded design was employed, involving 50 hypothetical diabetes mellitus case summaries, emphasizing varied aspects of diabetes management. GPT-4 evaluated each summary for guideline adherence, classifying them as compliant or non-compliant, based on the ADA guidelines. A medical expert, blinded to GPT-4s assessments, independently reviewed the summaries. Concordance between GPT-4 and the experts evaluations was statistically analyzed, including calculating Cohens kappa for agreement. ResultsGPT-4 labelled 30 summaries as compliant and 20 as non-compliant, while the expert identified 28 as compliant and 22 as non-compliant. Agreement was reached on 46 of the 50 cases, yielding a Cohens kappa of 0.84, indicating near-perfect agreement. GPT-4 demonstrated a 92% accuracy, with a sensitivity of 86.4% and a specificity of 96.4%. Discrepancies in four cases highlighted challenges in AIs understanding of complex clinical judgments related to medication adjustments and treatment modifications. ConclusionGPT-4 exhibits promising potential to support health-care professionals in reviewing diabetes management plans for guideline adherence. Despite high concordance with expert assessments, instances of non-agreement underscore the need for AI refinement in complex clinical scenarios. Future research should aim at enhancing AIs clinical reasoning capabilities and exploring its integration with other technologies for improved healthcare delivery.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
JMIR Medical Informatics
17 papers in training set
Top 0.1%
18.5%
2
npj Digital Medicine
97 papers in training set
Top 0.3%
17.4%
3
Bioengineering
24 papers in training set
Top 0.1%
6.3%
4
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.5%
4.8%
5
Expert Systems with Applications
11 papers in training set
Top 0.1%
4.8%
50% of probability mass above
6
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.7%
3.9%
7
PLOS ONE
4510 papers in training set
Top 40%
3.6%
8
Scientific Reports
3102 papers in training set
Top 37%
3.6%
9
PLOS Digital Health
91 papers in training set
Top 0.9%
3.1%
10
JMIR Public Health and Surveillance
45 papers in training set
Top 1%
2.4%
11
JAMIA Open
37 papers in training set
Top 0.6%
2.3%
12
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.1%
13
Frontiers in Physiology
93 papers in training set
Top 3%
1.7%
14
iScience
1063 papers in training set
Top 18%
1.5%
15
Advanced Science
249 papers in training set
Top 12%
1.5%
16
Frontiers in Public Health
140 papers in training set
Top 6%
1.2%
17
Journal of Pathology Informatics
13 papers in training set
Top 0.2%
1.2%
18
International Journal of Medical Informatics
25 papers in training set
Top 1%
0.9%
19
BMC Medical Research Methodology
43 papers in training set
Top 1%
0.9%
20
Artificial Intelligence in Medicine
15 papers in training set
Top 0.6%
0.9%
21
BMJ Open Diabetes Research & Care
15 papers in training set
Top 1.0%
0.8%
22
Cancer Medicine
24 papers in training set
Top 1%
0.8%
23
BJPsych Open
25 papers in training set
Top 0.7%
0.8%
24
BMJ Open
554 papers in training set
Top 13%
0.7%
25
Biology
43 papers in training set
Top 3%
0.7%