Artificial Intelligence in Diabetes Care: Evaluating GPT-4's Competency in Reviewing Diabetic Patient Management Plan in Comparison to Expert Review
Mondal, A.; Naskar, A.
Show abstract
BackgroundThe escalating global burden of diabetes necessitates innovative management strategies. Artificial intelligence, particularly large language models like GPT-4, presents a promising avenue for improving guideline adherence in diabetes care. Such technologies could revolutionize patient management by offering personalized, evidence-based treatment recommendations. MethodsA comparative, blinded design was employed, involving 50 hypothetical diabetes mellitus case summaries, emphasizing varied aspects of diabetes management. GPT-4 evaluated each summary for guideline adherence, classifying them as compliant or non-compliant, based on the ADA guidelines. A medical expert, blinded to GPT-4s assessments, independently reviewed the summaries. Concordance between GPT-4 and the experts evaluations was statistically analyzed, including calculating Cohens kappa for agreement. ResultsGPT-4 labelled 30 summaries as compliant and 20 as non-compliant, while the expert identified 28 as compliant and 22 as non-compliant. Agreement was reached on 46 of the 50 cases, yielding a Cohens kappa of 0.84, indicating near-perfect agreement. GPT-4 demonstrated a 92% accuracy, with a sensitivity of 86.4% and a specificity of 96.4%. Discrepancies in four cases highlighted challenges in AIs understanding of complex clinical judgments related to medication adjustments and treatment modifications. ConclusionGPT-4 exhibits promising potential to support health-care professionals in reviewing diabetes management plans for guideline adherence. Despite high concordance with expert assessments, instances of non-agreement underscore the need for AI refinement in complex clinical scenarios. Future research should aim at enhancing AIs clinical reasoning capabilities and exploring its integration with other technologies for improved healthcare delivery.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.