Back

Evaluating the AI Potential as a Safety Net for Diagnosis: A Novel Benchmark of Large Language Models in Correcting Diagnostic Errors

2026-02-24 health systems and quality improvement Title + abstract only
View on medRxiv
Show abstract

BackgroundDiagnostic errors are a leading cause of preventable patient harm, often occurring during early clinical encounters where diagnostic uncertainty is maximal. Large language models (LLMs) have shown potential in medical reasoning, yet their ability to function as a diagnostic safety net, specifically by identifying and correcting human diagnostic errors, remains systematically unquantified. We evaluated whether state-of-the-art LLMs can effectively challenge, rather than merely confirm, ...

Predicted journal destinations

1
npj Digital Medicine
85 training papers
#1 25.6%
2
PLOS ONE
1737 training papers
Top 51% 12.6%
3
PLOS Digital Health
88 training papers
Top 1.0% 10.4%
4
Scientific Reports
701 training papers
Top 21% 9.0%
5
BMC Medical Informatics and Decision Making
36 training papers
Top 4% 4.4%
6
BMC Health Services Research
43 training papers
Top 3% 1.8%
7
Journal of Medical Internet Research
81 training papers
Top 9% 1.8%
8
Nature Communications
483 training papers
Top 40% 1.8%
9
Royal Society Open Science
49 training papers
Top 2% 1.8%
10
PLOS Computational Biology
141 training papers
Top 8% 1.8%
11
BMJ Open
553 training papers
Top 54% 1.5%
12
Journal of Biomedical Informatics
37 training papers
Top 5% 1.5%
13
BMJ Open Quality
15 training papers
Top 1% 1.4%
14
JAMA Network Open
125 training papers
Top 16% 1.4%
15
JMIRx Med
29 training papers
Top 2% 1.3%
16
Frontiers in Public Health
135 training papers
Top 26% 1.1%
17
Journal of the American Medical Informatics Association
53 training papers
Top 6% 1.1%
18
Communications Medicine
63 training papers
Top 4% 1.1%
19
Nature Medicine
88 training papers
Top 11% 1.1%
20
Computers in Biology and Medicine
39 training papers
Top 9% 0.8%
21
Cureus
64 training papers
Top 19% 0.8%
22
eLife
262 training papers
Top 50% 0.8%
23
Nature
58 training papers
Top 13% 0.5%
24
Sensors
18 training papers
Top 3% 0.5%
25
JMIR Formative Research
31 training papers
Top 10% 0.5%
26
PLOS Medicine
95 training papers
Top 29% 0.5%
27
Proceedings of the National Academy of Sciences
100 training papers
Top 24% 0.5%
28
Journal of General Internal Medicine
19 training papers
Top 2% 0.5%
29
British Journal of General Practice
22 training papers
Top 2% 0.5%