Back
Top 0.2%
18.9%
Top 11%
12.6%
Top 59%
9.6%
Top 15%
6.9%
Top 0.4%
4.4%
Top 1%
4.1%
Top 6%
4.1%
#1
4.1%
Top 4%
1.7%
Top 3%
1.7%
Top 56%
1.4%
Top 10%
1.4%
Top 18%
1.3%
Top 5%
1.3%
Top 9%
1.0%
Top 52%
0.8%
Top 13%
0.8%
Top 20%
0.8%
Top 6%
0.8%
Top 2%
0.8%
Top 8%
0.8%
Top 12%
0.8%
Top 12%
0.5%
Top 7%
0.5%
Pathology's Last Exam: Stress-Testing Diagnostic Reasoning and Safety in Large Language Models
2025-12-15
pathology
Title + abstract only
View on medRxiv
Show abstract
Large language models (LLMs) are evolving into diagnostic co-pilots, yet current benchmarks fail to test the integrated, stepwise reasoning required in diagnostic pathology. Here, we present Pathologys Last Exam (PLE), a curated, highly detailed, text-based benchmark of 100 complex cases spanning organ systems, enriched for rare/challenging entities, plus 20 adversarial cases designed to stress-test model safety. Each case provides structured blocks (Primary, Clinical, Histopathology, IHC/Specia...
Predicted journal destinations
1
npj Digital Medicine
85 training papers
2
Scientific Reports
701 training papers
3
PLOS ONE
1737 training papers
4
Nature Communications
483 training papers
5
Nature Medicine
88 training papers
6
Computers in Biology and Medicine
39 training papers
7
PLOS Digital Health
88 training papers
8
The Lancet Digital Health
25 training papers
9
Journal of Clinical Microbiology
77 training papers
10
eBioMedicine
82 training papers
11
BMJ Open
553 training papers
12
Cureus
64 training papers
13
JAMA Network Open
125 training papers
14
JAMIA Open
35 training papers
15
Brain
69 training papers
16
eLife
262 training papers
17
PLOS Computational Biology
141 training papers
18
Journal of Medical Internet Research
81 training papers
19
BMC Cancer
21 training papers
20
The Journal of Molecular Diagnostics
24 training papers
21
Cancers
57 training papers
22
iScience
74 training papers
23
Human Brain Mapping
53 training papers
24
International Journal of Medical Informatics
25 training papers