Back

Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability

Gao, Y.; Myers, S.; Chen, S.; Dligach, D.; Miller, T.; Bitterman, D. S.; Chen, G.; Mayampurath, A.; Churpek, M. M.; Afshar, M.

2024-11-07 health informatics
10.1101/2024.11.06.24316848 medRxiv
Show abstract

Large language models (LLMs) are being explored for diagnostic decision support, yet their ability to estimate pre-test probabilities, vital for clinical decision-making, remains limited. This study evaluates two LLMs, Mistral-7B and Llama3-70B, using structured electronic health record data on three diagnosis tasks. We examined three current methods of extracting LLM probability estimations and revealed their limitations. We aim to highlight the need for improved techniques in LLM confidence estimation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Artificial Intelligence in Medicine
15 papers in training set
Top 0.1%
17.4%
2
Journal of Biomedical Informatics
45 papers in training set
Top 0.1%
13.4%
3
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.2%
11.7%
4
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.5%
6.0%
5
Computers in Biology and Medicine
120 papers in training set
Top 0.4%
5.9%
50% of probability mass above
6
International Journal of Medical Informatics
25 papers in training set
Top 0.3%
4.0%
7
JMIR Medical Informatics
17 papers in training set
Top 0.4%
3.4%
8
JAMIA Open
37 papers in training set
Top 0.5%
3.4%
9
Scientific Reports
3102 papers in training set
Top 40%
3.4%
10
npj Digital Medicine
97 papers in training set
Top 1%
3.4%
11
Journal of Medical Internet Research
85 papers in training set
Top 2%
2.4%
12
PLOS ONE
4510 papers in training set
Top 55%
1.7%
13
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.5%
1.6%
14
Journal of Personalized Medicine
28 papers in training set
Top 0.4%
1.6%
15
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.3%
1.6%
16
Informatics in Medicine Unlocked
21 papers in training set
Top 0.5%
1.6%
17
PLOS Digital Health
91 papers in training set
Top 2%
1.4%
18
BMC Bioinformatics
383 papers in training set
Top 5%
1.4%
19
BMC Medical Research Methodology
43 papers in training set
Top 0.9%
1.1%
20
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.6%
1.1%
21
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
1.0%
22
Frontiers in Digital Health
20 papers in training set
Top 1%
0.9%
23
Cureus
67 papers in training set
Top 5%
0.7%
24
Database
51 papers in training set
Top 1%
0.7%
25
Biology Methods and Protocols
53 papers in training set
Top 3%
0.7%