Back

DISCERN: A Clinical Impact-aware Framework for Radiology Report Comparison

Sharma, R.; Beeche, C.; Dong, J.; Zhuang, R.; Qu, H.; Zhang, R.; Gangaram, V.; Goswami, P.; Xin, J.; Ballard, J.; Goldberg, A.; Sagreiya, H.; Long, Q.; Chen, T.; Witschey, W. R.

2026-05-27 radiology and imaging
10.64898/2026.05.26.26353612 medRxiv
Show abstract

The surge in medical imaging has spurred the development of vision-language models (VLMs) to alleviate radiologist workloads. However, clinical deployment is hindered by the lack of meaningful evaluation frameworks. Current metrics - ranging from semantic similarity to large language model (LLM) based judges - often fail to distinguish between clinically trivial and critical discrepancies, poorly reflecting real-world clinical judgment. To address this, we introduce DISCERN (Discordance and Significance-aware Entity-level Radiology Report Comparison). DISCERN is a significance-aware framework that weighs report errors based on their potential impact on patient care. Our results demonstrate that DISCERN powered by closed source LLMs aligns more closely with expert radiologist assessments than traditional metrics or current LLM evaluators, providing a more interpretable and clinically relevant benchmark. By modeling radiologist prioritization and entity-level feedback, DISCERN facilitates targeted model refinement and ensures the safer integration of generative AI into clinical workflows.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.2%
18.9%
2
Scientific Reports
3102 papers in training set
Top 8%
9.3%
3
Nature Communications
4913 papers in training set
Top 26%
6.9%
4
Nature Machine Intelligence
61 papers in training set
Top 0.7%
4.4%
5
European Radiology
14 papers in training set
Top 0.2%
4.0%
6
Nature Medicine
117 papers in training set
Top 0.7%
4.0%
7
PLOS ONE
4510 papers in training set
Top 35%
4.0%
50% of probability mass above
8
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.2%
3.6%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 23%
3.1%
10
GigaScience
172 papers in training set
Top 0.6%
3.1%
11
Medical Physics
14 papers in training set
Top 0.2%
2.9%
12
PLOS Digital Health
91 papers in training set
Top 1%
1.9%
13
iScience
1063 papers in training set
Top 12%
1.8%
14
npj Precision Oncology
48 papers in training set
Top 0.5%
1.7%
15
The Lancet Digital Health
25 papers in training set
Top 0.4%
1.7%
16
IEEE Transactions on Medical Imaging
18 papers in training set
Top 0.3%
1.7%
17
PLOS Computational Biology
1633 papers in training set
Top 16%
1.7%
18
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.3%
19
Patterns
70 papers in training set
Top 1%
1.3%
20
Expert Systems with Applications
11 papers in training set
Top 0.2%
1.2%
21
eBioMedicine
130 papers in training set
Top 2%
1.2%
22
eLife
5422 papers in training set
Top 49%
1.2%
23
Journal of Medical Imaging
11 papers in training set
Top 0.2%
1.0%
24
Neurocomputing
13 papers in training set
Top 0.4%
1.0%
25
IEEE Access
31 papers in training set
Top 0.8%
0.8%
26
Science Advances
1098 papers in training set
Top 28%
0.8%
27
Diagnostics
48 papers in training set
Top 2%
0.8%
28
Science Translational Medicine
111 papers in training set
Top 6%
0.8%
29
Imaging Neuroscience
242 papers in training set
Top 3%
0.8%
30
Nature Computational Science
50 papers in training set
Top 2%
0.5%