Back

Clinician-Centered Evaluation of Large Language Model-Generated Discharge Summaries for Longer Hospitalizations: Insights from Hospitalists and Primary Care Physicians

Osborne, T.; Mahmud, T.; Zheng, X.; Jampala, S.; Abbasi, S.; Hong, S.; Kranz, K.; Lee, S.; Ng, P.; Odekon, K.; Schachter, L.; Sexton, R.; Spinnato, T.; Tharakan, M.; Wu, Z.; Wang, F.; Wong, R.

2026-06-05 health systems and quality improvement
10.64898/2026.06.03.26354858 medRxiv
Show abstract

Although large language models (LLMs) have shown promise for discharge summary generation, their value may be greater in longer hospitalizations, where increasing documentation volume and complexity increase both clinician burden and the risk of communication failures during transitions of care. Prior evaluations of LLM-generated discharge summaries have largely involved shorter stays and have rarely examined receiving-clinician priorities or incidental finding reporting. We compared LLM-generated and human-authored discharge summaries for 60 Internal Medicine hospitalizations lasting 7 to 21 days, with paired assessment by hospitalists and primary care physicians (PCPs). Clinician reviewers preferred LLM-generated summaries for 95% of encounters and rated them higher for quality, readability, factuality and completeness. PCPs, the primary recipients responsible for post-discharge care, found that LLM-generated summaries were better for understanding and communicating hospital care to patients, and providing follow-up care. LLM-generated summaries had fewer annotated errors, primarily due to fewer omissions, without increased estimated harm potential or likelihood compared with human-authored summaries. Benefits of LLM-generated summaries were especially salient for PCPs, who identified more omissions with greater downstream likelihood of harm than hospitalists. This underscores the importance of designing transition documents around the needs of clinicians assuming care post-discharge. LLM identification of radiology incidental findings was generally accurate and appropriate, suggesting potential to improve follow-up of clinically relevant findings. These findings extend prior work by demonstrating clinical value of LLMs in summarizing longer, complex hospitalizations and highlighting the value of stakeholder-centered design in clinical AI systems. Together, they support supervised LLM-assisted discharge summarization as a tool to reduce cognitive burden, improve documentation quality, and enhance transition-of-care communication.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.3%
17.5%
2
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.2%
10.1%
3
PLOS ONE
4510 papers in training set
Top 25%
6.8%
4
PLOS Digital Health
91 papers in training set
Top 0.3%
6.8%
5
Journal of Biomedical Informatics
45 papers in training set
Top 0.2%
6.4%
6
Medical Decision Making
10 papers in training set
Top 0.1%
3.6%
50% of probability mass above
7
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.8%
3.6%
8
BMJ Health & Care Informatics
13 papers in training set
Top 0.2%
3.1%
9
JAMA Network Open
127 papers in training set
Top 1%
3.1%
10
JMIRx Med
31 papers in training set
Top 0.2%
3.0%
11
Scientific Reports
3102 papers in training set
Top 42%
2.9%
12
Healthcare
16 papers in training set
Top 0.5%
1.9%
13
Frontiers in Digital Health
20 papers in training set
Top 0.7%
1.7%
14
Journal of General Internal Medicine
20 papers in training set
Top 0.5%
1.7%
15
European Heart Journal - Digital Health
15 papers in training set
Top 0.3%
1.7%
16
Canadian Medical Association Journal
15 papers in training set
Top 0.2%
1.3%
17
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.6%
1.2%
18
Journal of Clinical and Translational Science
11 papers in training set
Top 0.3%
1.2%
19
Journal of Personalized Medicine
28 papers in training set
Top 0.9%
0.9%
20
JAMIA Open
37 papers in training set
Top 1%
0.8%
21
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
22
Artificial Intelligence in Medicine
15 papers in training set
Top 0.7%
0.7%
23
iScience
1063 papers in training set
Top 32%
0.7%
24
Nature Medicine
117 papers in training set
Top 5%
0.7%
25
Journal of Medical Internet Research
85 papers in training set
Top 5%
0.7%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
27
CMAJ Open
12 papers in training set
Top 0.3%
0.7%
28
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.7%
29
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.6%
30
Frontiers in Bioengineering and Biotechnology
88 papers in training set
Top 3%
0.6%