Back

Sentiment in Clinical Notes: A Predictor for Length of Stay?

Boyne, A.; Feygin, M.; Sholeen, J.; Zimolzak, A.

2026-03-18 health informatics
10.64898/2026.03.16.26348553 medRxiv
Show abstract

BackgroundLength of stay (LOS) is a critical metric for hospital operational efficiency. While structured clinical data is widely used to predict LOS, unstructured admission notes contain latent prognostic information regarding diagnostic uncertainty and disease complexity. This study evaluates the efficacy of extracting sentiment and direct LOS estimates from admission notes to predict patient hospitalization duration. MethodsWe conducted a retrospective study of 4,503 adult patients admitted with community-acquired pneumonia between 2013 and 2023. Admission history and physical notes were preprocessed and filtered to extract physician-generated narratives. We evaluated four natural language processing models, VADER, TextBlob, Longformer, and an open-source large language model (GPT-oss-20B), to generate zero-shot sentiment scores. Additionally, GPT-oss-20B was prompted to directly estimate LOS. Model outputs were correlated with actual LOS using linear regression and Pearson correlation coefficients. ResultsSentiment models demonstrated statistically significant, albeit weak, correlations with actual LOS. Longformer achieved the highest variance explained among sentiment classifiers (R2 = 0.019). Direct LOS estimation by the LLM outperformed sentiment-based approaches, demonstrating the strongest correlation with actual hospital duration (r = -0.218, p < 0.001). Model agreement was generally poor (ICC = 0.059), and computational time varied drastically, from 2.6 seconds per 100 notes (TextBlob) to over 370 seconds (GPT-oss-20B). ConclusionZero-shot sentiment analysis of clinical notes yields a small but measurable correlation with LOS, limited primarily by the objective, non-evaluative nature of clinical documentation. Direct LLM estimation of clinical outcomes outperforms emotional sentiment extraction. Future predictive systems should integrate computationally efficient NLP models capable of capturing latent clinical complexity alongside established structured data variables.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
16.9%
2
Journal of Medical Internet Research
85 papers in training set
Top 0.3%
13.8%
3
JMIR Medical Informatics
17 papers in training set
Top 0.1%
13.8%
4
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.3%
8.1%
50% of probability mass above
5
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.4%
8.1%
6
Journal of Biomedical Informatics
45 papers in training set
Top 0.3%
4.7%
7
npj Digital Medicine
97 papers in training set
Top 1%
3.8%
8
JAMIA Open
37 papers in training set
Top 0.5%
3.5%
9
Scientific Reports
3102 papers in training set
Top 39%
3.5%
10
Frontiers in Digital Health
20 papers in training set
Top 0.5%
2.0%
11
PLOS ONE
4510 papers in training set
Top 56%
1.6%
12
BMC Medical Research Methodology
43 papers in training set
Top 0.8%
1.3%
13
Healthcare
16 papers in training set
Top 1.0%
1.3%
14
Artificial Intelligence in Medicine
15 papers in training set
Top 0.5%
1.2%
15
BMJ Health & Care Informatics
13 papers in training set
Top 0.7%
0.9%
16
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
17
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
18
Journal of the American Heart Association
119 papers in training set
Top 4%
0.7%
19
The Lancet Digital Health
25 papers in training set
Top 1%
0.7%
20
Heliyon
146 papers in training set
Top 7%
0.7%
21
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.7%
22
PLOS Digital Health
91 papers in training set
Top 3%
0.6%
23
Frontiers in Artificial Intelligence
18 papers in training set
Top 1.0%
0.6%