Back

Development and Validation of a Two-Stage NLP-LLM System for Automated Extraction of Deprescribing Recommendations from Discharge Summaries

Fujita, K.; Matheson, M.; Valecha, B.; Hilmer, S. N.

2026-04-30 geriatric medicine
10.64898/2026.04.29.26352010 medRxiv
Show abstract

IntroductionPolypharmacy in older adults is associated with increased risks of adverse drug events and functional decline. Discharge summaries often contain deprescribing recommendations, but these are frequently overlooked due to documentation complexity. ObjectiveTo develop and validate a two-stage hybrid system combining rule-based natural language processing (NLP) and large language model (LLM) for automated extraction of deprescribing recommendations from discharge summaries. MethodsThis retrospective cohort study included 850 discharge summaries from patients aged [≥]65 years with hospitalisation [≥]48 hours across six public hospitals in New South Wales, Australia. Model 1 (rule-based NLP) extracted discharge medications and candidate sentences containing pre-defined deprescribing keywords. Model 2 (open-source LLM) classified candidate sentences into five categories. Data were split into training (80%) and test (20%) sets. Gold standard classifications were established by independent reviews, followed by adjudication of discrepancies. ResultsModel 1 extracted 9,631 discharge medications (median 11 per patient) and 1,061 candidate sentences from 850 patients (median age 82.8 years). Model 2 achieved an F1 score of 0.91 and accuracy of 0.90 on the test set. Inter-rater reliability showed substantial agreement (Cohens kappa = 0.70). The most frequently identified medications recommended for deprescribing were antibiotics and opioids. The most common misclassification was incorrectly identifying actions completed during hospitalisation as post-discharge recommendations. The combined processing time averaged 12.6 seconds per discharge summary. ConclusionsA two-stage hybrid approach combining rule-based NLP and open-source LLM can accurately extract deprescribing recommendations from discharge summaries, enabling cost-efficient, privacy-compliant local deployment. Key Points- A two-stage system combining rule-based NLP and open-source LLM extracted and classified deprescribing recommendations from 850 discharge summaries, achieving an F1 score of 0.91 and accuracy of 0.90. - The use of an open-source LLM (Llama 3.3) enables cost-efficient, privacy-compliant local deployment in healthcare institutions. - Antibiotics and opioids were the most frequently identified medications recommended for deprescribing in discharge summaries.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 18%
10.1%
2
Biology Methods and Protocols
53 papers in training set
Top 0.1%
8.4%
3
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.4%
8.2%
4
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.4%
6.4%
5
JAMIA Open
37 papers in training set
Top 0.3%
4.9%
6
Cureus
67 papers in training set
Top 1.0%
4.0%
7
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.1%
3.6%
8
Journal of Medical Internet Research
85 papers in training set
Top 1%
3.6%
9
Pilot and Feasibility Studies
12 papers in training set
Top 0.1%
3.1%
50% of probability mass above
10
BMC Geriatrics
15 papers in training set
Top 0.1%
3.1%
11
BMC Medicine
163 papers in training set
Top 2%
2.7%
12
Journal of Biomedical Informatics
45 papers in training set
Top 0.6%
2.7%
13
International Journal of Medical Informatics
25 papers in training set
Top 0.6%
2.1%
14
Frontiers in Public Health
140 papers in training set
Top 4%
1.9%
15
BMJ Open
554 papers in training set
Top 8%
1.9%
16
BMC Neurology
12 papers in training set
Top 0.4%
1.7%
17
PLOS Digital Health
91 papers in training set
Top 1%
1.7%
18
Scientific Reports
3102 papers in training set
Top 57%
1.7%
19
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
20
Frontiers in Digital Health
20 papers in training set
Top 0.8%
1.5%
21
BJGP Open
12 papers in training set
Top 0.4%
1.3%
22
BMC Medical Research Methodology
43 papers in training set
Top 0.8%
1.3%
23
Journal of the American Medical Directors Association
13 papers in training set
Top 0.2%
1.3%
24
Journal of the American Geriatrics Society
12 papers in training set
Top 0.1%
1.3%
25
JMIR Medical Informatics
17 papers in training set
Top 1%
1.2%
26
Frontiers in Physiology
93 papers in training set
Top 4%
1.0%
27
Age and Ageing
27 papers in training set
Top 0.4%
1.0%
28
DIGITAL HEALTH
12 papers in training set
Top 0.6%
0.9%
29
Archives of Public Health
12 papers in training set
Top 0.7%
0.7%
30
Heliyon
146 papers in training set
Top 6%
0.7%