Back

Development and Temporal Evaluation of Multimodal Machine Learning Models to Predict High Inpatient Opioid Exposure

Kale, S.; Singh, D.; Truumees, E.; Geck, M.; Stokes, J.

2026-04-02 health informatics
10.64898/2026.03.31.26349842 medRxiv
Show abstract

High inpatient opioid exposure is associated with increased risk of persistent opioid use. Early identification of high-risk patients may improve opioid stewardship. We developed machine learning models to predict high opioid exposure during hospitalization using electronic health record data from MIMIC-IV. We conducted a retrospective study of 223,452 unique first hospital admissions in MIMIC-IV. The outcome was high opioid exposure, defined as the top decile among opioid-exposed admissions (MME/day [≥] 225), representing 2.65% of all admissions. Structured early-admission features included demographics, admission characteristics, laboratory utilization and abnormality summaries, and 24-hour procedural indicators. Discharge-note data were incorporated using ClinicalBERT embeddings and interpretable bigram features. Models were trained using an 80/10/10 split and evaluated with temporal validation on the most recent 10% of admissions. Performance was assessed using ROC-AUC and PR-AUC with 95% confidence intervals. Among structured-only models, XGBoost achieved the best test performance (ROC-AUC 0.932 [0.924-0.940]; PR-AUC 0.223 [0.193-0.262]). The combined structured and notes model improved precision-recall performance (ROC-AUC 0.932 [0.920-0.943]; PR-AUC 0.276 [0.229-0.331]). Temporal evaluation showed similar discrimination (ROC-AUC 0.929; PR-AUC 0.223). High-risk bigrams included procedural terms such as "external fixation" and "cervical discectomy." Integration of structured and text-derived features improved risk stratification compared to structured data alone. Interpretable bigram signals reflected procedural complexity and orthopedic pathology, reinforcing the clinical plausibility of model predictions. Multimodal EHR-based models accurately predict high inpatient opioid exposure and may support targeted opioid stewardship during hospitalization.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
npj Digital Medicine
97 papers in training set
Top 0.1%
25.6%
2
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.1%
17.3%
3
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
9.1%
50% of probability mass above
4
JMIR Medical Informatics
17 papers in training set
Top 0.1%
8.3%
5
JAMIA Open
37 papers in training set
Top 0.3%
4.8%
6
Scientific Reports
3102 papers in training set
Top 25%
4.8%
7
BMC Medical Informatics and Decision Making
39 papers in training set
Top 1%
2.6%
8
Journal of Biomedical Informatics
45 papers in training set
Top 0.6%
2.3%
9
Nature Communications
4913 papers in training set
Top 46%
2.3%
10
The Lancet Digital Health
25 papers in training set
Top 0.4%
1.7%
11
PLOS ONE
4510 papers in training set
Top 59%
1.3%
12
Journal of Medical Internet Research
85 papers in training set
Top 3%
1.2%
13
Science Advances
1098 papers in training set
Top 25%
0.9%
14
BMC Medicine
163 papers in training set
Top 6%
0.9%
15
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
0.9%
16
BMJ Open
554 papers in training set
Top 12%
0.8%
17
British Journal of Anaesthesia
14 papers in training set
Top 0.7%
0.8%
18
Frontiers in Medicine
113 papers in training set
Top 7%
0.7%
19
Communications Medicine
85 papers in training set
Top 1%
0.7%
20
Annals of Neurology
57 papers in training set
Top 2%
0.7%
21
Heliyon
146 papers in training set
Top 8%
0.6%
22
Journal of General Internal Medicine
20 papers in training set
Top 1%
0.6%
23
eClinicalMedicine
55 papers in training set
Top 2%
0.6%
24
PLOS Digital Health
91 papers in training set
Top 3%
0.6%