Back

Machine Learning Prediction of Pharmacogenetic Test Uptake Among Opioid-Prescribed Patients Using Electronic Health Records: A Retrospective Cohort Study

Yaseliani, M.; Hong, J.-W.; Bian, J.; Cavallari, L.; Duarte, J.; Nelson, D.; Lo-Ciganic, W.-H.; Nguyen, K. A.; Hasan, M. M.

2025-09-28 health informatics
10.1101/2025.09.26.25336591 medRxiv
Show abstract

BackgroundOpioids are a widely prescribed class of medication for pain management. However, they have variable efficacy and adverse effects among patients, due to complex interplay between biological and clinical factors. Pharmacogenetic (PGx) testing can be utilized to match patients genetic profiles to individualize opioid therapy, improving pain relief and reducing the risk of adverse effects. Despite its potential, PGx uptake--utilization of PGx testing--remains low due to a range of barriers at the patient, health care provider, infrastructure, and financial levels. Since testing typically involves a shared decision between the provider and patient, predicting likelihood of patient undergoing PGx testing and understanding the factors influencing that decision can help optimize resource use and improve outcomes in pain management. ObjectiveTo develop machine learning (ML) models, identifying patients likelihood of PGx uptake based on their demographics, clinical variables, medication use, and social determinants of health (SDoH). MethodsWe utilized electronic health records (EHR) data from a single center healthcare system to identify patients prescribed opioids. We extracted patients demographics, clinical variables, medication use, and SDoH, and developed and validated ML models, including neural networks (NN), logistic regression (LR), random forests (RF), gradient boosting (XGB), naive bayes (NB), and support vector machines (SVM) for PGx uptake prediction based on procedure codes. We performed 5-fold cross validation (CV) and created an ensemble probability-based classifier using the best-performing ML models for PGx uptake prediction. Various performance metrics, uptake stratification analysis, and feature importance analysis were employed to evaluate the performance of the models. ResultsThe ensemble model using XGB and SVM-RBF classifiers had the highest C-statistics at 79.61%, followed by XGB (78.94%), and NN (78.05%). While XGB was the best-performing model, the ensemble model achieved a high accuracy (67.38%), recall (76.50%), specificity (67.25%), and negative predictive value (99.49%). The uptake stratification analysis using the ensemble model indicated that it can effectively distinguish across uptake probability deciles, where those in the higher strata are more likely to undergo PGx in real-world (6.59% in the highest decile compared to 0.12% in the lowest). Furthermore, SHAP value analysis using the XGB model indicated age, hypertension, and household income as the most influential factors for PGx uptake prediction. ConclusionsThe proposed ensemble model demonstrated a high performance in PGx uptake prediction among patients using opioids for pain. This model can be utilized as a decision support tool, assisting clinicians in identifying patients likelihood of PGx uptake and guiding appropriate decision-making.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
International Journal of Medical Informatics
25 papers in training set
Top 0.1%
23.0%
2
Scientific Reports
3102 papers in training set
Top 8%
9.3%
3
JMIR Medical Informatics
17 papers in training set
Top 0.1%
9.3%
4
JAMIA Open
37 papers in training set
Top 0.1%
7.3%
5
BMC Medical Informatics and Decision Making
39 papers in training set
Top 0.4%
7.0%
50% of probability mass above
6
PLOS ONE
4510 papers in training set
Top 30%
5.0%
7
JMIR Public Health and Surveillance
45 papers in training set
Top 0.5%
3.7%
8
Journal of Medical Internet Research
85 papers in training set
Top 1%
3.7%
9
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
1.9%
10
Frontiers in Digital Health
20 papers in training set
Top 0.5%
1.8%
11
Journal of Personalized Medicine
28 papers in training set
Top 0.4%
1.7%
12
eClinicalMedicine
55 papers in training set
Top 0.6%
1.7%
13
npj Digital Medicine
97 papers in training set
Top 2%
1.7%
14
Frontiers in Public Health
140 papers in training set
Top 6%
1.3%
15
Pharmacology Research & Perspectives
11 papers in training set
Top 0.1%
1.3%
16
JMIR Formative Research
32 papers in training set
Top 1%
1.0%
17
BMJ Open
554 papers in training set
Top 11%
1.0%
18
PLOS Digital Health
91 papers in training set
Top 2%
0.9%
19
International Journal of Environmental Research and Public Health
124 papers in training set
Top 6%
0.8%
20
Biomedicines
66 papers in training set
Top 2%
0.8%
21
Cureus
67 papers in training set
Top 4%
0.8%
22
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
23
Healthcare
16 papers in training set
Top 2%
0.7%
24
Pain
70 papers in training set
Top 0.9%
0.7%
25
Journal of General Internal Medicine
20 papers in training set
Top 1%
0.7%
26
BMC Medical Research Methodology
43 papers in training set
Top 2%
0.7%
27
BMC Health Services Research
42 papers in training set
Top 3%
0.5%
28
Frontiers in Medicine
113 papers in training set
Top 9%
0.5%
29
Clinical and Translational Science
21 papers in training set
Top 1%
0.5%
30
BMJ Health & Care Informatics
13 papers in training set
Top 1%
0.5%