Machine learning-driven prediction of opioid and stimulant-related drug overdose fatalities: Analysis of the potential fourth wave
Eze, C. D.; Hansen, R.; Abate, M.; Smith, G.; Al-Mamun, M. A.
Show abstract
Between 2010 and 2021, fentanyl and stimulants co-involved deaths increased from 0.6% to 32.3% of all overdose deaths in the U.S. The Centers for Disease Control and Prevention monitors overdose deaths, but reports are delayed by about 4 -6 months. Therefore, advanced methods are needed for optimized trend monitoring and preparing the healthcare system. We developed and compared traditional and machine learning (ML)-based time series prediction models for forecasting opioid and stimulant-involved death rates. Forensic research data (2015 to 2023) built from the West Virginia (WV) Office of the Chief Medical Examiner data were used for this study. Decedents with any opioid or any stimulant-involved death were identified and placed into three cohorts [boxh] opioid-only, stimulant-only, and opioid and stimulant co-involved deaths. Monthly death rate per 100,000 was calculated for each cohort using total cases per month and West Virginia population data from Census Bureau. Autoregressive Integrated Moving Average (ARIMA), Random Forest (RF), and Extreme Gradient Boosting (XGBoost) variant models (differenced, non-differenced, and blended) models were trained on 80% of each cohorts time-ordered data. An iterative forecasting of the 20% testing data was conducted. Model performance on the test prediction was evaluated using calculated metrics such as root mean square error (RMSE), R2, mean absolute error (MAE), and mean absolute percentage error (MAPE) values. Counts and percentages of cases per year were obtained for each cohort. Death rate and model predictions were represented in time series. Models performance for each cohort were compared using the performance metrics. 10,812 cases were identified from 2015 to 2023 with 4,295 involving opioid-only, 1,392 involving stimulant-only, and 4,175 co-involving an opioid and a stimulant. Stimulant-only and opioid and stimulant co-involved death rates had an upward trend with a peak in opioid and stimulant co-involved death in 2021. Although opioid-only death rate had a downward trend over time, the death rate peaked in 2020. The non- differenced XGBoost model outperformed for opioid-only (R2 = 0.92, RMSE = 0.12, MAE = 0.10, MAPE = 6.59%) and simulant-only (R2 = 0.91, RMSE = 0.07, MAE = 0.06, MAPE = 7.35%) death rate prediction. The blended XGBoost model had the best performance for opioid and stimulant co-involved death rate prediction (R2 = 0.78, RMSE = 0.31, MAE = 0.27, MAPE = 8.87%). Differenced XGBoost models outperformed other models for short term forecasting, while the non-different variants performed better for long-term predictions. Machine learning models, especially, the XGBoost variants outperformed other models for predicting opioid-only, stimulant-only, and opioid and stimulant co-involved death rates, respectively. The differenced models can be used for early death rate signal detection while the non-differenced XGBoost models can aid long-term forecasts for overdose death monitoring, planning and allocation of resources in health systems. Author summaryThe United States has been faced with the problem of opioid abuse and overdose death for several decades. Currently, there is a rise in drug overdose deaths co-involving an opioid and a stimulant. Although the CDC monitors and produces a provisional overdose death count, this report is often delayed by 4-6 months. There is a need to develop a high accuracy predictive tool that can yield reliable forecasts of these overdose deaths that can be used to guide policy decisions and avoid the delay. Here we developed and compared machine learning (Extreme gradient boosting (XGBoost) and Random Forest) to traditional statistical (ARIMA) forecasting models for predicting overdose death rates involving an opioid, a stimulant, or both. We found that the XGBoost models performed better than ARIMA and Random Forest for making the predictions. Our study provides a tool that can be used to predict future overdose deaths and provide information to prepare health systems and communities to better respond to overdose deaths and develop policies targeting drug, especially opioid and stimulant overdose prevention.
Matching journals
The top 4 journals account for 50% of the predicted probability mass.