Back

Predicting Vaping Cessation in Young Adults: A Machine Learning and Explainable Artificial Intelligence (XAI) Approach to Public Health Intervention.

Satheeshkumar, P. S.; Lango, I.; Zafor, S.; Ebanks, M.; Das, R. K.; Cheung, K. w.; Pili, R.; Mahajan, S. D.

2025-09-18 health informatics
10.1101/2025.09.15.25335817 medRxiv
Show abstract

The public health impact of vaping in the United States reflects a complex balance of potential benefits and emerging risks. While e-cigarettes can substantially reduce exposure to toxic combustion byproducts and may aid in smoking cessation for adult tobacco users, evidence links e-cigarette use to respiratory and cardiovascular injury, raising concerns about long-term health outcomes in vapers. Unfortunately, vaping has become deeply entrenched among youth. In 2024, 38.4% of adolescent e-cigarette users reported habitual vaping patterns, underscoring persistent nicotine dependence within this vulnerable population. This escalating youth uptake underscores the urgent need for comprehensive, evidence-based policies that simultaneously advance cessation support for adults and enable robust prevention strategies for adolescents. Thus, to inform the optimal use of predictive technologies in vaping cessation efforts, we conducted a social-media-based survey targeting young adult vapers. Our aims were to (1) characterize cessation-related behaviors and attitudes and (2) evaluate machine learning and XAI methods for predicting quit attempts and successes within this population In our study, we employed both forward selection and backward elimination techniques to identify key predictors of successful vaping cessation. The dataset was partitioned into training (70%) and testing (30%) subsets to facilitate model development and evaluation. We applied a range of machine learning algorithms to the training data and subsequently validated their performance on the test set. For linear modeling, we utilized least absolute shrinkage and selection operator (LASSO), ridge regression, and elastic net. In addition, we incorporated non-linear approaches including random forest (RF) and support vector machine (SVM) to capture more complex relationships within the data. We assessed the model performance through C-Statistics/ area under the curve (AUC). Further we validated the performance through Brier Scores. Among the models evaluated, linear approaches demonstrated superior overall performance, with non-linear models such as random forest (RF) and support vector machine (SVM) exhibiting strong predictive accuracy on the training data. LASSO regression yielded robust results, with area under the curve (AUC) values of 0.89 for the training set and 0.91 for the test set. Ridge regression followed closely, achieving AUCs of 0.88 and 0.93, respectively. Elastic net regression performed consistently across both datasets, with an AUC of 0.91 in training and testing. Key predictors of successful vaping cessation included age, environmental triggers, vaping frequency, sex, and long-term behavioral outlook. Age emerged as a particularly influential factor, with individuals under 25 exhibiting increased vulnerability--likely due to neurodevelopmental sensitivity and elevated usage rates. Environmental cues, especially social exposure, were strongly associated with relapse risk, highlighting the importance of trigger management in cessation strategies. Interestingly, vaping frequency served as a counterintuitive indicator: erratic usage patterns correlated with lower cessation success, suggesting that consistent use may reflect a greater readiness for behavioral change. Sex-based differences were also notable, with males demonstrating more intense withdrawal symptoms and higher consumption levels, underscoring the need for gender-responsive interventions. These findings underscore the utility of machine learning in uncovering nuanced behavioral and contextual determinants of addiction and cessation outcomes, offering valuable insights for the design of targeted public health interventions.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
International Journal of Drug Policy
11 papers in training set
Top 0.1%
19.6%
2
Nicotine and Tobacco Research
13 papers in training set
Top 0.1%
14.5%
3
Scientific Reports
3102 papers in training set
Top 9%
8.5%
4
JMIR Public Health and Surveillance
45 papers in training set
Top 0.3%
4.9%
5
PLOS ONE
4510 papers in training set
Top 31%
4.9%
50% of probability mass above
6
Nicotine & Tobacco Research
11 papers in training set
Top 0.1%
4.0%
7
Preventive Medicine Reports
14 papers in training set
Top 0.1%
2.6%
8
Frontiers in Digital Health
20 papers in training set
Top 0.4%
2.1%
9
International Journal of Medical Informatics
25 papers in training set
Top 0.7%
1.9%
10
Journal of the American Heart Association
119 papers in training set
Top 3%
1.9%
11
eClinicalMedicine
55 papers in training set
Top 0.4%
1.9%
12
BMJ Open
554 papers in training set
Top 8%
1.9%
13
BMC Medicine
163 papers in training set
Top 4%
1.5%
14
Addiction
25 papers in training set
Top 0.3%
1.5%
15
Frontiers in Public Health
140 papers in training set
Top 5%
1.5%
16
International Journal of Environmental Research and Public Health
124 papers in training set
Top 5%
1.3%
17
Drug and Alcohol Dependence
37 papers in training set
Top 0.4%
1.3%
18
Addiction Biology
47 papers in training set
Top 0.6%
1.2%
19
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.4%
1.2%
20
npj Digital Medicine
97 papers in training set
Top 3%
1.2%
21
American Journal of Preventive Medicine
11 papers in training set
Top 0.4%
1.1%
22
JAMIA Open
37 papers in training set
Top 1%
1.0%
23
NeuroImage: Clinical
132 papers in training set
Top 3%
1.0%
24
BMC Public Health
147 papers in training set
Top 5%
0.9%
25
Heliyon
146 papers in training set
Top 6%
0.8%
26
Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences
15 papers in training set
Top 0.9%
0.7%
27
PeerJ
261 papers in training set
Top 17%
0.7%
28
Addiction Neuroscience
17 papers in training set
Top 0.6%
0.5%
29
Medicine & Science in Sports & Exercise
15 papers in training set
Top 0.6%
0.5%
30
PLOS Digital Health
91 papers in training set
Top 3%
0.5%