Back

Dynamic and Baseline Multi-Task Learning for Predicting Substance Use Initiation in the ABCD Study

Wei, M.; Zhang, H.; Peng, Q.

2026-04-13 addiction medicine
10.64898/2026.04.10.26350655 medRxiv
Show abstract

Background: Early initiation of substance use is linked to later adverse outcomes, and risk factors come from multiple domains and are shared across substances. In our previous work, traditional time-to-event Cox models identified individual risk factors, but these models are not designed to jointly model multiple outcomes or capture complex non-linear relationships. Multi-task learning (MTL) can leverage shared structure across related outcomes to improve prediction and distinguish common versus substance-specific predictors. However, most MTL studies rely on baseline features and focus on single outcomes, which limits their ability to capture shared risk and temporal changes. Substance use initiation is a time-dependent process that unfolds during development and reflects changing exposures over time. Baseline-only models cannot capture these changes or represent risk dynamics. Discrete-time modeling provides a practical approach by estimating interval-level initiation risk and combining it into cumulative risk at the subject level. By integrating multi-task learning with dynamic modeling, it is possible to share information across outcomes while capturing how risk evolves over time, which may improve prediction performance. Methods: Using the Adolescent Brain Cognitive Development (ABCD) Study (release 5.1), we developed two complementary multi-task learning (MTL) frameworks to predict initiation of alcohol, nicotine, cannabis, and any substance use. A baseline MTL model predicted fixed- horizon (48-month) initiation using one record per participant, while a dynamic discrete-time MTL model incorporated longitudinal interval data to model time-varying risk. Both models used multi-domain environmental exposures, core covariates, and polygenic risk scores (PRS). Performance was evaluated on a held-out test set using AUROC, PR-AUC, and calibration metrics, and compared with single-task logistic regression (LR). Feature importance was assessed using permutation importance and compared with Cox proportional hazards models. Results: MTL showed comparable or improved performance relative to LR, with larger gains for low-prevalence outcomes (cannabis and nicotine). Incorporating longitudinal information led to consistent improvements across all outcomes. Dynamic models increased AUROC by +0.044 to +0.062 for MTL and +0.050 to +0.084 for LR, indicating that temporal information was the primary driver of performance gains. Feature importance analyses showed modest overlap across methods, with higher agreement between dynamic MTL and Cox models than static MTL. A small set of features, including externalizing behavior, parental monitoring, and developmental factors, were consistently identified across all approaches. Conclusions: Dynamic multi-task learning improves the prediction of substance use initiation by leveraging longitudinal structure and shared information across outcomes. While MTL provides additional gains, incorporating time-varying information is the dominant factor for improving performance. Combining baseline and dynamic frameworks offers a comprehensive strategy for identifying robust risk factors and modeling adolescent substance use initiation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
JAMA Network Open
127 papers in training set
Top 0.1%
22.7%
2
Computational Psychiatry
12 papers in training set
Top 0.1%
10.5%
3
Drug and Alcohol Dependence
37 papers in training set
Top 0.1%
10.2%
4
PLOS Digital Health
91 papers in training set
Top 0.3%
6.4%
5
Addiction
25 papers in training set
Top 0.2%
4.9%
50% of probability mass above
6
Developmental Cognitive Neuroscience
81 papers in training set
Top 0.1%
4.3%
7
Frontiers in Artificial Intelligence
18 papers in training set
Top 0.1%
4.2%
8
Frontiers in Psychiatry
83 papers in training set
Top 0.9%
4.0%
9
Biological Psychiatry: Cognitive Neuroscience and Neuroimaging
62 papers in training set
Top 0.4%
4.0%
10
Statistics in Medicine
34 papers in training set
Top 0.1%
3.6%
11
Biological Psychiatry Global Open Science
54 papers in training set
Top 0.6%
1.7%
12
Human Brain Mapping
295 papers in training set
Top 3%
1.7%
13
PLOS ONE
4510 papers in training set
Top 53%
1.7%
14
Neuropsychopharmacology
134 papers in training set
Top 2%
1.3%
15
International Journal of Drug Policy
11 papers in training set
Top 0.2%
1.2%
16
Translational Psychiatry
219 papers in training set
Top 3%
1.0%
17
PLOS Computational Biology
1633 papers in training set
Top 22%
0.9%
18
The British Journal of Psychiatry
21 papers in training set
Top 0.8%
0.9%
19
Scientific Reports
3102 papers in training set
Top 71%
0.9%
20
Biological Psychiatry
119 papers in training set
Top 2%
0.9%
21
The Lancet Public Health
20 papers in training set
Top 0.6%
0.8%
22
Addiction Biology
47 papers in training set
Top 0.7%
0.8%
23
American Journal of Epidemiology
57 papers in training set
Top 1%
0.8%
24
eLife
5422 papers in training set
Top 61%
0.6%
25
Communications Biology
886 papers in training set
Top 32%
0.5%
26
Psychopharmacology
59 papers in training set
Top 0.8%
0.5%
27
Addiction Neuroscience
17 papers in training set
Top 0.6%
0.5%
28
BJPsych Open
25 papers in training set
Top 0.9%
0.5%
29
European Child & Adolescent Psychiatry
14 papers in training set
Top 0.5%
0.5%