Back

Constructing and analyzing a synthetic life course cohort based on pooling two data sources: A case study of early adulthood depression symptomatology and late-life cognition

Zimmerman, S. C.; Buto, P.; Kezios, K.; Zeki Al Hazzouri, A.; Glymour, M. M.

2026-02-27 epidemiology
10.64898/2026.02.25.26347113 medRxiv
Show abstract

BackgroundSynthetic cohorts created by combining two cohorts can be useful when no single data set includes both the exposure and outcome data of interest. We estimate the effects of depression in early adulthood on later-life memory outcome using two nationally representative cohorts separately and in a synthetic sample. MethodsWe used the National Longitudinal Study of Youth 1979 (NLSY; N=5,747) and the Health and Retirement Study (HRS; N=6,846) and a synthetic cohort combining exposure data from N=5,680 NLSY participants (born 1957-1965) aged 55-63 in 2020 who completed midlife cognitive assessment between 2006-2020 with outcome data from N=9,726 HRS participants born 1957-1964 who completed cognitive assessments when 47-63 years old and every 2-years thereafter. A 6-item version of the Centers for Epidemiologic Studies-Depression (CES-D) score (range 0-6) was measured from late adolescence through midlife in NLSY and in midlife in HRS. Memory was measured as the sum of immediate and delayed word recall scores up to twice in NLSY at age 48+ and up to 10 times in HRS at age 50+. We generated a synthetic life course cohort, matching HRS participants to NLSY participants based on 10 variables measured in midlife in both cohorts and posited to either confound or mediate the association between early life depressive symptoms and late-life memory. Matching variables included midlife depression and memory. We used confounder-adjusted linear mixed models to estimate the association between earliest reported depressive symptoms in NLSY and HRS with memory in the respective data sets and evaluated associations of early life depression symptoms with the repeated later life memory measures in the synthetic cohort. ResultsIn NLSY, each increment in CES-D at age 23-31 was associated with lower average memory scores ({beta}NLSY_level=-0.050 95%CI (-0.097,-0.003)) in midlife but no detectable difference in rate of memory decline ({beta}NLSY_slope=-0.070 95%CI (-0.382,0.242). In HRS, CES-D at average age 53 was associated with lower average memory ({beta}HRS_level=-0.163 (-0.199, -0.128)) but not rate of decline ({beta}HRS_slope=-0.021 (-0.062, 0.020)). In the synthetic cohort, CES-D at age 23-27 was associated with lower memory score at age 50+ ({beta}synth_level=-0.044 95%CI (-0.085,-0.003)) but not associated with rate of cognitive decline ({beta}synth_slope=0.005 95%CI (-0.052,0.062)). ConclusionsDepressive symptoms ages 23-31 predicted mid- to late-life memory function but had no clear association with memory decline. Combining data across cohorts spanning separate, but overlapping, parts of the life course is a promising approach to overcome data limitations in life course research, but it requires careful implementation to ensure that assumptions are met and estimates are appropriately interpreted.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
American Journal of Epidemiology
57 papers in training set
Top 0.1%
18.9%
2
Journal of Affective Disorders
81 papers in training set
Top 0.2%
10.3%
3
Epidemiology
26 papers in training set
Top 0.1%
4.9%
4
Psychological Medicine
74 papers in training set
Top 0.3%
4.9%
5
BMC Medicine
163 papers in training set
Top 0.7%
4.9%
6
Brain, Behavior, and Immunity
105 papers in training set
Top 0.5%
4.0%
7
PLOS ONE
4510 papers in training set
Top 35%
4.0%
50% of probability mass above
8
JAMA Network Open
127 papers in training set
Top 1%
3.3%
9
Journal of Psychiatric Research
28 papers in training set
Top 0.2%
3.1%
10
SSM - Population Health
17 papers in training set
Top 0.1%
2.6%
11
Translational Psychiatry
219 papers in training set
Top 2%
2.4%
12
JAMA Psychiatry
13 papers in training set
Top 0.2%
2.1%
13
Scientific Reports
3102 papers in training set
Top 57%
1.7%
14
Alzheimer's & Dementia
143 papers in training set
Top 2%
1.7%
15
PLOS Medicine
98 papers in training set
Top 3%
1.5%
16
Social Science & Medicine
15 papers in training set
Top 0.5%
1.5%
17
The British Journal of Psychiatry
21 papers in training set
Top 0.6%
1.4%
18
European Psychiatry
10 papers in training set
Top 0.4%
1.4%
19
International Journal of Epidemiology
74 papers in training set
Top 2%
1.4%
20
Biological Psychiatry Global Open Science
54 papers in training set
Top 1%
1.1%
21
Journal of Child Psychology and Psychiatry
25 papers in training set
Top 0.3%
1.0%
22
Social Psychiatry and Psychiatric Epidemiology
11 papers in training set
Top 0.4%
0.9%
23
Brain, Behavior, & Immunity - Health
27 papers in training set
Top 0.4%
0.9%
24
European Journal of Epidemiology
40 papers in training set
Top 0.6%
0.8%
25
BMC Public Health
147 papers in training set
Top 6%
0.8%
26
Molecular Psychiatry
242 papers in training set
Top 3%
0.8%
27
Journal of Affective Disorders Reports
10 papers in training set
Top 0.3%
0.7%
28
Epigenetics
43 papers in training set
Top 1%
0.7%
29
Nature Human Behaviour
85 papers in training set
Top 5%
0.7%
30
Clinical Epigenetics
53 papers in training set
Top 1%
0.7%