Back

Development of Longitudinal, Linked Maternal-Infant Cohorts using the Epic Cosmos Electronic Health Record Dataset

Leonard, S. A.; Dysart, K.; Callahan, A.; Siadat, S.; Zhang, J.; Handley, S. C.; Huybrechts, K. F.; Igbinosa, I.; Bateman, B. T.

2026-06-04 epidemiology
10.64898/2026.06.02.26354757 medRxiv
Show abstract

Background: Epic Cosmos is a relatively new centralized electronic health record dataset with high potential utility in perinatal epidemiologic research. Objectives: The study objectives were to develop replicable steps to create longitudinal, linked maternal-infant cohorts in Cosmos, assess completeness of key variables, evaluate potential selection bias with restrictions for longitudinal healthcare encounters, and provide an example epidemiologic analysis. Methods: We created maternal-infant cohorts by starting with live births during 2023-2024 recorded in the BirthFact data table and joining with additional data tables as needed. We selected and created variables for perinatal characteristics, common comorbidities, and routinely measured vital signs and laboratory values, and assessed variable completeness. We sequentially restricted the birth cohort for maternal-infant linkage and longitudinal healthcare from first-trimester prenatal care encounter through infant follow-up care within 12 weeks post-discharge from birth hospitalization. Finally, we conducted an example analysis of the association between high systolic blood pressure in the first trimester ([≥]140 mm Hg) and later onset of preeclampsia among those with chronic hypertension. Results: The total linked birth cohort included 2,624,186 pregnancies. Completeness was >90% for most variables assessed but was 77% for racial and ethnic group and 76% for body mass index at delivery. Characteristics of the cohort were similar to those reported for the entire United States birth population based on birth certificate data, including similar regional and racial-ethnic composition. Longitudinal cohort restriction requiring linked records from first trimester prenatal care through infant follow-up care reduced the cohort size to 509,148 pregnancies. However, restriction had minimal effects on cohort characteristics. In the example analysis, high systolic blood pressure was associated with increased risk of preeclampsia among those with chronic hypertension (aRR: 1.26; 95% CI: 1.22, 1.30). Conclusions: This study provides a rigorous and reproducible approach to creating longitudinal, linked maternal-infant cohorts in Epic Cosmos and the analytical findings suggest high data quality and representativeness.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
International Journal of Epidemiology
74 papers in training set
Top 0.1%
10.7%
2
Journal of the American Heart Association
119 papers in training set
Top 0.7%
9.4%
3
PLOS ONE
4510 papers in training set
Top 21%
8.6%
4
Circulation
66 papers in training set
Top 0.7%
5.0%
5
Hypertension
32 papers in training set
Top 0.2%
5.0%
6
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.1%
4.4%
7
Scientific Reports
3102 papers in training set
Top 29%
4.1%
8
PLOS Medicine
98 papers in training set
Top 1%
3.3%
50% of probability mass above
9
BMJ Open
554 papers in training set
Top 7%
2.8%
10
Journal of Biomedical Informatics
45 papers in training set
Top 0.5%
2.8%
11
Epidemiology
26 papers in training set
Top 0.2%
2.7%
12
Wellcome Open Research
57 papers in training set
Top 0.5%
2.4%
13
BMC Medical Research Methodology
43 papers in training set
Top 0.5%
1.7%
14
BMC Medicine
163 papers in training set
Top 4%
1.4%
15
Annals of Epidemiology
19 papers in training set
Top 0.2%
1.4%
16
BMC Pregnancy and Childbirth
20 papers in training set
Top 0.5%
1.3%
17
American Journal of Epidemiology
57 papers in training set
Top 1%
0.9%
18
BMC Cardiovascular Disorders
14 papers in training set
Top 1%
0.9%
19
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 1%
0.9%
20
JMIR Formative Research
32 papers in training set
Top 1%
0.8%
21
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%
22
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.8%
23
Clinical Infectious Diseases
231 papers in training set
Top 4%
0.8%
24
Nature Communications
4913 papers in training set
Top 61%
0.8%
25
JAMIA Open
37 papers in training set
Top 1%
0.8%
26
Trials
25 papers in training set
Top 2%
0.7%
27
Database
51 papers in training set
Top 1.0%
0.7%
28
JAMA Network Open
127 papers in training set
Top 5%
0.7%
29
International Journal of Obesity
25 papers in training set
Top 0.7%
0.7%
30
Open Heart
19 papers in training set
Top 1%
0.7%