Back

Mother-infant linked UK electronic birth cohorts representing 17.5 million births harmonised to the OMOP common data model

Seaborne, M.; Durbaba, S.; Mendez-Villalon, A.; Giles, T.; Gonzalez-Izquierdo, A.; Hough, A.; Sanchez-Soriano, C.; Snell, H.; Cockburn, N.; Nirantharakumar, K.; Poston, L.; Reynolds, R.; Santorelli, G.; Brophy, S.

2026-03-25 public and global health
10.64898/2026.03.23.26349078 medRxiv
Show abstract

We describe the harmonisation of five UK electronic birth cohorts to the Observational Medical Outcomes Partnership (OMOP) Common Data Model, creating a large scale, standardised resource for maternal and child health research. The Mother and Infant Research Data Analysis (MIREDA) partnership developed and implemented reproducible guidelines for mapping maternal infant relationships and identifying pregnancy episodes within routinely collected healthcare data. Cohorts from England, Scotland, and Wales were transformed despite substantial heterogeneity in data structure, coding systems, and variable definitions. The resulting harmonised resource preserves each cohort as an independent dataset while enabling federated analyses to be conducted across sites without the need to share individual level data. Collectively, the cohorts capture over 17.5 million live births, providing sufficient scale to investigate rare exposures and outcomes, support trial emulation, and evaluate population level policy impacts across the UK. This article details the transformation pipeline and provides reusable methods to support extension to additional cohorts and networks. The harmonised datasets enable interoperable, reproducible research and facilitate cross national comparative studies in maternal and child health.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 8%
17.2%
2
International Journal of Epidemiology
74 papers in training set
Top 0.1%
14.4%
3
npj Digital Medicine
97 papers in training set
Top 0.7%
6.7%
4
Nature Medicine
117 papers in training set
Top 0.3%
6.2%
5
The Lancet Digital Health
25 papers in training set
Top 0.1%
6.2%
50% of probability mass above
6
BMC Medicine
163 papers in training set
Top 0.8%
4.8%
7
BMJ Open
554 papers in training set
Top 6%
3.5%
8
PLOS ONE
4510 papers in training set
Top 41%
3.5%
9
Nature Genetics
240 papers in training set
Top 2%
3.5%
10
eLife
5422 papers in training set
Top 27%
3.5%
11
Genome Medicine
154 papers in training set
Top 3%
2.3%
12
PLOS Medicine
98 papers in training set
Top 2%
2.0%
13
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.2%
1.7%
14
Thorax
32 papers in training set
Top 0.6%
1.3%
15
BMJ
49 papers in training set
Top 0.9%
0.9%
16
The Lancet Infectious Diseases
71 papers in training set
Top 2%
0.9%
17
Wellcome Open Research
57 papers in training set
Top 2%
0.9%
18
Scientific Reports
3102 papers in training set
Top 71%
0.9%
19
The Lancet Public Health
20 papers in training set
Top 0.5%
0.9%
20
PLOS Global Public Health
293 papers in training set
Top 5%
0.8%
21
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.8%
22
Med
38 papers in training set
Top 0.9%
0.7%
23
British Journal of General Practice
22 papers in training set
Top 0.6%
0.7%
24
Communications Medicine
85 papers in training set
Top 1%
0.7%
25
JAMIA Open
37 papers in training set
Top 2%
0.6%
26
Epidemics
104 papers in training set
Top 2%
0.6%
27
GENETICS
189 papers in training set
Top 2%
0.6%
28
International Journal of Medical Informatics
25 papers in training set
Top 2%
0.6%
29
Science Advances
1098 papers in training set
Top 34%
0.6%
30
iScience
1063 papers in training set
Top 38%
0.6%