Back

OMOP CDM for breast cancer research: transforming the Breast Cancer Now Biobank data

Abdollahyan, M.; BCNB-BCI, ; Chelala, C.

2025-12-15 health informatics
10.64898/2025.12.14.25342223
Show abstract

Common data models (CDMs) are essential for health data standardisation, which facilitates the governance and management of data, improves data quality and enhances the findability, accessibility, interoperability and reusability of data. They allow researchers to efficiently integrate health datasets and perform joint analysis on them, promoting collaboration and maximising translation of research outputs for patients benefit. We describe the process of transforming the biobank data for over 2,850 donors recruited at the Barts Cancer Institute (BCI) site of the Breast Cancer Now Biobank (BCNB) - the UKs first national breast cancer biobank hosting longitudinal biospecimens and associated clinical, genomic and imaging data - into the Observational Medical Outcomes Partnership (OMOP) CDM. Our transformation pipeline achieved high coverage, with 83% of source concepts mapped, and our OMOP CDM achieved a total pass rate of 100% in quality assessments. We present the breast cancer characteristics of the resultant patient cohort. We report several challenges faced during the transformation process and explain how we addressed them, and discuss the strengths and limitations of adopting the OMOP CDM for breast cancer research. The OMOP-mapped BCNB-BCI dataset is a valuable resource that can now be explored and analysed alongside other health datasets.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
JCO Clinical Cancer Informatics
based on 14 papers
Top 0.1%
9.8%
2
Journal of Biomedical Informatics
based on 37 papers
Top 0.6%
9.8%
3
BMC Medical Informatics and Decision Making
based on 36 papers
Top 2%
7.3%
4
Journal of the American Medical Informatics Association
based on 53 papers
Top 2%
7.3%
5
JAMIA Open
based on 35 papers
Top 2%
6.2%
6
Scientific Data
based on 30 papers
Top 0.1%
5.6%
7
Nature Communications
based on 483 papers
Top 15%
5.1%
50% of probability mass above
8
PLOS ONE
based on 1737 papers
Top 72%
4.5%
9
BMJ Health & Care Informatics
based on 13 papers
Top 0.2%
4.3%
10
Bioinformatics
based on 24 papers
Top 0.4%
2.7%
11
npj Digital Medicine
based on 85 papers
Top 8%
2.4%
12
BMJ Open
based on 553 papers
Top 37%
2.2%
13
Nature Medicine
based on 88 papers
Top 7%
1.7%
14
PLOS Digital Health
based on 88 papers
Top 8%
1.7%
15
JMIR Medical Informatics
based on 16 papers
Top 3%
1.5%
16
International Journal of Medical Informatics
based on 25 papers
Top 4%
1.5%
17
The Lancet Digital Health
based on 25 papers
Top 3%
1.3%
18
Frontiers in Digital Health
based on 18 papers
Top 3%
1.3%
19
Wellcome Open Research
based on 34 papers
Top 2%
1.3%
20
iScience
based on 74 papers
Top 7%
0.8%
21
eLife
based on 262 papers
Top 31%
0.8%
22
Scientific Reports
based on 701 papers
Top 85%
0.8%
23
BMC Medical Research Methodology
based on 41 papers
Top 7%
0.6%
24
BMC Medical Genomics
based on 12 papers
Top 2%
0.6%
25
Med
based on 26 papers
Top 2%
0.6%