Back

A methodological framework for accommodating Cancer Genomics Information in OMOP-CDM using Variation Representation Specification (VRS).

Benetti, E.; Scicolone, G.; Tajwar, M.; Masciullo, C.; Bucci, G.; Riba, M.

2026-02-10 bioinformatics
10.64898/2026.02.09.702490 bioRxiv
Show abstract

The OMOP Common Data Model (OMOP CDM) in which observational health data are organized and stored is a broadly accepted data standard which helps clinical research facilitating federation study protocols. In case of cancer studies, there is a growing need to incorporate cancer genomics data in a standardized way. Starting from a brief overview of the basic features of the OMOP CDM, we imagine a path of increasing complexity for including known biomarker genomic data coming from pathology or reports or clinical laboratory findings, towards storing thousands of known and unknown variants coming from genome sequencing data. Data should be stored using standardized identifiers, including those defined by the Global Alliance for Genomics and Health (GA4GH). We propose a scalable strategy for storing genomics variants in increasingly complex scenarios and present KOIOS-VRS, a pipeline that automates the conversion of VCF files into OMOP compatible format.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 1%
22.4%
2
Genome Medicine
154 papers in training set
Top 0.3%
12.6%
3
Nucleic Acids Research
1128 papers in training set
Top 2%
8.4%
4
Database
51 papers in training set
Top 0.1%
7.1%
50% of probability mass above
5
BMC Bioinformatics
383 papers in training set
Top 1%
6.8%
6
PLOS ONE
4510 papers in training set
Top 36%
3.9%
7
GigaScience
172 papers in training set
Top 0.5%
3.6%
8
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.7%
3.6%
9
Bioinformatics Advances
184 papers in training set
Top 2%
2.7%
10
PLOS Computational Biology
1633 papers in training set
Top 12%
2.7%
11
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.1%
12
Briefings in Bioinformatics
326 papers in training set
Top 3%
1.9%
13
iScience
1063 papers in training set
Top 16%
1.7%
14
Nature Communications
4913 papers in training set
Top 55%
1.3%
15
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.6%
1.3%
16
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 4%
1.3%
17
Genome Research
409 papers in training set
Top 3%
1.2%
18
BMC Medical Genomics
36 papers in training set
Top 0.7%
1.2%
19
Frontiers in Genetics
197 papers in training set
Top 7%
1.2%
20
Genome Biology
555 papers in training set
Top 6%
1.2%
21
Scientific Reports
3102 papers in training set
Top 69%
0.9%
22
European Journal of Human Genetics
49 papers in training set
Top 1.0%
0.9%