Back

From Genomics Data to Causality: An Integrated Pipeline for Mendelian Randomization

Sharma, J.; Jangale, V.; Swain, A. K.; Yadav, P.

2023-11-04 epidemiology
10.1101/2023.11.04.23298053 medRxiv
Show abstract

BackgroundMendelian randomization (MR) has emerged as a valuable tool for causal inference in genetic epidemiology. Existing MR methods have issues related to pleiotropy and offer limited comprehensiveness. Here, we introduce an integrated MR analysis pipeline designed for GWAS summary statistics data. Our pipeline integrates feature selection, harmonization, and checkpoint mechanisms to improve the accuracy and reliability of MR analysis. MethodsIn classical GWAS, the p-value threshold usually does not guarantee to identify causal single-nucleotide polymorphisms (SNPs). In such cases, t-statistics can be considered as imperative and robust criteria for identifying causal SNPs. Therefore, in this study, we computed the t-statistic for all independent SNPs remained after linkage disequilibrium pruning. Next, prior to harmonization, we removed SNPs having a t-statistic below the average t-statistic value. Furthermore, our pipeline incorporates sensitivity analysis tests at each step to reduce the chances of directional pleiotropy. Result and ConclusionWe applied our pipeline to single-sample and two-sample MR study designs, encompassing diverse populations and a wide range of diseases. Our results demonstrate superior performance compared to existing MR methods. In conclusion, our research presents an integrated MR analysis pipeline that significantly enhances the accuracy and reliability of MR studies. By outperforming existing methods and providing comprehensive validation, this pipeline represents a valuable tool for researchers in genetics and epidemiology.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 7%
22.5%
2
Bioinformatics
1061 papers in training set
Top 3%
8.4%
3
BMC Bioinformatics
383 papers in training set
Top 2%
6.3%
4
BMC Research Notes
29 papers in training set
Top 0.1%
4.8%
5
Genetic Epidemiology
46 papers in training set
Top 0.1%
4.8%
6
BMC Medical Research Methodology
43 papers in training set
Top 0.2%
4.8%
50% of probability mass above
7
PLOS Genetics
756 papers in training set
Top 3%
4.3%
8
International Journal of Epidemiology
74 papers in training set
Top 0.4%
4.3%
9
Scientific Reports
3102 papers in training set
Top 37%
3.6%
10
Frontiers in Genetics
197 papers in training set
Top 3%
2.1%
11
F1000Research
79 papers in training set
Top 1%
2.1%
12
Gene
41 papers in training set
Top 0.7%
1.8%
13
Statistics in Medicine
34 papers in training set
Top 0.2%
1.7%
14
PeerJ
261 papers in training set
Top 7%
1.7%
15
Epidemiology and Infection
84 papers in training set
Top 1%
1.7%
16
PLOS Computational Biology
1633 papers in training set
Top 17%
1.7%
17
BMC Genomics
328 papers in training set
Top 3%
1.5%
18
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.3%
1.3%
19
JMIR Public Health and Surveillance
45 papers in training set
Top 3%
0.8%
20
European Journal of Epidemiology
40 papers in training set
Top 0.8%
0.7%
21
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 2%
0.7%
22
BMC Medical Informatics and Decision Making
39 papers in training set
Top 3%
0.7%
23
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.6%
24
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.6%