Back

Incorporating discovery and replication GWAS into summary data Mendelian randomization studies: A review of current methods and a simple, general and powerful alternative

Mounier, N.; Robertson, D. S.; Kutalik, Z.; Dudbridge, F.; Bowden, J.

2023-01-13 genetics
10.1101/2023.01.12.523708 bioRxiv
Show abstract

Mendelian Randomization (MR) is a popular method for using genetics to estimate the causal effect of a modifiable exposure on a health outcome. Single Nucleotide Polymorphisms (SNPs) are typically selected for inclusion if they pass a genome-wide significance threshold in order to guarantee that they are strong genetic instruments, but this also induces Winners curse, as SNP-exposure associations tend to be overestimated. In this paper, we consider how to combine SNP-exposure data from discovery and replication samples using two-sample and three-sample approaches to best account for Winners curse, weak instrument bias, and pleiotropy within a summary data MR framework, using only GWAS summary statistics. After reviewing several existing methods, that often correct for Winners curse at the individual SNP level, we propose a simple alternative based on the technique of regression calibration that enacts a global correction to the causal effect estimate directly. This approach does not only correct for Winners curse, but also simultaneously accounts for weak instruments bias. Regression calibration can be used with a wide range of existing MR methods, including pleiotropy-robust methods such as median-based and mode-based estimators. Extensive simulations and real data examples are used to illustrate the utility of the new approach. Software is provided for users to implement the method in practice. Author SummaryMendelian randomization is a method to explore causation in health research which exploits the random inheritance of genes from parents to offspring as a natural experiment. It attempts to quantify the effect of intervening and modifying a health exposure, such as a persons body mass, on a downstream outcome such as blood pressure. Causal estimates obtained using this method can be strongly influenced by the set of genes used, or more specifically, the rationale used to select them. For example, selecting only genes that are strongly associated with the health exposure can induce bias due to the Winners curse. Unfortunately, using genes with a small association can lead to so-called weak instrument bias leading to a no-win paradox. In this paper, we present a novel approach based on the technique of regression calibration to de-bias causal estimates in an MR study. Our approach relies on the use of two independent samples for the exposure (discovery and replication) to estimate the amount of bias that is expected for a specific set of genes, so that causal estimates can be re-calibrated accordingly. We use extensive simulations and applied examples to compare our approach to current methods and provide software for researchers to implement our approach in future studies.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Genetic Epidemiology
46 papers in training set
Top 0.1%
23.1%
2
Bioinformatics
1061 papers in training set
Top 3%
10.7%
3
Statistics in Medicine
34 papers in training set
Top 0.1%
8.6%
4
PLOS Genetics
756 papers in training set
Top 1%
8.6%
50% of probability mass above
5
PLOS ONE
4510 papers in training set
Top 21%
8.6%
6
International Journal of Epidemiology
74 papers in training set
Top 0.2%
7.0%
7
BMC Bioinformatics
383 papers in training set
Top 2%
4.1%
8
PLOS Computational Biology
1633 papers in training set
Top 9%
4.0%
9
Frontiers in Genetics
197 papers in training set
Top 2%
3.2%
10
Biometrics
22 papers in training set
Top 0.1%
2.8%
11
The American Journal of Human Genetics
206 papers in training set
Top 2%
1.9%
12
European Journal of Epidemiology
40 papers in training set
Top 0.5%
1.3%
13
BMC Medical Research Methodology
43 papers in training set
Top 0.8%
1.3%
14
GENETICS
189 papers in training set
Top 0.9%
1.3%
15
Scientific Reports
3102 papers in training set
Top 70%
0.9%
16
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.8%
17
Human Brain Mapping
295 papers in training set
Top 4%
0.8%
18
American Journal of Epidemiology
57 papers in training set
Top 1%
0.8%
19
Human Molecular Genetics
130 papers in training set
Top 4%
0.7%
20
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.7%
21
European Journal of Human Genetics
49 papers in training set
Top 1%
0.7%
22
Biostatistics
21 papers in training set
Top 0.1%
0.7%
23
The Annals of Applied Statistics
15 papers in training set
Top 0.1%
0.7%
24
BMC Research Notes
29 papers in training set
Top 0.9%
0.5%
25
Mathematics
11 papers in training set
Top 0.6%
0.5%