Back

Using summary statistics to evaluate the genetic architecture of multiplicative combinations of initially analyzed phenotypes with a flexible choice of covariates

Wolf, J. M.; Westra, J.; Tintle, N.

2021-03-09 genetics
10.1101/2021.03.08.433979 bioRxiv
Show abstract

While the promise of electronic medical record and biobank data is large, major questions remain about patient privacy, computational hurdles, and data access. One promising area of recent development is pre-computing non-individually identifiable summary statistics to be made publicly available for exploration and downstream analysis. In this manuscript we demonstrate how to utilize pre-computed linear association statistics between individual genetic variants and phenotypes to infer genetic relationships between products of phenotypes (e.g., ratios; logical combinations of binary phenotypes using and and or) with customized covariate choices. We propose a method to approximate covariate adjusted linear models for products and logical combinations of phenotypes using only pre-computed summary statistics. We evaluate our methods accuracy through several simulation studies and an application modeling various fatty acid ratios using data from the Framingham Heart Study. These studies show consistent ability to recapitulate analysis results performed on individual level data including maintenance of the Type I error rate, power, and effect size estimates. An implementation of this proposed method is available in the publicly available R package pcsstools.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Genetic Epidemiology
46 papers in training set
Top 0.1%
22.8%
2
The American Journal of Human Genetics
206 papers in training set
Top 0.3%
14.5%
3
PLOS Genetics
756 papers in training set
Top 1%
8.5%
4
PLOS Computational Biology
1633 papers in training set
Top 5%
6.5%
50% of probability mass above
5
Bioinformatics
1061 papers in training set
Top 4%
4.9%
6
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.6%
4.2%
7
Frontiers in Genetics
197 papers in training set
Top 2%
4.0%
8
PLOS ONE
4510 papers in training set
Top 38%
3.6%
9
BMC Bioinformatics
383 papers in training set
Top 3%
2.4%
10
Human Genetics and Genomics Advances
70 papers in training set
Top 0.2%
2.1%
11
Nature Communications
4913 papers in training set
Top 53%
1.5%
12
Genome Research
409 papers in training set
Top 3%
1.5%
13
iScience
1063 papers in training set
Top 19%
1.3%
14
GENETICS
189 papers in training set
Top 0.8%
1.3%
15
Bioinformatics Advances
184 papers in training set
Top 3%
1.3%
16
Human Molecular Genetics
130 papers in training set
Top 2%
1.3%
17
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
1.2%
18
Scientific Reports
3102 papers in training set
Top 66%
1.2%
19
International Journal of Epidemiology
74 papers in training set
Top 2%
1.0%
20
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.9%
21
Biostatistics
21 papers in training set
Top 0.1%
0.8%
22
Biometrics
22 papers in training set
Top 0.2%
0.8%
23
Journal of Computational Biology
37 papers in training set
Top 0.5%
0.8%
24
IEEE/ACM Transactions on Computational Biology and Bioinformatics
32 papers in training set
Top 0.7%
0.7%
25
Genome Biology
555 papers in training set
Top 8%
0.7%
26
Statistics in Medicine
34 papers in training set
Top 0.4%
0.7%
27
Communications Biology
886 papers in training set
Top 32%
0.5%
28
Genetics
225 papers in training set
Top 5%
0.5%
29
eLife
5422 papers in training set
Top 63%
0.5%