Back

Functionally informed cis and trans proteome-wide association studies prioritize disease-critical genes

Hou, K.; Pazokitoroudi, A.; Strober, B.; Jiang, X.; Price, A. L.

2026-04-27 genetic and genomic medicine
10.64898/2026.04.24.26351667 medRxiv
Show abstract

Proteome-wide association studies (PWAS) typically link genetically predicted protein levels to disease using cis-pQTLs, which can be limited by low cis-heritability for disease-critical genes under negative selection and by tagging due to co-regulation among nearby genes. Trans-pQTLs provide complementary information when large sample sizes are available to detect weak polygenic effects, enabling associations between trans-predicted protein levels and disease. We developed PolyPWAS, a functionally informed, summary statistics-based framework for associating both cis- and trans-predicted protein levels to disease. PolyPWAS integrates 96 functional annotations with proteome-wide pleiotropy to improve protein prediction, while correcting for PCs of predicted protein levels to limit tagging effects. We applied PolyPWAS to 2.8K plasma proteins measured in 34K UKB-PPP participants, analyzing GWAS summary statistics for 88 diseases and complex traits (average N=336K). Trans-predicted protein levels explained 21% of disease heritability (vs. 9.6% for cis-predicted protein levels), leveraging a 24% relative improvement in trans-prediction accuracy from functional priors. Trans-PWAS identified more significant protein-disease associations (and more conditionally significant associations) than cis-PWAS. Cis and trans associations showed only modest excess overlap (1.18, 95% CI: 1.11-1.26). Accordingly, combining evidence from cis and trans associations improved disease gene prioritization evaluated using gene sets from rare variant association studies (+11% relative improvement) and PoPS (+7.0% relative improvement) relative to cis-only approaches. PWAS associations to disease replicated across protein level cohorts, with strong UKB-PPP/deCODE concordance after adjusting for cohort-specific prediction accuracy. We provide examples where trans-regulatory effects link multiple disease-critical genes, underscoring the importance of integrating cis- and trans-regulatory effects to map protein-mediated disease biology.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Cell Genomics
162 papers in training set
Top 0.1%
13.9%
2
Nature Communications
4913 papers in training set
Top 12%
13.9%
3
Nature Genetics
240 papers in training set
Top 0.6%
10.1%
4
The American Journal of Human Genetics
206 papers in training set
Top 0.7%
7.0%
5
Genome Biology
555 papers in training set
Top 1%
6.2%
50% of probability mass above
6
Nature
575 papers in training set
Top 5%
6.1%
7
Cell
370 papers in training set
Top 4%
4.7%
8
Cell Systems
167 papers in training set
Top 3%
4.2%
9
Genome Medicine
154 papers in training set
Top 2%
4.2%
10
Science
429 papers in training set
Top 9%
3.5%
11
Nature Neuroscience
216 papers in training set
Top 4%
2.0%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 30%
1.8%
13
eLife
5422 papers in training set
Top 44%
1.6%
14
Molecular Systems Biology
142 papers in training set
Top 0.7%
1.6%
15
Nature Immunology
71 papers in training set
Top 1%
1.6%
16
Nature Medicine
117 papers in training set
Top 2%
1.6%
17
Science Advances
1098 papers in training set
Top 26%
0.9%
18
Nature Metabolism
56 papers in training set
Top 2%
0.9%
19
Developmental Cell
168 papers in training set
Top 11%
0.9%
20
Cell Metabolism
49 papers in training set
Top 2%
0.7%
21
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
22
Neuron
282 papers in training set
Top 10%
0.6%
23
Nature Microbiology
133 papers in training set
Top 5%
0.6%
24
Nature Structural & Molecular Biology
218 papers in training set
Top 6%
0.6%
25
Cancer Discovery
61 papers in training set
Top 2%
0.6%