Back

Development and validation of polygenic scores for within-family prediction of disease risks

Moore, S.; Davidson, I.; Anomaly, J.; Li, J. H.; Ahangari, M.; Moissiy, L.; Christensen, M.; Young, A. S.; Stern, D.; Wolfram, T.

2025-08-08 genetic and genomic medicine
10.1101/2025.08.06.25333145 medRxiv
Show abstract

The clinical implementation of polygenic scores (PGSs) for disease risk prediction, particularly in reproductive health applications, requires rigorous validation. Here, we develop seventeen disease PGSs by conducting large-scale GWAS meta-analyses, and we validate our scores in out-of-sample prediction analyses. We achieve state-of-the-art predictive performance, consistently matching or outperforming academic and commercial benchmarks, with liability R2 reaching up to 0.21 (type 2 diabetes). The performance of a PGS for embryo screening depends on its predictive ability within-family, which can be lower than its prediction ability among unrelated individuals. However, very few disease PGSs have been tested within-family. We perform systematic within-family validation of our disease PGSs, finding no decrease in predictive performance within-family for 16 of 17 scores. PGS performance typically declines with genetic distance from training data, an effect that needs to be accounted for to give properly calibrated predictions across ancestries. We perform extensive calibration of our scores performance across different ancestries, finding improved cross-ancestry performance compared to previous approaches, especially in African and East Asian populations. This is likely due to the fact our scores are constructed using a method that incorporates functional genomic annotations on more than 7 million variants, enabling a degree of fine-mapping of causal variants shared across ancestries. We illustrate clinical utility through examining the risk reduction that could be achieved through embryo screening for type 2 diabetes: selecting among 10 embryos is expected to reduce absolute disease risk by 12-20% in families where both parents are affected, with similar relative risk reductions across ancestries. These findings establish a framework for implementing PGS in reproductive medicine while demonstrating both the technologys potential for disease prevention and the methodological standards required for responsible clinical translation.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
The American Journal of Human Genetics
206 papers in training set
Top 0.1%
44.2%
2
Human Genetics and Genomics Advances
70 papers in training set
Top 0.1%
7.2%
50% of probability mass above
3
Cell Genomics
162 papers in training set
Top 0.4%
6.8%
4
PLOS Genetics
756 papers in training set
Top 2%
6.8%
5
Genome Medicine
154 papers in training set
Top 1%
4.6%
6
Nature Communications
4913 papers in training set
Top 42%
3.3%
7
Scientific Reports
3102 papers in training set
Top 48%
2.2%
8
Nature Genetics
240 papers in training set
Top 4%
2.0%
9
Human Molecular Genetics
130 papers in training set
Top 2%
1.6%
10
Nature Human Behaviour
85 papers in training set
Top 2%
1.6%
11
Genetic Epidemiology
46 papers in training set
Top 0.5%
1.4%
12
Cell Reports Medicine
140 papers in training set
Top 5%
1.3%
13
Nature Medicine
117 papers in training set
Top 4%
0.9%
14
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 42%
0.8%
15
European Journal of Human Genetics
49 papers in training set
Top 1%
0.8%
16
Frontiers in Genetics
197 papers in training set
Top 9%
0.8%
17
Bioinformatics
1061 papers in training set
Top 9%
0.8%
18
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.7%
19
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 1%
0.7%
20
Human Genetics
25 papers in training set
Top 0.5%
0.5%
21
Genetics in Medicine
69 papers in training set
Top 1%
0.5%
22
iScience
1063 papers in training set
Top 39%
0.5%