Back

An in silico framework for evaluating PRS-guided prognostic enrichment in clinical trial design

Cai, R.; Gillard, J.; Yang, S.; Gasparyan, S. B.; Lu, Y.; Tian, L.; Vedin, O.; Ashley, E. A.; Rivas, M. A.; O'Sullivan, J. W.

2026-03-24 genetic and genomic medicine
10.64898/2026.03.21.26348974 medRxiv
Show abstract

Background: Clinical trials are essential for therapeutic development but increasingly face challenges due to imprecise inclusion criteria, leading to low event rates and the need for large sample sizes. This inefficiency makes modern trials costly and time-consuming. Despite the availability of extensive clinical, genomic, and biological data, current trial enrollment strategies do not fully leverage this information. Incorporating genomic information into trial design could enable risk-based participant enrichment by preferentially enrolling individuals with higher disease risk, thereby increasing event rates and improving trial efficiency. Methods: In this study, we developed an in silico framework for evaluating prognostic enrichment guided by polygenic risk scores (PRS) in clinical trial design using genomic and electronic health record data from large-scale biobanks. Naturally occurring protective genetic variants were used as analogs of therapeutic interventions, with variant carriers treated as 'treatment' arms and non-carriers as 'control' arms. We compared unenriched designs, in which carriers and non-carriers were drawn from the full population, against PRS-enriched designs in which both arms were restricted to participants in the upper 75%, 50%, or 25% of the PRS distribution, respectively. Across these four designs, we quantified disease prevalence, statistical power, sample size requirements, and time-to-event accrual. Results: We applied this approach to the UK Biobank using three model gene-disease pairs: the protective variant p.Arg46Leu in PCSK9 for coronary artery disease (CAD), p.Gln175His in ANGPTL7 for glaucoma, and p.Arg381Gln in IL23R for inflammatory bowel disease (IBD). Across all three disease contexts, PRS-enriched designs increased disease prevalence, improved empirical power, and accelerated event accrual relative to unenriched cohorts. At 80% power, restricting enrollment to the upper 25% of the PRS distribution reduced required per-arm sample sizes by approximately 60% for CAD-PCSK9 and 78% for IBD-IL23R. Consistent reductions in time-to-event were also observed across enriched strata, suggesting that PRS-enriched trials could achieve target event counts with both smaller sample sizes and shorter follow-up. However, for glaucoma-ANGPTL7, the most restrictive threshold did not yield additional gains over moderate enrichment, as reduced sample size attenuated the detectable difference between arms. These results highlight the need to balance enrichment for higher-risk participants against retaining a sufficient eligible population, and underscore that optimal PRS thresholds are disease-context dependent. Conclusions: These findings establish a generalizable, data-driven framework for prospectively evaluating PRS-guided prognostic enrichment prior to trial initiation. In general, PRS-guided study designs lead to improved empirical power, lower required sample sizes, and faster trials. As population-scale genomic data become increasingly available within healthcare systems and biobanks, this framework provides a scalable foundation for integrating genetic risk information into clinical trial design.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Trials
25 papers in training set
Top 0.1%
28.7%
2
Circulation: Genomic and Precision Medicine
42 papers in training set
Top 0.2%
8.7%
3
BMC Medicine
163 papers in training set
Top 0.7%
5.0%
4
The American Journal of Human Genetics
206 papers in training set
Top 1%
3.7%
5
Genetics in Medicine
69 papers in training set
Top 0.4%
3.7%
6
Genome Medicine
154 papers in training set
Top 2%
3.7%
50% of probability mass above
7
PLOS ONE
4510 papers in training set
Top 43%
3.0%
8
Circulation
66 papers in training set
Top 1%
2.8%
9
Scientific Reports
3102 papers in training set
Top 44%
2.7%
10
Journal of the American Medical Informatics Association
61 papers in training set
Top 1%
2.2%
11
Nature Communications
4913 papers in training set
Top 46%
2.2%
12
npj Digital Medicine
97 papers in training set
Top 2%
1.5%
13
Clinical and Translational Science
21 papers in training set
Top 0.5%
1.4%
14
Journal of Translational Medicine
46 papers in training set
Top 1%
1.3%
15
PLOS Medicine
98 papers in training set
Top 3%
1.1%
16
BMC Medical Genomics
36 papers in training set
Top 0.8%
1.0%
17
International Journal of Epidemiology
74 papers in training set
Top 2%
1.0%
18
Nature Human Behaviour
85 papers in training set
Top 3%
0.9%
19
Genetic Epidemiology
46 papers in training set
Top 0.7%
0.8%
20
Human Genomics
21 papers in training set
Top 0.3%
0.8%
21
Human Genetics and Genomics Advances
70 papers in training set
Top 0.7%
0.8%
22
Frontiers in Genetics
197 papers in training set
Top 9%
0.8%
23
JAMA
17 papers in training set
Top 0.3%
0.8%
24
Nature
575 papers in training set
Top 15%
0.7%
25
PLOS Computational Biology
1633 papers in training set
Top 25%
0.7%
26
Human Molecular Genetics
130 papers in training set
Top 3%
0.7%
27
BMC Medical Informatics and Decision Making
39 papers in training set
Top 3%
0.7%
28
The Lancet Rheumatology
11 papers in training set
Top 0.3%
0.5%
29
European Journal of Human Genetics
49 papers in training set
Top 2%
0.5%
30
Journal of the American College of Cardiology
12 papers in training set
Top 0.8%
0.5%