An in silico framework for evaluating PRS-guided prognostic enrichment in clinical trial design
Cai, R.; Gillard, J.; Yang, S.; Gasparyan, S. B.; Lu, Y.; Tian, L.; Vedin, O.; Ashley, E. A.; Rivas, M. A.; O'Sullivan, J. W.
Show abstract
Background: Clinical trials are essential for therapeutic development but increasingly face challenges due to imprecise inclusion criteria, leading to low event rates and the need for large sample sizes. This inefficiency makes modern trials costly and time-consuming. Despite the availability of extensive clinical, genomic, and biological data, current trial enrollment strategies do not fully leverage this information. Incorporating genomic information into trial design could enable risk-based participant enrichment by preferentially enrolling individuals with higher disease risk, thereby increasing event rates and improving trial efficiency. Methods: In this study, we developed an in silico framework for evaluating prognostic enrichment guided by polygenic risk scores (PRS) in clinical trial design using genomic and electronic health record data from large-scale biobanks. Naturally occurring protective genetic variants were used as analogs of therapeutic interventions, with variant carriers treated as 'treatment' arms and non-carriers as 'control' arms. We compared unenriched designs, in which carriers and non-carriers were drawn from the full population, against PRS-enriched designs in which both arms were restricted to participants in the upper 75%, 50%, or 25% of the PRS distribution, respectively. Across these four designs, we quantified disease prevalence, statistical power, sample size requirements, and time-to-event accrual. Results: We applied this approach to the UK Biobank using three model gene-disease pairs: the protective variant p.Arg46Leu in PCSK9 for coronary artery disease (CAD), p.Gln175His in ANGPTL7 for glaucoma, and p.Arg381Gln in IL23R for inflammatory bowel disease (IBD). Across all three disease contexts, PRS-enriched designs increased disease prevalence, improved empirical power, and accelerated event accrual relative to unenriched cohorts. At 80% power, restricting enrollment to the upper 25% of the PRS distribution reduced required per-arm sample sizes by approximately 60% for CAD-PCSK9 and 78% for IBD-IL23R. Consistent reductions in time-to-event were also observed across enriched strata, suggesting that PRS-enriched trials could achieve target event counts with both smaller sample sizes and shorter follow-up. However, for glaucoma-ANGPTL7, the most restrictive threshold did not yield additional gains over moderate enrichment, as reduced sample size attenuated the detectable difference between arms. These results highlight the need to balance enrichment for higher-risk participants against retaining a sufficient eligible population, and underscore that optimal PRS thresholds are disease-context dependent. Conclusions: These findings establish a generalizable, data-driven framework for prospectively evaluating PRS-guided prognostic enrichment prior to trial initiation. In general, PRS-guided study designs lead to improved empirical power, lower required sample sizes, and faster trials. As population-scale genomic data become increasingly available within healthcare systems and biobanks, this framework provides a scalable foundation for integrating genetic risk information into clinical trial design.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.