Back

Constructing a Literature-Derived Database for Benchmarking Polygenic Risk Score Construction Methods with Spectral Ranking Inferences

Sebastian, C.; Yu, M.; Jin, J.

2026-03-03 genetic and genomic medicine
10.64898/2026.03.01.26347258 medRxiv
Show abstract

Polygenic risk scores (PRSs) have emerged as a valuable tool for genetic risk prediction and stratification in human diseases. Over the past decade, extensive methodological efforts have focused on improving the predictive power of PRS, leading to the development of numerous methods for PRS construction. Benchmarking these various methods thus becomes an essential task that is crucial for guiding future PRS applications. While studies have benchmarked subsets of these methods on specific phenotypes and cohorts, the resulting evidence remains fragmented, with a lack of work that comprehensively assess the relative performance of the various PRS methods. In this study, we addressed this gap by systematically constructing a PRS method benchmarking database synthesizing published results from 2009 to 2025. We applied a spectral ranking inference framework with uncertainty quantification to rank 14 PRS methods that had been adequately compared against each other in the literature. We constructed rankings using two complementary sources: original method-development studies and applications/benchmarking studies. While the highest-ranked methods (LDpred2 and AnnoPred) and the lowest-ranked method (C+T) were consistently identified from both sources, the relative ordering of most methods showed moderate variability. We further constructed phenotype-specific rankings, providing more detailed insights into the robustness and phenotype-specific strengths of individual methods. Collectively, the overall and phenotype-specific rankings of the PRS methods, along with the curated benchmarking data from the literature, provide a dynamic and practical reference database that can continuingly be updated with emerging new PRS methods and published benchmarking results to guide future PRS applications.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Genome Medicine
154 papers in training set
Top 0.2%
18.0%
2
Briefings in Bioinformatics
326 papers in training set
Top 0.2%
14.2%
3
Bioinformatics
1061 papers in training set
Top 3%
7.9%
4
Genome Biology
555 papers in training set
Top 1%
6.1%
5
The American Journal of Human Genetics
206 papers in training set
Top 0.8%
6.1%
50% of probability mass above
6
Nucleic Acids Research
1128 papers in training set
Top 4%
4.7%
7
Cell Genomics
162 papers in training set
Top 1%
3.8%
8
Nature Communications
4913 papers in training set
Top 40%
3.6%
9
Frontiers in Genetics
197 papers in training set
Top 2%
3.5%
10
Human Genetics
25 papers in training set
Top 0.1%
3.5%
11
Human Genetics and Genomics Advances
70 papers in training set
Top 0.1%
3.5%
12
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.6%
13
International Journal of Molecular Sciences
453 papers in training set
Top 8%
1.6%
14
Scientific Reports
3102 papers in training set
Top 60%
1.6%
15
PLOS Computational Biology
1633 papers in training set
Top 18%
1.4%
16
Genetic Epidemiology
46 papers in training set
Top 0.5%
1.4%
17
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 5%
1.2%
18
Communications Biology
886 papers in training set
Top 16%
1.1%
19
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
20
Human Genomics
21 papers in training set
Top 0.3%
0.9%
21
PLOS ONE
4510 papers in training set
Top 69%
0.7%
22
PLOS Genetics
756 papers in training set
Top 16%
0.7%
23
European Journal of Human Genetics
49 papers in training set
Top 1%
0.7%
24
Frontiers in Bioinformatics
45 papers in training set
Top 1%
0.7%
25
Computers in Biology and Medicine
120 papers in training set
Top 6%
0.6%
26
Genomics
60 papers in training set
Top 3%
0.6%