Back

Evaluation of phenotyping errors on polygenic risk score predictions

Li, R.; Tong, J.; Duan, R.; Chen, Y.; Moore, J. H.

2019-08-04 genomics
10.1101/724534 bioRxiv
Show abstract

Accurate disease risk prediction is essential in healthcare to provide personalized disease prevention and treatment strategies not only to the patients, but also to the general population. In addition to demographic and environmental factors, advancements in genomic research have revealed that genetics play an important role in determining the susceptibility of diseases. However, for most complex diseases, individual genetic variants are only weakly to moderately associated with the diseases. Thus, they are not clinically informative in determining disease risks. Nevertheless, recent findings suggest that the combined effects from multiple disease-associated variants, or polygenic risk score (PRS), can stratify disease risk similar to that of rare monogenic mutations. The development of polygenic risk score provides a promising tool to evaluate the genetic contribution of disease risk; however, the quality of the risk prediction depends on many contributing factors including the precision of the target phenotypes. In this study, we evaluated the impact of phenotyping errors on the accuracies of PRS risk prediction. We utilized electronic Medical Records and Genomics Network (eMERGE) data to simulate various types of disease phenotypes. For each phenotype, we quantified the impact of phenotyping errors generated from the differential and non-differential mechanism by comparing the prediction accuracies of PRS on the independent testing data. In addition, our results showed that the rate of accuracy degradation depended on both the phenotype and the mechanism of phenotyping error.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
BMC Medical Genomics
36 papers in training set
Top 0.1%
12.2%
2
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 0.7%
9.1%
3
Journal of Personalized Medicine
28 papers in training set
Top 0.1%
7.1%
4
Frontiers in Genetics
197 papers in training set
Top 0.7%
6.7%
5
Journal of Genetics and Genomics
36 papers in training set
Top 0.1%
6.3%
6
Bioinformatics
1061 papers in training set
Top 4%
6.3%
7
PLOS Computational Biology
1633 papers in training set
Top 7%
4.8%
50% of probability mass above
8
BioData Mining
15 papers in training set
Top 0.1%
4.3%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
10
Scientific Reports
3102 papers in training set
Top 38%
3.6%
11
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.0%
12
Genetic Epidemiology
46 papers in training set
Top 0.3%
2.7%
13
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.1%
2.6%
14
Human Genetics
25 papers in training set
Top 0.1%
2.1%
15
PLOS ONE
4510 papers in training set
Top 49%
2.1%
16
Genome Medicine
154 papers in training set
Top 5%
1.6%
17
Human Molecular Genetics
130 papers in training set
Top 2%
1.5%
18
Human Mutation
29 papers in training set
Top 0.5%
1.1%
19
PLOS Genetics
756 papers in training set
Top 12%
0.9%
20
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
21
European Journal of Human Genetics
49 papers in training set
Top 1%
0.8%
22
Database
51 papers in training set
Top 1%
0.7%
23
eLife
5422 papers in training set
Top 58%
0.7%
24
iScience
1063 papers in training set
Top 35%
0.7%
25
Human Genomics
21 papers in training set
Top 0.4%
0.7%
26
Genes
126 papers in training set
Top 4%
0.6%
27
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 3%
0.6%