Ancestry-specific performance of variant effect predictors in clinical variant classification
Hoffing, R.; Zeiberg, D.; Stenton, S. L.; Mort, M.; Cooper, D. N.; Hahn, M. W.; O'Donnell-Luria, A.; Ward, L. D.; Radivojac, P.
Show abstract
Predicting the effects of genetic variants and assessing prediction performance are key computational tasks in genomic medicine. It has been shown that well-calibrated variant effect predictors can be reliably used as evidence towards establishing pathogenicity (or benignity) of missense variants, thereby rendering these variants suitable for use in (or exclusion from) the genetic diagnosis of rare Mendelian conditions. However, most predictors have been trained or calibrated on data that may not be sufficiently representative to lead to similar performance across all genetic ancestries. This raises questions about the responsible deployment of these tools to improve human health. To better understand the utility of computational predictors, we set out to assess their ancestry-specific performance in terms of accuracy and evidence strength according to the ACMG/AMP guidelines. First, we determined that the expected count of rare variants in an individuals genome and the allele frequency distribution of these variants are the key confounders when evaluating a predictors performance across different genetic ancestries. Second, we found that a predictors accuracy itself inversely correlates with the allele frequency of the rare variant. After stratifying according to allele frequency, we show that established methods for predicting the pathogenicity of missense variants have comparable performance levels across major ancestry groups. Our results therefore support the wide deployment of such models in the context of genetic diagnosis and related applications.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.