Personalized Feature Statistics: Individual-Level Variant Inference under Genetic Ancestry Continuum
Wang, J. F.; Yu, R.; Edelson, J.; Park, J.; Le Guen, Y.; Liu, X.; Belloy, M.; Ionita-Laza, I.; Greicius, M.; Tang, H.; He, Z.
Show abstract
Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with complex diseases. However, the extent to which the effects of these variants vary across populations of diverse ancestries remains poorly understood. Furthermore, in these contexts genetic ancestry is treated as a categorical variable, thereby oversimplifying its continuous nature and the more nuanced ways in which it can influence genetic effects on disease. Here, we propose personalized feature statistics (PFstatistics), a statistical framework that quantifies the importance of genetic variants to a phenotype based on each individuals ancestry background, and profiles heterogeneous genetic effects across the genetic ancestry continuum. We demonstrate the utility of this framework through both simulations and real data analysis using sequencing data from ancestrally diverse cohorts in the Alzheimers Disease Sequencing Project (ADSP). We show that Alzheimers Disease (AD) risk variants span a spectrum from ancestry-homogeneous to ancestry-dependent effects, and that PFstatistics characterizes this spectrum at individual resolution across the ancestry continuum. PFstatistics also provides individual-level variant selection with FDR controlled at a target level, yielding distinct selection sets that vary across individuals according to their ancestry background. While demonstrated in the context of genetic ancestry, the proposed method is broadly applicable to other heterogeneity features such as environmental factors, offering a robust tool for understanding complex genetic contributions across diverse populations.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.