Back

Personalized Feature Statistics: Individual-Level Variant Inference under Genetic Ancestry Continuum

Wang, J. F.; Yu, R.; Edelson, J.; Park, J.; Le Guen, Y.; Liu, X.; Belloy, M.; Ionita-Laza, I.; Greicius, M.; Tang, H.; He, Z.

2026-04-29 neurology
10.64898/2026.04.28.26351879 medRxiv
Show abstract

Genome-wide association studies (GWAS) have successfully identified numerous genetic variants associated with complex diseases. However, the extent to which the effects of these variants vary across populations of diverse ancestries remains poorly understood. Furthermore, in these contexts genetic ancestry is treated as a categorical variable, thereby oversimplifying its continuous nature and the more nuanced ways in which it can influence genetic effects on disease. Here, we propose personalized feature statistics (PFstatistics), a statistical framework that quantifies the importance of genetic variants to a phenotype based on each individuals ancestry background, and profiles heterogeneous genetic effects across the genetic ancestry continuum. We demonstrate the utility of this framework through both simulations and real data analysis using sequencing data from ancestrally diverse cohorts in the Alzheimers Disease Sequencing Project (ADSP). We show that Alzheimers Disease (AD) risk variants span a spectrum from ancestry-homogeneous to ancestry-dependent effects, and that PFstatistics characterizes this spectrum at individual resolution across the ancestry continuum. PFstatistics also provides individual-level variant selection with FDR controlled at a target level, yielding distinct selection sets that vary across individuals according to their ancestry background. While demonstrated in the context of genetic ancestry, the proposed method is broadly applicable to other heterogeneity features such as environmental factors, offering a robust tool for understanding complex genetic contributions across diverse populations.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Computational Science
50 papers in training set
Top 0.1%
18.2%
2
Nature Genetics
240 papers in training set
Top 0.4%
14.0%
3
Nature Communications
4913 papers in training set
Top 24%
8.0%
4
The American Journal of Human Genetics
206 papers in training set
Top 0.6%
8.0%
5
Advanced Science
249 papers in training set
Top 5%
3.9%
50% of probability mass above
6
Nature Medicine
117 papers in training set
Top 0.7%
3.9%
7
Human Genetics and Genomics Advances
70 papers in training set
Top 0.1%
3.5%
8
Nature Biomedical Engineering
42 papers in training set
Top 0.3%
3.5%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 23%
3.2%
10
Neuron
282 papers in training set
Top 5%
2.0%
11
Alzheimer's & Dementia
143 papers in training set
Top 2%
1.7%
12
Cell Systems
167 papers in training set
Top 7%
1.7%
13
Scientific Reports
3102 papers in training set
Top 59%
1.7%
14
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
15
Genome Medicine
154 papers in training set
Top 5%
1.7%
16
Science Translational Medicine
111 papers in training set
Top 3%
1.5%
17
Brain
154 papers in training set
Top 3%
1.5%
18
Nucleic Acids Research
1128 papers in training set
Top 12%
1.5%
19
PLOS ONE
4510 papers in training set
Top 59%
1.3%
20
PLOS Computational Biology
1633 papers in training set
Top 20%
1.2%
21
Genome Biology
555 papers in training set
Top 6%
0.9%
22
Imaging Neuroscience
242 papers in training set
Top 3%
0.8%
23
Communications Biology
886 papers in training set
Top 25%
0.7%
24
Nature Neuroscience
216 papers in training set
Top 6%
0.7%
25
Med
38 papers in training set
Top 1.0%
0.7%
26
Nature
575 papers in training set
Top 18%
0.6%
27
Nature Aging
51 papers in training set
Top 2%
0.6%