Back

Partitioning Fraction of Variance Explained into Strong Localized Effects and Weak Diffuse Effects

Nan, F.; Azriel, D.; Schwartzman, A.

2026-01-07 genetics
10.64898/2026.01.06.697735 bioRxiv
Show abstract

High-dimensional genetic data present substantial challenges for estimating the fraction of variance explained (FVE) by genome-wide single-nucleotide polymorphisms (SNPs). Standard approaches for SNP heritability estimation, such as GWAS heritability (GWASH) and linkage disequilibrium score (LDSC) regression, typically assume Gaussian distributions for SNP effect sizes. However, empirical evidence indicates that SNP effects are often heavy-tailed, with a small subset of variants exerting disproportionately large influence. Such settings violate the recently established bounded-kurtosis effect (BKE) condition, under which these FVE estimators are consistent. Consequently, widely used methods may yield severely biased estimates when strong effects are present. We introduce a decomposed FVE estimation framework that accommodates heavy-tailed and heterogeneous SNP effect distributions. The proposed approach partitions total heritability into contributions from strong and weak genetic effects, estimating the former using low-dimensional adjusted R2 and the latter using an extension of FVE estimation methodology that remains valid under BKE compliance. We further develop a test for detecting violations of the BKE condition and compare several high-dimensional screening procedures for identifying strong-effect SNPs when they are not known in advance. Simulation studies show that the proposed decomposition substantially improves estimation accuracy over existing approaches in the presence of heavy-tailed effects. Application to the Adolescent Brain Cognitive Development (ABCD) Study demonstrates the practical utility of the method, yielding more reliable heritability estimates for the PolyVoxel Score, a neuroimaging-based biomarker linked to iron accumulation. Our results highlight the importance of accommodating effect heterogeneity in large-scale genomic studies.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Genetic Epidemiology
46 papers in training set
Top 0.1%
22.7%
2
The American Journal of Human Genetics
206 papers in training set
Top 0.4%
10.2%
3
Bioinformatics
1061 papers in training set
Top 4%
6.4%
4
PLOS Computational Biology
1633 papers in training set
Top 8%
4.2%
5
Human Brain Mapping
295 papers in training set
Top 2%
4.0%
6
Scientific Reports
3102 papers in training set
Top 34%
3.7%
50% of probability mass above
7
BMC Bioinformatics
383 papers in training set
Top 3%
3.6%
8
International Journal of Epidemiology
74 papers in training set
Top 0.6%
3.6%
9
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.6%
10
PLOS Genetics
756 papers in training set
Top 5%
3.1%
11
Biological Psychiatry
119 papers in training set
Top 1%
2.6%
12
PLOS ONE
4510 papers in training set
Top 47%
2.1%
13
NeuroImage
813 papers in training set
Top 3%
2.1%
14
Human Genetics and Genomics Advances
70 papers in training set
Top 0.2%
2.1%
15
Human Molecular Genetics
130 papers in training set
Top 1%
2.1%
16
Frontiers in Genetics
197 papers in training set
Top 4%
1.8%
17
Genetics
225 papers in training set
Top 2%
1.7%
18
American Journal of Epidemiology
57 papers in training set
Top 0.7%
1.7%
19
Behavior Genetics
15 papers in training set
Top 0.1%
1.3%
20
Nature Communications
4913 papers in training set
Top 56%
1.2%
21
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
1.2%
22
Biostatistics
21 papers in training set
Top 0.1%
1.2%
23
GENETICS
189 papers in training set
Top 1%
0.8%
24
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
25
Biometrics
22 papers in training set
Top 0.2%
0.8%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
27
Statistics in Medicine
34 papers in training set
Top 0.4%
0.6%
28
Genome Research
409 papers in training set
Top 5%
0.5%
29
Nature Human Behaviour
85 papers in training set
Top 6%
0.5%
30
eLife
5422 papers in training set
Top 63%
0.5%