Bias in diversity estimators and neutrality tests induced by neutral polymorphic structural variants
Ramos-Onsins, S. E.; Ross-Ibarra, J.; Caceres, M.; Ferretti, L.
Show abstract
Estimators of genetic diversity and neutrality tests derived from the site frequency spectrum (SFS), such as Wattersons{theta} W, nucleotide diversity{pi} , Tajimas D, and Fay and Wus H, are designed to be interpreted relative to a baseline defined by the standard neutral SFS. In genomic regions strongly linked to a polymorphic structural variant (SV), deviations from these baselines occur even under strict neutrality: conditioning on an SV at known frequency partitions samples into SV and non-SV haplotypes and distorts the SFS for linked neutral mutations. These deviations are well understood for genomic inversions under long-term balancing selection. However, not all SVs are under strong selection, and the evolution of some SVs may be better approximated as neutral. Here we derive analytical expectations for the unfolded (and, when necessary, folded) SFS of single nucleotide polymorphisms conditional on neutral linked polymorphic SVs, including inversions, deletions, insertions, and introgressions. We use these expectations to quantify the resulting bias in standard diversity estimators and neutrality tests as a function of SV frequency and type. Finally, we discuss approaches to build corrected estimators of diversity and neutrality tests that are unbiased/centered after accounting for the presence and frequency of the SV.
Matching journals
The top 1 journal accounts for 50% of the predicted probability mass.