Back

Taming Strong Selection with Large Sample Sizes

Gravel, S.; Krukov, I.

2021-03-30 genetics
10.1101/2021.03.30.437711 bioRxiv
Show abstract

1The fate of mutations and the genetic load of populations depend on the relative importance of genetic drift and natural selection. In addition, the accuracy of numerical models of evolution depends on the strength of both selection and drift: strong selection breaks the assumptions of the nearly neutral model, and drift coupled with large sample sizes breaks Kingmans coalescent model. Thus, the regime with strong selection and large sample sizes, relevant to the study of pathogenic variation, appears particularly daunting. Surprisingly, we find that the interplay of drift and selection in that regime can be used to define asymptotically closed recursions for the distribution of allele frequencies that are accurate well beyond the strong selection limit. Selection becomes more analytically tractable when the sample size n is larger than twice the population-scaled selection coefficient: n [≥] 2Ns (4Ns in diploids). That is, when the expected number of coalescent events in the sample is larger than the number of selective events. We construct the relevant transition matrices, show how they can be used to accurately compute distributions of allele frequencies, and show that the distribution of deleterious allele frequencies is sensitive to details of the evolutionary model.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.