Gene-level complexity explains genome-wide variation in the distribution of fitness effects
Yildirim, B.; James, J.
Show abstract
The distribution of fitness effects (DFE) -- describing how harmful, neutral, or beneficial new mutations are -- is central to understanding how populations evolve. Although the DFE varies across genomes and species, it remains unclear which aspects of genomic organization drive this variation. Here, we inferred gene-level selective constraints across the genomes of Mus musculus castaneus, Drosophila melanogaster and Saccharomyces cerevisiae using a combination of population genetics and machine learning trained on diverse gene features. Many gene features contributed to selective constraint, with conservation, gene structure, and expression being the most informative. These constraints delineated gene classes with distinct DFEs. Genes with higher connectivity and expression -- features reflecting how many traits a gene influences -- experienced stronger and less dispersed deleterious effects, and the rate of adaptation peaked at intermediate levels of selective constraint. When compared in a Fishers geometric model (FGM) framework, this variation was consistent with predictions based on complexity considered at the gene level rather than at the organism level, whereas between-species comparisons alone were less consistent with FGM. Our results suggest gene-level complexity, captured by genomic feature proxies, better explains DFE variation than organism-level labels and highlight the value of modeling the combined effects of gene features when linking genomic architecture to fitness landscape and patterns of molecular evolution.
Matching journals
The top 5 journals account for 50% of the predicted probability mass.