GENETICS
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match GENETICS's content profile, based on 189 papers previously published here. The average preprint has a 0.05% match score for this journal, so anything above that is already an above-average fit.
Lee, H.; Terhorst, J.
Show abstract
Across many complex traits, genetic variants with larger effect sizes tend to occur at lower frequencies, which is often interpreted as a signature of stabilizing selection. In statistical genetics, the so-called -model captures this relationship by assuming that effect size variance is inversely proportional to heterozygosity raised to a power 0 [<=] [<=] 1. Although empirically useful, the -model is phenomenological rather than mechanistic and lacks a direct population-genetic interpretation. In this paper, we derive an alternative to the -model based on evolutionary theory. Our approach yields a linear mixed model in which the frequency dependence of effect size emerges naturally as a function of interpretable evolutionary quantities describing mutational variance, selection intensity, and coupling between the focal and selected traits. These quantities enter through two identifiable variance components that can be estimated by restricted maximum likelihood (REML). The resulting framework links a fitness-landscape model to standard mixed-model methodology, enabling both inference on evolutionary parameters and downstream prediction by best linear unbiased prediction (BLUP). In forward simulations, the model accurately recovers the focal-trait variance and generally improves genetic prediction relative to conventional -model baselines.
Kornilov, S. A.
Show abstract
Shenhar et al. (2026) report 50% "intrinsic" lifespan heritability after calibrating a one-component correlated-frailty survival model to Scandinavian twin lifespans. Their framework is mathematically coherent, but the intrinsic component is not identified if heritable, mortality-relevant extrinsic susceptibility is omitted at calibration. We show that one-component calibration absorbs omitted familial extrinsic structure into the intrinsic frailty scale parameter{sigma}{theta} , and that this variance absorption is visible through separate diagnostics (1) Variance absorption. Under misspecification,{sigma}{theta} is inflated by +22.1% (95% CI: 21.5-22.7%), corresponding to +49% inflation in [Formula]. Falconer h2 is downstream of calibration and inherits a +9.2 pp bias (95% CI: 8.7-9.7). The{sigma}{theta} inflation is model-general: +22% (GM), +18% (MGG), +14% (SR); any dependence summary that is strictly increasing in{sigma}{theta} inherits this inflation, so Falconer h2 is one affected downstream quantity among many (Corollary B3). (2) Structural fingerprint. In the joint twin survival surface S(t1, t2), misspecification produces systematic dependence errors (ISE 48x that of the recovery model). Conditional twin dependence is inflated at all ages, peaking at age 80 ({Delta}r = 0.048). (3) Specificity. The bias requires an omitted component that is both heritable and mortality-relevant. Three negative controls, a boundary check ({rho} = 0), and a two-component recovery refit ({sigma}{theta} restored to within -3.2%) establish specificity. ACE decomposition yields C {approx} 0 throughout: the omitted extrinsic component loads onto A (because it is shared 1.0/0.5 in MZ/DZ), so switching summary statistics does not restore identification. (4) Sensitivity and falsifiability. Over an empirically anchored regime ({sigma}{gamma} [isin] [0.30, 0.65],{rho} [isin] [0.20, 0.50]), Falconer bias ranges from +2.8 to +18.9 pp (mean 9 pp). If{rho} is sufficiently negative, the bias reverses sign in all three model families (Corollary B4). A full-likelihood robustness check shows that this upward pull is partly structural and partly estimator-specific: in the same misspecified one-component model, ML still inflates{sigma}{theta} (+3%), whereas matching only rMZ inflates it much more (+21%). These results do not resolve true intrinsic heritability but establish that Shenhars 50% estimate carries a structured, model-general upward bias originating in the fitted latent variance{sigma}{theta} .
Gupta, M.; Holmes, C. M.; Belousova, J.; Gopalakrishnan, S.; Rego-Costa, A.; Desai, M. M.
Show abstract
Mapping the genetic basis of complex traits is complicated by the presence of epistatic interactions between loci. While work in molecular genetics identifies numerous specific genetic interactions, statistical analyses of quantitative traits frequently conclude that additive (nonepistatic) models explain most heritable variation. However, these conclusions are typically limited by the narrow range of genetic relatedness(e.g. in F1 offspring of a biparental or circular cross). Here, we use a barcoded panel of Saccharomyces cerevisiae genotypes with a broad range of relatedness to quantify the effects of epistasis on the genetic architecture of seven complex traits. We find limited contributions of epistasis to the genetic basis of these traits. These results indicate that epistasis beyond that detected in standard yeast crosses may exist, yet it contributes little to phenotypic variance in these systems.
Monyak, T.; Morris, G.
Show abstract
Global networks of crop breeding programs leverage diverse germplasm, but diversity increases the complexity of maintaining stability in their elite genepools. To characterize genetic heterogeneity in breeding metapopulations and develop insights on how to manage it, we simulated the evolution of breeding populations on fitness landscapes. We revealed the geometric decrease in the average effect size of alleles segregating as standing variation that become fixed along an adaptive walk. We also demonstrated how independent adaptive walks of subpopulations are influenced by genetic drift, leading to cryptic genetic heterogeneity among elite genepools. This variation is released when elite lines derived from independent subpopulations are crossed, leading to segregation for 2-4X more major QTL in admixed families as in unadmixed families, and 2-4X more epistatic interactions. The emergent property of fitness epistasis for traits under stabilizing selection is well-understood in evolutionary genetics, but under-appreciated in crop quantitative genetics. To highlight the importance of this phenomenon, we constructed an empirical genotype-to-fitness landscape from the sorghum NAM, a global admixed prebreeding resource, demonstrating the utility of fitness landscapes for inferring genetic compatibilities within metapopulations. Our findings suggest that in breeding networks, strategies for effective germplasm exchange must account for epistasis in the oligogenic component of the genetic architecture of locally-adapted traits. Article summaryModern public sector crop improvement happens in networks of breeding programs that routinely exchange genetic information. Traditional models for understanding quantitative traits have limited predictiveness in situations with such genetic heterogeneity. This study uses breeding simulations and empirical data to show the utility of the fitness landscape framework for characterizing the genetic architecture of complex traits in breeding metapopulations. By simulating the evolution of breeding programs and integration into networks, it demonstrates how epistatic interactions between large-effect alleles are a fundamental property that must be accounted for when exchanging germplasm. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=102 SRC="FIGDIR/small/712732v1_ufig1.gif" ALT="Figure 1"> View larger version (25K): org.highwire.dtl.DTLVardef@1541326org.highwire.dtl.DTLVardef@b553a8org.highwire.dtl.DTLVardef@8758b4org.highwire.dtl.DTLVardef@1d0bdcd_HPS_FORMAT_FIGEXP M_FIG C_FIG
Johnson, O. L.; Tobler, R.; Schmidt, J. M.; Huber, C. D.
Show abstract
Genetic evidence for fluctuating selection has begun to accumulate for different species over the past few decades, especially for the Drosophila genus where studies have reported hundreds of loci undergoing putatively adaptive oscillations across successive seasons. However, most theoretical and simulation studies of fluctuating selection have relied on abstract or weakly parameterized models, making it difficult to assess their relevance for natural populations. In this study, we simulate multilocus seasonally fluctuating selection under a recently developed model and examine its effect on the variance effective population size (Ne) at a genome-wide scale. By recapitulating genomic, demographic, and evolutionary parameters from natural Drosophila populations in our simulations, we were able to reproduce allele frequency oscillations reported in recent studies and show that these lead to [~]50% genome-wide reductions in Ne. We also demonstrate that Ne reductions are well predicted by the maximum frequency amplitude among all adaptively fluctuating loci, and that the frequency amplitudes are largely determined by the number of adaptively fluctuating loci and the strength of their epistatic interactions. Our results demonstrate that fluctuating selection can substantially reduce effective population size and underscore the importance of temporally variable selection in shaping genome-wide patterns of variation beyond classical models. Article SummaryGenetic studies of fluctuating selection in natural populations have grown steadily over the past decade, with reports suggesting that hundreds of loci undergo adaptive oscillations over seasonal timescales in cosmopolitan Drosophila populations. By simulating seasonally fluctuating selection under a recently developed model and ecological scenarios informed by published studies, the authors show that this mode of selection can reduce effective population size by [~]50%, with the magnitude of the reduction correlated with the locus exhibiting the largest allele frequency fluctuations. These findings highlight fluctuating selection as an important factor shaping genome-wide patterns of genetic variation and effective population size.
Wang, H.; Wainschtein, P.; Sidorenko, J.; Fikere, M.; Zhang, Y.; Kemper, K. E.; Zheng, Z.; Hivert, V.; Zeng, J.; Goddard, M. E.; Visscher, P. M.; Yengo, L.
Show abstract
Assessing the contribution of ultra-rare variants (minor allele frequency <0.01%) to the heritability of complex traits remains challenging due to limited understanding of potential biases. Here, we focus on singletons (that is, variants observed only once in the study sample), the most abundant class of ultra-rare variants, to showcase various confounders of heritability estimates and underline pitfalls in their interpretation. We show through theory, simulations, and analysis of 5,330,210 exome-sequenced singletons in 305,813 unrelated European-ancestry individuals in the UK Biobank that (i) population stratification induces both upward and downward biases in singleton-based heritability estimates (), (ii) estimates capture non-additive genetic effects, and (iii) asymptotic standard errors of estimates from likelihood-based procedures are generally mis-calibrated when traits are not normally distributed. We further showcase these biases in real-data analyses of 22 quantitative phenotypes and report, after accounting for these pitfalls, significant estimate for number of children (3.4%), peak expiratory flow (1.9%), red blood cell count (2.5%), white blood cell count (1.9%) and heel bone mineral density (2.4%). Overall, our study provides recommendations for robust inference of heritability from ultra rare variants and underscores that reliable estimates for ordinal and binary traits will require far larger sample sizes and improved methods, given that confounding in these traits remains difficult to detect and correct
Kinney, J. B.
Show abstract
Additive fitness landscapes--also called Mount Fuji landscapes--are the simplest and most widely used models of sequence-function relationships. As such, they play essential roles across multiple areas of biology, including evolutionary theory, quantitative genetics, gene regulation, and protein science. One of the most basic properties of any fitness landscape is its genotypic density--the number of sequences near a given fitness value. Understanding this density is especially important near fitness peaks, as it quantifies the supply of high-fitness genotypes. Here I study the genotypic density of additive landscapes near fitness peaks. Although this density is well known to be approximately Gaussian near the middle of the fitness range, its behavior near maximal fitness has not been reported. I begin by deriving a saddle-point approximation that accurately describes the genotypic density of additive landscapes over virtually the entire fitness range. I then show that the log density follows a power law near maximal fitness, with an exponent determined by how much the best allele at each position outperforms its nearest competitor. This power-law behavior holds over a substantial fraction of fitness values, besting the Gaussian approximation on both simulated and empirical landscapes across roughly a quarter to a third of the fitness range. Under certain conditions this behavior also extends to globally epistatic landscapes (defined as nonlinear functions over one or more additive traits), though with a reduced range of validity. These findings advance our understanding of one of the most fundamental models of sequence-function relationships. In particular, they reveal that the uppermost reaches of Mount Fuji landscapes, rather than being sharply peaked, are actually quite stubby.
Snell, H.; McCallum, S.; Raghavan, D.; Singh, R.; Ramachandran, S.; Sugden, L.
Show abstract
Adaptive mutations, or mutations that confer a fitness benefit, can leave behind distinct signals in genetic data. Computational methods have improved the localization of adaptive mutations in genetic samples using a range of statistical and machine learning classification techniques. However, these methods miss the opportunity to jointly integrate statistics at both the site and window-based level, thus failing to harness all available statistical evidence to detect selection. Our method, WINDEX, combines these different resolutions of statistics to improve the detection of adaptive mutations among hitchhiking signals. Our model simultaneously integrates emissions at different resolutions by defining site-based and window-based latent states corresponding to neutral, linked, and sweep regions, with the site-based states and transition models nested within the window-based states. Using evolutionary simulations with varying selection parameters, we validate the ability of WINDEX to classify positive selective sweeps. Using data from the 1000 Genomes Project, we show that WINDEX is able to identify regions harboring signals of selective sweeps, and provides improved localization within those regions over existing methods. In addition, using WINDEX genome-wide allows for estimation of the proportion of whole genomes that are under positive selective pressures; our estimates of between 9.7-10.5% across different populations provide support for other preliminary estimates of these quantities. Author summaryPopulation geneticists often seek evidence for positive selective sweeps, or an evolutionary event in which a beneficial allele increases in frequency over time in a population, resulting in increased fitness of the individuals that have said allele. Positive selective sweeps, however, are difficult to detect due to varying patterns of linkage disequilibrium (LD), or the nonrandom association of alleles, and detecting these signals reliably among differing LD structures remains a challenge in the field. In this work, we present WINDEX, a probabilistic framework designed to leverage signals of positive selective sweeps at both the site- and window-levels in the form of a hierarchical hidden Markov model (HHMM), to localize regions of positive selective sweeps in aligned haplotype data. We validate WINDEX in evolutionary simulations over varying positive selective sweep scenarios, showcasing the improved resolution that the HHMM structure provides. We apply WINDEX in comparative genomic scans of canonical sites of positive selection as well as whole-genome scans to demonstrate the tools power in localizing functionally-validated signals of selection and to offer insights into the proportion of the human genome currently under positive selective pressures. WINDEX is publicly available and easy to apply to many cases of human genetic data.
Reyes Castellon, G. A.; Aimadeddine, G.; Chiao, C. R.; Guruprasad, S.; Halbert, P. E.; Hassan, S. A.; Luong, M. Q.; Mailanperuma Arachchillage, K. S.; Martinez, Y.; Mukhtarov, M.; Nair, G.; Nguyen, E. N.; Onochie, C. L.; Patel, O.; Than, J. T.; Manat, Y.; IISAGE, ; Meisel, R. P.
Show abstract
Life history traits are often correlated, creating trade-offs that may impede the response to natural selection and be responsible for the evolution of senescence. These trade-offs may arise through pleiotropic effects, which can affect the response to selection in ways that resemble intra-locus sexual antagonism. Despite these hypothesized relationships, we lack clear connections between pleiotropy, sexual antagonism, and the evolution of life histories. Empirical tests for inter-sexual differences in life-history traits, including sex-specific aging, can be used to evaluate hypotheses about how pleiotropy and sexual conflict affect evolutionary trade-offs. To those ends, we measured lifespan, development time, and body size in Drosophila pseudoobscura males and females, each of which carried one of six third chromosome inversion genotypes. Temperature affected lifespan and development more than any other factor; higher temperatures increased mortality rate, decreased lifespan, and accelerated development. However, we also observed sex differences in mortality rates and development times that depended on genotype and temperature. Notably, temperature elevated the initial mortality rate across all flies, yet increasing temperatures reduced the rate of aging in some genotype-sex combinations. Similarly, direct effects of genotype on mortality rate and development time depended greatly on sex and temperature, but there was no genotype effect on body size. Despite these context-dependent genotype effects on life history traits, we failed to identify any correlations that would serve as clear evidence for sexual conflict or trade-offs. Our results therefore suggest that either historical conflicts have been resolved or any conflicts that may exist do not result in the correlations predicted by existing models.
Ortiz-Barrientos, D.; Cooper, M.
Show abstract
Article summaryGene interactions are common, yet additive genetic models often predict short-term evolution and breeding response. This study argues that additivity can arise because populations sample only a small neighbourhood of a curved fitness landscape. In additive channels, genetic variation is small enough that local curvature contributes little to heritable fitness differences. The study defines an additivity index ([A]g) that compares variance from the local slope of log-fitness with variance from curvature, and links this ratio to expected prediction accuracy under Gaussian assumptions. A selection-inheritance framework shows when additive channels persist and when populations leave them. It yields testable predictions.
Li, J.; Hermisson, J.; Sachdeva, H.
Show abstract
We study one of the simplest scenarios of polygenic selection that can be imagined: a subdivided population of diploid individuals expressing an additive trait under spatially homogeneous stabilizing selection. We are interested in the amounts of variation that can be maintained at mutation-selection-migration-drift equilibrium, at individual loci and at the level of the trait, within and among subpopulations. We derive analytical approximations for variance components and summary statistics such as FST and QST under the assumptions of the infinite-island model and compare these with individual-based simulations. We find that: (i) There is a critical migration threshold (which depends on effect sizes of trait loci) below which population structure strongly inflates genic variance in the subdivided population to levels well above those in a panmictic population. Variation within each subpopulation is maximized close to the critical migration rate. (ii) The genetic basis of trait variation across subpopulations is most similar close to this migration threshold and (counter-intuitively) decreases for higher migration rates. This has consequences for the portability of Genome-Wide Association Studies (GWAS) between subpopulations, i.e, the extent to which loci with large contributions to variance in one subpopulation explain variance in other subpopulations. (iii) An analytical mean-field approach based on the single-locus diffusion approximation, together with effective migration and selection parameters (to account for associations between loci), very accurately predicts various quantities.
Hoyt, S. H.; Reddy, T. E.; Gordan, R.; Allen, A. S.; Majoros, W. H.
Show abstract
Interpreting the effects of novel mutations on phenotypic traits remains challenging, particularly for cis-regulatory variants. For rare variants, individuals typically possess at most one affected copy of the causal allele, leading to allelic imbalance, and thus the ability to infer inheritance of allelic imbalance can inform genetic studies of phenotypic traits. While many methods for detection of allele-specific expression (ASE) exist, they largely focus on ASE in one individual. We show that performing joint inference across multiple individuals in a trio allows for simultaneously improving estimates of ASE and identifying its likely mode of inheritance. Our Bayesian approach has the benefit of being able to (1) aggregate information across individuals so as to improve statistical power, (2) estimate uncertainty in estimates, and (3) rank modes of inheritance by posterior probability. We demonstrate that this model is also applicable to other forms of imbalance such as allele-specific chromatin accessibility. Applying the model to ATAC-seq and RNA-seq from several trios, we uncover examples in which ASE can be linked to imbalance in chromatin state of cis-regulatory elements and to potential causal variants. As the cost of sequencing continues to decrease, we expect that powerful methodologies such as the one presented here will promote more routine collection of samples from related individuals and improve our understanding of genetic effects on gene regulation and their contribution to phenotypic traits.
Brud, E.; Guerrero, R. F.
Show abstract
Alleles with opposing effects on fitness characters are said to exhibit selectional antagonistic pleiotropy (broadly construed so that effects are not necessarily confined to the same individual). A number of theoretical investigations considered the case where a pair of alleles at a locus influences two fitness components and derived the conditions giving rise to stable polymorphism under various assumptions about the mode of trait-interaction. Strikingly, many of these analyses concluded that the potential for maintaining polymorphism is strongly constrained by the joint influence of two factors: (1) the prevalence of weak selection coefficients over coefficients of large magnitude, and (2) the absence of beneficial dominance reversals (where the deleterious effects of each allele are partially or completely masked in the heterozygous genotype). Consequently, the conclusion that selective polymorphism is unlikely to be maintained by intralocus mechanisms of antagonistic pleiotropy has achieved widespread acceptance. Here we argue that such conclusions do not apply to any of the following models of antagonism: (i) additive trait-interaction, (ii) multiplicative trait-interaction, (iii) bivoltine selection, (iv) soft selection, (v) hard selection, and (vi) sexual antagonism. We demonstrate that the parameter space giving rise to stable allelic variation is quite large throughout, and moreover, the plenitude of suitable parameters neither depends on the strength of selection nor requires dominance reversal. Dominance coefficients associated with stringent conditions for stable polymorphism are shown to be atypical as compared to all feasible parameters, and best regarded as an outcome of adherence to a special relation: dominance with a constant magnitude and direction, which includes the case of additive allelic effects at a locus. Properties of single-locus equilibria (heterozygosity, allele frequency differentiation) are investigated, as well as the contribution of dominance schemes to the genetic variance in fitness characters in populations at multilocus linkage equilibrium. Author summaryAllelic variants at a locus with opposing effects on multiple fitness components (antagonistic fitness pleiotropy) have long been appreciated as a possible source of balancing selection. The prevalence of polymorphism owing to this form of natural selection, however, has been doubted on theoretical grounds due to the fact that standard assumptions of genetic models (namely, constant magnitudes for the dominance coefficients) are hardly conducive to the maintenance of polymorphism. The major exception to this conclusion lies with schemes that exhibit dominance reversal (where the direction of dominance for antagonistic alleles flips across fitness components). Here we conduct a geometric analysis of the space of polymorphism-promoting dominance parameters and conclude that the conditions for maintaining balanced alleles is unrestrictive, with non-reversals playing an underappreciated role.
Stephens, E.; Hamza, A.; Driessen, M. R. M.; O'Neil, N. J.; Stirling, P. C.; Hieter, P.
Show abstract
The cohesin complex has conserved roles in sister chromatid cohesion, DNA replication, genome organization, and the DNA damage response. We heterologously expressed the human cohesin complex in yeast to probe the behaviour of human cohesin. Human cohesin was unable to complement loss of function mutations in yeast cohesin, either as single subunits or as complexes, including in the context of co-expressing up to 12 human cohesin-associated genes. Heterologous expression of human cohesin in yeast expressing wildtype yeast cohesin resulted in dominant cohesion dysregulation and DNA damage sensitivity phenotypes. We used co-immunoprecipitation to demonstrate that human SMC proteins interact with endogenous yeast cohesin rings creating dominant-negative hybrid complexes that disrupt endogenous cohesin biology.
Lin, R.; Reynolds, M. J.; Shankar, N. R.; Johnson, A.
Show abstract
The correct assembly of ribosomes is essential for viability and faithful gene expression. In eukaryotic cells, the pre-40S and pre-60S ribosomal subunits are largely pre-assembled in the nucleolus before they are exported to the cytoplasm for final maturation. Although most ribosomal proteins of the large subunit are loaded onto pre-60S particles in the early nucleolar steps, a few, including eL24, are loaded in the cytoplasm. eL24 is thought to recruit the zinc-finger protein Rei1 (ZNF622 in humans). In yeast, Rei1 has a paralog, Reh1. While we and others have previously shown that Rei1 facilitates the removal of Arx1, Rei1 and Reh1 appear to have an additional unknown function. To identify this function, we first examined the protein composition of pre-60S subunits isolated from rei1{Delta} reh1{Delta} mutant cells and found that these subunits were specifically defective for eL24. However, the absence of eL24 did not impair Rei1 binding to pre-60S. Moreover, overexpression of eL24 suppressed the growth defect of the double mutant. As an alternative approach to understanding the function of Rei1 and Reh1, we screened for bypass suppressors of the growth defect of rei1{Delta} reh1{Delta} cells. We identified mutations in the genes coding for ribosomal protein uL3, the GTPase Lsg1 and the protein phosphatase Ppq1. Importantly, these suppressors all partially reversed the eL24 loading defect of rei1{Delta} reh1{Delta} cells. Based on these results, we propose a revised order of cytoplasmic assembly events where Rei1 and Reh1 facilitate the recruitment of eL24 to the pre-60S particle.
Barbosa, G. O.; Solis-Calero, C.; Kornberg, T.
Show abstract
Binding of Fibroblast growth factor (FGF) to a heparan sulfate proteoglycan (HSPG) is required for paracrine FGF signaling. To improve our understanding of FGF:HSPG association, we developed a method to monitor export of the Drosophila FGF ortholog Branchless (Bnl) in vivo. We detected Bnl on the surface of approximately 10% of Bnl-producing cells, but Bnl on the surface of cells depleted of HS was much reduced. HS depletion also non-autonomously decreased the activity of cytonemes that extend from cells that receive Bnl. These results are consistent with the idea that Bnl export to the cell surface is regulated, that intracellular binding of an HSPG to Bnl in producing cells is essential for export, and that cells that take up Bnl actively participate in its release from producing cells. SummaryLevels of FGF exported to the surface of FGF-expressing cells are dependent on intracellular heparan sulfate proteoglycans.
Cataldo-Ramirez, C.; Lin, M.; McMahon, A.; Gignoux, C.; Weaver, T. D.; Henn, B. M.
Show abstract
Genome-wide association studies (GWAS) and polygenic score (PGS) development are typically constrained by the data available in biobank repositories in which European cohorts are vastly overrepresented. Here, we increase the utility of non-European participant data within the UK Biobank (UKB) by characterizing the genetic affinities of UKB participants who self-identify as Bangladeshi, Indian, Pakistani, "White and Asian" (WA), and "Any Other Asian" (AOA), towards creating a more robust South Asian sample size for future genetic analyses. We assess the relationships between genetic structure and self-selected ethnic identities and use consistent patterns of clustering in the dataset to train a support vector machine (SVM). The SVM was utilized to reassign n = 1,853 AOA and WA participants at the subcontinental level, and increase the sample size of the UKB South Asian group by 1,381 additional participants. We further leverage these samples to assess GWAS performance and PGS development. We include environmental covariates in the height GWAS by implementing a rigorous covariate selection procedure, and compare the outputs of two GWAS models: GWASnull and GWASenv. We show that PGS performance derived from both GWAS models yield comparable prediction to PGS models developed with an order of magnitude larger training, and environmentally-adjusted PGS models reduce the sex-bias in predictive performance. In summary, we demonstrate how GWAS performance can be improved by leveraging ambiguous ethnicity codes, ancestry matched imputation panels, and including environmental covariates.
Salomon, J.; Enjalbert, J.; Flutre, T.
Show abstract
The genetics of interspecific groups remains largely unexplored, despite the central role of social (or indirect) genetic effects in shaping phenotypic expression within communities. Intercropping, i.e. the simultaneous cultivation of multiple crop species in the same field, offers a powerful model to harness these interspecific social effects. Such species mixtures provide well-documented agricultural benefits, yet few breeding frameworks have integrated the genetics of social interactions. Here, we address this gap by extending quantitative genetic theory to interspecific groups, with intercropping as a concrete and applied model case. We propose a quantitative genetic model that jointly analyzes intra and interspecific interactions within a unifying framework. Breeding values are decomposed into a direct component, shared in mono and mixed-crops, an interspecific social component corresponding to the effect of one species on another, and an intraspecific component that captures the social effects within a mono-genotypic stand of cloned plants. Statistically, this consists in simultaneously fitting several linear mixed models, one per stand type, all having direct breeding values in common. As no open-source software can fit such a complex mixed model, we provide such an implementation in R/C++. Simulations across various genetic (co)variance structures and sparse experimental designs showed accurate estimation of all genetic (co)variances and breeding values. With an incomplete, yet balanced design combining sole crops and intercrops, genetic gains in both systems were achievable simultaneously, enabling breeding strategies that progressively integrate intercropping into existing, sole-crop-only schemes. More broadly, this framework allows dissecting direct and social genetic effects when genotypes are observed in mono- and mixed-species situations, cultivated or not.
Mah, J. C.; Lohmueller, K. E.
Show abstract
Accurate estimation of population demographic history is central to population genetics yet remains challenging due to the sensitivity of inference methods to the number of individuals and the demographic scenario assumed in inference. The site-frequency spectrum (SFS) of neutral variants, a widely used summary statistic of genetic variation, is particularly sensitive to demographic processes, but studies have shown that qualitative results from demographic inference, i.e., population expansion vs. contraction, can depend strongly on the number of individuals in the dataset. Here, we analyzed two simulated datasets and one empirical dataset characterized by an ancient population bottleneck followed by a recent population expansion. Fitting a two-epoch demographic model across a range of sample sizes, we found that inference shifted from signals of ancient population contraction at small sample sizes to signals of recent population expansion at large sample sizes. Other summary statistics, including Tajimas D and the proportion of singletons, also changed with sample size. We found that these changes of inferred evolutionary signals under a two-epoch model can be explained by the epoch which contributes the highest mean proportion of coalescent branch lengths. Our results highlight that demographic inference depends critically on the number of individuals analyzed and suggest that analyzing datasets at multiple sample sizes can reveal complementary aspects of population history.
Selenius, E.; Keaney, T.; Winters, S.; Mappes, J.; Kokko, H.
Show abstract
Population genetic models excel at identifying the conditions for polymorphisms based on balancing selection but typically disregard the ecological processes that yield particular values of selection coefficients. We model a system that combines antagonistic pleiotropy, dominance reversal and heterozygote advantage: the wood tiger moth Arctia plantaginis, where alternative haplotypes at a major-effect locus determine male hindwing coloration. Yellow offers better protection against predators, while white is often associated with better mating success. The effects of mortality and reproductive success overlap in time because protandrous males can mate as long as they are alive, but they need to avoid predation for several days before the bulk of females emerge. We show that protandry aids polymorphism maintenance whenever the second-fittest genotype (after the heterozygote) is the poorly surviving but mating advantaged homozygote, while increased protandry harms polymorphism when the second-best fitness is that of the survival advantaged morph. Ecologically plausible protandry times predict that dominance reversal does not have to be strong for polymorphism to be maintained. Our study highlights the importance of timing traits in maintaining polymorphisms in Lepidoptera and showcases the benefits of deriving fitness explicitly in place of abstract selection coefficients that lack temporal components within the life cycle.