Back

Genetics

Oxford University Press (OUP)

Preprints posted in the last 90 days, ranked by how well they match Genetics's content profile, based on 225 papers previously published here. The average preprint has a 0.22% match score for this journal, so anything above that is already an above-average fit.

1
Bias in diversity estimators and neutrality tests induced by neutral polymorphic structural variants

Ramos-Onsins, S. E.; Ross-Ibarra, J.; Caceres, M.; Ferretti, L.

2026-02-28 genetics 10.64898/2026.02.26.708357 medRxiv
Top 0.1%
89.7%
Show abstract

Estimators of genetic diversity and neutrality tests derived from the site frequency spectrum (SFS), such as Wattersons{theta} W, nucleotide diversity{pi} , Tajimas D, and Fay and Wus H, are designed to be interpreted relative to a baseline defined by the standard neutral SFS. In genomic regions strongly linked to a polymorphic structural variant (SV), deviations from these baselines occur even under strict neutrality: conditioning on an SV at known frequency partitions samples into SV and non-SV haplotypes and distorts the SFS for linked neutral mutations. These deviations are well understood for genomic inversions under long-term balancing selection. However, not all SVs are under strong selection, and the evolution of some SVs may be better approximated as neutral. Here we derive analytical expectations for the unfolded (and, when necessary, folded) SFS of single nucleotide polymorphisms conditional on neutral linked polymorphic SVs, including inversions, deletions, insertions, and introgressions. We use these expectations to quantify the resulting bias in standard diversity estimators and neutrality tests as a function of SV frequency and type. Finally, we discuss approaches to build corrected estimators of diversity and neutrality tests that are unbiased/centered after accounting for the presence and frequency of the SV.

2
Laboratory yeast crosses reveal limited epistasis in the genetic basis of complex traits

Gupta, M.; Holmes, C. M.; Belousova, J.; Gopalakrishnan, S.; Rego-Costa, A.; Desai, M. M.

2026-04-06 genetics 10.64898/2026.04.04.716439 medRxiv
Top 0.1%
66.0%
Show abstract

Mapping the genetic basis of complex traits is complicated by the presence of epistatic interactions between loci. While work in molecular genetics identifies numerous specific genetic interactions, statistical analyses of quantitative traits frequently conclude that additive (nonepistatic) models explain most heritable variation. However, these conclusions are typically limited by the narrow range of genetic relatedness(e.g. in F1 offspring of a biparental or circular cross). Here, we use a barcoded panel of Saccharomyces cerevisiae genotypes with a broad range of relatedness to quantify the effects of epistasis on the genetic architecture of seven complex traits. We find limited contributions of epistasis to the genetic basis of these traits. These results indicate that epistasis beyond that detected in standard yeast crosses may exist, yet it contributes little to phenotypic variance in these systems.

3
Histone H3 availability is more important for development than H3.2 versus H3.3 subtype identity

McPherson, J.-M. E.; Sykes, C.; Grossmann, L. C.; Hill, C. H.; Leatham-Jensen, M. P.; Duronio, R. J.; McKay, D. J.

2026-02-10 genetics 10.64898/2026.02.09.704946 medRxiv
Top 0.1%
65.1%
Show abstract

The distinct contributions of replication-dependent and replication-independent histones to development and genome function remain unclear. In this study, we investigate how the distinct protein identities of the histone H3.2 and H3.3 subtypes contribute to development and gene regulation in Drosophila. Comparing animals in which the replication-independent H3.3 genes were mutated to produce the replication-dependent H3.2 protein with those carrying deletions of the replication-independent H3.3 genes revealed that replication-independent H3.3 is essential for fertility, adult locomotor behavior, and normal longevity. However, development to adulthood does not depend on which replication-independent H3 subtype is expressed from the H3.3 loci. Moreover, replication-independent H3.3 is not required to establish or maintain global patterns of chromatin accessibility or gene expression in the adult brain. Surprisingly, we find that expression of H3.2 from the replication-dependent HisC locus is essential in post-replicative cells in the absence of replication-independent H3.3, and we uncover a critical role for the HIRA histone chaperone complex in preserving genome function when replication-independent H3.3 is deleted. We conclude that an available pool of H3 is more critical than the specific identity of H3 in the pool.

4
Effect of population structure and stabilizing selection on quantitative genetic variation

Li, J.; Hermisson, J.; Sachdeva, H.

2026-04-01 evolutionary biology 10.64898/2026.03.29.714437 medRxiv
Top 0.1%
58.0%
Show abstract

We study one of the simplest scenarios of polygenic selection that can be imagined: a subdivided population of diploid individuals expressing an additive trait under spatially homogeneous stabilizing selection. We are interested in the amounts of variation that can be maintained at mutation-selection-migration-drift equilibrium, at individual loci and at the level of the trait, within and among subpopulations. We derive analytical approximations for variance components and summary statistics such as FST and QST under the assumptions of the infinite-island model and compare these with individual-based simulations. We find that: (i) There is a critical migration threshold (which depends on effect sizes of trait loci) below which population structure strongly inflates genic variance in the subdivided population to levels well above those in a panmictic population. Variation within each subpopulation is maximized close to the critical migration rate. (ii) The genetic basis of trait variation across subpopulations is most similar close to this migration threshold and (counter-intuitively) decreases for higher migration rates. This has consequences for the portability of Genome-Wide Association Studies (GWAS) between subpopulations, i.e, the extent to which loci with large contributions to variance in one subpopulation explain variance in other subpopulations. (iii) An analytical mean-field approach based on the single-locus diffusion approximation, together with effective migration and selection parameters (to account for associations between loci), very accurately predicts various quantities.

5
General moment closure for the neutral two-locus Wright-Fisher dynamics

Kundagrami, R.; Yetter, S.; Steinruecken, M.

2026-01-20 genetics 10.64898/2026.01.16.700021 medRxiv
Top 0.1%
51.4%
Show abstract

The Wright-Fisher diffusion and its dual, the coalescent process, are at the core of many results and methods in population genetics. Approaches have been developed to study the dynamics of its moments under genetic drift, mutation, and recombination using ordinary differential equations. The dynamics of these moments can be used to study population genetic processes and are key building blocks of efficient methods to infer population genetic parameters, like demographic histories or fine-scale recombination rates. However, the system of equations does not close under recombination; that is, computing moments of a certain order requires knowledge of moments of higher order. By applying a coordinate transformation to the diffusion generator, we show that the canonical moments in these alternative coordinates yield a closed system, enabling more accurate numerical computations. Compared to previous approaches in the literature, we believe that this approach can be more readily extended to general scenarios. Through simulations, we verify that the derived system of differential equations can accurately capture the dynamics of the moments, and can be used to efficiently compute expected diversity and linkage statistics in population genetic samples.

6
Parameterizing the genetic architecture under stabilizing selection

Lee, H.; Terhorst, J.

2026-03-27 genetics 10.64898/2026.03.27.714826 medRxiv
Top 0.1%
43.4%
Show abstract

Across many complex traits, genetic variants with larger effect sizes tend to occur at lower frequencies, which is often interpreted as a signature of stabilizing selection. In statistical genetics, the so-called -model captures this relationship by assuming that effect size variance is inversely proportional to heterozygosity raised to a power 0 [<=] [<=] 1. Although empirically useful, the -model is phenomenological rather than mechanistic and lacks a direct population-genetic interpretation. In this paper, we derive an alternative to the -model based on evolutionary theory. Our approach yields a linear mixed model in which the frequency dependence of effect size emerges naturally as a function of interpretable evolutionary quantities describing mutational variance, selection intensity, and coupling between the focal and selected traits. These quantities enter through two identifiable variance components that can be estimated by restricted maximum likelihood (REML). The resulting framework links a fitness-landscape model to standard mixed-model methodology, enabling both inference on evolutionary parameters and downstream prediction by best linear unbiased prediction (BLUP). In forward simulations, the model accurately recovers the focal-trait variance and generally improves genetic prediction relative to conventional -model baselines.

7
The fate of mutations on Y chromosomes andautosomes: a unified Wright-Fisher frameworkaccounting for segregation time

Offenstadt, A.; Billiard, S.; Giraud, T.; Veber, A.; Jay, P.

2026-04-03 evolutionary biology 10.64898/2026.04.01.715871 medRxiv
Top 0.1%
43.1%
Show abstract

Understanding how mutations evolve on Y chromosomes is central to explaining the origin, diversity and persistence of sex chromosomes. Mutations occurring on the Y chromosome in sexual populations experience selective dynamics that differ markedly from those on autosomes, due to a reduced effective population size and the presence of large non-recombining regions containing alleles maintained in a permanently heterozygous state. These specific features alter gene transmission in the Y chromosome population compared to autosomes, even within the same pedigree. Here, we provide a two-sex diploid Wright-Fisher model that explicitly incorporates both sex chromosomes and autosomes within a unified population framework, in order to capture the influence of these specificities on the fate of mutations, not only considering fixation probabilities but also segregation times. We use diffusion approximations and provide analytical and numerical tools to compute these quantities across a wide range of parameters and selection regimes. We recover classical results on fixation probabilities in various scenarios, including purely beneficial, deleterious or overdominant mutations, and extend them in the light of mean segregation time, a key but often overlooked determinant of evolutionary outcomes over finite timescales. In particular, our analyses show that overdominant mutations are overall more likely to fix in observable time windows on the Y chromosome than on autosomes. Individual-based simulations corroborate our approximations and highlight parameter regimes where the theoretical approach is particularly useful, especially for parameter values inducing long segregation times or small fixation probabilities, for which simulations are impractical. Our results provide a comprehensive and tractable framework for clarifying how chromosome-specific features shape evolutionary dynamics beyond fixation probabilities alone.

8
Stabilizing selection on a polygenic trait from the gene's-eye view.

Courau, P.; Schertzer, E.; Lambert, A.

2026-03-06 evolutionary biology 10.64898/2026.02.23.706325 medRxiv
Top 0.1%
40.6%
Show abstract

We study a polygenic trait under stabilizing selection at statistical equilibrium, where genetic effect, mutation rate and mutational bias are heterogeneous across loci. The model assumes L biallelic sites subject to reversible mutations, each allele described by its frequency in the population. Using a diffusion approximation, a mean-field approximation and neglecting linkage disequilibrium, we predict consistent phenomena across several regimes of selection: (1) a small deviation {Delta}* of the trait mean from its optimal value appears and persists due to genetic mutations not aligned with selection; (2) while this deviation is often undetectable at the trait level, it leaves a substantial signature at the locus level by favoring alleles reducing it, resulting in genic selection with mean coefficient s* proportional to -{Delta}* acting pervasively; (3) with stronger selection on the trait, (3a) the value of {Delta}* is decreased but the intensity of genic selection is increased in inverse proportion, resulting in an essentially constant, non negligible value of s*. We show how the stationary distribution of allelic frequencies can be obtained from {Delta}*. The latter can then be characterized as the solution to a fixed-point equation. Finally, we quantify several macroscopic observables of interest (genetic variance, description of the fluctuations of the trait mean as an Ornstein-Uhlenbeck process). The orders of magnitude of the macroscopic observables can be derived on a wide region of the parameter space. The model shows good fit and can straightforwardly be extended to accommodate pleiotropy, dominance, and some forms of epistasis. We also discuss the different breakdown which may occur (Bulmer effect, Hill-Robertson effect, breakdown of the Ornstein-Uhlenbeck approximation for the dynamics of the trait mean, depletion of genetic variability due to low mutation rates).

9
Altering dosage of meiotic crossover-associated RING finger proteins affects crossover number and interference in Drosophila

Frantz, E.; Santa Rosa, P.; McMahan, S.; Sekelsky, J.

2026-02-19 genetics 10.64898/2026.02.18.706578 medRxiv
Top 0.1%
38.0%
Show abstract

Crossovers play a critical role in ensuring correct reductional segregation of homologous chromosomes in the first meiotic division. Crossing over is initiated by formation of DNA double-strand breaks (DSBs), but the number of DSBs is greater than the number of crossovers. Which recombination sites become crossovers, versus being repaired as non-crossovers, is not random, but is subject to several crossover patterning phenomena, including crossover assurance and crossover interference. One current model for crossover designation proposes that crossover-associated RING finger proteins (CORs) undergo the biophysical process of coarsening, in which larger accumulations continue to get larger and smaller accumulations go away. Genetic and cytological studies of the three CORs in Drosophila melanogaster, Vilya, Narya, and Nenya, are consistent with this model. In females heterozygous for a deletion of vilya, fewer doublecrossovers are observed. Conversely, crossovers are elevated in females carrying a duplication of vilya and in females coordinately overexpressing Vilya, Narya, and Nenya. These findings support a model in which crossover designation occurs through coarsening of COR proteins within the synaptonemal complex.

10
Genetic determinants of intraspecific variation in crossover frequencies in the honeybee, Apis mellifera

Everitt, T.; Trinh, L. H.; Taliadoros, D.; Ronneburg, T.; Olsson, A.; de Miranda, J. R.; Servin, B.; Webster, M. T.

2026-02-05 genomics 10.64898/2026.02.03.702590 medRxiv
Top 0.1%
32.7%
Show abstract

Meiotic recombination facilitates natural selection and is necessary for correct chromosomal segregation in most sexually reproducing species. Crossover rates vary greatly both within and among species, but the determinants of this variation are not fully understood. The honeybee Apis mellifera has extremely high recombination rates. Honeybee males (drones) are haploid, which enables the distribution of crossovers to be directly estimated from the progeny of a single reproductive female (queen). Here we map crossover events in the honeybee using whole genome sequencing of 1509 drone progeny of 184 queens. This allows us to assay intra-specific variation in recombination rate and its genetic and non-genetic determinants. We estimate the average crossover rate as 23 cM/Mb, with between 22 and 88 crossovers events detected in individual offspring. We estimate 28% of this variation is additive heritable variation among queens. There is no effect of queen age or genetic background on crossover rate. A genome-wide association study identifies variation in the gene mlh1 as associated with mean crossover rate. We estimate that variation in the gene is associated with a 10% difference in crossover rate between the two homozygous genotypes at the most significant SNP. This gene has a well-established role in recombination and variation in the gene could affect crossover rates by affecting resolution of Holliday junctions as crossovers. This is the first gene discovered to be associated with recombination rate variation in an insect. Adaptive evolution of this gene could potentially underlie the extremely high recombination rates in honeybees.

11
Elevated mutation in haploid yeast driven by translesion synthesis

Fredette-Roman, J.; Smith, D. R.; Omari, S. B.; Sharp, N.

2026-01-23 evolutionary biology 10.64898/2026.01.22.701062 medRxiv
Top 0.1%
28.5%
Show abstract

The impact of selection versus genetic drift on the evolution of mutation patterns is unclear. In Saccharomyces cerevisiae, which is predominantly diploid in nature, there is evidence that haploid cells have a higher mutation rate than diploids, suggesting that a haploid-specific mutator phenotype may have evolved due to the limited opportunity for selection to act on this rare cell type. Mutation in haploids was primarily elevated in late-replicating regions of the genome, implicating error-prone translesion synthesis (TLS) repair. Additional research has demonstrated that removing REV1, a gene responsible for initiating TLS, causes a reduction in haploid mutation rate. To assess whether the preferential use of this error-prone repair pathway by haploids explains the difference in genome-wide mutation patterns between cell types, we deleted REV1 in both diploid and haploid S. cerevisiae and estimated their mutation rates using a mutation accumulation experiment. Consistent with a previous study, we found a 50% higher single nucleotide mutation rate in REV1+ haploids than in REV1+ diploids. Deleting the REV1 gene caused this difference to vanish, with mutation rates in haploid and diploid rev1{Delta} lines converging on 2.4 x 10-10. Our results suggest that the mutagenic effect of translesion synthesis is much stronger in haploids, reflecting a limited opportunity for selection to act on mutation rates in rarer cells or smaller populations. We also find evidence that REV1 plays an important role in mitochondrial genome maintenance in both cell types.

12
Mapping Gene Drive Dynamics onto Mendelian Models

Wen, Z.; Wan, M.; Greenbaum, G.; Carja, O.

2026-01-30 evolutionary biology 10.64898/2026.01.28.702305 medRxiv
Top 0.1%
25.8%
Show abstract

CRISPR-based gene drives bias their own transmission and can spread even when deleterious, giving rise to evolutionary dynamics that can be substantially more complex than those governed by standard Mendelian inheritance. Identifying conditions under which gene-drive dynamics can be faithfully approximated by Mendelian models would therefore enable the extensive theoretical toolkit of classical population genetics to be applied to gene-drive systems. Here, we develop a general mapping framework that translates gene-drive models into dynamically equivalent Mendelian models, allowing their behavior to be analyzed using classical theory. By deriving both haploid and diploid effective-parameter mappings, we identify Mendelian models that closely reproduce allele-frequency trajectories of gene drives across a wide range of conversion rates, fitness costs, and dominance effects. We delineate the regions of the parameter space where a one-parameter haploid approximation provides an accurate first-order representation, and where incorporating dominance in a diploid mapping substantially improves fidelity and recovers internal equilibria and threshold behavior. Analytic approximations yield efficient mappings across most of the drive parameter space, while a trajectory-based grid search further improves accuracy near nonlinear regime boundaries. To demonstrate the utility of this framework, we apply it to predicting gene swamping in a two-deme migration-selection model and show that the mapped Mendelian system accurately forecasts transitions between fixation and loss under three relevant release scenarios: environmental variation in fitness, engineered fitness asymmetries, and environment-dependent conversion. Together, these results establish a theoretical bridge between non-Mendelian gene drives and classical population genetic models, providing an interpretable and computationally efficient foundation for predicting gene-drive outcomes and guiding the design of gene drive systems and deployment strategies.

13
Experimental Evolution of Yeast Reveals Trade-offs Between Early and Late Stationary Phase

Tarkington, J. A.; Sherlock, G. J.; Mahadevan, A.

2026-03-12 evolutionary biology 10.64898/2026.03.12.711341 medRxiv
Top 0.1%
25.7%
Show abstract

Stationary phase in yeast and other microorganisms begins when a limiting nutrient in the environment is exhausted and cell division ceases. Most cells subsequently enter quiescence and lose viability. In spent media, without metabolic byproducts being diluted, cellular processes can modify the environment and cause the relative growth rates of different genotypes to vary over the course of stationary phase. In this work we experimentally evolve S. cerevisiae in batch culture, varying the time spent in stationary phase between growth cycles. We measure the relative fitness of the resulting adaptive clones across a range of environments: with different amounts of time in stationary phase and in two different carbon sources. By comparing the inferred performance (relative growth rate during a period of the growth cycle) of a mutant to that of its ancestor, we can estimate the effects of each observed mutation on performance during various phases of growth. We show that when an adaptive mutation emerges in growth cycles that include a stationary phase, its effect on stationary phase performance is largely independent of the type of carbon source provided. However, for the same group of mutants, mutational effects on performance in early stationary phase are negatively correlated with those effects in late stationary phase, suggesting a trade-off. We also show that increased intervals of stationary phase result in larger fitness effects of adaptive mutations and distinct routes of adaptation. Together, these results demonstrate that stationary phase consists of more than one distinct fitness-related phenotype, and that the phenotypes that allow for high performance in the first few days of stationary phase trade off with those that allow for high performance in later stationary phase.

14
Zinc-Finger Motifs Unique To Arabidopsis Thaliana Paralogs Rpa1C And Rpa1E Are Required For Rpa-Dependent DNA Repair

Mills, I.; Culligan, K. M.

2026-01-28 genetics 10.64898/2026.01.26.701817 medRxiv
Top 0.1%
23.2%
Show abstract

RPA is a heterotrimeric ssDNA binding protein that is highly conserved across all eukaryotes. Arabidopsis (Arabidopsis thaliana) has five RPA1 paralogs divided into three groups (A, B, C) each with unique functions in DNA replication and repair. The group C paralogs (RPA1C and RPA1E in Arabidopsis) function specifically in DNA-damage repair and carry a C-terminal extension unique to group-C paralogs. This C-terminal extension contains a zinc finger motif (ZFM) that is highly conserved and is therefore predicted to be critical to the functionality of the paralogs during DNA damage repair. To address this, we employed a CRISPR-Cas9 strategy to specifically remove the ZFM from RPA1C or RPA1E while leaving the genes otherwise intact (termed C-ZFKO and E-ZFKO). C-ZFKO and E-ZFKO lines were challenged with DNA damaging agents, and their susceptibility was compared to both WT (Col-0) lines and to previously characterized T-DNA null mutants (rpa1c and rpa1e). To address the role of the respective ZFMs in homologous recombination pathways (HRR), we employed a GUS-reporter system to compare WT lines to C-ZFKO and E-ZFKO lines. We find here that C-ZFKO and E-ZFKO lines displayed hypersensitivity to DNA damaging agents at a level comparable to previously characterized T-DNA null mutants (rpa1c and rpa1e). When studying the rate of HRR, both C-ZFKO and rpa1c showed a drastic reduction in single-strand annealing (SSA) while E-ZFKO and rpa1e had a more modest, but still significant decrease. All mutant lines had a comparable decrease in synthesis-dependent strand annealing (SDSA) compared to WT. Thus, we show here that the respective RPA1C and RPA1E-encoded ZFM is crucial for the ability of each paralog to function during DNA damage repair.

15
Natural selection on synonymous genetic variation in the major histocompatibility complex

Roved, J.

2026-02-23 evolutionary biology 10.64898/2026.02.23.707394 medRxiv
Top 0.1%
22.9%
Show abstract

Protein coding DNA sequences harbor synonymous nucleotide variation that does not change amino acid sequences but influences phenotypes via multiple effects on the pathway from gene to protein. Synonymous variation has recently been shown to coevolve between viruses and their natural hosts, but its potential role in host immune defenses has not been explored. Here, I present evidence of natural selection on synonymous variation in the major histocompatibility complex (MHC), a highly polymorphic multigene locus that plays a crucial role in pathogen recognition by the adaptive immune system of vertebrate species. Using data from a wild population of Great Reed Warblers, I show that codon usage in exon 3 of MHC class I (MHC-I) genes is under strong purifying selection in 56 out of 87 codon sites. Scanning the Great Reed Warbler genome for tRNA genes revealed that, for most amino acids, bias towards preferred codons was associated with abundances of tRNA isotypes, indicating that the purifying selection is likely driven by selection for increased translational efficiency. However, spikes of synonymous variation appeared in 31 of the 87 sites in the MHC-I exon 3, and in those sites, codon usage bias and correlations with tRNA abundances were reduced. The distribution of the spikes of synonymous variation showed no consistent association with structural domains of the MHC-I protein, nor with sites under positive selection for amino acid change, which are considered important for antigen binding properties. Intriguingly, the amount of synonymous variation in genotypes showed a positive correlation with Darwinian fitness, indicating that important evolutionary forces are at play that neutralize purifying selection in the 31 sites. From an ultimate perspective, the release of purifying selection among certain sites in MHC genes may indicate an arms race with pathogens, and I propose that the spikes of synonymous variation may reveal a footprint of natural selection on MHC genes to escape inhibitory molecular interactions between intracellular pathogens and MHC mRNA. Unravelling the mechanisms of such interactions should be of great importance to our understanding of this extremely important locus and I hope that the results and methodological advancements presented here will spark future studies of synonymous variation in the MHC and its biological effects.

16
Optimal spatial release strategies for confined gene drives and Wolbachia

Wang, Z.; Champer, J.

2026-03-06 genetics 10.64898/2026.03.04.709515 medRxiv
Top 0.1%
22.8%
Show abstract

Gene drives are genetic elements that can rapidly spread through populations, offering potential solutions for controlling disease vectors and pests. In some scenarios, it is necessary to utilize drives that can be confined to only target populations. The success of these threshold-dependent gene drives, which require a minimum local frequency to establish, depends critically on the spatial strategy used for introduction. Here, we use a reaction-diffusion model to systematically identify optimal release patterns that maximize the per-capita efficiency for four distinct gene drive designs as well as use of Wolbachia bacteria, which spread similarly to frequency-dependent gene drives. We find that the most efficient release strategy is highly dynamic, transitioning from a broad "everywhere" release for short timeframes to a "multiple-ring" pattern for intermediate times, and finally to a focused "center" release for longer timeframes. These timeframes depend on the specific type of drive, with more powerful variants transitioning more quickly to center releases. Our results demonstrate that these optimized, variable release strategies can be substantially more effective than simple uniform releases. This study provides a quantitative framework for designing effective gene drive implementations, highlighting that a carefully planned spatial strategy is essential for maximizing impact, making optimal use of available resources.

17
A mdg4 Retrotransposon Screen for X-linked Female Sterile Alleles and its Relationship with the Transcription Factor OVO

Benner, L.; Oliver, B. C.

2026-02-14 genetics 10.64898/2026.02.12.705638 medRxiv
Top 0.1%
22.3%
Show abstract

In the germline, the mdg4 retrotransposon integrates in close proximity to the location of OVO DNA binding motifs, suggesting that insertion bias is driven by the OVO transcription factor. A classical genetic example of this is the reversion of the dominant female-sterile allele, ovoD1, by the transposition of mdg4 into the ovo promoter where OVO protein binds. We wanted to take advantage of this relationship and determine if we could recover female sterile alleles along the X chromosome due to mdg4 insertion, with the hypothesis that these would be genes that OVO binds and transcriptionally regulates in the germline. We mobilized the mdg4 retrotransposon with the use of mutants for the lncRNA gene flamenco (flam) and recovered 17 recessive female sterile alleles out of a total of 1,192 chromosomes screened. We identified 11 complementation groups, for which a mdg4 insertion was responsible for female sterility in 7 groups. Notably, a complementation group consisting of 6 alleles was found to be the result of a Doc transposable element insertion into the gene Grip91 and is potentially evidence for a Doc insertional hotspot in the genome. Our screen also uncovered that 7/17 recessive female sterile chromosomes contained multiple transposable element insertions indicating that flam- females derepress numerous transposable elements that can lead to multiple transposon insertions along a single chromosome, as has been suggested previously. Altogether, we found that mdg4 did have an insertion bias into OVO bound regions of the genome that can result in female sterility, however, this was the case for a minority of the female sterile alleles recovered with this method. Article SummaryThe retrotransposon mdg4 preferentially inserts near binding sites of the female germline transcription factor OVO in Drosophila melanogaster, most notably at the ovo locus itself. We leveraged this relationship to screen for X-linked recessive female-sterile mutations generated by mdg4 mobilization in flamenco mutant females. From 1,192 chromosomes, we recovered 17 female-sterile alleles across 11 complementation groups. mdg4 insertions were significantly enriched in OVO-bound regions but accounted for only a subset of sterility phenotypes, revealing substantial background mutagenesis by other transposable elements. These results refine the OVO-mdg4 relationship and highlight both the promise and limitations of transposon-based genetic screens.

18
Increased variability and reduced phenotypic robustness in clonal Drosophila mercatorum

Kahraman, A.; Wirth, M.; Hammoud, H.; Reslan, M.; Haidar, M. A.; Djuhadi, G.; Mathejzyk, T.; Reifenstein, E.; Balke, J.; von Kleist, M.; Linneweber, G. A.

2026-04-06 genetics 10.64898/2026.04.02.716190 medRxiv
Top 0.1%
22.3%
Show abstract

Phenotypic variation arises from the interplay of genetic, environmental, and stochastic developmental factors. Quantitative genetics predicts that reducing genetic variation through inbreeding or clonality should reduce phenotypic variation, an assumption that underlies the widespread use of inbred and clonal model organisms in biomedical research. Here, we test this assumption in the facultatively parthenogenic fly Drosophila mercatorum, in which parthenogenesis results in complete homozygosity and clonality after a single generation. Contrary to expectation, clonal parthenogenic flies showed broad shifts in trait means, altered interindividual variability, increased fluctuating asymmetry, and reduced behavioral and developmental canalization relative to sexually reproducing controls. Inbreeding reproduced substantial parts of this phenotype, whereas outcrossing restored robustness, identifying loss of heterozygosity as a major driver of the effect. Our findings show that extreme genetic uniformity can amplify rather than constrain stochastic phenotypic divergence, suggesting that controlled heterozygosity may, in some contexts, provide a more robust and reproducible experimental substrate than highly inbred, isogenic, or clonal animals.

19
Linking Codon- and Protein-Level Mutation Scores to Population Genetics Reveals Heterogeneous Selection Efficiency Across Escherichia coli Lineages

Mischler, M.; Vigue, L.; Croce, G.; Weigt, M.; Tenaillon, O.

2026-03-18 genetics 10.64898/2026.03.16.711857 medRxiv
Top 0.1%
22.0%
Show abstract

Quantifying the selective effects of individual mutations is essential to understand how their population-wise frequencies evolve under natural selection and genetic drift. Large genomic datasets provide a real-life experiment that we exploit to characterize the efficiency of selection across different mutations types and populations. Using Direct Coupling Analysis, a model from statistical physics, we derive protein-informed scores for individual non-synonymous mutations identified in 81,440 Escherichia coli genomes. We show that these scores act as a latent variable capturing the probability that a mutation is beneficial, neutral, or mildly to highly deleterious. We contribute to the debate on the importance of synonymous mutations by demonstrating that their selection intensities span a single order of magnitude in the E. coli species, whereas non-synonymous mutations span six orders of magnitude. We further relate selection efficiency to genetic drift, defined as the inverse of population size, and to ecological lifestyle, and we identify a 10,000-fold reduction in selection efficiency between the entire E. coli species and its most pathogenic populations. Together, these results highlight how population genetics and protein variant fitness predictors inform one another: variation in selection efficiency is associated with shifts in the distribution of mutation scores, and population genetics data provide a benchmark to assess the accuracy of these scores. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=182 SRC="FIGDIR/small/711857v1_ufig1.gif" ALT="Figure 1"> View larger version (51K): org.highwire.dtl.DTLVardef@1df70corg.highwire.dtl.DTLVardef@1464860org.highwire.dtl.DTLVardef@139d4d3org.highwire.dtl.DTLVardef@1c3a4c5_HPS_FORMAT_FIGEXP M_FIG Schematic representation of the analysis of polymorphism in 81,440 Escherichia coli genomes. 458,443 polymorphic codon sites were identified and oriented using homologous sequences from closely related species. Mutations can be classified as synonymous or non-synonymous based on whether they alter the amino-acid sequence encoded, and real-valued scores predictive of fitness effects can be attributed to mutations within each of these classes. Codon scores reflect the global codon usage preference within the E. coli genome. DCA scores capture position- and amino-acid-specific preference as well as epistatic constraints and are obtained for each protein from a set of distantly related homologous sequences. Coupled with the abundance of polymorphic sites within different E. coli subpopulations, these different polymorphism classifications allow to precisely compare the intensity of selection between different types of mutations and across populations with distinct lifestyles, illustrated here by their pathogenic power. C_FIG

20
Position-dependent variant effects reveal importance of context in genomic regulation

Aninta, S. I.; Tewhey, R.; de Boer, C. G.

2026-03-18 genomics 10.64898/2026.03.17.712488 medRxiv
Top 0.1%
22.0%
Show abstract

Gene expression is governed by the DNA sequence, which is read out through complex interactions between transcription factors (TFs), co-activators, and chromatin. Massively Parallel Reporter Assays (MPRAs) provide a high-throughput framework for functionally characterizing how regulatory DNA sequences impact the expression of a model gene. MPRAs have also proven to be useful for measuring the effects of genetic variation, where each allele is typically tested in the center of [~]200 bp of genomic context cloned into the MPRA; but the impact of variant position and local context remains largely unexplored. In this study, we systematically investigate how shifting the position of a variant within an MPRA probe influences its regulatory activity using models that predict expression in MPRAs from DNA sequence. We find that while the direction of variant effects is usually preserved across positions, the magnitude of expression changes can vary substantially depending on where the variant is placed within the construct. This positional bias appears to be largely explained by the strong position-dependent activity of TFs whose binding the variants perturb. In a subset of cases, interactions consistent with cooperativity between TFs also contributes to position-specific effects. [~]1% of variants appear to disrupt RNA polymerase III (Pol III) promoters within Alu elements, resulting in position-specificity because both A and B boxes are required for function and exclusion of either motif due to window shifts disrupts the variants effects. However, we saw little evidence to support the hypothesis that the positional dependence of variant effects resulted from redundancy of motifs. Overall, our study demonstrates the complexity of cis-regulatory grammar and how it can confound the interpretation of regulatory variants.