GENETICS — Latest Matching Preprints

1

The impact of P-Element-induced hybrid dysgenesis on the male germline in Drosophila simulans

Griffin, J. S.; Harney, E.; Capes, C.; Connell, R.; Betancourt, A. J.; Romero-Soriano, V.

2026-07-01 genetics 10.64898/2026.06.28.735054 medRxiv

Top 0.1%

51.1%

Show abstract

The P-element, a DNA transposon, has independently invaded two Drosophila species, accompanied by rapid evolution of suppression. In the germline, suppression is mediated primarily by maternally expressed piRNAs, a class of regulatory small RNAs associated with PIWI proteins. The offspring of females that lack P-element-specific piRNAs and males that contain P-elements suffer a syndrome of deleterious phenotypes, including sterility, genome rearrangements, gonadal atrophy, and mutations, while the offspring of the reciprocal cross are normal. These effects, collectively termed hybrid dysgenesis, have been investigated primarily in female D. melanogaster. Here, we study hybrid dysgenesis in male D. simulans. Using an attached-X chromosome stock, we generated genetically identical F1 males that differed only in maternal suppression of the P-element. Using targeted sequencing of P-element breakpoints, we show that P-element transposition is elevated in dysgenic males and confirm a preference for insertion near origins of replication. Using transcriptomics, we show that dysgenic males have elevated P-element expression and reduced splicing suppression, with patterns of gene expression suggesting the loss of mature sperm cells. Fertility assays show higher rates of male sterility but otherwise modest effects on fertility. In conjunction with the transcriptomic data, small RNA sequencing confirms that the piRNA pathway functions in testes. Our results suggest that the P-element may spread more readily through males than females, as transposition rates are similar while fertility defects are less severe in males.

2

Why linkage disequilibrium measures disagree: Fisher geometry of rare common haplotype structure

Ichikawa, Y.

2026-07-07 genetics 10.64898/2026.07.02.736022 medRxiv

Top 0.1%

34.2%

Show abstract

Conventional LD measures such as r2 perform poorly in the rare common regime, particularly in asymmetric configurations such as nested haplotype structure. Because r2 is symmetric and quadratic, it removes directional structure in two ways: squaring discards the sign, or phase, retained by the signed LD coefficient D, while symmetric normalization hides the asymmetry between the conditional probabilities P(A|B) and P(B|A). Although D recovers the phase, it is locus symmetric and unnormalized; its magnitude is hard to compare across frequency regimes and it does not by itself express which way the asymmetry runs. We therefore analyze the conditional-probability asymmetry {Delta} = P(A|B) - P(B|A), together with r2 and D, as distinct scalar functions on the haplotype simplex under the Fisher information metric. The conditional probabilities P(A|B) and P(B|A) are bounded in [0, 1], directly express carrier-set inclusion, and are more readily visualized than D. Moreover, their difference admits the exact decomposition {Delta} = M + C into a marginal frequency term M and an LD-coupled term C. Prior work has characterized either the mathematical behavior of LD normalizations across allele-frequency space or the Fisher geometry of the haplotype simplex, but not their connection. We bridge this gap by showing that the geometric structure of the simplex explains why LD measures disagree in the rare common regime and why symmetric normalizations such as r2 lose directional information. We show that the fixed-frequency leaf is intrinsically anisotropic, positively curved, and frequency-dependent under the Fisher metric. These geometric predictions are tested empirically , in phased 1000 Genomes data1 and a two locus Wright Fisher model, in a companion paper (Ichikawa, preprint); the present note develops the geometry itself. Keywords: linkage disequilibrium; Fisher information metric; haplotype simplex; rare variant; conditional-probability asymmetry; nested haplotype structure

3

Formation, persistence, and breakdown of carrier-set topology in linkage disequilibrium: empirical structure in 1000 Genomes and a two locus Wright Fisher model

Ichikawa, Y.

2026-07-01 genetics 10.64898/2026.07.01.735767 medRxiv

Top 0.1%

31.1%

Show abstract

Linkage disequilibrium between two biallelic loci is usually summarized by scalar association measures such as r2 and D'. These measures quantify how visible an allelic association is to a symmetric LD scan, but they do not directly represent the topology of carrier sets: whether the carriers of one variant are contained within, partially overlap with, or are disjoint from the carriers of the other. This distinction is structural. On the haplotype-frequency simplex, carrier-set inclusion corresponds to a boundary face where one haplotype class is absent. In the rare-common regime, a nested rare variant is further constrained by the ceiling r2 [≤] pA/pB, so that complete carrier-set inclusion can remain nearly invisible to r2. Here, as a companion to the Fisher-geometry preprint 1, we examine the empirical and dynamic behavior of this carrier-set topology. In 1000 Genomes Phase 3, across 156,604,320 SNP pairs from the MHC and NEGR1 regions, pairs on the | D' |= 1 boundary span a wide range of r2 and | C |. Within fixed r2 strata, r2 poorly distinguishes nested from non-nested carrier-set configurations, with AUROC values of approximately 0.54-0.62, whereas the boundary-sensitive normalization | D' | separates them much more effectively, with AUROC values of approximately 0.90-0.92. The empirical data also obey the predicted r2 [≤] pA/pB ceiling. We then introduce a temporal axis using a two-locus Wright-Fisher model on the same simplex. Carrier-set topology evolves through three motions relative to the | D' |= 1 boundary: formation or persistence, in which recombination suppression establishes and maintains inclusion without requiring selection; visibility change, in which selection or drift moves r2 along the boundary while preserving the inclusion relation; and breaking, in which a recombination pulse introduces the previously absent haplotype and dissolves inclusion. A fourth mode, specificity erosion, expands the partner carrier set while preserving inclusion, thereby lowering P(A | B) while keeping P(B | A) and | D' | equal to one. This mode shows that asymmetric conditional probabilities are best understood as diagnostic coordinates for carrier-set topology, not as the primary object itself. Together, these results show that topology and visibility are separable axes of LD structure. Conventional r2-based scans and carrier-set topology scans therefore answer complementary, not interchangeable, questions.

4

Paralogs of the Candida albicans TLO gene family form interconnected functional networks with incomplete redundancy

Simonton, E.; Cangelosi, N.; Zhou, M.; Hendricks, P. S.; Woodruff, A. L.; Anderson, M. Z.

2026-07-02 genetics 10.64898/2026.06.29.735307 medRxiv

Top 0.1%

30.5%

Show abstract

Gene duplication typically fails to confer a selective advantage to an organism, prompting their removal from a population. In the rare instance that duplication either does not incur a fitness cost or it enhances fitness, gene families can form through repeating the duplication process. While the function of gene duplicates has been studied in detail, little work has explored how repeated duplication impacts paralog redundancy and may restrict the emergence of new paralogs or novel function. Here, we constructed a panel of single deletion mutants for each of the 14 members of the Candida albicans telomere-associated (TLO) gene family to test the redundancy in molecular and biological function among paralogs from a lineage-specific expansion. Tlo proteins function as interchangeable subunits of the Mediator transcriptional regulatory complex and have the potential to alter gene expression and an array of cellular responses. Redundancy was the most common outcome, being observed for approximately 80% of the phenotypic assays in strains lacking single TLO genes. However, mutants for all 14 paralogs displayed non-redundant functions in phenotypes ranging from carbon utilization to in vivo virulence. Analysis of gene expression in single TLO mutants found similar trends in redundancy, and loss of single TLOs disproportionately affected genes involved in filamentation, adhesion, redox reactions, and transporter activity at the cell surface. Importantly, sequence divergence between paralogs positively correlated with the frequency of altered phenotypes in single TLO mutants, indicating the acquisition of non-redundant function with increased evolutionary distance. Double mutants lacking two TLO genes produced both positive and negative synergistic phenotypes, suggesting that crosstalk or coordinated regulation is common among paralogs. Together, this study demonstrates that recently emergent paralogs acquire non-redundant functions despite often retaining redundancy with other gene family members to form a highly interconnected functional network.

5

Nemo2.4: fast and accurate quantitative genetics forward-time simulations

Guillaume, F.; Cotto, O.; Chebib, J.; Beeravolu Reddy, C.; Schmid, M.

2026-07-08 evolutionary biology 10.64898/2026.07.02.736177 medRxiv

Top 0.1%

30.1%

Show abstract

We present Nemo 2.4, an advanced forward-time individual-based simulation framework designed to model the complex eco-evolutionary dynamics and genetic basis of quantitative traits. This tool addresses current challenges in evolutionary quantitative genetics by providing unprecedented flexibility and computational efficiency. Nemo 2.4's modular architecture allows researchers to design custom life cycles by combining specialized Life Cycle Event (LCE) modules, from reproduction and dispersal to selection, crossing, and phenotype expression. The software supports diverse population models, including both Wright-Fisher (WF) and non-WF dynamics, spatially explicit models, and varying demography. Nemo 2.4 handles a wide range of genetic architectures, including both multi-allelic Quantitative Trait Loci (QTL) for general trait studies, and dense di-allelic Quantitative Trait Nucleotides (QTN) implemented with highly optimized bit-wise data structures. Crucially, it allows the simulation of QTNs on comprehensive genetic maps that incorporate other genetic elements, providing genomic-scale resolution. Key biological complexities are integrated natively: the model accommodates modular pleiotropy, dominance, and pairwise epistasis across multiple traits, facilitating the study of complex genotype-phenotype mappings. Furthermore, Nemo 2.4 models phenotypic plasticity through reaction norms and incorporates underlying liability thresholds, enabling the simulation of environmental influences on trait evolution with various forms of selection (e.g., Gaussian, linear, truncation). Due to its compiled design and memory-efficient data representations for large numbers of loci, Nemo provides a robust platform for running high-throughput simulations critical for testing theoretical predictions in polygenic adaptation and understanding evolutionary responses to changing environments.

6

Model-free inference of evolution from allele frequency timeseries using permutation tests

Bertram, J.; Kushnir, A.

2026-07-03 evolutionary biology 10.64898/2026.07.01.735864 medRxiv

Top 0.2%

21.8%

Show abstract

Allele frequency (AF) timeseries allow us to directly observe the dynamics of evolution at a genetic level. However, extracting useful inferences from AF timeseries has proved difficult due to the model uncertainties and noisiness inherent in AF change at fine temporal scales. Here we present three new permutation tests --- which do not assume a model of evolutionary change or a parametric statistical model --- to detect AF timeseries features of evolutionary interest. The features identified by these approaches are: 1) any evolutionary change (as opposed to apparent change due to measurement error); 2) directional selection; 3) fluctuating selection with a propensity to change sign (negative autocorrelation). We are not aware of existing tests for features 1 and 3. Feature 2 is commonly tested using standard evolutionary models such as the Wright-Fisher; we show that the permutation approach has comparable statistical power. We apply our new approaches to AF timeseries data from D. melanogaster and D. pulex.

7

A chemical-genetic approach for stress-independent activation of the fission yeast stress-activated protein kinase pathway

Sawin, K. E.; Gupta, A.; Dudnakova, T.; Bayrak, B.; Kovac, A.; Modaffari, D.; Rodriguez-Rodriguez, A. I.; Scott, M. L.; Tay, Y. D.

2026-07-09 cell biology 10.64898/2026.06.30.735518 medRxiv

Top 0.2%

18.8%

Show abstract

BackgroundThe fission yeast stress-activated protein kinase (SAPK) pathway includes a conserved mitogen-activated protein (MAP) kinase cascade that regulates multiple cellular processes and is activated by several types of external stress. Understanding how Sty1, the MAP kinase in the SAPK pathway, controls these processes is complicated by the fact that different stressors can have stressor-specific effects that may be difficult to separate from the effects of Sty1 activation itself. Moreover, upon stress, Sty1 activation is usually short-lived. Previously, we developed a fission yeast strain, SISA, in which Sty1 kinase activity can be switched on in a sustained manner in the absence of external stress. This required combining multiple mutations in the SAPK pathway, including an analog-sensitive version of Sty1. When SISA cells are grown in the presence of analog-sensitive kinase inhibitors, Sty1 is inhibited, but when inhibitor is removed, Sty1 becomes hyperactive. While this strain was useful, it had several limitations. ResultsHere we describe and validate a more rationally-designed strain, SISA4, that retains the features of the original SISA strain while overcoming its limitations. SISA4 is more stable genetically than SISA, easier to use in genetic crosses, and easy to identify by phenotype or genotyping. We show that analog-sensitive kinase inhibitors 4-Amino-1-tert-butyl-3-(1-naphthylmethyl)pyrazolo[3,4-d]pyrimidine (1-NM-PP1) and 4-Amino-1-tert-butyl-3-(3-bromobenzyl)pyrazolo[3,4-d]pyrimidine (3-BrB-PP1) are equally potent for inhibiting analog-sensitive Sty1 in vivo, and we determine optimal inhibitor concentrations for converting SISA4 cells from a Sty1-inhibited state to a Sty1-hyperactive state. We also find that both 1-NM-PP1 and 3-BrB-PP1 have measurable off-target effects in wild-type cells, although these are modest and generally do not affect interpretation of experiments. Finally, using SISA4, we show that the Sty1-activated transcription factor Atf1 plays an unexpected role in maintaining cell-polarity disruption after Sty1 hyperactivation. ConclusionsSISA4 will be useful for investigating how SAPK pathway activation regulates diverse cellular processes.

8

Determinants of dicentric chromosome breakage in Drosophila

Ridges, J. T.; Hill, H. J.; Baldwin-Brown, J. G.; Golic, K.; Phadnis, N.

2026-07-13 genetics 10.64898/2026.07.09.737500 medRxiv

Top 0.2%

18.7%

Show abstract

Eukaryotic genomes often have fragile sites where chromosomes are particularly prone to break. In Drosophila, when dicentric ring chromosomes try to segregate, they break at nonrandom hotspots. Here, we precisely map breakage hotspots produced by dicentric ring chromosomes in Drosophila. Our study provides three key results about the nature of dicentric chromosome breakage. First, duplications produced by dicentric ring chromosome breakage are surprisingly complex and involve many structural rearrangements, indicating that healing of these breaks is not a simple process. Second, characterization of one particular hotspot showed that new termini all occurred within a single intron of a large testis-expressed gene, suggesting that replication-transcription conflict may be a key determinant of chromosome fragile sites. Third, the new ends are often located near preexisting transposons, suggesting that transposon insertions may contribute to fragility or participate in stabilization of broken ends.

9

Integrating Bottleneck Size into Selection Tests for Biological Diversity Data

Le, T. M. T.; Gjini, E.

2026-07-10 genetics 10.64898/2026.07.07.737025 medRxiv

Top 0.2%

18.6%

Show abstract

Population bottlenecks profoundly shape genetic diversity, but distinguishing stochastic drift from selective pressure requires precise estimation and accounting for bottleneck size. While deep-sequencing data enable inference via frameworks like beta-binomial modeling, integrating these estimates directly into selection tests remains a critical challenge. In this study, based on existing computational approaches, we propose a new method that explicitly incorporates bottleneck size estimates into neutrality tests for biological diversity data. Designed for variant frequency data, our framework accounts for sequencing errors and sampling biases to improve the precision and interpretability of selection signature detection. We validate this framework using previously published Streptococcus pneumoniae in vivo experimental data, successfully replicating established fitness results, while uncovering novel genes relevant to infection and pathogenesis. This integrated new model with explicit bottleneck effects narrows down the set of candidate genes under selection and provides a robust, generalizable tool for disentangling drift from selection across a wide range of biological systems.

10

Promoter Structural Variants are Drivers of Genome-Wide Differential Expression in Maize

Munasinghe, M.; Read, A.; Schulz, A. J.; Brandvain, Y. J.; Springer, N. M.; Hirsch, C.

2026-06-28 genomics 10.64898/2026.06.23.734061 medRxiv

Top 0.3%

18.0%

Show abstract

BackgroundStructural variants (SVs) are large insertions or deletions of DNA sequences. While less numerous than single nucleotide polymorphisms, SVs often account for a greater proportion of nucleotide differences between genomes. Their size and frequent association with repetitive sequences has historically hindered their detection, which has limited the ability to associate this variation with molecular and phenotypic trait variation. While some SVs have been linked to observable traits, it remains unclear whether such effects are rare or broadly distributed across the genome. ResultsTo test for genome-wide relationships between SVs and gene expression, we analyzed genome assemblies and transcriptomic data from 10 tissues across 26 diverse maize inbred lines. We identified SVs amongst these lines and examined variants located within the 1kb promoter region upstream of genes. Thousands of genes showed expression differences associated with promoter SVs, often in a tissue-specific manner. One common feature of these SVs was the presence of transposable element sequences. LTR retrotransposons were enriched amongst promoter SVs associated with differential expression and often reduced expression of the nearby gene. Despite widespread expression changes, we found no enrichment for specific biological functions or pathways among affected genes. ConclusionsOur findings indicate that extant TE-mediated promoter SVs play a significant role in shaping gene expression patterns across the maize genome. However, their phenotypic effects appear limited or context-dependent, suggesting that many variants may have minimal impact outside specific developmental stages or environmental conditions.

11

Efficient septum formation is essential for chromosome segregation in Bacillus subtilis when SMC function is impaired

Lai, N. K.; Lastra, L. C.; Adebiyi, K. O.; Rudner, D. Z.; Jacobson, S. C.; Kearns, D. B.; Wang, X.

2026-06-22 microbiology 10.64898/2026.06.22.733809 medRxiv

Top 0.3%

18.0%

Show abstract

Structural maintenance of chromosomes (SMC) complexes play conserved roles in chromosome organization, segregation, and repair in all domains of life. In Bacillus subtilis, SMC is required for segregation of newly replicated origins. To investigate whether other proteins function with SMC in this process, we performed a synthetic lethal screen with an smc hypomorphic allele (smc*) that is mildly defective in chromosome segregation. In addition to recovering previously reported interactions of smc with parB and spoIIIE, our screen identified minJ and divIVA as essential in the smc* background. We show that the synthetic lethality between smc* and{Delta} minJ or{Delta} divIVA arises from defects in segregating the replication terminus. Importantly, deletion of minD, which suppresses the cell division defects of{Delta} minJ and{Delta} divIVA, restored terminus segregation and viability in the smc* background. These findings support a model in which proper septum formation promotes chromosome terminus resolution and segregation by enabling SpoIIIE-mediated DNA clearance from the division septum during cytokinesis. These findings highlight the interdependence between chromosome segregation and cell division. ImportanceThe SMC complex plays a central role in chromosome organization and segregation in Bacillus subtilis, but the cellular functions that become important when SMC activity is reduced are not well understood. Using a hypomorphic smc allele, we discovered that mutations affecting cell division become essential when chromosome organization and segregation are impaired. Our findings support a model in which efficient septum formation enables proper localization of the SpoIIIE DNA translocase, which in turn resolves and segregates the chromosome terminus region. These results highlight the critical role of cell division in supporting chromosome segregation.

12

A recombinant dilp2GS-rpr donor line for adult-inducible IPC ablation across Drosophila genetic backgrounds

Chen, Y.; Bai, Y.; Zhuang, X.

2026-06-22 genetics 10.64898/2026.06.17.733056 medRxiv

Top 0.3%

15.2%

Show abstract

Genetic-background studies require defined perturbations that can be crossed reproducibly into many recipient backgrounds. We generated a Drosophila dilp2GS-rpr donor line for adult-inducible ablation of insulin-producing cells (IPCs), which secrete insulin-like peptides and provide a tractable model of insulin-deficient metabolic physiology. This line carries dilp2-GeneSwitch-GAL4 and UAS-reaper in cis on the same second chromosome homolog over a balancer. PCR genotyping and sequencing confirmed both transgenic elements in the candidate recombinant line. RU486 induction reduced dilp2 mRNA expression, supporting partial IPC ablation. Treatment-duration testing identified 8 days of RU486 as sufficient to increase whole-body glucose in the dilp2GS-rpr line but not in the background-matched control; food intake did not differ between RU486- and vehicle-treated flies. Across metabolic assays, whole-body glucose showed the clearest RU486- and line-dependent phenotype. This validated dilp2GS-rpr line enables testing how recipient genetic backgrounds modify inducible IPC/DILP metabolic phenotypes and provides a framework for similar linked donor-line resources.

13

CoalMiner: a coalescent model generator for fastsimcoal2

Esplin-Stout, R.; Sethuraman, A.

2026-06-30 evolutionary biology 10.64898/2026.06.25.734618 medRxiv

Top 0.3%

15.1%

Show abstract

Demographic inference using the Site Frequency Spectrum (SFS) is often constrained by the number and complexity of models tested. Here we present a coalescent model generator called CoalMiner for use with fastsimcoal2. CoalMiner utilizes a decision tree framework to generate biologically plausible models, with user input dictating the number and ranges of demographic parameters and histories, which can then be plugged into the fastsimcoal2 pipeline. Using extensive simulations and empirical data, we show that CoalMiner is an effective helper tool to explore demographic model space. CoalMiner is written in Python and is freely available on GitHub: https://github.com/raywray/coalminer with numerous vignettes and tutorials.

14

Genetic Association of Somatic Incompatibility and NLR-like Protein Domains in Coprinopsis cinerea

Auxier, B.; Ament Velasquez, L.; Baars, J. J. P.; Scholtmeijer, K.; F. van Peer, A.; Debets, A. J.; Aanen, D. K.

2026-06-27 genetics 10.64898/2026.06.24.733965 medRxiv

Top 0.3%

15.0%

Show abstract

In fungi, hyphal fusion is beneficial within an individual, but fusion between individuals comes with the risks of infection or exploitation. To manage this risk, fungi have developed mechanisms to restrict sustained fusion to be within a genetic individual, called allorecognition. In Ascomycete fungi, this recognition is based on allelic identity at several polymorphic allorecognition genes, often triggering cell death. However, the genetic basis of allorecognition is unknown in basidiomycetes, the clade that includes mushroom-forming fungi. Here, we map the first locus for this trait, which we call somA, in the mushroom-forming fungus Coprinopsis cinerea. We combined F1 offspring phenotypes with independent backcross lines to identify a region on chromosome 5 linked with the production of a barrage zone, a classic allorecognition phenotype. Fine-mapping of this region resulted in a region with a set of kinases and NACHT domain proteins, flanked by a leucine-rich repeat (LRR) protein. While the NACHT and kinase proteins are diverse between the parents, the LRR-encoding protein shows signs of purifying selection. Additional C. cinerea genomes show that this region contains several highly divergent alleles, consistent with long-term balancing selection. These polymorphic alleles all contain a single monomorphic LRR, which may indicate a novel mechanism for fungal nonself recognition. Based on a phylogenetic survey of related Basidiomycetes, this specific locus architecture appears to be restricted to closely related species. This finding of a multiallelic locus may explain the general trend of few nonself recognition loci in basidiomycetes. These results provide a first understanding of how individuality is maintained in basidiomycetes.

15

Revising the genetic and epigenetic architecture of in vitro regeneration capacity in natural Arabidopsis thaliana populations

Arima, K.; Chen, Y.; Sugimoto, K.; Sasaki, E.

2026-07-01 genetics 10.64898/2026.06.26.734650 medRxiv

Top 0.3%

14.7%

Show abstract

Plant regeneration is a dynamic developmental process that spans from cell dedifferentiation to organ reconstruction in response to inductive cues, such as wounding stress and hormonal signals. Although this capacity varies widely both between and within species, a comprehensive understanding of the genetic and epigenetic bases of this variation remains incomplete. To address this issue, we revisited published datasets on natural variation in in vitro regeneration capacity in Arabidopsis thaliana. Using quantitative genetic approaches, including meta-analyses of genome-wide association studies (GWAS) and multi-locus models, we dissected the genetic architecture underlying regeneration traits. Our results showed that shoot regeneration capacity is primarily explained by allelic variation in the cis-regulatory region of WUSCHEL (WUS), a key regulator of shoot meristem formation. Notably, these polymorphisms are also associated with epigenetic variants of the DNA transposon ATDNA2T9C, which is located within the regulatory region. Furthermore, allelic variation in ARABIDOPSIS RESPONSE REGULATOR 2 (ARR2), a positive regulator of cytokinin signaling, is associated with callus formation and greening traits and may promote shoot formation through genetic interactions with WUS alleles. Although in vitro regeneration is controlled by complex, multilayered gene regulatory networks, our results suggest that, in A. thaliana, natural variation in regeneration capacity is largely shaped by a small number of major-effect modifiers together with epigenetic variation and genetic interactions, despite the substantial heterogeneity observed among natural populations.

16

Driver-independent lexAop-tdTomato.nls reporter signal in the adult Drosophila proventriculus

Zhou, X.; Zhang, T.; Kim, W. J.

2026-07-11 genetics 10.64898/2026.07.07.737111 medRxiv

Top 0.4%

13.6%

Show abstract

Reporters are widely used in Drosophila genetics to visualize gene expression and cell lineages. However, uncharacterized limitations in specific reporter lines can lead to data misinterpretation. Here, we identify a consistent, driver-independent tdTomato signal in the adult proventriculus from the widely used lexAop-tdTomato.nls reporter line. This signal was observed across multiple lexA driver combinations and was directly detectable in lexAop-tdTomato.nls responder-alone adult proventriculi lacking any lexA driver and without antibody staining. In contrast, no comparable native red fluorescence was detected in larval proventriculi under the same no-antibody imaging condition. Mouse and rabbit anti-RFP immunostaining further supported the presence of proventriculus-associated tdTomato/RFP antigen in adult responder-alone animals. In larval responder-alone proventriculi, antibody-amplified staining was antibody-source-dependent: a detectable signal was observed only with rabbit anti-RFP, whereas mouse and rat anti-RFP produced no reliable detectable signal under the same staining condition. A driver-matched comparison using lexAop-RFP.nls did not reproduce the proventricular signal, arguing against detectable ectopic activity of the tested lexA driver in this tissue. However, because lexAop-tdTomato.nls and lexAop-RFP.nls differ in reporter/transgene architecture and possibly genomic insertion context, the underlying cause cannot be assigned specifically to the lexAop sequence. Our findings highlight the necessity of including driver-negative and no-antibody controls when using this reporter line in adult Drosophila proventriculus and gut studies.

17

Evo 2's Perception of Single Nucleotide Substitutions in the Genes of Two Plant Model Organisms

Mantegazza, O.; Bertolini, L.; Leoni, G.; Colaiacovo, M.; Petrillo, M.; Bonfini, L.; Savini, C.; Ceresa, M.; Zaoui, X.

2026-07-03 genomics 10.64898/2026.07.01.729829 medRxiv

Top 0.4%

13.1%

Show abstract

Although DNA Large Language Models (DNA-LLMs) offer a path to decoding genetic complexity, our ability to evaluate these models is constrained by our incomplete understanding of the very same genetic syntax and functional logic that these models are trained to learn. In this study we use single nucleotide substitutions that have or have not been observed in living organisms, to evaluate how the DNA-LLM Evo 2 interprets gene sequences from two plant model organisms, Arabidopsis thaliana and Oryza sativa japonica. Using perplexity as a measure of the model's confidence, we observe that alleles containing simulated substitutions are perceived, on average, as less likely than those observed in vivo. Although the size of the effect is modest, the effect is statistically significant and robust, suggesting that Evo 2 is aligned with our current understanding of evolutionary selective constraints. This approach is designed to be model-agnostic and species-agnostic and could serve as a generic framework for evaluating the performance of DNA-LLMs.

18

Female genetic variation controlling timing of mating plug ejection in Drosophila melanogaster

Carlisle, J. A.; Craig, R. M. J.; Matera-Vatnick, M.; Villanuenva, B. M.; Andrus, A. R.; Cosgrove, E. J.; Chen, D. S.; Clark, A. G.; Wolfner, M. F.

2026-07-01 genetics 10.64898/2026.06.27.734984 medRxiv

Top 0.4%

13.0%

Show abstract

In multiply-mating species, male-female postcopulatory, prezygotic interactions can influence reproductive outcomes. In Drosophila melanogaster, females can bias sperm storage and usage and thereby influence paternity outcomes. One mechanism by which females may regulate paternity contributions from specific males is through modulation of mating plug ejection timing. The D. melanogaster mating plug is composed of seminal fluid proteins, and some female-derived proteins, that coagulate in the female reproductive tract during mating. The mating plug facilitates sperm storage; thus, timing of female mating plug ejection is associated with sperm storage and relative paternity contributions in cases of multiple mating. However, whether there is natural genetic variation among females that shapes mating plug ejection timing, and genes or phenomena that might mediate it are unknown. We examined mating plug ejection in females from 69 lines of the Drosophila Genetic Reference Panel and observed dramatic differences in median plug ejection timing ranging from less than 1 to over 6 hours. We used this variation to perform a genome-wide association study to identify gene candidates associated with this phenotype. Many gene candidates are expressed in the brain and/or function in neurodevelopment. The candidate pool was also enriched for genes expressed in the ovary and functioning in oogenesis, indicating a link between female reproductive physiology and mating plug ejection. Consistent with this interpretation, females without a germline delay mating plug ejection. Our results demonstrate that female mating plug ejection is a physiologically integrated reproductive trait with a genetic basis that can be shaped by selection. Article SummaryThe D. melanogaster mating plug is composed of seminal fluid proteins and some female-derived proteins that coagulate in the female reproductive tract during mating. The mating plug facilitates sperm storage; thus, timing of female mating plug ejection is associated with sperm storage and relative paternity contributions in cases of multiple mating. Using the DGRP, we observed heritable genetic variation in female timing of mating plug ejection and through a GWAS find associated gene candidates. Gene candidates are enriched for neurodevelopment function and oogenesis function. We experimentally validate the connection between female mating plug ejection and the ovary.

19

Inference of fitness landscapes with heterogeneous patterns of epistasis across sites

Marti-Gomez, C.; McCandlish, D. M.

2026-06-28 evolutionary biology 10.64898/2026.06.25.734428 medRxiv

Top 0.4%

12.8%

Show abstract

Fitness landscapes provide a framework for understanding how genetic variation shapes evolutionary outcomes. Although these landscapes were long treated as abstract conceptual objects, recent advances in genetic engineering and high-throughput phenotyping have enabled the empirical measurement of phenotypic values across large combinatorial sequence spaces. These developments create a need for statistical frameworks that can summarize, infer, and interpret fitness landscapes in the presence of complex genetic interactions. Here, we introduce a framework for summarizing the structure of genetic interactions across sites based on the average squared local k-way epistatic coefficients between mutations at different subsets of sites, and derive the precise manner in which the variance in these local k-way epistatic coefficients across backgrounds relates to epistasis of orders higher than k. These statistics can be computed exactly for complete combinatorial landscapes and are related to classical statistics in the fitness landscape literature. Moreover, they can be estimated from empirical correlations when data are incomplete or noisy, and used to define an empirical Bayes prior for fitness landscape inference that differentially penalizes interactions involving different subsets of sites. We apply this inference method to diverse high-throughput protein and RNA combinatorial mutagenesis datasets and find that fitness landscapes often show highly structured patterns of genetic interactions across positions. Finally, we use this model to infer a fitness landscape for a dynamic self-splicing intron comprising 65,536 genotypes, and describe in detail the main genetic interactions that shape the structure of this landscape and how they relate to the underlying molecular mechanism. Together, these results provide new tools for summarizing and modeling complex fitness landscapes, and for linking large-scale empirical data to the mathematical theory of fitness landscapes.

20

Knock-in = knock-out: differential fitness effects of cardinal mutations in Anopheles stephensi

Larrosa-Godall, M.; Shackleford, L.; Leftwich, P. T.; Gonzalez, E.; Ang, J. X.; Edwards, M.; Nevard, K.; Luk, J. C. Y.; Mckee, M.; Noad, R.; Anderson, M.; Alphey, L.

2026-07-09 genetics 10.64898/2026.07.07.737011 medRxiv

Top 0.4%

12.6%

Show abstract

The kynurenine pathway metabolizes tryptophan into 3-hydroxykynurenine (3-HK), a precursor for ommochrome eye pigments synthesized via the cardinal (cd) gene in mosquitoes. While cd disruption was presumed neutral, we observed fitness costs in Anopheles stephensi knock-in but not knock-out cd mutants. Here we investigated this anomaly further by assessing survival, fecundity, and midgut integrity across multiple cd mutant lines. Heterozygous knock-in lines, expressing a fluorescent marker and guide RNA for CRISPR/Cas9, exhibited reduced survival post-blood feeding, larva-to-adult survival deficits, and midgut barrier dysfunction, whereas knock-outs showed no such costs. Oral supplementation with xanthurenic acid partially rescued knock-in mortality, implicating oxidative stress linked to 3-HK metabolism. Expression analyses suggest transgene insertion effects, rather than cd disruption, underlie these fitness costs. These findings highlight the importance of evaluating insertional effects in gene drive target selection and support cd as a viable target for genetic control strategies in An. stephensi.