G3 — Latest Matching Preprints

1

An axiomatic approach to cultivar ranking in multi-environment trials

Kondratev, A. Y.; Ianovski, E.; Voronina, E.; Crossa, J.

2026-07-01 genetics 10.64898/2026.06.27.734959 medRxiv

Top 0.4%

1.1%

Show abstract

Multi-environment trials are central to cultivar evaluation because they reveal how candidate cultivars perform across locations, years, management conditions, and stress environments. The resulting yield matrix is a rich source of data on genotype-by-environment interaction, and a wide literature on estimation, decomposition, visualisation, and prediction of yield potential and stability has flourished. However the ultimate question of which cultivar to recommend on the basis of such a matrix is often left implicit. The question is far from trivial, and in this paper we formulate cultivar recommendation as an axiomatic ranking problem. This framework is rich enough to encompass the existing literature on stability indices, as well as any other deterministic ranking procedure. We show that many commonly used stability-based procedures can violate minimal criteria of efficiency or consistency. The result of such violations is that a cultivar with uniformly high yield could be ranked below a cultivar with uniformly low yield, or the relative ranks of two cultivars could depend on whether or not a third cultivar is present in the matrix. Our results prove that under a small number of such criteria the space of admissible rules collapses to the family of power means and their limiting cases. If we further wish to allow multiplication normalisation of yield, we are left with the geometric mean as the unique solution.

2

Characterizing the Small Non-Coding RNA Pathways in the Invasive Zebra Mussel (Dreissena polymorpha)

Hernandez Elizarraga, V. H.; O'Brien, L. G.; Ballantyne, S.; Gohl, D. M.

2026-07-11 genomics 10.64898/2026.07.10.737777 medRxiv

Top 0.7%

0.8%

Show abstract

The zebra mussel (Dreissena polymorpha) is an invasive species that causes extensive economic and ecological damage. Here, we identify and characterize the key components of the small RNA (sRNA) and RNA interference (RNAi) pathways in zebra mussels. Like other mollusks, zebra mussels have extensive microRNA (miRNA) and Piwi-interacting RNA (piRNA) machinery but lack or have modified canonical factors needed to produce small interfering RNA (siRNA). Specifically, the zebra mussel Dicer sequence displays substitutions in the conserved DEAD box motif that is required for substrate processivity, and this organism also lacks some attendant accessory factors such as R2D2. We sequenced the small RNA found in both isolated somatic tissue (adductor muscle) and whole animals (including germline), and identified both conserved and novel miRNA and diverse piRNA sequences, but few endogenous siRNAs. To determine whether their remaining sRNA machinery could still be co-opted to initiate gene silencing, we injected dsRNA targeting several genes into zebra mussel adductor muscle. The injected rpn8-targeting dsRNA reduced rpn8 mRNA levels and was processed into sRNA that resemble endogenous miRNAs and piRNAs. The levels of both sRNA types correlated with mRNA knockdown, suggesting that they may act together to initiate RNAi as seen elsewhere. dsRNA targeting other genes produced variable results suggesting that particular criteria may be needed to trigger an RNAi response in this assay. Our results characterize endogenous sRNA pathways in zebra mussels, establish that dsRNA can induce RNAi, and lay the groundwork for further optimizations to establish RNAi-based genetic manipulation tools for this damaging invasive species.

3

Genetic Association of Somatic Incompatibility and NLR-like Protein Domains in Coprinopsis cinerea

Auxier, B.; Ament Velasquez, L.; Baars, J. J. P.; Scholtmeijer, K.; F. van Peer, A.; Debets, A. J.; Aanen, D. K.

2026-06-27 genetics 10.64898/2026.06.24.733965 medRxiv

Top 0.8%

0.6%

Show abstract

In fungi, hyphal fusion is beneficial within an individual, but fusion between individuals comes with the risks of infection or exploitation. To manage this risk, fungi have developed mechanisms to restrict sustained fusion to be within a genetic individual, called allorecognition. In Ascomycete fungi, this recognition is based on allelic identity at several polymorphic allorecognition genes, often triggering cell death. However, the genetic basis of allorecognition is unknown in basidiomycetes, the clade that includes mushroom-forming fungi. Here, we map the first locus for this trait, which we call somA, in the mushroom-forming fungus Coprinopsis cinerea. We combined F1 offspring phenotypes with independent backcross lines to identify a region on chromosome 5 linked with the production of a barrage zone, a classic allorecognition phenotype. Fine-mapping of this region resulted in a region with a set of kinases and NACHT domain proteins, flanked by a leucine-rich repeat (LRR) protein. While the NACHT and kinase proteins are diverse between the parents, the LRR-encoding protein shows signs of purifying selection. Additional C. cinerea genomes show that this region contains several highly divergent alleles, consistent with long-term balancing selection. These polymorphic alleles all contain a single monomorphic LRR, which may indicate a novel mechanism for fungal nonself recognition. Based on a phylogenetic survey of related Basidiomycetes, this specific locus architecture appears to be restricted to closely related species. This finding of a multiallelic locus may explain the general trend of few nonself recognition loci in basidiomycetes. These results provide a first understanding of how individuality is maintained in basidiomycetes.

4

Transposon-associated genetic structure of a fungal phytopathogen population of wheat

Phan, H. T. T.; Shankar, M.; Jones, D. A. B.; Furuki, E.; Rybak, K.; Kamphuis, F.; Golzar, H.; Oliver, R. P.

2026-06-26 pathology 10.64898/2026.06.22.733729 medRxiv

Top 0.9%

0.6%

Show abstract

Septoria nodorum blotch (SNB) is an economically important fungal disease of wheat caused by Parastagonospora nodorum. It is primarily controlled by the breeding of resistant wheat cultivars, but experience over the last 50 years shows that new pathogen populations soon evolve that are more virulent on the current popular cultivars. In this study, we assembled a panel of 360 P. nodorum isolates. The collection resolved into eight subpopulations. One core and seven transient populations were found possessing contrasting characters in term of spatial and temporal distribution, mating-type, effector haplotypes and patterns of intact and degraded copies of a Tc-1 mariner transposon, called Molly. Molly can proliferate and randomly insert throughout the fungal genome. Its multiplication in sexual population likely triggered RIP which partially explains the extensive genetic diversity and explains the ability to form new adapted lineages and the observed population structure of this important pathogen of wheat. When tested on wheat, the recently emerged groups exhibited greater pathogenicity on modern elite cultivars consistent with the low-amplitude boom-and-bust cycle observed previously. It is possible that active copies of Molly transpose and contribute to both the birth and death of the transient groups. This study identified and characterised a fungal specific transposable element (TE) which plays a vital role in shaping Australian P. nodorum population structure and creating extensive genetic diversity which potentially leads to better adaptation of the pathogen. The study suggests practical measures to improve the efficiency and longevity of resistance breeding for SNB.

5

Knowledge-guided Bayesian optimization using pre-trained LLMs speeds up the identification of superior genotypes from germplasm collection

Hamazaki, K.; Tsuda, K.

2026-07-02 bioinformatics 10.64898/2026.06.28.735149 medRxiv

Top 1.0%

0.5%

Show abstract

Background: Germplasm collections contain wide genetic diversity that is valuable for plant breeding, but conducting phenotypic evaluation for all genotypes in field trials is rarely feasible. Bayesian optimization offers a way to decide, season by season, which genotypes to cultivate in order to identify superior genotypes with fewer evaluations. However, standard Bayesian optimization commonly starts from randomly selected genotypes and mainly relies on surrogate models built from marker genotype information, while the text-based passport information that accompanies germplasm is not fully used. We examined whether pre-trained large language models can provide prior knowledge that improves these decisions in germplasm evaluation. Results: We constructed a large-language-model-guided Bayesian optimization framework that introduces large language models into two parts of the Bayesian optimization workflow. In zero-shot warmstarting, a large language model proposes initial genotypes using passport information such as cultivar name, country of origin, and subpopulation, optionally together with principal component scores derived from genome-wide single-nucleotide-polymorphism markers. In addition, we evaluated a large-language-model-based surrogate model that predicts phenotypic values for untested genotypes using in-context learning from previously evaluated genotypes. Using a rice germplasm panel and two target traits (seed number per panicle for maximization and protein content for minimization), we compared strategies. For seed number per panicle, zero-shot warmstarting with a general-purpose instruction-following model reduced the number of evaluated genotypes needed to reach the best genotype, whereas improvements were small for protein content. When genomic information was available, Gaussian-process-based Bayesian optimization was the strongest overall approach, while the large-language-model-based surrogate model outperformed random baselines and was competitive in some settings. When genomic information was not available, predictions based on passport information improved efficiency compared with fully random strategies. Conclusions: Pre-trained large language models can inject useful agronomic knowledge into Bayesian optimization for germplasm evaluation, particularly by improving early-stage genotype selection, and can also support optimization when genomic information is unavailable. As models better handle long genomic sequences together with passport information, large-language-model-guided Bayesian optimization may become a practical and explainable decision-support approach for agricultural optimization.

6

Comparison of localGEBV and Optimal Haplotype Stacking Fitness Functions using a Novel R Package: HapSelect

Shaffer, W.; Papin, V.; Carter, Z.; Brunner, S. M.; Tong, J.; Villiers, K.; Robinson, H.; Voss-Fels, K.; Hayes, B. J.; Hickey, L.; Dinglasan, E.

2026-07-13 genetics 10.64898/2026.07.08.737160 medRxiv

Top 1%

0.5%

Show abstract

Haplotype-based breeding strategies have emerged as promising approaches to maximize long-term genetic gain by identifying complementary parental combinations while maintaining genetic diversity. However, these methods typically require phased genotypes and more intensive workflow pipelines and skillsets. We developed a novel local genomic estimated breeding value (localGEBV) fitness function with similar intent to the optimal haplotype stacking (OHS) framework fitness function and implemented both in the novel R package, HapSelect. Our aim was to evaluate whether phased haplotypes provide additional benefit over the more easily available dosage-based unphased genotypes in highly inbred crops. A subset of bread wheat nested association mapping (NAM) population comprising 444 lines genotyped with 6,054 DArT-Seq markers was analysed. Marker effects were estimated using rrBLUP, localGEBV and haplotype effects were calculated across linkage disequilibrium-defined haploblocks, and genetic algorithms (GA) were used to identify optimal sets of 30 founders using either a localGEBV derived fitness function with unphased, dosage inputs or the OHS fitness function with phased inputs. Selected parental sets were compared with conventional truncation selection (TS) through 150 generations of forward simulation. The OHS fitness function achieved a marginally greater optimized ultimate GEBV than the localGEBV fitness function during GA optimization, with only 18 of the 30 selected founders overlapped between the two methods. Despite these differences, forward simulations demonstrated nearly identical long-term genetic gain for localGEBV and OHS-selected founders, with both approaches outperforming conventional truncation selection by maintaining greater genetic diversity and delaying the genetic plateau. The minimal difference between localGEBV and OHS is likely attributable to the high homozygosity of the population, where localGEBV and haplotype effects are nearly confounded. These results demonstrate that dosage-based localGEBV provides a practical alternative to phased haplotype approaches for parent selection in inbred crops, substantially simplifying genomic workflows while maintaining long-term breeding performance. Future work should evaluate these methods in more diverse inbred populations and outbred species, where great haplotypic diversity may increase the advantage of true haplotype-based optimizations.

7

A chemical-genetic approach for stress-independent activation of the fission yeast stress-activated protein kinase pathway

Sawin, K. E.; Gupta, A.; Dudnakova, T.; Bayrak, B.; Kovac, A.; Modaffari, D.; Rodriguez-Rodriguez, A. I.; Scott, M. L.; Tay, Y. D.

2026-07-09 cell biology 10.64898/2026.06.30.735518 medRxiv

Top 1%

0.4%

Show abstract

BackgroundThe fission yeast stress-activated protein kinase (SAPK) pathway includes a conserved mitogen-activated protein (MAP) kinase cascade that regulates multiple cellular processes and is activated by several types of external stress. Understanding how Sty1, the MAP kinase in the SAPK pathway, controls these processes is complicated by the fact that different stressors can have stressor-specific effects that may be difficult to separate from the effects of Sty1 activation itself. Moreover, upon stress, Sty1 activation is usually short-lived. Previously, we developed a fission yeast strain, SISA, in which Sty1 kinase activity can be switched on in a sustained manner in the absence of external stress. This required combining multiple mutations in the SAPK pathway, including an analog-sensitive version of Sty1. When SISA cells are grown in the presence of analog-sensitive kinase inhibitors, Sty1 is inhibited, but when inhibitor is removed, Sty1 becomes hyperactive. While this strain was useful, it had several limitations. ResultsHere we describe and validate a more rationally-designed strain, SISA4, that retains the features of the original SISA strain while overcoming its limitations. SISA4 is more stable genetically than SISA, easier to use in genetic crosses, and easy to identify by phenotype or genotyping. We show that analog-sensitive kinase inhibitors 4-Amino-1-tert-butyl-3-(1-naphthylmethyl)pyrazolo[3,4-d]pyrimidine (1-NM-PP1) and 4-Amino-1-tert-butyl-3-(3-bromobenzyl)pyrazolo[3,4-d]pyrimidine (3-BrB-PP1) are equally potent for inhibiting analog-sensitive Sty1 in vivo, and we determine optimal inhibitor concentrations for converting SISA4 cells from a Sty1-inhibited state to a Sty1-hyperactive state. We also find that both 1-NM-PP1 and 3-BrB-PP1 have measurable off-target effects in wild-type cells, although these are modest and generally do not affect interpretation of experiments. Finally, using SISA4, we show that the Sty1-activated transcription factor Atf1 plays an unexpected role in maintaining cell-polarity disruption after Sty1 hyperactivation. ConclusionsSISA4 will be useful for investigating how SAPK pathway activation regulates diverse cellular processes.

8

Paralogs of the Candida albicans TLO gene family form interconnected functional networks with incomplete redundancy

Simonton, E.; Cangelosi, N.; Zhou, M.; Hendricks, P. S.; Woodruff, A. L.; Anderson, M. Z.

2026-07-02 genetics 10.64898/2026.06.29.735307 medRxiv

Top 1%

0.4%

Show abstract

Gene duplication typically fails to confer a selective advantage to an organism, prompting their removal from a population. In the rare instance that duplication either does not incur a fitness cost or it enhances fitness, gene families can form through repeating the duplication process. While the function of gene duplicates has been studied in detail, little work has explored how repeated duplication impacts paralog redundancy and may restrict the emergence of new paralogs or novel function. Here, we constructed a panel of single deletion mutants for each of the 14 members of the Candida albicans telomere-associated (TLO) gene family to test the redundancy in molecular and biological function among paralogs from a lineage-specific expansion. Tlo proteins function as interchangeable subunits of the Mediator transcriptional regulatory complex and have the potential to alter gene expression and an array of cellular responses. Redundancy was the most common outcome, being observed for approximately 80% of the phenotypic assays in strains lacking single TLO genes. However, mutants for all 14 paralogs displayed non-redundant functions in phenotypes ranging from carbon utilization to in vivo virulence. Analysis of gene expression in single TLO mutants found similar trends in redundancy, and loss of single TLOs disproportionately affected genes involved in filamentation, adhesion, redox reactions, and transporter activity at the cell surface. Importantly, sequence divergence between paralogs positively correlated with the frequency of altered phenotypes in single TLO mutants, indicating the acquisition of non-redundant function with increased evolutionary distance. Double mutants lacking two TLO genes produced both positive and negative synergistic phenotypes, suggesting that crosstalk or coordinated regulation is common among paralogs. Together, this study demonstrates that recently emergent paralogs acquire non-redundant functions despite often retaining redundancy with other gene family members to form a highly interconnected functional network.

9

Genetic Variation in Drosophila melanogaster Aggression

Gleason, J. M.; Kessen, C. M.; Verma, V.; Bath, E.

2026-07-09 genetics 10.64898/2026.07.04.736468 medRxiv

Top 1%

0.4%

Show abstract

Animals fight for resources to obtain fitness benefits; most contests are intrasexual, and males tend to fight more than females. Although the genetic basis of male aggression is well studied, we know little about the genetic variation of female aggression. Female aggression varies with reproductive status and is potentially influenced not only by her genotype, but also by the genotype of her mate. Here we measured both male and female aggression in a set of Drosophila melanogaster inbred lines by competing each line against a standard competitor. Aggression varied among lines for both sexes, but male and female aggression were not correlated. Female aggression for many lines increased with mating, as expected, but not all lines changed aggression. However, when females were mated to males of different lines, male genotype did not affect the post-mating change in aggression, suggesting that ejaculate-mediated effects do not vary across these lines. The aggression level of the standard opponent was positively correlated with that of focal individuals indicating that individuals modulate their behavior according to the genotype of their opponent.

10

Enhancing predictive accuracy of yield traits in cassava through multi-trait genomic prediction

de Freitas, G. M.; Certuche, D. S.; Jannink, J.-L.; de Oliveira, E. J.; Garcia, A. A. F.

2026-07-06 genetics 10.64898/2026.07.01.735838 medRxiv

Top 1%

0.4%

Show abstract

Multi-trait genomic prediction offers a practical route to improve selection for costly, complex traits in clonally propagated crops such as cassava. In a Brazilian breeding panel of 1,078 cassava clones genotyped with 25,923 SNPs and phenotyped for six agronomic traits, we compared single-trait (ST) and multi-trait (MT) GBLUP models. Stage-wise mixed models produced BLUEs that fed into ST and MT-GBLUP. We tested five cross-validation schemes that mimic breeder realities: ST baseline (CV1); naive all-traits MT prediction for unphenotyped candidates (CV2); MT prediction using auxiliary trait phenotypes in the test set (CV3); and two sparse-phenotyping regimes with missingness by trait (CV4) or by clone (CV5) at 25%, 50%, and 75% levels. The main results were that, under the ST baseline (CV1), predictive ability ranged from 0.50 for DMC and 0.45 for FRY down to 0.13 for Le.Dis. A naive full MT model (CV2) performed approximately on par with ST-GBLUP. In contrast, MT designs (CV3) that included informative auxiliary traits, such as shoot yield and combinations with plant vigor and leaf disease severity, yielded small gains for DMC with predictive ability of approximately 0.51 (+2%), while FRY predictive ability increased to approximately 0.65 (+44%), accompanied by RMSE reductions for FRY up to approximately 13.5% (e.g. RMSE approximately 6.2). Sparse-phenotyping simulations (CV4/CV5) demonstrated that MT models sustain or even improve predictive ability under realistic missing-data regimes (PA {approx} 0.62 - 0.65). Selection concordance between MT and ST top-10% sets was generally high (>0.80), and MT configurations produced measurable improvements in expected selection response and genetic gain per cycle for several target traits. These results indicate that strategically implemented MT-GBLUP, using a small set of biologically and operationally informative auxiliary traits and optimized sparse phenotyping, can materially increase predictive accuracy and selection efciency for economically critical cassava traits while reducing phenotyping burden.

11

Haplotype-specific chromosome painting unveils recombination patterns in the holocentric species Rhynchospora breviuscula H.Pfeiff.

Nascimento, T.; Marques, A.

2026-06-29 genetics 10.64898/2026.06.24.733714 medRxiv

Top 1%

0.4%

Show abstract

The genus Rhynchospora Vahl (beak-sedges) comprises approximately 381 accepted species with a worldwide distribution, all of which possess holocentric chromosomes, where centromeric activity is distributed almost along the entire chromosome. Despite the recent advances, the mechanisms governing the dynamics of meiotic recombination in holocentric plants remain poorly understood. Here, we developed haplotype-specific oligo-FISH probes for chromosomes 1, 2, and 3 based on a haplotype-phased genome assembly of Rhynchospora breviuscula (n = 5), enabling homolog-specific chromosome painting. Each probe set was labelled with a distinct fluorophore and hybridised in situ to metaphase chromosomes of the reference plant and seven F1 individuals derived from self-crossed reference plants. This approach allowed the unambiguous discrimination of homologous haplotypes and the indirect visualisation of crossover (CO) events in recombined chromosomes. We observed that recombination events were predominantly located in terminal chromosomal regions, consistent across individuals. These results corroborate previous findings from single-cell recombination mapping and provide independent cytological validation of the recombination landscape in this species. Our study establishes haplotype-specific chromosome painting as a robust tool for high-resolution mapping of meiotic recombination in holocentric plants across generations. Furthermore, these probes provided a foundation for future investigations into inverted meiosis, a mechanism characterized by an alternative pattern of chromosome segregation in holocentric species.

12

Gene model for the ortholog of tgo in Drosophila busckii

Perez, J.; Giunta, A. A.; Wittke-Thompson, J. K.

2026-07-01 genomics 10.64898/2026.06.26.734908 medRxiv

Top 2%

0.3%

Show abstract

Gene model for the ortholog of tango (tgo) in the Sep. 2015 (UC Berkeley ASM127793v1/DbusGB1) Genome Assembly (GenBank Accession: GCA_001277935.1) of Drosophila busckii. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

13

Towards genetic indicators in ectomycorrhizal fungi: estimating the effective population size

Champion, A.; Bazzicalupo, A.; Heuertz, M.; Gargiulo, R.

2026-07-03 genetics 10.64898/2026.06.30.735680 medRxiv

Top 2%

0.3%

Show abstract

Ectomycorrhizal (EM) fungi are vital to forest ecosystems, supporting tree growth and survival. However, their inclusion in conservation policy and action remains limited and little is known about the status of their genetic diversity, which is essential for their long-term survival and adaptation. The Global Biodiversity Framework adopted a genetic indicator based on the effective population size, Ne, to monitor genetic diversity in all species. To date, it is still uncertain how Ne, a key parameter, can be reliably assessed in species with complex life history traits. Ectomycorrhizal fungi are a highly diverse group of taxa displaying haplodiplontic life cycles with partially clonal reproduction. Here, we review the literature to understand how these life history traits might affect Ne and its estimation in six species of EM fungi. We estimated Ne in 19 populations using eight genetic and genomic datasets from selected studies. We compared Ne estimates using Linkage Disequilibrium (LD) and Sibship Frequency (SF) methods. We tested how Ne estimates change due to partial clonality and genetic structure gradients and whether the number of genetic markers influence the precision of the estimates. We show a systematic bias in Ne estimations when large clones are present and when populations are not correctly delimited. We found both methods are not robust to these factors, which makes them unreliable for conservation assessment purposes in EM fungi. This study provides new perspectives for further research into the links between life history traits and the effective population size of ectomycorrhizal fungi.

14

Ire1-triggered hxl1 mRNA splicing coordinates stress tolerance and virulence in the pathogenic fungus Trichosporon asahii

Shimizu, Y.; Matsumoto, Y.; Sugita, T.

2026-06-27 microbiology 10.64898/2026.06.27.734954 medRxiv

Top 2%

0.3%

Show abstract

The pathogenic fungus Trichosporon asahii causes severe mycoses in immunocompromised hosts, such as neutropenic patients. In Cryptococcus neoformans, the unfolded protein response (UPR) sensor Ire1 induces hxl1 mRNA splicing and contributes to stress responses and virulence. The function of Ire1-triggered hxl1 mRNA splicing in stress tolerance and virulence of T. asahii, however, remains unclear. Here, we demonstrated that ire1- and hxl1 gene-deficient T. asahii mutants are sensitive to dithiothreitol (DTT), an inducer of endoplasmic reticulum stress, and exhibit reduced virulence in a silkworm infection model. DTT treatment induced hxl1 mRNA splicing in the wild-type strain, whereas ire1 gene-deficient mutants did not undergo hxl1 mRNA splicing. The ire1 gene-deficient mutants were more sensitive than the parent strain to DTT, H2O2, Congo red, and SDS, and showed reduced virulence in silkworms. Similarly, hxl1 gene-deficient mutants exhibited increased sensitivity to these stressors and reduced virulence. Both the ire1 gene-deficient and hxl1 gene-deficient mutants showed decreased expression of reactive oxygen species-detoxifying related genes CAT2, SOD1, and SOD2, compared with the parent strain. Together, these findings suggest that Ire1-triggered hxl1 mRNA splicing contributes to stress resistance and virulence in T. asahii.

15

Novel Drosophila cis-regulatory elements can be uncovered by footprinting transcription factor binding sites in ATAC-seq data

Mei, C.; Ness, J.; Nakai, K.; Wunderlich, Z.

2026-06-25 genomics 10.64898/2026.06.22.733832 medRxiv

Top 2%

0.3%

Show abstract

Developmental processes depend on carefully coordinated gene expression. Expression is modulated by the binding of transcription factors (TFs) to cis-regulatory elements (CREs), like enhancers and promoters. Many computational and experimental approaches have been developed to find CREs, particularly enhancers, in the genome, each with strengths and caveats. Given the increasing availability of ATAC-seq data and methods to find TF binding therein, we hypothesized that we could use TF footprinting tools to find clusters of TF binding events within accessible chromatin that may act as CREs. Using Drosophila anterior-posterior patterning network as a test bed, we used a digital genomic footprinting tool (DGT), TOBIAS, on previously published early embryo ATAC-seq data to characterize the TF footprint landscape of 16 TFs essential for embryonic patterning. Even in this system, with its extensive enhancer annotation, most footprinted TF binding sites lie outside of known enhancers, with intergenic and intronic regions hosting the highest TF footprint count, albeit at low density. To find potential novel enhancers, we identified high-density TF footprint clusters that are highly conserved and overlap with active enhancer histone mark signals. Five high confidence candidates were selected for reporter assay validation and all five were found to drive spatially patterned expression in the embryo. This study shows that even in a highly characterized system, the analysis of footprinted TF binding sites in ATAC-seq data can uncover new regulatory regions and suggests this approach may be helpful in using existing ATAC-seq data to find novel CREs. ARTICLE SUMMARYGiven the increasing availability of ATAC-seq datasets, workflows to exploit the data to uncover new cis-regulatory elements (CREs), including enhancers, are valuable. Using early anterior-posterior patterning in the Drosophila embryo as a test case, we find that previously published transcription factor footprinting tools and ATAC-seq data can be analyzed to yield new candidate CREs. Experimental validation confirms the activity of selected candidate CREs, suggesting that existing data can be analyzed to find novel regulatory elements.

16

Genome-Wide Markers Predict Metribuzin Tolerance in Southern Soft Red Winter Wheat

Sellani, J.; Anzueto, H.; Arcenaux, K.; Price, P. T.; Brown-Guedira, G.; Harrison, S.; DeWitt, N.

2026-07-03 genomics 10.64898/2026.06.28.733875 medRxiv

Top 2%

0.3%

Show abstract

Metribuzin is a versatile herbicide effective against various annual grasses and broadleaf weeds found in wheat fields. However, it can cause foliar damage to wheat, impacting plant health and yield. A clearer understanding of the genetic architecture associated with metribuzin tolerance is necessary to guide marker-based breeding strategies. This study evaluated 351 historic Gulf Atlantic Wheat Nursery (GAWN) wheat breeding lines representative of southern US soft red winter wheat (SRWW) germplasm. Field trials were conducted at Winnsboro (WN) and Baton Rouge (BR), Louisiana, in 2016 and 2017. Metribuzin was applied at specific growth stages[DN1.1], and tolerance was assessed based on visual foliar damage. Genomic data from 6,252 filtered single nucleotide polymorphism (SNP) markers were used to estimate narrow-sense heritability, conduct genome-wide association (GWAS), and assess genomic prediction accuracy using genomic best linear unbiased prediction (GBLUP). Broad-sense heritability ranged from 0.54 to 0.69 within environments and reached 0.77 across environments, while narrow-sense heritability ranged from 0.35 to 0.47, indicating moderate additive genetic control. No SNP surpassed the significance threshold, but genomic prediction (GP) showed moderate to strong predictive ability (PA) across environments, with the highest accuracy (r = 0.62) observed between BR17 and WN17. These results indicate that metribuzin tolerance in SRWW is primarily controlled by multiple small-effect loci and that GS provides a more effective breeding strategy than marker-assisted selection for improving tolerance in southern wheat germplasm.

17

Gene model for the ortholog of raptor in Drosophila grimshawi

Lieser, B. C.; Lose, B.; Kiser, C. A.; Butterfield, S.; Laschober, L.; Laskowski, L. F.; Nielsen, J.; Pulford, J.; Thompson, J. S.; Rele, C. P.; Wittke-Thompson, J. K.

2026-07-11 genomics 10.64898/2026.07.07.737051 medRxiv

Top 2%

0.3%

Show abstract

Gene model for the ortholog of raptor in the D. grimshawi May 2011 (Agencourt dgri_caf1/DgriCAF1) Genome Assembly (GenBank Accession: GCA_000005155.1) of Drosophila grimshawi. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

18

Heat stress drives opposing redox shifts in temperate versus tropical Drosophila melanogaster embryos

O'Leary, T. S.; Lockwood, B. L.

2026-07-03 evolutionary biology 10.64898/2026.06.30.733001 medRxiv

Top 2%

0.3%

Show abstract

Redox balance is central to aerobic metabolism, yet acute heat stress can destabilize this balance by increasing metabolic rates and shifting the balance of critical electron carriers such as NADH. In early Drosophila melanogaster embryos, maintaining redox balance is particularly critical as embryos undergo a developmental redox shift and rely on oxidative phosphorylation to power nuclear divisions. Here, we assayed six isofemale D. melanogaster lines from temperate (Vermont, USA; France; Japan) and tropical (St. Kitts; Ghana; India) climates to assess metabolic responses to heat in heat-sensitive versus heat-tolerant embryos. We used untargeted LC--MS to measure 33 metabolites and the major redox couples (NADH/NAD+, NADPH/NADP+, and GSH/GSSG) at 25{degrees}C and after a 32{degrees}C heat shock. In all embryos, heat shock induced shared shifts in metabolic profiles, with increases in nucleotide monophosphates (e.g., AMP, CMP, and GMP) and amino acids (e.g., alanine, glutamic acid, serine). In contrast, redox metabolites diverged by region: heat-sensitive temperate embryos shifted toward a more oxidized state (46.6% decrease in NADH/NAD+ ratio and 4-fold increase in oxidized glutathione), while heat-tolerant tropical embryos maintained glutathione balance and increased the NADH/NAD+ ratio by 52.9%, indicating a more reduced state. These patterns are consistent with higher NADH oxidation and greater oxidative stress (inferred from oxidized glutathione) in the temperate embryos, versus better maintenance of redox balance in tropical embryos. Together, our results suggest that maintaining redox balance is a key determinant of acute heat tolerance, and healthy development overall, during early embryogenesis.

19

Comparison of directional random walk and weighted least squares modeling of sparse fossil data

Ergon, R.

2026-07-01 evolutionary biology 10.64898/2026.06.26.734751 medRxiv

Top 2%

0.3%

Show abstract

The general random walk model (GRW) of Hunt (2006) is used to infer directional evolution in mean trait values from sparse fossil data by modeling phenotypic change as the accumulated result of small steps with mean step sizes and step variances. Using simulations and real data cases, Ergon (2026) showed that the step variances can be estimated reasonably well only when the mean trait values have small measurement errors, while for fossil data with realistic measurement errors they appear to be extremely difficult to find, and they are often found to be negative. In the simulations Ergon (2026) assumed that the true phenotypic mean values were known. Here, I essentially repeat these simulations under the assumption that only mean trait values with large measurement errors are known, and based on weighted mean squared error (WMSE) comparisons the conclusion is that weighted least squares (WLS) is a better method than GRW. A second conclusion is that WLS is a better method also in the possibly rare cases with large measurement errors where the GRW parameters are estimated well. The GRW method is simply not flexible enough to handle such cases. A third conclusion is that Akaike Information Criterion (AIC) results for GRW models with large measurement errors relative to the step variance may be overly optimistic.

20

Gene model for the ortholog of raptor in Drosophila erecta

Backlund, A. E.; Nielsen, J.; Pulford, J.; Cook, B.; Anderson, J.; Robert, M.; Thompson, J. S.; Rele, C. P.; Wittke-Thompson, J. K.

2026-07-14 genomics 10.64898/2026.07.09.737526 medRxiv

Top 2%

0.3%

Show abstract

Gene model for the ortholog of raptor in the May 2011 (Agencourt Dere_CAF1/DereCAF1) Genome Assembly (GenBank Accession: GCA_000005135.1) of Drosophila erecta. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.