G3
◐ Oxford University Press (OUP)
Preprints posted in the last 30 days, ranked by how well they match G3's content profile, based on 33 papers previously published here. The average preprint has a 0.02% match score for this journal, so anything above that is already an above-average fit.
Adamu Bukari, A.-R.; Sidney, B.; Gerstein, A. C.
Show abstract
Nakaseomyces glabratus is a globally distributed opportunistic fungal pathogen. An ongoing discussion in studies of N. glabratus population structure has been whether genetic clusters are best defined using multilocus sequence typing (MLST) or short-read whole-genome sequencing (WGS). To assess the concordance between MLST- and WGS-based phylogenies, we analyzed a dataset of 548 N. glabratus WGS sequences from 12 countries. Clusters identified from WGS largely recapitulated the MLST-defined sequence type (ST) groups: fourteen WGS clusters were composed of a single MLST ST, and the remaining contained STs with very closely related MLST profiles. We thus propose a pragmatic naming convention, consistent with the system used in other microbial species, which specifies WGS cluster labels based on the primary ST. From the large WGS isolate dataset, we determined the prevalence of admixture and genomic variants. Interestingly, seven of the nine singleton isolates were admixed, in addition to 58 isolates from six different clusters. Aneuploidy was detected in 4% of isolates, most commonly in chrE, which contains ERG11, the gene encoding the enzyme targeted by azole antifungals. Aneuploid chromosomes did not exhibit elevated heterozygosity relative to the sequencing error rate, consistent with instability of extra chromosome copies. Copy number variants were found in 3% of the isolates; some of the CNVs co-occurred with aneuploidies, and were primarily identified on chrD, chrE, chrI, and chrM. Our findings demonstrate that deep splits between clusters preserve the utility of MLST ST designations for clade-level designation, yet underscore the utility of WGS for high-resolution genomic analyses. Article SummaryThere is an ongoing debate in studies on Nakaseomyces glabratus about whether traditional MLST analysis is sufficient to determine population structure, or whether the precision of whole genome sequencing (WGS) is necessary. We analyzed WGS data from 548 isolates from around the world. We found a very strong agreement between the two methods. We propose a hybrid naming system, where cluster names are based on the dominant MLST group. We used the WGS data to show that admixed isolates, and those with extra chromosomes or CNVs are rare (<7% of isolates in each class) and are distributed throughout the phylogeny.
Hodehou, D. A. T.; Diatta, C.; Bodian, S.; Ndour, M.; Sambakhe, D.; Sine, B.; Felderhoff, T.; Diouf, D.; Morris, G. P.; Kane, N. A.; Faye, J. M.
Show abstract
Grain mold severely constrains sorghum [Sorghum bicolor (L.) Moench] productivity and grain quality in subhumid environments. Photoperiod-sensitive flowering plays a key role in mold avoidance and yield stability along north-south rainfall gradients. In response to the high susceptibility of elite cultivars in subhumid zones of Senegal, we developed and characterized a recombinant inbred line (RIL) population derived from Nganda (grain mold-susceptible) and Grinkan (photoperiod-sensitive) varieties. The population was evaluated across three distinct agro-ecological zones over two years. Environmental indices derived from genotype-environmental interactions, together with defined growth windows, strongly influenced flag leaf appearance (FLA), a photoperiodic flowering trait. Plasticity parameters (intercept and slope) for environmental indices, FLA, grain mold severity, and yield enabled identification of loci contributing to flowering response, mold resistance, and yield stability. The maturity gene Ma1 and two QTLs for FLA, qFLA6.2 and qFLA6.3, were identified, stable across environments, and colocalized with grain mold and yield QTLs. The wild-type Ma1 allele from Grinkan delayed FLA and reduced grain mold damage but was not associated with increased yield. The Ma1 effect was confirmed using the developed breeder-friendly KASP marker, Sbv3.1_06_40312464K, in 174 F3 three-way cross families. Photoperiod-sensitive lines with intermediate-to-late FLA alleles showed strong negative associations with mold damage. Overall, the identified stable loci and candidate lines provide foundations for effective molecular breeding of climate-resilient varieties. PLAIN LANGUAGE SUMMARYGrain mold is a fungal disease that reduces sorghum grain yield and quality, particularly in subhumid climates. With the limited number of resistant elite varieties, photoperiod-sensitive flowering to day length variation can contribute to grain mold escape at the end of rainy seasons. We characterized 286 sorghum recombinant inbred lines across three contrasting environments over two years along rainfall gradients in Senegal. Using flag leaf appearance (FLA), which is a photoperiodic flowering trait, strong genotype-environment interactions for FLA and genotypic plasticity were revealed. We identified and validated the common genomic locus associated with FLA variation and its plasticity across environments, the canonical maturity gene Ma1, which was influenced by temperature variation across environments. The presence of Ma1 in the background of photoperiod-sensitive lines enhances grain mold avoidance and yield stability along rainfall gradients in Senegal. CORE IDEASO_LIWe investigated photoperiodic flowering plasticity in sorghum as a contributor to grain mold resistance and yield stability along rainfall gradients. C_LIO_LIThe Maturity locus Ma1 (qFLA6.1) is the major contributor of photoperiodic flowering and its plasticity across semi-arid and subhumid environments. C_LIO_LIHybrid genotypes carrying two stable loci qFLA6.1 and qFLA6.2 sustain high grain mold avoidance in diverse environments. C_LIO_LIPhotoperiod-sensitive lines with medium to late flowering times are effective in avoiding grain mold, while maintaining yield stability in subhumid regions. C_LI
Madrigal, M.; Dowell, J. A.; Moseley, J. C.; Kliebenstein, D.
Show abstract
Botrytis cinerea is a necrotrophic fungal pathogen that infects thousands of plant species. During infection, these diverse plant hosts produce different specialized metabolites that can inhibit pathogen growth and shape pathogen fitness. However, the genetic architecture of pathogen resistance toward individual host defense metabolites remains poorly understood. To address this question, we exposed 83 B. cinerea isolates to the metabolite linalool and quantified metabolic and structural responses. Exposure revealed extensive phenotypic diversity across isolates. Genome-wide association identified 101 genes of interest associated with membrane transport and stress response regulation. Genetic associations were stronger for morphological traits than for metabolic traits, suggesting that hyphal architecture may have a complex genetic architecture contributing to linalool resistance. Together, these results establish natural variation in linalool response and provide candidate loci for understanding how generalist pathogens respond to host-derived chemical defenses. Article SummaryTo understand how a generalist pathogen responds to host defenses, we asked how Botrytis cinerea responds to linalool, a widespread monoterpene involved in plant defense. We exposed 83 B. cinerea isolates to 1000 {micro}M of linalool for 72 hours and quantified metabolic traits (growth curves and growth dynamics over time) and morphological traits (hyphal network features). Using GWA, we linked phenotypic variation to genetic variants. Results indicate substantial natural variation in linalool resistance and distinct genetic architectures across trait classes: metabolic responses are driven by a relatively small number of loci with larger effects, whereas structural/morphological responses appear more polygenic.
Kuster, R. D.; Sisler, P.; Sandhu, K.; Yin, L.; Niece, S.; Krueger, R.; Dardick, C.; Keremane, M.; Ramadugu, C.; Staton, M. E.
Show abstract
BackgroundPangenomes are a promising new approach to genomics that can reduce reference bias in genotyping, but the reliability of such a data model remains unclear in tracking variation across species. To test the utility of graph-based pangenomes for interspecific breeding, we developed a Minigraph-Cactus super-pangenome representing four Citrus species derived from the founder lines of a citrus breeding program. To benchmark SNP calling accuracy using graph and linear-based approaches, we performed whole genome short read sequencing for two sets of pedigreed progeny: 30 F1 hybrids and 244 advanced hybrids from an F1 crossed with a parent not included in the pangenome. ResultsThe linear approach yielded more SNP calls than the graph-based approach, however, both methods exhibited similar Mendelian Inheritance Error Rates (MIER) in a tool-dependent manner. Reconstruction of parental haplotype blocks in the advanced hybrids revealed a striking improvement in performance in the pangenome graph-based calls, suggesting MIER is vulnerable to error when reference bias influences both parental and progeny genotype calls. Masking of regions diverged from the reference path improved MIER accuracy metrics and haplotype block reconstruction in both the linear and graph-based SNP calls. ConclusionsIn non-model systems, inheritance patterns observed from pedigreed hybrids provide a framework for benchmarking variant-calling accuracy using pangenomes. SNP miscalls originating from diverged regions can falsely satisfy MIER filters, thus we recommend haplotype blocks. The inherent structure of the pangenome graph has promising applications for removing regions of unreliable mapping quality, which cannot otherwise be reliably removed using traditional filtering metrics.
Kesälahti, R.; Cervantes, S.; Niskanen, A.; Pyhäjärvi, T.
Show abstract
Genomic imprinting is a rare epigenetic phenomenon in plants and animals, defined by parent-of-origin specific gene expression. Its molecular mechanisms and evolutionary significance remain incompletely understood. In this study, we investigated whether genomic imprinting occurs in Scots pine and, by extension, in other conifers to gain insight into the evolutionary origins of imprinting. We performed reciprocal crosses to assess imprinting in seed embryos and applied a unique approach that used exome-capture data from the haploid, maternally inherited megagametophyte tissue to identify maternal alleles, thereby allowing us to infer paternal alleles in the embryos of the same seeds. Our findings show that maternally inherited haploid megagametophyte tissue offers an effective strategy for resolving parental alleles in offspring while simultaneously removing extensive paralogous variation from the dataset. This framework is broadly applicable to other conifer species and to taxa that possess comparable maternally derived haploid tissues. No evidence of genomic imprinting was detected. Although the limited overlap between the exome-capture and RNA-sequencing datasets and the stringent paralog filtering reduced the amount of analyzable data considerably, the absence of detectable imprinting may also reflect genuinely weak or absent imprinting signals in conifers. We identified several limitations in this preliminary study and outline recommendations for future work to overcome them, and additional research will be necessary to determine whether genomic imprinting occurs in conifers
Bankina, B.; Fomins, N.; Gudra, D.; Kaneps, J.; Bimsteine, G.; Roga, A.; Stoddard, F.; Fridmanis, D.
Show abstract
Leaf diseases pose a serious threat to faba bean production. Leaf blotch of faba bean, caused by Alternaria spp., has become increasingly widespread and destructive in several countries. Leaf diseases pose a serious threat to faba bean production. The infection of plant by pathogens can be influenced by various factors associated with the host plant, environmental conditions and presence of other microorganisms. The phyllosphere and endosphere play a critical role in plant health and disease development. This study aimed to evaluate the factors shaping the structure and diversity of fungal communities associated with faba beans. Plant samples were collected in 2004 from two intensively managed faba bean production fields in the central region of Latvia. Fungal assemblages were characterized using an ITS region metabarcoding approach based on Illumina MiSeq sequencing. Among the assigned amplicon sequence variant (AVS), 65% belonged to the phylum Ascomycota, while approximately 4% were classified as Basidiomycota. Alternaria and Cladosporium were the dominant genera across samples. The alfa and beta diversities of fungal communities was higher during flowering of faba beans to compare with ripening. The higher abundance of Basidiomycota yeasts were observed during flowering, in contrast, Cladosporium genus was significantly more abundant during ripening. Alternaria DNA was found on leaves that showed no symptoms of the disease. The diversity and composition of fungal communities were significantly influenced by sampling time and presence of leaf blotch, caused by Alternaria spp.
Couturier, F.; Cravero, C.; Lesur, I.; Confais, J.; Belmonte, E.; Piat, L.; Marande, W.; Rellstab, C.; Valbuena, M.; Saez-Laguna, E.; Duvaux, L.
Show abstract
We present a genome assembly from a specimen of Quercus canariensis (Fagaceae; Fagales; Magnoliopsida). The assembly was generated using PacBio HiFi long reads with an approximate sequencing depth of 39X and scaffolded using a reference-guided approach. The genome sequence has a total length of 816.0 megabases for haplotype 1 and 804.8 megabases for haplotype 2. The two haplotypes are each resolved into 12 chromosomal pseudomolecules, with only 3.48% and 1.36% of sequences remaining unplaced in haplotypes 1 and 2, respectively. Assembly completeness is supported by BUSCO scores of 98.3% and 98.2% complete genes for haplotypes 1 and 2, respectively. Structural annotation identified 51,882 and 46,482 protein-coding genes in haplotypes 1 and 2, respectively. This genome assembly provides the first chromosome-scale reference genome for Q. canariensis, laying the base for future genomic and evolutionary studies in this understudied species of the hybridizing white oak species complex. TaxonomyLineage cellular organisms; Eukaryota; Viridiplantae; Streptophyta; Embryophyta; Tracheophyta; Spermatophyta; Magnoliopsida; eudicotyledons; Gunneridae; Pentapetalae; rosids; fabids; Fagales; Fagaceae; Quercus EBI:txid568684 Quercus canariensis Willd. 1809 (Willdenow)
Sattler, M. C.; Singh, A.; Bass, H. W.; Mondin, M.
Show abstract
BackgroundMaize knobs are regions of constitutive heterochromatin that are readily identified in both meiotic and somatic chromosomes. These structures have been characterized as stable throughout the cell cycle, exhibiting late replication during the S-phase, and are composed of two specific families of highly repetitive DNA sequences: K180 and TR-1. Although widely used as cytogenetic markers due to their variability in number and chromosomal position across inbred lines, hybrids, and landraces, little is known about their chromatin structure and dynamics. In this study, we analyzed chromatin accessibility of knobs using DNS-seq data across four maize tissues representing distinct developmental stages. ResultsOur results reveal that K180 knobs exhibit tissue-specific variation in chromatin accessibility, transitioning between open and closed states during development. In contrast, the TR-1 knob of chromosome 4 remained consistently inaccessible across all tissues analyzed. A knob composed of both K180, and TR-1 further supported this observation, with only the K180 region showing dynamic accessibility. To validate these findings, we also analyzed other repetitive regions such as centromeres, which showed a uniformly closed chromatin structure similar to TR-1. These results suggest a unique developmental modulation of chromatin accessibility associated with K180 repeats. While the chromatin accessibility of knobs does not reach the levels observed at Transcription Start Sites (TSS), the comparison among different classes of repetitive DNA within maize constitutive heterochromatin provides compelling evidence for sequence-specific and tissue-specific chromatin dynamics. ConclusionsOur findings uncover a previously unrecognized property of maize knobs and establish a reference for future studies on chromatin organization and epigenetic regulation of repetitive DNA in plant genomes.
Shaik, A.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Kjeldsen, J. B.; Jorgensen, U.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Proma, S.; Garcia-Abadillo, J.; Jarquin, D.
Show abstract
Environmental factors affect crop growth and development thus their consideration across sites and years become essential for genotypic evaluation. Genomic selection (GS) has been broadly implemented to accelerate breeding cycles by skipping field evaluations thus allowing early identification of outperforming genotypes. In this study, 7,740 phenotypic records corresponding to 516 Miscanthus sacchariflorus genotypes evaluated in five locations across three years were considered for analysis. Additionally, environmental data on six weather covariates was implemented to characterize similarities between locations. Different sets of locations of variable sizes were used for model calibration based on two cross-validations (CV00 and CV0) schemes leaving out one location at a time. Predictive ability across locations of the best model varied between 0.45 and 0.90 for both schemes. These results were compared to associate predictive ability in function of weather patterns between training and testing sets to allow models calibration optimization. We found it is feasible to optimize resource allocation by considering environmentally correlated sets. In most cases, the information from only one and, at most, two locations were enough to deliver better results than using all four locations, reducing training sets by up to 75%. The results obtained shed light on helping breeders make informed decisions considering weather data when designing evaluations.
Hayes, R. A.; Kern, A. D.; Ponisio, L. C.
Show abstract
Pollen is a robust and widespread substance that captures a historical snapshot of a specific time and place, and it can be used to track movements through space by examining the pollen deposited on various objects. Palynology, the study of pollen, is used across fields such as conservation, natural history, and forensics, where it is particularly useful for tracing the origin and movement of objects. However, pollen has remained underutilized due to the difficulty of distinguishing many pollen taxa beyond the family level and limited pollen reference material to support location predictions. With recent developments in pollen DNA metabarcoding these issues have been rectified, but much of the available pollen data are primarily from wind-pollinated species, which are widespread and less informative of specific sample locations. Bee-collected pollen presents an untapped resource in training predictive models to geolocate sample origin. Here we compiled bee-collected pollen DNA sequence relative abundance data from three projects in the western U.S. and assessed the accuracy of supervised machine learning models to predict the location of sample origin based solely on pollen assemblage, without the need of incorporating additional data. Random Forest and k-Nearest Neighbors models yielded high accuracy across all projects. We also found that models trained on taxonomically clustered pollen assigned sequence variants (ASVs) performed slightly better than those trained on raw sequence data, but the difference was minor, indicating that models trained on raw sequence data can reliably predict location and avoid the time-consuming taxonomic assignment process. Our results demonstrate the utility of repurposing bee-collected pollen for geolocation and provide a framework for employing supervised machine learning in future geolocation efforts. HighlightsO_LIBee-collected pollen metabarcoding data was used to accurately predict sample origin C_LIO_LIRandom Forest and k-Nearest Neighbors algorithms were most accurate with lowest error C_LIO_LITaxonomically-classified and raw DNA sequence data training sets performed comparably C_LI
Bachler, A.; Walsh, T. K.; Andrews, D.; Williams, M.; Tay, W. T.; Gordon, K. H.; James, B.; Fang, C.; Wang, L.; Wu, Y.; Stone, E. A.; Padovan, A.
Show abstract
BackgroundThe cotton bollworm Helicoverpa armigera is a major global pest controlled by genetically engineered crops expressing Bacillus thuringiensis (Bt) toxins, including Vip3Aa. While Vip3Aa is widely deployed, the genetic basis of resistance remains poorly understood. Previous work identified disruption of a thyroglobulin-like gene (HaVipR1) as one mechanism of resistance, suggesting additional loci may be involved. ResultsUsing linkage analysis, transcriptomics, long-read sequencing, and CRISPR-Cas9 gene editing, we identify a second thyroglobulin-like gene, HaVipR2, as a novel mediator of Vip3Aa resistance. Resistance in a field-derived H. armigera line was shown to be monogenic, recessive, and autosomal, mapping to chromosome 29. Long-read sequencing revealed a [~]16 kb transposable element insertion disrupting HaVipR2, which was undetectable using standard short-read approaches. CRISPR-Cas9 knockout of HaVipR2 conferred >900-fold resistance, confirming its causal role. Comparative analyses show that HaVipR1 and HaVipR2 share conserved domain architecture, indicating that thyroglobulin-domain proteins represent a recurrent target of resistance evolution. ConclusionsOur findings establish thyroglobulin-domain proteins as a new class of Bt resistance genes in Lepidoptera and demonstrate that transposable element insertions can drive adaptive resistance while evading detection by conventional methods. These results highlight the importance of long-read sequencing and accurate genome annotation for resistance monitoring and provide new insights into the molecular basis and evolution of Vip3Aa resistance.
Proma, S.; Garcia-Abadillo, J.; Sagae, V. S.; Sacks, E.; Leakey, A. D. B.; Zhao, H.; Ghimire, B. K.; Lipka, A. E.; Njuguna, J. N.; Yu, C. Y.; Seong, E. S.; Yoo, J. H.; Nagano, H.; Anzoua, K. G.; Yamada, T.; Chebukin, P.; Jin, X.; Clark, L. V.; Petersen, K. K.; Peng, J.; Sabitov, A.; Dzyubenko, E.; Dzyubenko, N.; Glowacka, K.; Nascimento, M.; Campana Nascimento, A. C.; Dwiyanti, M. S.; Bagment, L.; Shaik, A.; Jarquin, D.
Show abstract
Genomic selection holds the potential to serve as a strategic tool to enhance the genetic gain of complex traits in Miscanthus breeding programs. The development of improved cultivars requires their assessment for various traits across diverse environments to ensure suitable overall performance. Hence, the multi-trait multi-environment (MTME) genomic prediction (GP) models offer an opportunity to improve selection accuracy. This study aims to evaluate the potential of five GP models: (1) three MTME models including genotype-by-trait-by-environment interaction (GxExT) and (2) two single-trait multi-environment (STME) models (with and without GxE interaction). A Miscanthus sacchariflorus population comprising 336 genotypes evaluated in three environments and scored for four traits (biomass yield YDY, total culm number TCM, average internode length AIL, and culm node number CNN) was analyzed. The predictive ability of the models was evaluated considering three cross-validation schemes resembling realistic scenarios (CV1: predicting new genotypes, CVP: predicting missing traits in a given environment, and CV2: predicting partially observed genotypes). On average, in all cross-validation schemes compared to the STME the predictive ability of the MTME models was 10% to 70% higher for TCM and AIL. On the other hand, for YDY and CNN, both STME models performed similarly or slightly better (between 5 to 64%) than the MTME models in most environments. While the MTME models were not successful for all traits when compared to their STME counterparts, MTME models improved the prediction of the performance of genotypes that were untested across environments or lacked trait information in a specific environment. Overall, our study suggests that MTME GP models can be implemented in Miscanthus breeding programs to improve the predictive ability of the complex traits, shorten breeding cycles, and accelerate selection decisions.
Herrighty, E. M.; Specht, C. D.; Gore, M. A.; Solano, L.; Estrada-Gamboa, J.; Hernandez, C. E.; Tufan, H. A.; Landis, J. B.
Show abstract
Understanding crop genetic diversity is essential for conservation and breeding, yet farmer-maintained germplasm remains largely underrepresented in genomic studies. Theobroma cacao L. has a complex domestication history and extensive global diversity, and cacao currently cultivated in Central America, particularly in Costa Rica, has been understudied compared to South American and Mexican cultivars despite cultural and historical importance. In this study, we investigate the genetic diversity of cacao from farmer-managed systems across Costa Rica to search for Criollo germplasm and identify and characterize any unique local genetic groups. Ninety-four trees were sampled from 17 farms across four regions of the country and sequenced using whole genome resequencing. Farmer materials were analyzed alongside 166 previously characterized reference accessions representing major cacao genetic groups. Population structure analyses, phylogenetic reconstruction, and network approaches revealed that Costa Rican cacao encompasses multiple known genetic groups, including Criollo-derived lineages, while also harboring locally distinct diversity not fully represented in current global reference collections. Analyses revealed close kinship between many accessions with no clear geographic patterns corresponding to the observed population differentiation, reflecting the effects of farmers in creating dominant patterns of gene flow through seed-saving, clonal propagation, and sharing genotypes among farms. Heterozygosity levels varied substantially among individuals, consistent with a mixture of highly inbred Criollo trees and more heterozygous, admixed genotypes. We find that farmer-managed cacao systems are reservoirs of genetic diversity, including possibly rare or historically important lineages, underscoring the value of these farming systems for effective conservation and management of genomic resources for cacao resilience and improvement.
Li, F.; Lima, D.; Bashir, S.; Yadro Garcia, C.; Lopes, A. R.; Verbinnen, G.; de Graaf, D. C.; De Smet, L.; Rodriguez, A.; Rosa-Fontana, A.; Rufino, J.; Martin-Hernandez, R.; Medibees Consortium, ; Pinto, M. A.; Henriques, D.
Show abstract
The western honey bee (Apis mellifera) is an essential pollinator facing unprecedented threats from pesticide exposure. While pesticide resistance evolution is well documented in agricultural pests, our understanding of genetic variation in honey bee detoxification systems remains limited. This represents a missed opportunity, as harnessing naturally occurring detoxification diversity could provide new avenues for pollinator protection. Cytochrome P450 monooxygenases (CYPs), which are central to xenobiotic metabolism, offer a promising starting point. Here, we present the first comprehensive analysis of CYP genetic diversity in A. mellifera. We analysed the CYPome of 1,467 individuals representing 18 A. mellifera subspecies from 25 countries and identified 5,756 single-nucleotide polymorphisms (SNPs) in 46 CYP genes. Imputed McDonald-Kreitman testing revealed that 56% of non-synonymous CYP substitutions were driven by positive selection. Of the 1,302 haplotypes identified, 84% resided in CYP3, concentrated in the CYP9 and CYP6AS subfamilies implicated in xenobiotic detoxification. Population-level analysis of nucleotide diversity, Tajimas D selection signatures, FST-based differentiation, and McDonald-Kreitman testing pointed to CYP3 clan genes as the primary locus of adaptive variation. This work provides the first step toward building a comprehensive pharmacogenomic resource for honey bees, enabling the prediction of population-specific pesticide vulnerabilities and leveraging naturally occurring detoxification variants to enhance pollinator resilience - a critical step toward sustainable pollinator management.
Iitsuka, R.; Haruta, N.; Oomura, S.; Sugimoto, A.
Show abstract
Dauer larvae are a dormant developmental stage in nematodes that is induced by a range of environmental cues. The molecular mechanisms that transduce these cues to regulate dauer entry have been well characterized in Caenorhabditis elegans, whereas those in other nematode species remain unclear. The closest known sibling species of C. elegans, Caenorhabditis inopinata, occupies a distinct ecological niche and shows an extremely low frequency of dauer formation by starvation in laboratory conditions, suggesting that it could serve as a useful comparative model for analyzing dauer-inducing mechanisms. To support such analysis, we generated a fluorescent dauer reporter, Cin-col-183p::mCherry, in C. inopinata based on a previously reported dauer-specific reporter in C. elegans. This reporter showed fluorescence specifically in the pre-dauer and dauer stages, but not in other developmental stages, indicating that it functions as a dauer-specific marker in C. inopinata. Using these marker strains, we compared the responses to high temperature and RNAi-mediated knockdown of insulin/IGF-1 pathway genes (daf-2, age-1, and pdk-1), and found that dauer induction differs mechanistically between C. elegans and C. inopinata. This dauer-specific fluorescent strain will be a useful tool for investigating the diversity of dauer-inducing mechanisms across nematode species. Article SummaryDauer is a dormant developmental stage in nematodes induced by environmental stress. Although its regulation is well studied in Caenorhabditis elegans, the mechanisms in other species remain unclear. Here, we developed a fluorescent dauer reporter, Cin-col-183p::mCherry, in Caenorhabditis inopinata, a close relative of C. elegans. The reporter was specifically expressed in pre-dauer and dauer stages, confirming its usefulness as a dauer marker. Using this strain, we found that responses to high temperature and insulin/IGF-1 pathway gene knockdown differ between C. elegans and C. inopinata. This reporter will help reveal diversity in dauer-inducing mechanisms across nematode species.
Guilbaud, R.; Bagnoli, F.; Ben-Sadoun, S.; Biselli, C.; Buret, C.; Buiteveld, J.; Cativelli, L.; Copini, P.; Drouaud, J.; Esselink, D.; Fricano, A.; Benoit, V.; Kelly, L. J.; Kodde, L.; Metheringham, C. L.; Pinosio, S.; Rogier, O.; Segura, V.; Spanu, I.; Tumino, G.; Buggs, R. J.; Gonzalez-Martinez, S. C.; Vietto, L.; Nervo, G.; Jorge, V.; Dowkiw, A.; Smulders, M. J.; Sanchez, L.; Vendramin, G. G.; Bastien, C.; Faivre Rampant, P.
Show abstract
Within the framework of the European Adaptive BREEDING for Better FORESTs project (B4EST, https://b4est.eu/), we have developed genotyping tools for Poplar, Ash, and Pine forest tree species. SNP arrays are attractive genotyping tools because of the user-friendly genotype calling system and the robust transferability among laboratories. Here we describe the development of an Axiom SNP array for Pinus pinaster (13,407 SNPs), Pinus pinea (5,671 SNPs), Poplar spp. (13,408 SNPs), and Fraxinus spp. (13,407 SNPs) based on a two-step process. We first assembled a high-density (>100,000 SNPs/species) screening array that served to test a large panel of candidate SNPs on a diversity panel involving at least 120 individual trees per species or species group. In the second step, we selected and combined the most informative SNPs to build the final 50,000 SNP 4TREE array. This approach resulted in high genotyping success rates, including for species lacking previously validated high-quality SNP resources. The 4TREE SNP array provides a valuable and transferable genomic tool to support genomic prediction, breeding, and adaptive management of forest tree species.
Oiki, S.; Abe, M.; Hirasawa, A.; Koizumi, A.; Otani, A.; Shinohara, T.; Miyazaki, Y.
Show abstract
Candida auris (Candidozyma auris) is an emerging multidrug-resistant fungal pathogen that poses a significant global health threat. However, the molecular mechanisms underlying its virulence remain incompletely understood. In this study, we performed in vivo transcriptome analysis using an immunosuppressed mouse gastrointestinal infection model to identify genes associated with host-adaptation and virulence during infection. By comparing fungal transcriptomes obtained from colonization and dissemination sites with those from in vitro cultures, we identified genes that were consistently upregulated during infection. Among these genes, the unfolded protein response regulator HAC1 was selected as a candidate virulence-associated gene for further analysis. RT-PCR and sequencing analyses revealed that HAC1 mRNA in C. auris undergoes an unconventional splicing event of 287 bp that is enhanced under ER stress conditions. The excised region spans the annotated open reading frame boundary, suggesting that the translated region of HAC1 may require re-evaluation. Notably, a proportion of HAC1 transcripts appeared to be spliced even under non-stress conditions, indicating a detectable basal level of UPR activation. Differences in splicing dynamics were also observed among clade strains. Functional analyses demonstrated that deletion of HAC1 increased sensitivity to ER stress and heat stress. The HAC1 deletion mutant also exhibited reduced virulence in both Galleria mellonella and immunosuppressed mouse infection models, as evidenced by delayed host mortality and decreased fungal burdens, respectively. These findings indicate that HAC1 contributes to ER stress adaptation, thermotolerance, and survival in the host environment, and identify HAC1 as a virulence-associated gene in C. auris.
Tomimoto, S.; Satake, A.
Show abstract
Trees accumulate somatic mutations throughout their long lifespan, resulting in genetic mosaicism among branches. While recent genomic studies quantified these mutations, they were largely limited to describing static patterns of variation. In this study, we developed a mathematical model to infer the dynamic processes of somatic mutation accumulation from snapshot genomic data obtained from four tropical trees (Dipterocarpaceae), which dominate tropical rain forests in Southeast Asia. Our model focus on genetic differences between shoot apical meristems (SAMs) at branch tips and explicitly incorporate stem cell dynamics within SAMs during shoot elongation and branching, enabling us to quantify somatic genetic drift arising from stem cell lineage replacement. By comparing model predictions with empirical data from Dipterocarpaceae trees, we estimated key parameters governing stem cell dynamics and somatic mutation rates. Our results indicate that both shoot elongation and branching involve replacement of stem cell lineages, leading to a moderate degree of somatic genetic drift. Accounting for stem cell dynamics resulted in slightly lower mutation rate estimates than previous approaches that ignored these processes. Using the estimated parameters, we further performed stochastic simulations to predict patterns of somatic mutations, including features not directly observed in the sampled trees, such as occasional deviations of somatic mutation phylogenies from physical architecture. Together, our modeling framework provides insights into how genetic mosaicism is shaped within tropical trees and reveals the stem cell dynamics underlying their long-term growth and accumulation of somatic mutations. (236 words) Highlights- We built mathematical models to predict the genetic differences between branch tips by somatic mutations. - The model considers the varying dynamics of stem cells in shoot meristem during shoot elongation and branching. - We compared the model prediction with empirical data from tropical trees, Dipterocarpaceae, and estimated the dynamics of stem cells and mutation rate. - Somatic mutation dynamics were shaped by somatic genetic drift arising from stem cell lineage replacement during shoot elongation and branching. - Accounting for stem cell dynamics led to slightly smaller estimates of mutation rates compared with previous estimates that ignored the dynamics. - Our models offer insights into how genetic variability is shaped in the tropical trees and the stem cell dynamics underlying their long-term growth.
Leal, C.; Bujanda, R.; Eichmeier, A.; Pecenka, J.; Hakalova, E.; Antonielli, L.; Compant, S.; Gramaje, D.
Show abstract
Cadophora luteo-olivacea is an ecologically versatile fungus associated with grapevine trunk diseases, yet the extent to which strains from different hosts and environments differ in genome composition, functional potential, and pathogenicity remains poorly understood. Here, we performed a comparative genomic analysis of 12 C. luteo-olivacea isolates recovered from grapevine, almond, apple, Crocus bulbs, soil, air, wastewater, and deep-sea sediment. Genome assemblies were highly complete (BUSCO >99%) and ranged from 46.94 to 50.70 Mbp. Pairwise average nucleotide identity (ANI) revealed a cohesive 11-strain group and one markedly divergent strain, CBS 266.93. Phylogenomic analysis based on 2,645 single-copy orthologs further showed that CBS 266.93 lies outside the main C. luteo-olivacea clade and forms a sister relationship with Cadophora malorum, indicating that its taxonomic placement warrants reassessment. Across the remaining strains, broad functional conservation was observed, including similar KOG profiles, extensive carbohydrate-active enzyme repertoires (798-849 genes per genome), and abundant biosynthetic gene clusters (26-35 per genome). Transposable element content varied substantially among strains (0.67-4.45% of genome), but this variation did not parallel overall functional profiles. All isolates colonized grapevine leaves in vitro, although lesion severity differed significantly among strains, indicating conserved plant-colonizing capacity with quantitative variation in aggressiveness. Small RNA profiling of inoculated grapevine leaves further revealed isolate-associated differences in host miRNA family expression, particularly for miR398, miR827, and miR156. Together, these results show that most C. luteo-olivacea strains share a conserved genomic framework compatible with plant colonization, while retaining lineage-and strain-level phenotypic and host-associated variation.
Lourenco, V. M.; Ogutu, J. O.; Piepho, H.-P.
Show abstract
Data contamination--from recording errors to extreme outliers--can compromise statistical models by biasing predictions, inflating prediction errors, and, in severe cases, destabilizing performance in high-dimensional settings. Although contamination can affect responses and covariates, we focus on response contamination and evaluate Random Forests through simulation. Using a synthetic animal-breeding dataset, we assess robust Random Forests across several contamination scenarios and validate them on plant and animal datasets. We thereby clarify the consequences of contamination for prediction, develop a robust Random Forest framework, and evaluate its performance. We examine preprocessing or data-transformation strategies, algorithmic modifications, and hybrid approaches for robustifying Random Forests. Across these approaches, data transformation emerges as the most effective strategy, delivering the strongest performance under contamination. This strategy is simple, general, and transferable to other Machine Learning methods, offering a remedy for robust genomic prediction. In real breeding data, robust Random Forests are useful when substantial contamination, phenotypic corruption, misrecording, or train-deployment mismatch is plausible and the goal is to recover a latent signal for genomic prediction and selection; ranking-based robust Random Forests are the dependable first option, whereas weighting-based Random Forests should be used only when their weighting scheme preserves rank structure and improves prediction. Robustification is not universally necessary, but it becomes important when contamination distorts the link between observed responses and the predictive target; standard Random Forests remain the default for clean data, whereas robust Random Forests should be fitted alongside them whenever contamination is plausible, with the final choice guided by data, trait, and breeding objective. Author summaryMachine learning (ML) methods are widely used for prediction with high-dimensional, complex data, and supervised approaches such as Random Forests (RF) have proved effective for genomic prediction (GP) and selection. Yet their performance can be severely compromised by data contamination if the algorithms rely on classical data-driven procedures that are sensitive to atypical observations. Robustifying ML methods is therefore important both for improving predictive performance under contamination and for guiding their practical use in high-dimensional prediction problems. To address this need, we develop robust preprocessing, algorithm-level, and hybrid strategies for improving RF performance with contaminated data. Using simulated animal data, we show that ranking-and weighting-based robust RF provide the strongest overall compromise for genomic prediction and selection under contamination. Validation on several plant and animal breeding datasets further shows that the benefits of robustification are not universal, but depend on the dataset, trait, and breeding objective. Although motivated by RF, the framework we propose is general, practical, and readily transferable to other ML methods. It also offers a basis for deciding when robustness should complement standard RF rather than replace it outright.