Microbiome
○ Springer Science and Business Media LLC
Preprints posted in the last 30 days, ranked by how well they match Microbiome's content profile, based on 139 papers previously published here. The average preprint has a 0.13% match score for this journal, so anything above that is already an above-average fit.
Trubl, G.; Roux, S.; Kellom, M.; Vyshenska, D.; Tomatsu, A.; Singh, K.; Kimbrel, J.; Eloe-Fadrosh, E. A.; Malmstrom, R. R.; Pett-Ridge, J.; Blazewicz, S. J.
Show abstract
Viruses are abundant and ecologically important in soils, yet the persistence and production dynamics of extracellular virions remain poorly understood. We applied a genome-resolved stable isotope probing viromics (SIP-viromics) approach, combining H 18O labeling with viral metagenomics, to track virion turnover in seasonally dry grassland soils following rewetting. We identified 354 viral populations (vOTUs) using individual-sample and combined metagenome assemblies. Only 22% of vOTUs exhibited significant 18O enrichment, indicating active replication and new virion production during the 1-week incubation; the majority (78%) persisted without detectable replication, consistent with a viral seed bank. Active vOTUs accounted for 4.76-5.15% of total virions per gram of soil, with viral loads ranging from 3.15 x 1010 to 6.59 x 1010 virions per gram. Probabilistic and deterministic sensitivity analyses spanning viral DNA fraction and genome length reinforced that persistent virions represented the majority of the extracellular viral pool post-wet-up, regardless of parameter assumptions. Host predictions linked both active and persistent vOTUs primarily to Actinomycetota and Pseudomonadota--bacterial groups known to rapidly resuscitate following rewetting--suggesting that some viruses exhibit rapid turnover while others persist over longer timescales, forming a stable viral pool capable of reinitiating infections during favorable conditions. These results demonstrate that SIP-viromics can distinguish newly produced from persistent virions and reveal host-associated patterns of lytic infection and virion production. Our findings advance understanding of soil virus-host interactions and highlight the ecological role of persistent virions as a genetic reservoir contributing to microbial turnover and biogeochemical cycling following environmental disturbance. ImportanceUnderstanding the persistence and production dynamics of soil viruses is critical for elucidating their roles in microbial community dynamics and nutrient cycling, yet these processes have remained largely uncharacterized due to methodological limitations. By integrating stable isotope probing with viromics, this study provides a robust framework for directly distinguishing newly produced from persistent virions in situ. Unlike conventional viromics, which only catalogs viral diversity, SIP-viromics enables quantification of active viral replication and persistence under natural soil conditions. Our results demonstrate that most virions in a seasonally dry soil persisted through a rewetting event, with active replication limited to a minority of viral populations. Persistent virions were primarily linked to dominant bacterial groups, indicating that host ecophysiology and environmental stability strongly influence lytic infection. Collectively, these findings highlight viruses as long-term reservoirs of genetic material, capable of shaping microbial dynamics and ecosystem processes over time. This work establishes SIP-viromics as a powerful approach for studying virus-host interactions and their ecological significance in terrestrial environments.
Galbraith, M.; Williams, D.; Shaw, L. P.; Lipworth, S.; Stoesser, N.
Show abstract
2.Metagenomes offer the potential to characterise Escherichia coli strain-level diversity within the human gut microbiome, informing our understanding of colonisation diversity and the genetic features distinguishing infection from carriage. Among numerous reference-based tools for short-read metagenomic strain-level profiling, the best approach remains unclear. Here, we benchmarked six published tools--PanTax, PathoScope, StrainGE, Strainify, StrainR2 and StrainScan--for their ability to detect co-existing strains of E. coli and estimate their relative abundance across real and simulated metagenomes of increasing complexity with varying reference database composition. In the ZymoBIOMICS(R) D6331 dataset, only PanTax achieved zero error when predicting the equal abundance of five E. coli strains. In a differentially abundant four-strain mock community dataset (SRR13355226), StrainScan had the lowest mean absolute proportional error (0.89), driven by reduced sensitivity (0.5), followed by PathoScope (4.08). Across simulated metagenomes reflecting the healthy adult gut microbiome, all tools demonstrated high sensitivity ([≥]0.833), but specificity, precision and F1 score were selectively improved in some tools through detection thresholds to remove low abundance false positives. Outright, StrainGE achieved the highest F1 score (0.978). Predicted relative abundances of the E. coli K12-MG1655 (phylogroup A) and O157:H7 Sakai (phylogroup E) strains spiked into simulated metagenomes across varying abundance ratios were generally accurate, with PanTax and StrainR2 showing the lowest mean absolute proportional error (0.06). When truly present strains were removed from the reference database, out-of-phylogroup assignments were observed for some tools. Collectively, our results demonstrate that published metagenomic strain-level profiling tools vary in their ability to profile E. coli strains, indicating that method selection should be guided by intended application. These findings will facilitate characterisation of E. coli strain-level diversity within short-read gut metagenomes with greater accuracy than previously possible. 3. Impact statementStrain-level diversity within the human gut microbiome can be important for human health, with species such as Escherichia coli existing as both commensal and pathogenic strains. Most existing gut microbiome datasets are from short-read i.e., Illumina, sequencing, and numerous bioinformatic tools have been developed to profile strain-level variation from these data. However, the existing literature is often difficult to navigate given that the available tools have been benchmarked in various ways and are subject to author bias. This is, to our knowledge, the first independent benchmarking of six published tools for profiling E. coli at strain-level resolution from short-read metagenomes. Using both real and simulated datasets of increasing complexity, we demonstrate substantial variation in tool performance in terms of strain detection and relative abundance estimation, highlighting that tool choice should be guided by the specific research question, as no single method performs optimally across all scenarios. This work provides an unbiased framework for tool selection and will support more accurate and reproducible E. coli strain-level analyses in gut microbiome research from short-metagenomic data. 4. Data summaryThe authors confirm all supporting data, code and protocols have been provided within the article or through supplementary data files. Supplementary methods, six supplementary tables and four supplementary figures are available in the online Supplementary Material. Code for simulating metagenomes using InSilicoSeq, SLURM job scripts for the simulated metagenomes dataset and R visualization and statistical analysis scripts are available within a dedicated public GitHub repository (https://github.com/mattgal11/benchmarking_short_read_strain_profilers). The following supplementary data are available on FigShare (https://doi.org/10.6084/m9.figshare.32125474): O_LINormalised per-contig relative abundances for 98 species assemblies used to construct the baseline gut microbiome profile for InSilicoSeq metagenome simulation (Normalised_relative_abundance_for_InSilicoSeq_simulated_metagenomes_ gut_microbiome_profile.csv) C_LIO_LIZymoBIOMICS(R) D6331 gut microbiome standard dataset predicted relative abundance data (Zymobiomics_D6331_raw_predicted_abundance.csv) C_LIO_LISRR13355226 mock community (99% human reads; 1% E. coli reads) paired-end reads with human reads depleted (SRR13355226_depleted_R1.fastq.gz & SRR13355226_depleted_R2.fastq.gz) C_LIO_LISRR13355226 mock community dataset raw predicted abundance data, with and without human read removal (SRR13355226_raw_predicted_abundance_with_and_without_human_read_r emoval.csv) C_LIO_LISimulated metagenomes dataset raw call types and detection metric values with increasing detection thresholds (Simulated_metagenomes_raw_call_type_assingments_and_detection_thres holds.csv) C_LIO_LISimulated metagenomes dataset (all references) predicted relative abundance data (Simulated_metagenomes_all_references_raw_predicted_abundances.csv) C_LIO_LISimulated metagenomes dataset (all references) mapped reads for PathoScope and Strainify (all_refs_pathoscope_reads_mapped.csv & all_refs_strainify_reads_mapped.csv) C_LIO_LISimulated metagenomes dataset (reduced reference database) predicted relative abundance data (Simulated_metagenomes_K12_and_Sakai_removed_from_reference_datab ase_raw_predicted_abundance.csv) C_LI
Galaras, A.; Chasapi, I. N.; Aplakidou, E.; Chasapi, M. N.; Lamari, E.; Diplari, S.; Georgakopoulos-Soares, I.; Karatzas, E.; Baltoumas, F. A.; Kyrpides, N.; Pavlopoulos, G.
Show abstract
Wastewater surveillance has emerged as a critical tool for global epidemiology, yet the functional diversity of wastewater microbiomes remains poorly characterized at the protein level. Here, we present WasteFams, the first comprehensive database dedicated to the systematic exploration of protein families in wastewater metagenomic and metatranscriptomic studies worldwide. Integrating data from 580 metagenomes, 132 metatranscriptomes, and 1,709 reference genomes, WasteFams catalogs 3,887 non-redundant protein families (containing {succeq}100 members) derived from over 105 million predicted proteins. Each protein family is enriched with multi-layered annotations, including AlphaFold3 structural predictions, taxonomic classifications, and biome-specific metadata. To further expand their functional annotation, we integrated deep genomic context analysis to link protein families to Mobile Genetic Elements (MGEs), Biosynthetic Gene Clusters (BGCs), Antibiotic Resistance Genes (ARGs), and CRISPR elements. Accessible through the EnvoFams portal, WasteFams provides a user-friendly interface featuring advanced search capabilities, sequence and structural similarity tools, and interactive visualization modules. As global initiatives increasingly leverage wastewater for public health and environmental insights, WasteFams can serve as a critical resource for discovering novel microbial functions, monitoring resistance mechanisms, and exploring the biotechnological potential of secondary metabolites within wastewater-engineered ecosystems.
He, Y.; Du, Y.; Nguyen, L.; Wang, Y.
Show abstract
The prevailing taxonomic profiling methods for an environmental sample rely heavily on PCR amplification of SSU ribosomal RNA (rRNA) genes and genome-based reference databases. Identification and extraction of Illumina metagenomics sequencing data are PCR independent but technically challenging in recognition of the SSU rRNA fragments. Here we present Mitag4taxa, a computational pipeline designed for taxonomic profiling of microbial communities from metagenomic Illumina sequencing reads containing rRNA tags (mitag). A Hidden Markov Model (HMM) of SSU rRNA genes and those for the V4 region of 16S rRNA and the V9 region of 18S rRNA genes were created, respectively, using the representative sequences of different families and corresponding hypervariable regions in the SILVA database. The pipeline identifies and extracts 16S and 18S rRNA gene fragments along with the quality score from metagenomic or metatranscriptomic datasets using HMM search integrated with the models. The hypervariable regions, including the V4 region of 16S rRNA and the V9 region of 18S rRNA genes, can be further scanned and recruited for taxonomic classification and biodiversity estimate. To demonstrate its high reliability, the performance of Mitag4taxa was evaluated using both real and simulated datasets. In human gut metagenomic assessments, taxonomic profiles derived from Mitag4taxa showed high consistency with those based on conventional 16S rRNA gene amplicons, identifying dominant families such as Bacteroidaceae and Prevotellaceae with similar relative abundances. Statistical analyses confirmed highly significant positive correlations between Mitag4taxa and amplicon-based community structures. The 18S V9 module was further validated using shotgun metagenomic data from deep-sea sediment cores, successfully recovering key eukaryotic taxa such as Collodaria and Leotiomycetes. Furthermore, benchmarking against the RiboTagger software using CAMI marine simulated datasets revealed that Mitag4taxa achieved a higher average F1 score and lower error metrics. Overall, Mitag4taxa provides a complementary rRNA gene amplicon- and genome-independent strategy for microbial community profiling, enabling improved detection of both prokaryotic and eukaryotic taxa from metagenomic and metatranscriptomic sequencing data.
Cumbo, F.; Felici, G.; Blankenberg, D.; Valeriani, F.; Romano Spica, V.; Santoni, D.
Show abstract
BackgroundThe exponential growth of public metagenomic datasets offers an unprecedented opportunity to explore microbial diversity. However, analyzing this vast amount of data presents significant computational challenges. While shotgun metagenomics provides deep functional and taxonomic resolution, its high cost still limits its application. On the other hand, 16S rRNA gene sequencing remains a cost-effective and widely used alternative, but tools are needed to maximize its discovery potential. Traditional clustering is not scalable, obstructing the creation of a comprehensive and continuously updated catalog of microbial life from 16S data. MethodsWe developed a reproducible and scalable Snakemake pipeline for the incremental clustering of 16S rRNA amplicons. The workflow begins by constructing a reference database from bacterial and archaeal genomes. It then processes 16S rRNA samples sequentially. For each new sample, sequences are first mapped against the existing cluster centroids. Sequences that match known centroids are assigned accordingly, while unmapped sequences are clustered independently to form novel operational taxonomic units (OTUs). These new centroids are then merged with the existing database, allowing it to grow dynamically without the need for computationally prohibitive all-at-once re-clustering. ResultsOur pipeline enables the efficient and continuous expansion of a 16S rRNA cluster database. By processing a large corpus of public 16S rRNA samples, we generated a comprehensive atlas of tens of thousands of OTUs. A significant fraction of these clusters, particularly at the genus and family levels, were classified as unknown. ConclusionsThis work provides a powerful, open-source tool for large-scale analysis of 16S rRNA samples. The incremental clustering strategy overcomes the scalability limitations of traditional methods, allowing researchers to leverage public data and discover novel microbes in their own microbiome samples.
Ademola-Popoola, I. J.; Grogen, K. E.; Abdul-Aziz, M. A.; Ta, C. K.; Tang, K.; Blekhman, R.; Barreiro, L. S.; Perry, G. H.; Weyrich, L. S.
Show abstract
Industrialization has been identified as the single biggest factor driving global microbiome diversity. While many studies examining gut microbiomes attribute these shifts to dietary increases in fat and reductions in protein, oral microbiome responses to industrialization remains debated. The oral microbiome is more resilient due to long-standing coevolution with host tissues and biofilm stability. However, limited geographic and historical representation has constrained our understanding of how these transitions unfolded globally in the oral microbiomes. Here, we investigate oral microbiome variation in Batwa rainforest hunter-gatherers and neighboring Bakiga subsistence farmers from southwestern Uganda, comparing them with publicly available data from Tanzanian, Venezuelan, and industrialized populations from North America, Europe, and Australia. Using 16S rRNA gene sequencing, we characterized salivary microbiota and evaluated differences in local and global diversity, composition, and differential abundance. Ugandan populations contained significant compositional differences but similar levels of diversity, suggesting that shared environments and dietary overlap may shape microbial assemblages despite distinct cultural histories. Globally, strong continental and industrialization effects were observed in the oral microbiome, with all industrial populations clustering separately from people living in other locations. African populations also clustered separately from non-African groups. Oral microbiome diversity was highest in Ugandan individuals and lowest in industrialized populations, mirroring patterns previously observed in the gut microbiome. Together, these findings demonstrate that both geography and subsistence strategy structure global oral microbiome variation. They also clarify the position that oral microbial communities record biocultural transitions and highlight the need to better understand the industrial mechanisms that shape microbial diversity in the oral cavity.
Ho, J. Y.; Hu, D.; Kang, D. Y.; Sim, C. B. W.; Wijaya, W.; Boucher, Y. F.
Show abstract
Coastal marine environments are increasingly recognised as reservoirs of antimicrobial-resistant (AMR) pathogens. However, it remains challenging to recover high-quality genomes of clinically relevant bacteria present at low abundance from complex natural systems. Here, we applied culture-enriched metagenomics to systematically track the diversity and dynamics of major AMR pathogens within the coastal marine system of St. Johns Island, Singapore, as a model ecosystem for pathogen surveillance. Selective media-based enrichment recovered 773 metagenome-assembled genomes (MAGs) from 92 multi-matrix environmental samples, which includes coastal water, sediment, and seaweed, capturing diverse AMR ESKAPE and Vibrio species. Distinct bacterial signatures and dispersal patterns were observed in each niche, for example, microbes that signal human impact was detected at the beach, while fish-associated pathogens were present at the aquaculture facility outlet. Notably, the high-quality MAGs enabled subspecies-level identification and supported the AMR gene detection across six distinct coastal habitats. Detailed differences in the recovery of specific pathogens across enrichment media were also identified, demonstrating the methods efficacy in finding media suitable for surveillance of specific organisms, such as deciding between liquid or solid formulations. MAGs recovered from culture-enriched metagenomics were highly similar to genomes obtained from pure isolates, as demonstrated for Klebsiella pneumoniae. The preserved culture-enriched stocks were capable of recovering organisms of interest when individual isolates were required for further study. Overall, our findings highlight the utility of culture-enriched metagenomics as a cost-effective, sensitive approach to uncovering the genomic landscape of pathogens with environmental reservoirs, with implications for AMR surveillance and ecological risk assessment.
Turner, A. A. B.; Stahn, M.; Millard, A.; Sauvageau, D.; Stein, L. Y.
Show abstract
Agriculture is a major source of anthropogenic greenhouse-gas emissions, being the largest source of nitrous oxide (N2O), an extremely potent greenhouse gas and ozone-depleting agent. Soil N2O emissions are largely driven by microbial nitrification, in which ammonia-oxidizing microorganisms catalyze the rate-limiting oxidation of ammonia to nitrite. Nitrification not only mediates N2O fluxes but also reduces fertilization efficiency and contributes to eutrophication through nitrate leaching. Bacteriophage (phage)-based control of microbial communities is rapidly garnering interest in a number of fields; however, phages infecting ammonia-oxidizers are largely uncharacterized, with only one lytic phage having been described, limiting the potential for phage-mediated nitrification inhibition. Here, we show the largest set of phages infecting ammonia-oxidizing bacteria (AOB) to date: 45 dsDNA phages identified from urban wastewater, infecting four AOB species, with 16 demonstrating cross-genus host ranges and capable of eliminating nitrification activity in liquid cultures. Phylogenetic and taxonomic analyses revealed six proposed families of Caudoviricetes and numerous monophyletic clades, likely representing higher-level lineages. Structure-guided genome annotation revealed these phages to carry diverse and seldom-seen auxiliary metabolic genes, ranging from a complete ABC transporter cassette to a large antimicrobial resistance gene cluster. These results unveil the previously unrecognized diversity of AOB phages and their potential to alter host physiology. Our data demonstrates a broad taxonomic and functional repertoire of cultured AOB phages, greatly expanding the panel of known AOB phages, suggesting that viruses play a more significant and complex role in nitrification than previously understood. Moreover, we outline an effective methodological framework for isolating AOB phages from environmental samples. These results will help reframe our understanding of environmental nitrification and enable intensified selection and use of phages for its control.
Szeto, C. Y. Y.; Kwan, H. S.
Show abstract
Dietary and lifestyle microbiome interventions often produce mild but heterogeneous remodeling rather than uniform community shifts. In this setting, scalar diversity or group-level summaries can appear weak or inconclusive even when participants move in organized but magnitude-limited directions, or move substantially in divergent directions. We developed a response-geometry framework that jointly describes baseline-referenced response magnitude and cross-participant directional coherence within a compositional feature space. The framework complements diversity, ordination, trajectory, PERMANOVA, PERMDISP, and beta-diversity analyses by asking whether paired responses differ in size, shared direction, or both. MethodsA response vector for each participant was defined as the follow-up minus baseline profile after adding a 0.5 pseudocount and applying centered log-ratio transformation in Aitchison-based response space. Response magnitude was the Euclidean length of this vector. Directional coherence was quantified as cosine alignment between participant-level response vectors and the mean group response vector, with sign-flip permutations as a paired-structure-preserving diagnostic null. We evaluated the framework using workflow-sensitive diversity comparisons, 198,000 logistic-normal compositional simulations with 100 or 500 features and small-to-large shared-direction effects, public-data-derived implementation stress tests, a synbiotic and dietary-intervention cohort, and a fiber/fermented-food application in 16S rRNA gene amplicon and shotgun-derived CAZyme gene-family feature spaces. A beta research-preview repository accompanying the preprint is available at https://github.com/carolyyszeto/microbiome-response-interpreter-beta as v6.5-beta, including documented scripts, a toy dataset, environment notes, output-interpretation guidance, and exploratory implementation utilities. ResultsWorkflow comparisons showed that richness-sensitive differences were concentrated in rare-tail and low-abundance structure, informing the analytical feature-space context for response interpretation. In simulations, null and magnitude-only random-direction scenarios showed near-null detection rates of 0.061 and 0.062, close to nominal alpha = 0.05, whereas shared-direction scenarios showed increasing coherence with stronger effects and larger sample sizes. Mixed-responder and opposing-subgroup scenarios attenuated or cancelled pooled coherence, supporting separation between response magnitude and directional organization. The synbiotic and dietary-intervention cohort showed modest, heterogeneous displacement with limited within-arm coherence, with permutation p values from 0.575 to 0.653. In the fiber/fermented-food application, fermented-food exposure showed stronger 16S response organization than the baseline-period reference, while CAZyme estimates used non-identical sampling endpoints and remained feature-space-specific. ConclusionsThis response-geometry framework helps distinguish paired microbiome movement size from shared response orientation. It is intended as an interpretively cautious response-organization descriptor for mild, heterogeneous intervention settings, not as a replacement for existing multivariate methods. Its interpretation depends on sample size, effect structure, endpoint alignment, zero handling, group-direction stability, and feature-space definition. The framework does not convert weak, null, endpoint-limited, or sensitivity-dependent findings into efficacy, predictive, or mechanistic claims.
Ossowicki, A.; Griffioen, T.; Mileti, E.; Attanasi, V.; Hames, C.; Carrion, V. J.; Oyserman, B.
Show abstract
Scalable soil microbiome monitoring requires sampling methods that are reproducible across operators, field sites, and logistical constraints. Here, we evaluated three key methodological choices that commonly limit comparability in agricultural rhizosphere studies: how the rhizosphere sampling unit is operationally defined, sample pooling strategies, and preservation methods. We introduce the RhizoCore, a standardized root-zone soil core defined by core diameter, depth, position relative to the plant, and subsample volume, as a practical proxy for traditional rhizosphere sampling. The RhizoCore method captured more than 92% of the sequencing depth found in traditional rhizosphere samples, with differences limited predominantly to low-abundance taxa. Preservation methods significantly affected bacterial communities, while sample pooling showed greater impact on fungal diversity and substantially reduced within-group variability across all treatments. Despite these effects, differential abundance analysis revealed minimal compositional changes, with only a small fraction of microbial taxa significantly affected by either pooling or preservation method. Our findings demonstrate that the RhizoCore method provides a reproducible, and scalable approach for rhizosphere sampling that balances scientific rigor with practical field implementation, offering a framework for large-scale soil microbiome monitoring programs and for improving comparability among agricultural microbiome studies across diverse environmental conditions.
Luecking, D.; Manzano-Marin, A.; Willemsen, A.
Show abstract
Viruses of the phylum Nucleocytoviricota are paradigm-shifting entities due to their exceptionally large genomes and complex gene repertoires, which blur the lines between viral and cellular life. Previous research has leveraged computational approaches to map their extensive diversity, while experimental work has started to elucidate the intricate networks they form with hosts, bacterial and other symbionts, co-infecting virophages and other mobile genetic elements. Here, we analyzed deeply sequenced metagenomes sampled from wastewater treatment plants in Denmark, an environment with rapid abiotic changes and known to be a hotbed of dense microbial communities. We discovered 61 novel nucleocytoviruses, 15 virophages and 14 polinton-like viruses. By integrating them with microbial contigs into a multilayered interaction network, we explore the role of these entities on a mesocosm scale. We demonstrate the centrality of nucleocytoviruses, positioning them as important players shaping microbial community structure and evolution in wastewater treatment plants.
Pellegrinetti, T. A.; Molligan, J.; Almeida Santos, A.; Plante, N.; Jacques, J.; Gregoire-Taillefer, A.; Canale, M. C.; Rodrigues Duffeck, M.; Faris, A. M.; Olmedo-Velarde, A.; Valmorbida, I.; Perez-Lopez, E.
Show abstract
BackgroundLeafhoppers are among the most important insect vectors of plant pathogens worldwide and depend on microbial symbionts to exploit nutrient-poor phloem diets. However, most studies of leafhopper-associated microbiota have focused on a limited number of taxa or marker-gene surveys, leaving the genomic diversity, ecological organization, and functional potential of these microbial communities poorly understood. Here, we generated the Global Leafhopper Microbiome Catalog by integrating genome-resolved metagenomics from 171 leafhopper species across 11 subfamilies and 13 countries, including the first microbiomes characterized from Arctic leafhoppers. ResultsDe novo assembly and genome reconstruction generated 337 high-quality non-redundant microbial genomes and 18.6 million non-redundant genes, substantially expanding the known microbial diversity associated with Cicadellidae, including several previously undescribed bacterial lineages. Comparative analyses revealed a recurrent modular microbiome architecture composed of: (i) a conserved core of obligate nutritional symbionts, dominated by Candidatus Karelsulcia and Candidatus Nasuia; (ii) a heterogeneous layer of secondary symbionts, including Wolbachia, Arsenophonus, Rickettsia, and Diplorickettsia; and (iii) a dynamic pool of environmentally acquired bacteria. While obligate symbionts remained highly conserved across divergent hosts, secondary and environmental taxa varied substantially among species and regions, suggesting repeated acquisition shaped by ecological filtering rather than host phylogeny alone. Comparative analyses between the specialist corn leafhopper Dalbulus maidis and the more polyphagous aster leafhopper Macrosteles quadrilineatus further showed that closely related vectors can maintain conserved ancestral symbionts while harboring markedly distinct accessory microbiomes. Arctic populations contained unique microbial assemblages enriched in functions associated with cold tolerance, oxidative stress, and reproductive manipulation. In addition, we identified numerous plant-associated bacteria, including phytoplasmas, spiroplasmas, Pantoea, and Erwinia, alongside taxa with predicted nutritional and plant growth-promoting functions. ConclusionsOur findings reveal that leafhopper microbiomes are structured through the interaction of ancient obligate symbioses and flexible environmentally responsive microbial layers. This work establishes a genome-resolved framework for understanding microbiome evolution in insect vectors and highlights the potential role of microbial community structure in host adaptation, pathogen ecology, and sustainable pest management.
Wu, Q.; Ning, Z.; Zhang, A.; Cheng, K.; Figeys, D.
Show abstract
Taxonomic interpretation of metaproteomic peptides remains difficult because many peptide sequences are present in proteins from different organisms, reducing taxonomic specificity. Current peptide-centric workflows can report taxonomic summaries or taxon level confidence scores, but they do not provide formal statistical evidence that a taxon is present in the microbiome. Here we present MetaUmbra, a tool that derives genome-level statistical significance values from identified peptides. MetaUmbra builds theoretical peptide lists by in silico digestion of the taxon specific proteins and matches observed peptides against these references. It then combines a conservative significance estimate from unique peptides with a Monte Carlo based p-value for shared peptide evidence estimated under an empirical null model. In the defined community benchmark SIHUMIx, MetaUmbra identified the expected genomes without introducing false-positive genomes after embedding the SIHUMIx genomes in a large gut reference background. In the single strain benchmark Mix24X, all expected genomes were identified with the best statistical significances even after near neighbor and full background expansion. In a hamster gut genome panel, MetaUmbra further preserved an interpretable ranking of candidate genomes in a dense real-data setting. Together, these results show that MetaUmbra can statistically identify the presence of specific microbes in a complex microbiome while maintaining low false-positive calls. MetaUmbra therefore provides a practical framework for converting peptide evidence into genome-level statistical inference in metaproteomics.
Inda-Diaz, J. S.; Adegoke, F.; Löber, U.; Jarquin-Diaz, V. H.; Duan, Y.; Bengtsson-Palme, J.; Ugarcina Perovic, S.; Coelho, L. P.
Show abstract
Identifying antibiotic resistance genes (ARGs) from metagenomic data is critical for studying antimicrobial resistance across microbial communities and pathogens. However, there is no standardized methodology for ARG annotation. Here, we compare ten commonly used ARG detection pipelines by analysing over 270 million prokaryotic genes from the Global Microbial Gene Catalogue across 13 distinct habitats. We observed up to a 45-fold difference in the number of reported ARGs, with a mean Jaccard index of only 16% between pipelines. Pipeline selection profoundly impacted downstream biological interpretations, with drastic changes to estimates of ARG relative abundance and richness, to the characterization of pan- and core-resistomes, and to the class-level composition of the inferred resistome. ARG detection pipelines make different, defensible trade-offs, and no single approach should be treated as authoritative. Therefore, users should justify and communicate choices carefully, as our analyses show that, taken uncritically, the same data can support conflicting biological and ecological interpretations.
Mathlouthi, N. E. H.; Gdoura-Ben Amor, M.; Belguith, I.; Derouich, R.; Ammar Keskes, L.; Gdoura, R.
Show abstract
Microbiome research has expanded globally, yet the Middle East and North Africa (MENA) region remains severely under-represented in international sequencing repositories. Here we present the MENA Microbiome Database, the first systematically harmonized catalog of publicly available metagenomic sequencing data from 24 MENA countries, consolidating 60,126 runs across 51,365 biological samples and 2,373 BioProjects deposited between 2008 and 2026. Records were retrieved from ENA, NCBI SRA, and PubMed, enriched with BioSample and study-level metadata, and classified into microbiome subtypes using a 73-rule keyword-based harmonization framework. Amplicon sequencing accounted for 80.6% of runs, with Illumina platforms dominating at 92.7%. Geographic coverage is highly skewed: Saudi Arabia and Turkey together contribute over half of all records, while five countries (Libya, Syria, Palestine, Yemen, and South Sudan) remain critically under-sampled. Metadata completeness averaged 73.97% under a MIxS-MIMS proxy framework, with geographic coordinates available for fewer than 15% of runs. Ecological analyses revealed that country-level factors significantly structure environmental, animal-associated, and plant-associated microbiomes, but not human-associated microbiomes. Spatial autocorrelation confirmed non-random clustering of sampling effort around Red Sea coastal and eastern Mediterranean hotspots. This open, reproducible resource, comprising harmonized data files, analysis code, and an interactive browsing platform, establishes a foundational infrastructure for regional microbiome science and equitable global comparative studies. GRAPHICAL ABSTRACT O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=133 SRC="FIGDIR/small/722303v1_ufig1.gif" ALT="Figure 1000"> View larger version (69K): org.highwire.dtl.DTLVardef@16ebcd3org.highwire.dtl.DTLVardef@12ed2d1org.highwire.dtl.DTLVardef@112b5b1org.highwire.dtl.DTLVardef@156b8a4_HPS_FORMAT_FIGEXP M_FIG C_FIG
Son, Y.; Craft, E. J.; Pineros, M. A.; Mathieson, O. L.; Awan, A.; Blakeley-Ruiz, J. A.; Kleiner, M.; Kao-Kniffin, J.
Show abstract
Urban agriculture increasingly relies on compost-based substrates for sustainable production, yet we lack a clear characterization of how these systems respond to biological amendments aimed at introducing beneficial microbiota. Here we investigated how developmental stage and co-inoculation with arbuscular mycorrhizal fungi (AMF) and phosphate-solubilizing bacteria (PSB) reshape rhizosphere microbial function in Solanum lycopersicum grown in compost-based urban farm substrate. Using plant physiology assays, 16S rRNA amplicon sequencing, and metagenome-informed metaproteomics, we characterized tomato physiological responses and rhizosphere microbial activity during flowering and fruiting across control, single AMF, single PSB, and AMF and PSB co-inoculation treatments. Co-inoculation synergistically enriched beneficial taxa, improved fruit nutrient accumulation, elevated nutrient transporter and quorum sensing protein production, and drove stress-driven dormancy in competitively excluded taxa, with responses varying between developmental stages. Our findings establish metagenome-informed metaproteomics as essential for resolving stage-specific rhizosphere microbiome functional responses to tomato development and AMF and PSB co-inoculation.
Rahlff, J.; Lang-Yona, N.; Lahav, E.; Westmeijer, G.; Das, R.; Buder, K.; Bueschel, R.; Micheel, J.; Eckhardt, S.; Evangeliou, N.; Groot Zwaaftink, C.; van Pinxteren, M.
Show abstract
BackgroundCloud water harbors diverse microbial communities despite its extreme oligotrophic conditions. However, the ecological and evolutionary dynamics of viruses in these transient atmospheric habitats remain poorly understood. Clouds have traditionally been regarded primarily as passive carriers of microorganisms rather than as active ecological environments supporting microbial interactions. In this study, cloud water was sampled at Mount Verde, Cape Verde Islands (744 m a.s.l.). We performed metagenomic analyses of iron-flocculated cloud water alongside genome analyses of a bacterial isolate and metagenome-assembled genomes using established bioinformatic approaches. Viral diversity, virus-host interactions, metabolic functions, genetic adaptations, and viral population dynamics across cloud events were investigated. In addition, UV-B resistance experiments were conducted for a novel cloud-water isolate. ResultsWe isolated 24 cloud water bacteria, including four novel species lineages, and recovered 62 high-quality metagenome-assembled genomes, including 10 novel species lineages. We identified 458 viral operational taxonomic units and 237 virus-host linkages across diverse prokaryotic hosts, revealing active viral predation across diverse bacterial taxa. In addition, CRISPR spacer matches from isolates of novel bacterial lineages such as Deinococcus nubigenus MPC36 were found. Viruses carried genes involved in host adaptation to environmental stressors, including cold-shock response, UV radiation resistance, and osmotic stress. In addition, viral populations exhibited SNP-level microdiversity and shifts in single-nucleotide variant composition across temporally proximate cloud events, indicating rapid population turnover. Experimental characterization of the cloud isolate Curtobacterium nubigenum MPC39 further revealed pronounced resistance to UV-B radiation and the presence of an inducible prophage, Curtobacterium phage vB_CnuS_Cirrus1 assigned to the new viral family Nebulaviridae, which could be validated in transmission electron microscopy. Reconstructed genomes from cloud-associated bacteria encoded carbon monoxide dehydrogenase genes and UV resistance genes, suggesting trace gas metabolism and enhanced UV protection as survival strategies in oligotrophic cloud droplets. In silico replication rates estimated using iRep were consistent with active bacterial replication at the time of sampling. ConclusionsTogether, these findings demonstrate that clouds are not merely passive carriers of microorganisms, but dynamic atmospheric ecosystems in which virus-host interactions shape microbial diversity and contribute to microbial turnover, atmospheric dispersal, and cloud-water biogeochemistry.
Warren, F.; Petropoulou, K.; Harris, H.; Barbas-Bernardos, C.; Kasapi, M.; Garcia, A.; Holmes, E.; Domoney, C.; Wist, J.; Garcia-Perez, I.; Frost, G.
Show abstract
The human duodenum harbours a complex, dynamic microbial community that is challenging to study due to inaccessibility, particularly postprandially when nutrient-rich chyme and fluctuating metabolites create unique microbial niches. We used naso-duodenal intubation to longitudinally sample duodenal luminal contents following pea-based meals of differing food structure, alongside parallel blood collection. Shotgun metagenomic sequencing, comprehensive metabolomic profiling and gut hormone measurements were combined to explore microbe-metabolite-hormone interactions. Food structure significantly affected postprandial bacterial composition, with saccharolytic oral taxa increasing after meals with intact structure. Alpha diversity was influenced by structure type (P = 0.025), with whole pea seeds promoting greater diversity than pea flour. Network analysis revealed complex interactions between the duodenal microbiome, luminal metabolites and gut hormones, with most microbial associations linked to glucose-dependent insulinotropic polypeptide (GIP) rather than glucagon-like peptide-1 (GLP-1). Metabolic profiling showed meal-dependent changes in amino acid metabolism, including shifts in D/L amino acid ratios over time consistent with microbial metabolism. The duodenal microbiome showed close phylogenetic relationships with the oral microbiome, with composition influenced by food structuring and swallowing. These findings reveal dynamic microbe-metabolite interplay in the human duodenum during digestion and its relationship to gut hormone responses.
Procter, M.; Kundu, B.; Sudalaimuthuasari, N.; AlMaskari, R. S.; Shah, I.; Alnuaimi, S.; Husain, F.; Aldhaheri, K.; Hazzouri, K. M.; Amiri, K. M.
Show abstract
Aridification and climate stress threaten global plant productivity, but the survival strategies of desert plants remain only partly understood. In this study, we examined how the microbiome of Citrullus colocynthis, a hardy desert cucurbit valued for its ecological and medicinal benefits, may influence the plants ability to withstand harsh conditions. Using 16S rRNA amplicon sequencing, shotgun metagenomics, and culture-based methods, we analyzed microbiome changes across two regions of the UAE during the rainy and dry seasons. Leaf and root bacterial communities showed clear seasonal shifts, with greater richness in winter and higher evenness in summer, while soil microbiomes remained stable. Dominant bacterial groups, Actinomycetota and Pseudomonadota, varied seasonally, indicating trade-offs between stress tolerance and metabolic flexibility. Fungal communities (mainly Ascomycota and Basidiomycota) were stable at the phylum level but reorganized by order between seasons; archaeal populations showed little change. Among 24 cultured bacterial isolates, including three potential new species, we identified multiple stress tolerance and plant growth-promoting traits. Genomic data revealed biosynthetic clusters for antimicrobial and stress-protective functions, as well as adaptation genes in Pseudomonas orientalis. These results demonstrate that the dynamic, functionally diverse microbiome of C. colocynthis enhances its resilience to desert stress, offering potential for arid-land agriculture.
Srinak, N.; Lachnit, T.; Ulrich, L.; Fraune, S.; Kaleta, C.; Taubenheim, J.
Show abstract
Host-associated microbiomes are typically maintained in stable configurations that support host fitness, yet the mechanisms by which metabolic perturbations destabilize these communities remain poorly understood. Using the freshwater cnidarian Hydra vulgaris AEP, we systematically assessed microbiome responses to 326 single-metabolite perturbations. Only 17 metabolites, mostly amino acid-related compounds, induced significant compositional shifts in the microbial community. Most shifts are accompanied by transitions from Curvibacter- to Pseudomonas-dominated or Legionella-dominated states, indicating the existence of three alternative community states which can be induced by metabolic triggers. Integrating 16S sequences with functional genomic information, we found that {beta}-diversity strongly predicted functional shifts, whereas reduced -diversity was associated with loss of metabolic functions. The metabolite perturbations also altered host-microbe interactions, affecting pathogenicity-, glycocalyx-, and nitrogen-related functions. In particular, nitrogen metabolism shifted from ammonia oxidation in Curvibacter-dominated communities to ammonia reduction in Pseudomonas-dominated states. Experimental validation confirmed that Pseudomonas metabolizes L-arginine and drives environmental ammonia accumulation to levels that could impair Hydras fitness and induce disease phenotypes. Conversely, Limnobacter was found to scavenge the environmental ammonia, potentially mitigating the adverse effects. These results demonstrate that metabolite-driven niche reconfiguration can destabilize host-associated microbiomes by coupling compositional shifts to functional change and host pathology, identifying metabolite-driven niche restructuring as a mechanism linking microbial community instability to host dysfunction.