Towards a holistic epidemiology of Streptococcus agalactiae using the BakRep repository
Fenske, L.; Schwengers, O.; Goesmann, A.
Show abstract
Streptococcus agalactiae is a versatile multi-host pathogen that can cause major neonatal disease in humans, as well as mastitis in dairy animals. Its ability to infect a wide range of hosts is largely driven by its high genomic plasticity and the acquisition of distinct accessory genes. The global population of S. agalactiae is characterized by multiple of capsular serotypes and clonal complexes that differ in their propensity to cause invasive disease, including hypervirulent CC17 (often serotype III) associated with neonatal meningitis, whereas CC1/CC19/CC23 are more often colonizing lineages. Although widely studied, most research is limited to particular regions or single outbreak events, offering only fragmented snapshots instead of a comprehensive global picture. To move beyond region- or outbreak-limited studies, this work has analyzed 37970 S.agalactiae genomes from BakRep, integrating serotypes, MLST, AMR genes, lineage-specific genes, and descriptive metadata to map current trends and identify potential gaps in public data. The dataset largely matched the known population structure with serotype III, Ia and V most common and stable serotype/clonal complex lineages (e.g. III-2/CC17, Ia/CC23, CC1/V), while also rising serotype diversity. Lineages differed in their accessory-gene profiles, with III-2/CC17 being enriched for virulence and adhesion genes, while other groups showed either greater genomic plasticity (mobile/phage genes) or niche specialization. AMR was widespread with very high tetracycline resistance (>80%), frequent MLSB resistance determinants, and emerging aminoglycoside resistance in some genomes. But overall it became evident that the associated metadata contained substantial gaps. Missing or incomplete information limits biological interpretation, underscoring that rigorously curated, structured metadata is essential for maximizing the value of ongoing sequencing efforts.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.