Back

High-resolution species delimitation in Acinetobacter baumannii using a novel Core-Gene Consensus Delimitation approach

El Mchachti, K.; Valcek, A.; Van der Henst, C.; Flot, J.-F.

2026-05-05 microbiology
10.64898/2026.05.01.722318 bioRxiv
Show abstract

Acinetobacter baumannii is a highly adaptable nosocomial pathogen with extensive antibiotic resistance, a disproportionately large accessory genome, and high genomic plasticity. Owing to these features, the World Health Organisation (WHO) classifies A.baumannii as a critical-priority pathogen. In this study, we analyzed 47 isolates from our VUB (Vrije Universiteit Brussel) collection and applied distance-based species-delimitation algorithms - Automatic Barcode Gap Discovery (ABGD) and Assemble Species by Automatic Partitioning (ASAP) - for the first time at the bacterial core-genome scale. By integrating conspecificity matrices, we extended these traditionally single-locus methods into a multi-locus framework, which we term Core-Gene Consensus Delimitation (CGCD). Across a range of gene-level co-occurrence thresholds, CGCD consistently recovered 11 stable groups using both ABGD and ASAP. Larger-scale validation using 856 A. baumannii genomes recovered the same 11 well-separated groups were recovered, demonstrating the robustness and reliability of our clustering approach. Mapping these groups onto a core-genome phylogeny revealed that each group forms a distinct clade, indicating that they represent evolutionarily independent lineages rather than arbitrary clusters. We further constructed a clustering tree based on accessory gene presence-absence patterns. In this tree, only one strain (AB231-VUB) clustered within group 11; otherwise, the groups remained tightly cohesive, sharing characteristic sets of accessory genes. Together, these results show that the groups defined by CGCD are genomically, evolutionarily, and functionally distinct, supporting their interpretation as separate species. Our findings highlight CGCD as a powerful, high-resolution framework for species delimitation. CGCD is threshold-free, gene-based, and universally applicable--the first species-delimitation approach that can be applied across all domains of life, from bacteria to animals and plants.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Nucleic Acids Research
1128 papers in training set
Top 1%
12.3%
2
Nature Communications
4913 papers in training set
Top 15%
12.1%
3
Microbial Genomics
204 papers in training set
Top 0.2%
9.9%
4
Genome Biology
555 papers in training set
Top 2%
4.8%
5
ISME Communications
103 papers in training set
Top 0.4%
4.8%
6
PLOS Biology
408 papers in training set
Top 2%
4.8%
7
Nature Microbiology
133 papers in training set
Top 0.8%
4.2%
50% of probability mass above
8
mBio
750 papers in training set
Top 4%
4.2%
9
eLife
5422 papers in training set
Top 20%
4.2%
10
Genome Medicine
154 papers in training set
Top 2%
3.5%
11
mSystems
361 papers in training set
Top 3%
3.0%
12
Scientific Reports
3102 papers in training set
Top 42%
2.8%
13
Communications Biology
886 papers in training set
Top 4%
2.3%
14
Frontiers in Microbiology
375 papers in training set
Top 4%
2.0%
15
mSphere
281 papers in training set
Top 3%
2.0%
16
Microbiome
139 papers in training set
Top 2%
1.8%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 33%
1.7%
18
Cell Host & Microbe
113 papers in training set
Top 3%
1.7%
19
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 4%
1.3%
20
The ISME Journal
194 papers in training set
Top 2%
1.1%
21
PLOS ONE
4510 papers in training set
Top 66%
0.8%
22
Journal of Clinical Microbiology
120 papers in training set
Top 2%
0.8%
23
Cell Reports
1338 papers in training set
Top 32%
0.8%
24
Genome Research
409 papers in training set
Top 4%
0.7%
25
PLOS Genetics
756 papers in training set
Top 17%
0.6%