Back

Hidden from plain sight: Novel Chlamydiota diversity emerging from screening genomic and metagenomic data.

Davison, H. R.; Hurst, G. D. D.

2023-03-18 evolutionary biology
10.1101/2023.03.17.533158 bioRxiv
Show abstract

Chlamydiota are an ancient and hyperdiverse Phylum of obligate intracellular bacteria. The best characterized representatives are pathogens or parasites of mammals, but it is thought that their most common hosts are microeukaryotes like Amoebozoa. The diversity in taxonomy, evolution, and function of non-pathogenic Chlamydiota are slowly being described. Here we use data mining techniques and genomic analysis to extend our current knowledge of Chlamydiota diversity and its hosts, in particular the Order Parachlamydiales. We extract one Rhabdochlamydiaceae and three Simkaniaceae genomes from NCBI Short Read Archive deposits of ciliate and algal genome sequencing projects. We then use these to identify a further 14 and 8 genomes respectively amongst existing, unidentified environmental assemblies. From these data we identify two novel clades with host associated data, for which we propose the names Candidatus Sacchlamydia (Family Rhabdochlamydiaceae) and Candidatus Amphrikania (Family Simkaniaceae), as well as a third new clade of environmental MAGs Candidatus Acheromydia (Family Rhabdochlamydiaceae). The extent of uncharacterized diversity within the Rhabdochlamydiaceae and Simkaniaceae is indicated by 16 of the 22 MAGs being evolutionarily distant from currently characterised genera. Within our limited data, we observe great predicted diversity in Parachlamydiales metabolism and evolution, including the potential for metabolic and defensive symbioses as well as pathogenicity. These data provide an imperative to link genomic diversity in metagenomics data to their associated eukaryotic host, and to develop onward understanding of the functional significance of symbiosis with this hyperdiverse clade. Graphical Abstract O_FIG O_LINKSMALLFIG WIDTH=161 HEIGHT=200 SRC="FIGDIR/small/533158v2_ufig1.gif" ALT="Figure 1"> View larger version (56K): org.highwire.dtl.DTLVardef@fd4282org.highwire.dtl.DTLVardef@1199065org.highwire.dtl.DTLVardef@156f740org.highwire.dtl.DTLVardef@829eed_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Environmental Microbiome
26 papers in training set
Top 0.1%
14.3%
2
ISME Communications
103 papers in training set
Top 0.1%
12.4%
3
Microbiome
139 papers in training set
Top 0.3%
8.3%
4
Frontiers in Microbiology
375 papers in training set
Top 0.8%
8.3%
5
mSystems
361 papers in training set
Top 2%
4.8%
6
Genome Biology and Evolution
280 papers in training set
Top 0.5%
3.6%
50% of probability mass above
7
Science China Life Sciences
26 papers in training set
Top 0.7%
2.1%
8
Scientific Reports
3102 papers in training set
Top 59%
1.7%
9
BMC Biology
248 papers in training set
Top 1%
1.7%
10
Communications Biology
886 papers in training set
Top 9%
1.7%
11
The ISME Journal
194 papers in training set
Top 1%
1.7%
12
Ecology and Evolution
232 papers in training set
Top 2%
1.7%
13
mBio
750 papers in training set
Top 8%
1.7%
14
mSphere
281 papers in training set
Top 4%
1.5%
15
eLife
5422 papers in training set
Top 45%
1.5%
16
Journal of Genetics and Genomics
36 papers in training set
Top 1%
1.5%
17
Nature Communications
4913 papers in training set
Top 55%
1.3%
18
BMC Ecology and Evolution
49 papers in training set
Top 1%
1.3%
19
Microbiology Spectrum
435 papers in training set
Top 4%
1.2%
20
PLOS ONE
4510 papers in training set
Top 60%
1.2%
21
PLOS Biology
408 papers in training set
Top 14%
1.2%
22
PeerJ
261 papers in training set
Top 12%
0.9%
23
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 40%
0.9%
24
Journal of Systematics and Evolution
11 papers in training set
Top 0.2%
0.9%
25
Molecular Phylogenetics and Evolution
61 papers in training set
Top 0.3%
0.9%
26
Molecular Biology and Evolution
488 papers in training set
Top 4%
0.8%
27
Environmental Microbiology
119 papers in training set
Top 3%
0.7%
28
Global Ecology and Biogeography
41 papers in training set
Top 0.6%
0.7%
29
iScience
1063 papers in training set
Top 33%
0.7%
30
Genomics
60 papers in training set
Top 3%
0.7%