Back

Genome size distributions in bacteria and archaea are strongly linked to phylogeny

Aylward, F. O.; Martinez-Gutierrez, C. A.

2021-12-16 evolutionary biology
10.1101/2021.12.15.472816 bioRxiv
Show abstract

The evolutionary forces that determine genome size in bacteria and archaea have been the subject of intense debate over the last few decades. Although the preferential loss of genes observed in prokaryotes is explained through the deletional bias, factors promoting and preventing the fixation of such gene losses remain unclear. Moreover, statistical analyses on this topic have typically been limited to a narrow diversity of bacteria and archaea without considering the potential bias introduced by the shared recent ancestry of many lineages. In this study, we used a phylogenetic generalized least-squares (PGLS) analysis to evaluate the effect of different factors on the genome size of a broad diversity of bacteria and archaea. We used dN/dS to estimate the strength of purifying selection, and 16S copy number as a proxy for ecological strategy, which have both been postulated to play a role in shaping genome size. After model fit, Pagels lambda indicated a strong phylogenetic signal in genome size, suggesting that the diversification of this trait is strongly influenced by shared evolutionary histories. As a predictor variable, dN/dS showed a poor predictability and non-significance when phylogeny was considered, consistent with the view that genome reduction can occur under either weak or strong purifying selection depending on the ecological context. Copies of 16S rRNA showed poor predictability but maintained significance when accounting for non-independence in residuals, suggesting that ecological strategy as approximated from 16S rRNA copies might play a minor role in genome size variation. Altogether, our results indicate that genome size is a complex trait that is not driven by any singular underlying evolutionary force, but rather depends on lineage- and niche-specific factors that will vary widely across bacteria and archaea. Author SummaryThe evolutionary forces driving genome size in bacteria and archaea have been subject to debate during the last decades. Independent comparative analyses have suggested that unique variables, such as the strength of selection, environmental complexity, and mutation rate, are the main drivers of this trait, which complicates generalizations across the Tree of Life. Here, we applied a phylogeny-based statistical approach to assess how tightly genome size is linked to evolutionary history in bacteria and archaea. Moreover, we also evaluated the predictability of genome size from the strength of purifying selection and ecological strategy on a broad diversity of bacteria and archaea genomes. Our approach indicates that genome size in prokaryotes is strongly dependent on phylogenetic history, and that genome size is the result of the interaction of variables like past events, current selection regimes, and environmental complexity that are clade dependent.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Genome Biology and Evolution
280 papers in training set
Top 0.1%
23.0%
2
Molecular Biology and Evolution
488 papers in training set
Top 0.3%
10.7%
3
Journal of Molecular Evolution
21 papers in training set
Top 0.1%
8.6%
4
Journal of Evolutionary Biology
98 papers in training set
Top 0.1%
6.5%
5
Molecular Ecology
304 papers in training set
Top 2%
3.7%
50% of probability mass above
6
PeerJ
261 papers in training set
Top 3%
3.1%
7
BMC Ecology and Evolution
49 papers in training set
Top 0.6%
2.7%
8
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 2%
2.1%
9
BMC Genomics
328 papers in training set
Top 2%
2.1%
10
Frontiers in Ecology and Evolution
60 papers in training set
Top 2%
2.1%
11
Evolution Letters
71 papers in training set
Top 0.9%
2.1%
12
PLOS Genetics
756 papers in training set
Top 7%
1.9%
13
eLife
5422 papers in training set
Top 41%
1.7%
14
Global Ecology and Biogeography
41 papers in training set
Top 0.3%
1.7%
15
Ecology and Evolution
232 papers in training set
Top 2%
1.7%
16
PLOS ONE
4510 papers in training set
Top 58%
1.4%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 36%
1.4%
18
Frontiers in Microbiology
375 papers in training set
Top 6%
1.4%
19
Journal of Theoretical Biology
144 papers in training set
Top 1%
1.1%
20
Peer Community Journal
254 papers in training set
Top 3%
1.0%
21
G3 Genes|Genomes|Genetics
351 papers in training set
Top 2%
0.9%
22
Evolution
199 papers in training set
Top 2%
0.9%
23
Scientific Reports
3102 papers in training set
Top 70%
0.9%
24
BMC Biology
248 papers in training set
Top 3%
0.8%
25
Nature Communications
4913 papers in training set
Top 60%
0.8%
26
PLOS Biology
408 papers in training set
Top 18%
0.8%
27
Molecular Phylogenetics and Evolution
61 papers in training set
Top 0.3%
0.8%
28
Heredity
53 papers in training set
Top 0.3%
0.8%
29
Philosophical Transactions of the Royal Society B: Biological Sciences
53 papers in training set
Top 2%
0.7%
30
Genome Research
409 papers in training set
Top 5%
0.7%