Back

Long-read metagenomic sequencing reveals novel lineages and functional diversity in urban soil microbiome

Duan, Y.; Cusco, A.; Zhang, Y.; Inda-Diaz, J. S.; Zhu, C.; Castro, A. A.; Yang, X.; Yu, J.; Jiang, G.; Zhao, X.-M.; Coelho, L. P.

2026-03-21 bioinformatics
10.64898/2026.03.20.713087 bioRxiv
Show abstract

City parks and other urban green spaces can bring significant benefits to the physical and mental health of city residents. However, there is limited knowledge about the microbial communities inhabiting these urban soils. Here, we applied long-read metagenomic sequencing to 58 urban soil samples from two major cities in China, enabling genome-resolved reconstruction of microbial diversity at unprecedented contiguity. We recovered 7,949 medium- and high-quality metagenome-assembled genomes, comprising 4,171 species-level genome bins, of which over 97% represent previously undescribed species. Long-read assemblies revealed extensive secondary metabolic capacity, including more than 30,000 biosynthetic gene clusters, which were highly contiguous compared with those from fragmented short-read assemblies. Beyond secondary metabolism, we uncovered over 2 million small protein families, including hundreds that are strongly enriched in the neighbourhood of defense systems and mobile genetic elements, highlighting their overlooked role in urban soils. These findings expand our understanding of the functional diversity of urban soil microbiomes and provide new insights with implications for urban public health.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 2%
23.4%
2
Advanced Science
249 papers in training set
Top 0.2%
19.3%
3
Microbiome
139 papers in training set
Top 0.9%
3.8%
4
The Innovation
12 papers in training set
Top 0.1%
3.7%
50% of probability mass above
5
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 2%
3.7%
6
Cell
370 papers in training set
Top 7%
3.2%
7
Genome Biology
555 papers in training set
Top 3%
3.2%
8
Protein & Cell
25 papers in training set
Top 1%
1.8%
9
Science China Life Sciences
26 papers in training set
Top 0.8%
1.8%
10
mSystems
361 papers in training set
Top 5%
1.8%
11
Scientific Reports
3102 papers in training set
Top 63%
1.4%
12
PLOS ONE
4510 papers in training set
Top 58%
1.4%
13
Molecular Plant
36 papers in training set
Top 0.9%
1.4%
14
Science
429 papers in training set
Top 16%
1.3%
15
Communications Biology
886 papers in training set
Top 13%
1.3%
16
Science of The Total Environment
179 papers in training set
Top 4%
1.0%
17
Science Advances
1098 papers in training set
Top 24%
1.0%
18
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.0%
19
Cell Discovery
54 papers in training set
Top 4%
1.0%
20
Nature Microbiology
133 papers in training set
Top 4%
0.9%
21
National Science Review
22 papers in training set
Top 2%
0.9%
22
Journal of Genetics and Genomics
36 papers in training set
Top 2%
0.9%
23
Nature Biotechnology
147 papers in training set
Top 6%
0.9%
24
Cell Systems
167 papers in training set
Top 10%
0.9%
25
Computational and Structural Biotechnology Journal
216 papers in training set
Top 9%
0.8%
26
mBio
750 papers in training set
Top 11%
0.8%
27
Journal of Structural Biology
58 papers in training set
Top 1%
0.8%
28
Cell Reports Physical Science
18 papers in training set
Top 0.7%
0.8%
29
Frontiers in Microbiology
375 papers in training set
Top 9%
0.7%
30
Genome Medicine
154 papers in training set
Top 9%
0.7%