Back

SIPdb: A stable isotope probing database and analytical dashboard for linking amplicon sequences to microbial activity using a reverse ecology approach

Trentin, A. B.; Simpson, A.; Kimbrel, J. A.; Blazewicz, S. J.; Wilhelm, R. C.

2026-02-11 bioinformatics
10.64898/2026.02.09.704843 bioRxiv
Show abstract

Stable isotope probing (SIP) provides a powerful means to connect microbial sequence data with diverse metabolic activities, but the lack of a framework for SIP-derived data has limited its integration into broader strategies for ecological inference. Here, we introduce the SIPdb, an extensible SQLite database of curated nucleic acid SIP experiments (also in phyloseq format) paired with an interactive RShiny dashboard for analysis and visualization. The initial release compiles 22 studies covering 21 isotopolog substrates across diverse environments, with data standardized using the MISIP metadata standard. In creating the SIPdb, we have provided a standardized pipeline that accommodates the three most common SIP gradient fractionation strategies (binary, multi-fraction, and density-resolved), two isotope incorporator designation strategies (fixed- and sliding-window), and four complementary differential abundance methods (DESeq2, edgeR, limma-voom, and ALDEx2). Using our pipeline, we identified more than 42,000 unique amplicon sequence variants as isotope incorporators across 62 phyla. Benchmarking with synthetic datasets demonstrated consistent performance across incorporator designation strategies, with ALDEx2 providing the highest specificity. Validation against original publications showed that, on average, SIPdb recovered 70.1% of author-reported incorporator taxa, with discrepancies arising from differences in phylotyping or classification approaches. Finally, our reanalysis of a non-SIP study of 1,4-dioxane degradation showed how SIPdb can both validate known degraders and uncover additional candidate taxa involved in community metabolism. The SIPdb establishes a scalable platform for reverse ecology, enabling hypothesis generation, cross-study meta-analysis, and linking taxa to metabolic processes, while serving as an open, extensible resource to accelerate ecological interpretation in microbiome research.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Nucleic Acids Research
1128 papers in training set
Top 0.4%
19.6%
2
Microbiome
139 papers in training set
Top 0.1%
18.8%
3
Nature Communications
4913 papers in training set
Top 8%
17.7%
50% of probability mass above
4
mSystems
361 papers in training set
Top 1%
6.4%
5
Nature Biotechnology
147 papers in training set
Top 2%
4.0%
6
PLOS Computational Biology
1633 papers in training set
Top 12%
2.8%
7
Scientific Data
174 papers in training set
Top 0.8%
2.1%
8
Nature Microbiology
133 papers in training set
Top 2%
1.7%
9
ISME Communications
103 papers in training set
Top 1%
1.7%
10
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.7%
11
Genome Biology
555 papers in training set
Top 4%
1.7%
12
mSphere
281 papers in training set
Top 4%
1.2%
13
Cell Reports Methods
141 papers in training set
Top 3%
1.2%
14
Bioinformatics
1061 papers in training set
Top 8%
1.0%
15
The ISME Journal
194 papers in training set
Top 2%
1.0%
16
PLOS ONE
4510 papers in training set
Top 64%
0.9%
17
Communications Biology
886 papers in training set
Top 18%
0.9%
18
Cell Systems
167 papers in training set
Top 10%
0.9%
19
Nature Methods
336 papers in training set
Top 6%
0.7%
20
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
21
mBio
750 papers in training set
Top 12%
0.6%
22
npj Biofilms and Microbiomes
56 papers in training set
Top 2%
0.6%
23
Scientific Reports
3102 papers in training set
Top 78%
0.6%
24
Journal of Proteome Research
215 papers in training set
Top 3%
0.5%
25
Methods in Ecology and Evolution
160 papers in training set
Top 3%
0.5%
26
eLife
5422 papers in training set
Top 63%
0.5%