Back

A Permutation-Based Framework for Evaluating Bias in Microbiome Differential Abundance Analysis

Zeng, K.; Fodor, A. A.

2026-03-18 bioinformatics
10.64898/2026.03.14.711836 bioRxiv
Show abstract

BackgroundIn microbiome research, differential abundance analysis aids in identifying significant differences in microbial taxa across two or more conditions. Statistical approaches used for this purpose include classical tests such as the t-test and Wilcoxon test, as well as methods designed to account for the compositional nature of microbiome data, including ALDEx2, ANCOM-BC2, and metagenomeSeq. In addition, methods originally developed for RNA sequencing data, such as DESeq2 and edgeR, have been frequently applied to microbiome studies. However, the use of these methods has been controversial. One area of concern is whether different modeling frameworks produce accurate p-values when the null hypothesis is true. ResultsWe evaluated eight methods across six publicly available datasets. Four permutation strategies were applied to generate data under the null hypothesis: shuffling sample names, shuffling counts within samples, shuffling counts within taxa, and fully randomizing the counts table. Methods based on the negative binomial distribution (DESeq2 and edgeR) produced p-values that were consistently smaller than expected under the null hypothesis. In contrast, methods that attempt to correct for compositionality (ALDEx2, ANCOM-BC2, and metagenomeSeq) tended to produce larger-than-expected p-values, even when only sample labels were shuffled, a permutation strategy that does not alter compositional structure. These deviations were dependent on dataset characteristics and permutation strategy, suggesting complex interactions between underlying data structure and algorithm performance. Generating data to follow the expected negative binomial distribution did not eliminate the tendency of DESeq2 and edgeR to exaggerate statistical significance. Although similar patterns were observed in RNA sequencing (RNAseq) datasets, the deviations were less pronounced than in microbiome data. In contrast, the classical t-test and Wilcoxon test yielded p-value distributions consistent with theoretical expectations across datasets and permutation strategies. ConclusionsThese results indicate that the performance of several widely used differential abundance methods can be problematic under null conditions and may affect biological interpretation. Our findings emphasize the importance of careful method selection and highlight the robustness of simpler statistical approaches for reliable inference.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PeerJ
261 papers in training set
Top 0.1%
18.3%
2
BMC Bioinformatics
383 papers in training set
Top 0.5%
14.1%
3
PLOS ONE
4510 papers in training set
Top 23%
8.3%
4
Bioinformatics
1061 papers in training set
Top 3%
8.1%
5
mSystems
361 papers in training set
Top 1%
7.1%
50% of probability mass above
6
Scientific Reports
3102 papers in training set
Top 28%
4.2%
7
Methods in Ecology and Evolution
160 papers in training set
Top 0.9%
3.5%
8
PLOS Computational Biology
1633 papers in training set
Top 11%
3.0%
9
Microbiology Spectrum
435 papers in training set
Top 2%
1.9%
10
Environmental DNA
49 papers in training set
Top 0.2%
1.7%
11
mSphere
281 papers in training set
Top 3%
1.7%
12
Microbial Genomics
204 papers in training set
Top 1%
1.7%
13
BMC Genomics
328 papers in training set
Top 3%
1.5%
14
Frontiers in Microbiology
375 papers in training set
Top 6%
1.3%
15
F1000Research
79 papers in training set
Top 2%
1.3%
16
BMC Microbiology
35 papers in training set
Top 1.0%
1.2%
17
Frontiers in Bioinformatics
45 papers in training set
Top 0.6%
0.9%
18
GigaScience
172 papers in training set
Top 3%
0.9%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.9%
20
Molecular Ecology Resources
161 papers in training set
Top 0.9%
0.9%
21
Microbiome
139 papers in training set
Top 3%
0.8%
22
Microorganisms
101 papers in training set
Top 2%
0.8%
23
Metabarcoding and Metagenomics
12 papers in training set
Top 0.1%
0.7%
24
Bioinformatics Advances
184 papers in training set
Top 5%
0.7%
25
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
26
Frontiers in Genetics
197 papers in training set
Top 12%
0.6%