Back

An Exponential Scale Mixture Model for Metatranscriptomics Data with Application to Inflammatory Bowel Disease

Kim, H.; Ma, L.

2026-05-15 genomics
10.64898/2026.05.15.725552 bioRxiv
Show abstract

Metatranscriptomic (MTX) sequencing enables profiling of gene expression across microbial communities, providing a framework for linking genetic potential with functional activity. However, standard pipelines report normalized abundances rather than raw counts, limiting the use of count-based RNA-seq methods, while Gaussian-based alternatives rely on transformations and assumptions that are often poorly suited to MTX data. We propose a new modeling framework for differential expression analysis of MTX data, built on a scale mixture of exponential distributions, that incorporates DNA abundance to adjust for genomic potential, accommodates subject-specific random effects, treats zeros as left-censored, and employs a mixture prior to handle extreme sparsity. Applied to the IBDMDB multi-omics cohort, differential expression results vary substantially across models, including among Gaussian approaches with different pseudocount choices. Our approach identifies a distinct subset of candidate genes not detected by existing Gaussian methods; these may provide useful leads toward a novel understanding of transcriptomic patterns associated with dysbiosis in inflammatory bowel disease. Estimated dysbiosis effect directions are consistent between our model and Gaussian-based approaches, while effect sizes from our model tend to be larger in absolute value.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Biotechnology
147 papers in training set
Top 0.6%
12.1%
2
Bioinformatics
1061 papers in training set
Top 3%
9.9%
3
mSystems
361 papers in training set
Top 1%
8.2%
4
Microbiome
139 papers in training set
Top 0.6%
6.2%
5
Cell Reports Methods
141 papers in training set
Top 0.5%
4.7%
6
PLOS Computational Biology
1633 papers in training set
Top 8%
4.2%
7
BMC Bioinformatics
383 papers in training set
Top 2%
4.1%
8
Frontiers in Genetics
197 papers in training set
Top 2%
3.9%
50% of probability mass above
9
BMC Genomics
328 papers in training set
Top 0.8%
3.6%
10
Genome Biology
555 papers in training set
Top 2%
3.6%
11
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.6%
12
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.8%
3.5%
13
Microbial Genomics
204 papers in training set
Top 0.7%
3.0%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.0%
15
Nucleic Acids Research
1128 papers in training set
Top 11%
1.7%
16
PLOS ONE
4510 papers in training set
Top 55%
1.7%
17
Nature Communications
4913 papers in training set
Top 56%
1.3%
18
Bioinformatics Advances
184 papers in training set
Top 4%
1.2%
19
Scientific Reports
3102 papers in training set
Top 67%
1.2%
20
Genome Medicine
154 papers in training set
Top 6%
1.2%
21
Genetic Epidemiology
46 papers in training set
Top 0.6%
1.2%
22
Physiological Genomics
15 papers in training set
Top 0.2%
1.2%
23
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.9%
24
Communications Biology
886 papers in training set
Top 25%
0.7%
25
Methods in Ecology and Evolution
160 papers in training set
Top 2%
0.7%
26
Genome Research
409 papers in training set
Top 5%
0.7%
27
npj Biofilms and Microbiomes
56 papers in training set
Top 2%
0.6%
28
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 48%
0.6%