Back

PACMON: Pathway-guided Multi-Omics data integration for interpreting large-scale perturbation screens

Qoku, A.; Stickel, T.; Amerifar, S.; Wolf, S.; Oellerich, T.; Buettner, F.

2026-03-24 bioinformatics
10.64898/2026.03.20.713295 bioRxiv
Show abstract

High-throughput perturbation screens coupled with single-cell molecular profiling enable systematic interrogation of gene function, yet interpreting the resulting data in terms of biological pathways remains challenging. Existing approaches either identify latent gene modules without linking them to perturbations, or model perturbation effects without incorporating prior biological knowledge, limiting interpretability and scalability. Here, we introduce PACMON (Pathwayguided Multi-Omics data integration for interpreting large-scale perturbation screens), a Bayesian latent factor model that jointly infers pathway-level programs and their modulation by experimental perturbations. PACMON decomposes multimodal molecular measurements into shared latent factors aligned with known biological pathways through structured sparsity priors, while simultaneously estimating how each perturbation activates or represses these pathway programs. The framework naturally accommodates multiple data modalities and employs stochastic variational inference for scalable application to large datasets. We evaluate PACMON in three settings of increasing complexity. On synthetic data with known ground truth, PACMON achieves near-perfect recovery of pathway structure and perturbation effects, outperforming existing methods in both accuracy and computational scalability. Applied to a multimodal Perturb-CITE-seq screen of melanoma cells, PACMON recovers coherent interferon-signaling and cell-cycle programs spanning RNA and surface-protein modalities and identifies interpretable perturbation-pathway associations consistent with known immune-evasion mechanisms. Finally, we apply PACMON to the Tahoe-100M perturbation atlas -- approximately 100 million cells and over 1,000 drug-dose combinations -- producing the first pathway-level latent factor analysis at this scale and revealing biologically meaningful drug-response landscapes across Hallmark pathway programs. PACMON provides a unified, scalable and interpretable framework for mapping perturbation effects onto biological pathways in modern large-scale perturbation experiments.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Cell Systems
167 papers in training set
Top 0.2%
22.3%
2
Nature Biotechnology
147 papers in training set
Top 0.5%
12.2%
3
Nature Methods
336 papers in training set
Top 1.0%
10.0%
4
Nature Communications
4913 papers in training set
Top 25%
7.1%
50% of probability mass above
5
Genome Biology
555 papers in training set
Top 2%
3.6%
6
Nucleic Acids Research
1128 papers in training set
Top 6%
3.6%
7
Nature Genetics
240 papers in training set
Top 2%
3.6%
8
Nature
575 papers in training set
Top 8%
2.9%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 26%
2.4%
10
Bioinformatics
1061 papers in training set
Top 7%
2.1%
11
Genome Medicine
154 papers in training set
Top 4%
1.9%
12
Nature Cell Biology
99 papers in training set
Top 2%
1.9%
13
Nature Machine Intelligence
61 papers in training set
Top 2%
1.7%
14
Science
429 papers in training set
Top 14%
1.7%
15
PLOS Computational Biology
1633 papers in training set
Top 18%
1.5%
16
Advanced Science
249 papers in training set
Top 13%
1.3%
17
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
18
Nature Computational Science
50 papers in training set
Top 1.0%
1.2%
19
Scientific Reports
3102 papers in training set
Top 67%
1.2%
20
Communications Biology
886 papers in training set
Top 15%
1.2%
21
Cell Reports
1338 papers in training set
Top 31%
0.9%
22
The American Journal of Human Genetics
206 papers in training set
Top 3%
0.9%
23
Nature Microbiology
133 papers in training set
Top 4%
0.9%
24
Cancer Research
116 papers in training set
Top 3%
0.8%
25
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
26
Cell Genomics
162 papers in training set
Top 7%
0.7%
27
Science Advances
1098 papers in training set
Top 33%
0.6%
28
Nature Neuroscience
216 papers in training set
Top 7%
0.6%
29
Genome Research
409 papers in training set
Top 5%
0.6%