Back

Analysis of Transcriptograms in Epithelial-Mesenchymal Transition (EMT)

Santos, O. J.; Dalmolin, R. J.; de Almeida, R. M. C.

2026-02-18 bioinformatics
10.64898/2026.02.16.706231 bioRxiv
Show abstract

Single-cell RNA sequencing (single-cell RNA-seq) has represented a revolution in gene expression analysis. However, high dropout rates and stochastic noise often reduce the amount of information captured in these experiments. The epithelial-mesenchymal transition (EMT), which is fundamental to tumor progression and organismal development, is particularly difficult to fully characterize due to the existence of intermediate states. In this work, we demonstrate that projecting transcriptomic data onto gene lists ordered using protein-protein interaction (PPI) information acts as a "biological low-pass filter", attenuating technical noise and increasing the statistical power of the analyses. We propose and validate an innovative pipeline that integrates the Transcriptogram method with Principal Component Analysis (PCA). By applying a moving average over functionally ordered genes, we drastically increase the signal-to-noise ratio, enabling the inference of cellular trajectories. The method was applied to a public dataset of TGF-{beta}1-induced MCF10A cells, with rigorous batch-effect correction based on biological controls. The results reveal that EMT is not merely a morphological change, but a coordinated, systemic reprogramming. This approach enabled the identification of critical modules that would remain hidden in conventional analyses: (i) a massive "Metabolic Switch" (Cluster 2), indicating a transition toward oxidative phosphorylation to sustain invasion; (ii) a strategic blockade of the cell cycle (Cluster 4); and (iii) a "Detoxification Shield" and chemoresistance program (Cluster 5), characterized by endogenous activation of metallothioneins. We conclude that the combination of PPI network topology and dimensionality reduction offers superior resolution for dissecting cellular plasticity. The method not only validates classical markers, but also reveals the hidden functional architecture of the transition, showing that EMT is not a single, uniform process, but rather one in which cells can follow distinct trajectories, halting at different stages of differentiation.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.1%
18.1%
2
Bioinformatics
1061 papers in training set
Top 2%
12.0%
3
Scientific Reports
3102 papers in training set
Top 11%
8.2%
4
BMC Bioinformatics
383 papers in training set
Top 1%
8.2%
5
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.7%
6.1%
50% of probability mass above
6
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 1%
3.9%
7
iScience
1063 papers in training set
Top 6%
3.5%
8
PLOS Computational Biology
1633 papers in training set
Top 11%
3.2%
9
Nucleic Acids Research
1128 papers in training set
Top 7%
3.0%
10
PLOS ONE
4510 papers in training set
Top 43%
3.0%
11
npj Systems Biology and Applications
99 papers in training set
Top 0.9%
2.0%
12
Advanced Science
249 papers in training set
Top 12%
1.6%
13
Communications Biology
886 papers in training set
Top 10%
1.6%
14
Cell Reports Methods
141 papers in training set
Top 3%
1.4%
15
Nature Communications
4913 papers in training set
Top 55%
1.3%
16
Frontiers in Genetics
197 papers in training set
Top 6%
1.3%
17
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 39%
1.1%
18
Cancers
200 papers in training set
Top 4%
0.9%
19
RNA Biology
70 papers in training set
Top 0.4%
0.9%
20
Cell Systems
167 papers in training set
Top 11%
0.9%
21
Development
440 papers in training set
Top 3%
0.8%
22
International Journal of Molecular Sciences
453 papers in training set
Top 15%
0.8%
23
Journal of Cell Science
353 papers in training set
Top 2%
0.8%
24
Life Science Alliance
263 papers in training set
Top 1%
0.8%
25
Physical Biology
43 papers in training set
Top 2%
0.7%
26
eLife
5422 papers in training set
Top 59%
0.7%
27
BMC Genomics
328 papers in training set
Top 7%
0.6%
28
Frontiers in Molecular Biosciences
100 papers in training set
Top 6%
0.6%