Back

Decoupling Lineage and Intrinsic Information in Single-Cell Lineage Tracing Data with Deep Disentangled Representation Learning

Wen, Y.; Xiong, J.; Gong, F.; Ma, L.; Wan, L.

2026-03-11 cell biology
10.64898/2026.03.10.710716 bioRxiv
Show abstract

Single-cell RNA sequencing combined with lineage tracing technologies provides rich opportunities to study development and tumor evolution, yet existing computational methods struggle to disentangle intrinsic transcriptional states from lineage-driven effects. We introduce DeepTracing, a deep generative framework that integrates disentangled representation learning with lineage-aware Gaussian processes to explicitly separate intrinsic cellular variation from lineage constraints. The model constructs a layered latent space and enforces independence via Total Correlation regularization, producing intrinsic, lineage, and unified embeddings. Across extensive benchmarks, DeepTracing consistently outperforms existing approaches. In TedSim simulations, it achieves superior clustering of cell states and effectively recovers phylogenetic structure, surpassing original expression and scVI. Applied to mouse tumor lineage-tracing data, DeepTracing attains higher ARI/NMI for tumor-type classification than scVI and PORCELAN, accurately separating primary and metastatic tumors and recovering known trajectories such as early lymph-node divergence and liver-to-kidney cross-seeding. In larger datasets, it maintains strong performance while preserving both transcriptomic continuity and lineage fidelity. DeepTracing also reconstructs continuous developmental trajectories in mouse ventral midbrain, isolating temporal effects from intrinsic differentiation. These results establish DeepTracing as a scalable and interpretable framework for analyzing multimodal single-cell data in tumor progression. Code availabilityThe source code is publicly available at https://github.com/Yuhong-Wen/DeepTracing.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Machine Intelligence
61 papers in training set
Top 0.1%
14.0%
2
Nature Communications
4913 papers in training set
Top 12%
14.0%
3
Nature
575 papers in training set
Top 3%
9.8%
4
Genome Biology
555 papers in training set
Top 1%
6.2%
5
Nature Methods
336 papers in training set
Top 2%
6.2%
50% of probability mass above
6
Cell Systems
167 papers in training set
Top 3%
4.7%
7
Science
429 papers in training set
Top 9%
3.5%
8
Cell Reports
1338 papers in training set
Top 16%
3.5%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 21%
3.5%
10
Nature Medicine
117 papers in training set
Top 1%
3.0%
11
Nature Biotechnology
147 papers in training set
Top 3%
2.5%
12
Nature Cell Biology
99 papers in training set
Top 2%
1.8%
13
Patterns
70 papers in training set
Top 1.0%
1.6%
14
Nucleic Acids Research
1128 papers in training set
Top 11%
1.6%
15
Advanced Science
249 papers in training set
Top 13%
1.4%
16
Science Advances
1098 papers in training set
Top 24%
1.2%
17
Journal of Cell Biology
333 papers in training set
Top 3%
1.2%
18
Nature Genetics
240 papers in training set
Top 6%
1.2%
19
Communications Biology
886 papers in training set
Top 18%
0.9%
20
PLOS ONE
4510 papers in training set
Top 65%
0.9%
21
Cell
370 papers in training set
Top 16%
0.8%
22
Cell Genomics
162 papers in training set
Top 6%
0.8%
23
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
24
Development
440 papers in training set
Top 4%
0.7%
25
Nature Neuroscience
216 papers in training set
Top 6%
0.7%
26
Bioinformatics
1061 papers in training set
Top 10%
0.7%
27
Scientific Reports
3102 papers in training set
Top 77%
0.7%