Novel Parameter-Free and Interpretable Integration of CITE-seq RNA and ADT Profiles via Tensor Decomposition-Based Unsupervised Feature Extraction
Taguchi, Y.-h.; Turki, T.
Show abstract
CITE-seq jointly profiles cellular transcripts and surface proteins, but integrating RNA and antibody-derived tags (ADTs) remains challenging because the two modalities differ markedly in dimensionality, sparsity, and noise characteristics. We present a tensordecomposition-based unsupervised feature extraction framework for the parameter-free integration of CITE-seq data. By constructing a gene x cell x protein tensor and applying HOSVD, this method derives the shared latent representations of genes, cells, and proteins without prior gene filtering or modality-weight tuning. Across five ImmGen T-cell CITE-seq datasets, the resulting cell embeddings were generally more consistent with annotated cell types than RNA-only, protein-only, or totalVI-based embeddings, whereas the organ-level consistency did not improve. The latent factors also enabled post hoc unsupervised gene selection, and the selected genes showed biologically meaningful enrichment for T-cell-related terms. In addition, failure in a poor-quality dataset served as a useful quality-control signal. Together with a blocked sparse-matrix implementation for large tensors, these results indicate that tensor decomposition-based unsupervised feature extraction provides an interpretable, scalable, and competitive approach for integrating RNA and ADT measurements in CITE-seq experiments.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.