GraphBG: Fast Bayesian Domain Detection via Spectral Graph Convolutions for Multi-slice and Multi-modal Spatial Transcriptomics
Do, V. H.; Tran, T. P. L.; Canzar, S.
Show abstract
Spatial transcriptomics (ST) technologies enable measurement of gene expression with spatial context, offering unprecedented insight into tissue architecture and cellular microenvironments. A fundamental analysis task is the identification of spatial domains, i.e., contiguous regions with distinct molecular profiles. As ST datasets scale to larger tissue areas, multiple slices, and multiple molecular modalities, there is a growing need for clustering methods that are accurate, scalable, and capable of integrating diverse spatial and molecular signals. We present GraphBG, a unified and scalable framework for spatial domain detection in ST data. GraphBG integrates approximate spectral graph convolutions with a variational Bayesian Gaussian mixture model, enabling robust representation learning and clustering of spatially coherent domains. We extend this core model to support multi-slice analysis (GraphBG-MS) through metacell aggregation, batch correction, and joint clustering, and to multi-modal spatial omics data (GraphBG-MM) via modality-specific graph encodings and kernel canonical correlation analysis. Across diverse real and simulated datasets, GraphBG consistently outperforms existing methods in domain coherence, scalability, and biological interpretability. Notably, it accurately clusters over 370,000 cells from 31 MERFISH tissue slices in just 5 minutes and integrates spatial transcriptomic and proteomic data for improved domain resolution. Applying GraphBG-MS to mouse liver ST data, we show that it captures canonical lobular zonation and disease-specific remodeling, highlighting its ability to reveal biologically meaningful tissue organization.
Matching journals
The top 2 journals account for 50% of the predicted probability mass.