Back

GatorDuo: Global-Consistency Dual-Graph Refinement With Pseudo-Label Agreement for Spatial Transcriptomics

Zhang, Z.; Jimeno Yepes, A.; Bian, J.; Li, F.; Liu, Y.

2026-05-13 bioinformatics
10.64898/2026.05.10.724039 bioRxiv
Show abstract

Spatial transcriptomics (ST) measures gene expression together with spatial coordinates, enabling spatial domain identification of coherent tissue regions. Many recent approaches rely on graph-based modeling to combine spatial neighborhoods and transcriptomic (gene-expression) similarity, yet neighborhood construction is often unreliable under sparsity and technical noise. As a result, spurious cross-domain shortcut edges can persist in static graphs and propagate misleading signals during message passing, ultimately blurring domain boundaries and weakening cluster separability. In this paper, we propose GatorDuo, a topology-aware dual-graph contrastive self-supervised framework for robust spatial domain identification that couples gene-expression similarity with spatial proximity through complementary neighborhood graphs. GatorDuo introduces global-consistency-based graph refinement that uses a pseudo-label agreement mask to suppress cross-domain shortcut edges in both views, thus stabilizing neighborhood topology for representation learning. To avoid manual tuning of domain resolution, GatorDuo further employs a contextual bandit reinforcement-learning strategy to adaptively select the clustering granularity (the number of clusters) used for refinement. The refined view-specific embeddings are integrated via a hybrid-routing Mixture-of-Experts (MoE) module to generate a unified embedding, optimized with contrastive objectives augmented by an MoE-alignment term. Across eight public benchmarks spanning sequencing- and imaging-based ST at spot and single-cell resolution, and compared with ten representative baselines, GatorDuo consistently delivers strong and robust spatial domain identification performance across multiple clustering metrics, while yielding informative unified embeddings that can support downstream biological analyses.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Nature Methods
336 papers in training set
Top 0.1%
26.5%
2
Nature Communications
4913 papers in training set
Top 13%
12.8%
3
Nature Biotechnology
147 papers in training set
Top 0.6%
10.7%
4
Bioinformatics
1061 papers in training set
Top 4%
7.0%
50% of probability mass above
5
Nucleic Acids Research
1128 papers in training set
Top 5%
4.0%
6
Advanced Science
249 papers in training set
Top 5%
3.7%
7
Genome Biology
555 papers in training set
Top 3%
3.1%
8
Briefings in Bioinformatics
326 papers in training set
Top 3%
2.1%
9
Genome Medicine
154 papers in training set
Top 3%
2.1%
10
Genome Research
409 papers in training set
Top 2%
2.1%
11
Cell Systems
167 papers in training set
Top 6%
2.1%
12
Nature Machine Intelligence
61 papers in training set
Top 2%
1.5%
13
PLOS Computational Biology
1633 papers in training set
Top 17%
1.5%
14
Cell Reports Methods
141 papers in training set
Top 4%
1.0%
15
Bioinformatics Advances
184 papers in training set
Top 4%
1.0%
16
Nature Computational Science
50 papers in training set
Top 1%
1.0%
17
PLOS ONE
4510 papers in training set
Top 63%
0.9%
18
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 41%
0.9%
19
Communications Biology
886 papers in training set
Top 23%
0.8%
20
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.8%
21
Scientific Reports
3102 papers in training set
Top 75%
0.7%
22
Nature
575 papers in training set
Top 16%
0.7%
23
Patterns
70 papers in training set
Top 3%
0.7%
24
Science Advances
1098 papers in training set
Top 35%
0.5%
25
BMC Bioinformatics
383 papers in training set
Top 8%
0.5%
26
GigaScience
172 papers in training set
Top 4%
0.5%