Back

ClumPyCells resolves spatial aggregation in complex tissues overcoming size biases

Zhao, Z.; Cui, L.; Aguilar-Navarro, A. G.; Monajemzadeh, M.; Chang, Q.; Chen, Z.; Tsui, H.; Flores-Figueroa, E.; Schwartz, G. W.

2026-03-30 bioinformatics
10.64898/2026.03.26.714529 bioRxiv
Show abstract

The spatial arrangement of cells within a tissue microenvironment shapes their interactions and cell states, which are essential for tissue development, homeostasis, and disease. Spatial -omics technologies can precisely map the location of each cell within complex tissue structures, while also profiling their protein content and transcriptional diversity. Various approaches have been developed to analyze spatial patterns of cell aggregation, repulsion, or random distribution within tissues. However, differences in cell morphology within a tissue can introduce significant bias. Cell size in particular is not accounted for and introduces challenges when quantifying the aggregation of cells or their molecular features. To overcome such limitations, we present ClumPyCells: a statistical framework that measures cell and marker aggregation within tissue while correcting for size morphology. ClumPyCells enables interpretation of cell aggregation, bypassing interfering cell types or tissue regions unrelated to the desired spatial correlation. We demonstrate the capabilities of ClumPyCells across several tumor types, including melanoma and colorectal cancer, and spatial -omics technologies such as spatial transcriptomics and proteomics, while benchmarking how cell-size differences contribute to misinterpretations. By correcting for disruptive cell types within a region of interest, ClumPyCells will determine new tissue patterns and structures without morphological interference.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 11%
14.1%
2
Cell Systems
167 papers in training set
Top 0.8%
12.1%
3
Bioinformatics
1061 papers in training set
Top 3%
9.9%
4
Genome Biology
555 papers in training set
Top 1%
6.2%
5
Nature Biotechnology
147 papers in training set
Top 2%
4.2%
6
Molecular Systems Biology
142 papers in training set
Top 0.1%
4.1%
50% of probability mass above
7
Molecular & Cellular Proteomics
158 papers in training set
Top 0.6%
3.9%
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
9
Nature Methods
336 papers in training set
Top 3%
3.6%
10
Cell Reports Methods
141 papers in training set
Top 1.0%
3.5%
11
iScience
1063 papers in training set
Top 7%
3.0%
12
PLOS ONE
4510 papers in training set
Top 44%
2.7%
13
Nucleic Acids Research
1128 papers in training set
Top 9%
2.0%
14
Scientific Reports
3102 papers in training set
Top 54%
1.9%
15
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.3%
16
Advanced Science
249 papers in training set
Top 13%
1.3%
17
Cancer Research Communications
46 papers in training set
Top 0.7%
1.2%
18
Journal of Proteome Research
215 papers in training set
Top 2%
1.2%
19
Communications Biology
886 papers in training set
Top 17%
0.9%
20
Genome Research
409 papers in training set
Top 3%
0.9%
21
Genome Medicine
154 papers in training set
Top 7%
0.9%
22
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
23
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%
24
npj Systems Biology and Applications
99 papers in training set
Top 3%
0.7%
25
Science Advances
1098 papers in training set
Top 32%
0.7%
26
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 48%
0.6%
27
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.6%
28
Nature Machine Intelligence
61 papers in training set
Top 4%
0.6%