Back

Representation Learning Methods for Single-Cell Microscopy are Confounded by Background Cells

Gupta, A.; Moses, A.; Lu, A. X.

2025-06-30 bioinformatics
10.1101/2025.06.26.661577 bioRxiv
Show abstract

Deep learning models are widely used to extract feature representations from microscopy images. While these models are used for single-cell analyses, such as studying single-cell heterogeneity, they typically operate on image crops centered on individual cells with background information present, such as other cells, and it remains unclear to what extent the conclusions of single-cell analyses may be altered by this. In this paper, we introduce a novel evaluation framework that directly tests the robustness of crop-based models to background information. We create synthetic single-cell crops where the center cells localization is fixed and the background is swapped--e.g., with backgrounds from other protein localizations. We measure how different backgrounds affect localization classification performance using model-extracted features. Applying this framework to three leading models for single-cell microscopy for analyzing yeast protein localization, we find that all lack robustness to background cells. Localization classification accuracy drops by up to 15.8% when background cells differ in localization from the center cell compared to when the localization is the same. We further show that this lack of robustness can affect downstream biological analyses, such as the task of estimating proportions of cells for proteins with single-cell heterogeneity in localization. Ultimately, our framework provides a concrete way to evaluate single-cell model robustness to background information and highlights the importance of learning background-invariant features for reliable single-cell analysis.1

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
14.5%
2
Biological Imaging
15 papers in training set
Top 0.1%
9.2%
3
BMC Bioinformatics
383 papers in training set
Top 2%
6.4%
4
Scientific Reports
3102 papers in training set
Top 17%
6.4%
5
PLOS Computational Biology
1633 papers in training set
Top 5%
6.4%
6
PLOS ONE
4510 papers in training set
Top 34%
4.4%
7
Journal of Microscopy
18 papers in training set
Top 0.1%
2.8%
50% of probability mass above
8
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.1%
9
Biomedical Optics Express
84 papers in training set
Top 0.6%
2.1%
10
BMC Methods
11 papers in training set
Top 0.1%
1.7%
11
Frontiers in Bioinformatics
45 papers in training set
Top 0.2%
1.7%
12
eLife
5422 papers in training set
Top 41%
1.7%
13
Molecular Biology of the Cell
272 papers in training set
Top 1%
1.7%
14
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.7%
15
Frontiers in Genetics
197 papers in training set
Top 7%
1.2%
16
Physical Biology
43 papers in training set
Top 2%
1.1%
17
Developmental Biology
134 papers in training set
Top 2%
1.1%
18
Biology Methods and Protocols
53 papers in training set
Top 2%
0.9%
19
Biophysical Journal
545 papers in training set
Top 4%
0.9%
20
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
21
Nature Methods
336 papers in training set
Top 5%
0.9%
22
Development
440 papers in training set
Top 3%
0.8%
23
Journal of Structural Biology
58 papers in training set
Top 2%
0.8%
24
Patterns
70 papers in training set
Top 3%
0.7%
25
Limnology and Oceanography: Methods
11 papers in training set
Top 0.4%
0.7%
26
Cytometry Part A
30 papers in training set
Top 0.3%
0.7%
27
Frontiers in Bioengineering and Biotechnology
88 papers in training set
Top 3%
0.7%
28
Applied Sciences
24 papers in training set
Top 1%
0.7%
29
Plant Methods
39 papers in training set
Top 0.9%
0.7%
30
Journal of Cell Science
353 papers in training set
Top 3%
0.7%