Back

SEG: Segmentation Evaluation in absence of Ground truth labels

Sims, Z.; Strgar, L.; Thirumalaisamy, D.; Heussner, R.; Thibault, G.; Chang, Y. H.

2023-02-24 bioinformatics
10.1101/2023.02.23.529809 bioRxiv
Show abstract

Identifying individual cells or nuclei is often the first step in the analysis of multiplex tissue imaging (MTI) data. Recent efforts to produce plug-and-play, end-to-end MTI analysis tools such as MCMICRO1- though groundbreaking in their usability and extensibility - are often unable to provide users guidance regarding the most appropriate models for their segmentation task among an endless proliferation of novel segmentation methods. Unfortunately, evaluating segmentation results on a users dataset without ground truth labels is either purely subjective or eventually amounts to the task of performing the original, time-intensive annotation. As a consequence, researchers rely on models pre-trained on other large datasets for their unique tasks. Here, we propose a methodological approach for evaluating MTI nuclei segmentation methods in absence of ground truth labels by scoring relatively to a larger ensemble of segmentations. To avoid potential sensitivity to collective bias from the ensemble approach, we refine the ensemble via weighted average across segmentation methods, which we derive from a systematic model ablation study. First, we demonstrate a proof-of-concept and the feasibility of the proposed approach to evaluate segmentation performance in a small dataset with ground truth annotation. To validate the ensemble and demonstrate the importance of our method-specific weighting, we compare the ensembles detection and pixel-level predictions - derived without supervision - with the datas ground truth labels. Second, we apply the methodology to an unlabeled larger tissue microarray (TMA) dataset, which includes a diverse set of breast cancer phenotypes, and provides decision guidelines for the general user to more easily choose the most suitable segmentation methods for their own dataset by systematically evaluating the performance of individual segmentation approaches in the entire dataset.

Matching journals

The top 11 journals account for 50% of the predicted probability mass.

1
PLOS ONE
4510 papers in training set
Top 25%
6.9%
2
Biological Imaging
15 papers in training set
Top 0.1%
6.9%
3
PLOS Computational Biology
1633 papers in training set
Top 5%
6.9%
4
Bioinformatics
1061 papers in training set
Top 4%
6.4%
5
BMC Bioinformatics
383 papers in training set
Top 2%
4.0%
6
Nature Communications
4913 papers in training set
Top 37%
3.9%
7
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.7%
8
Scientific Reports
3102 papers in training set
Top 36%
3.6%
9
GigaScience
172 papers in training set
Top 0.5%
3.6%
10
Nature Methods
336 papers in training set
Top 3%
3.1%
11
Advanced Science
249 papers in training set
Top 7%
2.9%
50% of probability mass above
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.6%
13
iScience
1063 papers in training set
Top 11%
1.9%
14
Communications Biology
886 papers in training set
Top 6%
1.9%
15
Genome Biology
555 papers in training set
Top 4%
1.8%
16
Cell Reports Methods
141 papers in training set
Top 2%
1.7%
17
Frontiers in Bioinformatics
45 papers in training set
Top 0.2%
1.7%
18
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.7%
19
Journal of Microscopy
18 papers in training set
Top 0.3%
1.3%
20
Cell Systems
167 papers in training set
Top 8%
1.3%
21
Small Methods
26 papers in training set
Top 0.6%
1.2%
22
Plant Physiology
217 papers in training set
Top 2%
1.2%
23
Biomedical Optics Express
84 papers in training set
Top 0.8%
1.1%
24
Patterns
70 papers in training set
Top 2%
1.0%
25
Nucleic Acids Research
1128 papers in training set
Top 15%
0.9%
26
Scientific Data
174 papers in training set
Top 2%
0.9%
27
Biology Methods and Protocols
53 papers in training set
Top 2%
0.8%
28
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.8%
29
eLife
5422 papers in training set
Top 59%
0.7%
30
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 10%
0.7%