MDMR: Balancing Diversity and Redundancy for Annotation-Efficient Fine-Tuning of Pretrained Cell Segmentation Models

Sheikh, E. M.; Tharwat, A.; Schwan, C.; Schenck, W.

2025-11-05 bioengineering

10.1101/2025.11.04.686267 bioRxiv

Show abstract

Pretrained cell segmentation models have simplified and accelerated microscopy image analysis, but they often perform poorly on new and challenging datasets. Although these models can be adapted to new datasets with only a few annotated images, the effectiveness of fine-tuning depends critically on which images are selected for annotation. To address this, we propose CGMD (Centrality-Guided Maximum Diversity), a novel algorithm that identifies a small set of images that are maximally diverse with respect to each other in the pretrained feature space. We evaluate CGMD under an extremely low annotation budget of just two images per dataset for fine-tuning the pretrained Cellpose Cyto2 model on four different 2D+t datasets from the Cell Tracking Challenge. CGMD consistently outperforms six competitive active learning and subset selection methods and approaches the performance of fully supervised fine-tuning. The results show that centrality-guided maximum diversity subset selection enables stable and annotation-efficient fine-tuning of pretrained cell segmentation models. The code is publicly available at: https://github.com/eiram-mahera/cgmd.

MDMR: Balancing Diversity and Redundancy for Annotation-Efficient Fine-Tuning of Pretrained Cell Segmentation Models

Matching journals