Back

UCF-MultiOrgan-Path: A Public Benchmark Dataset of Histopathologic Images for Deep Learning Model Based Organ Classification

Hossain, M. S. B.; Piazza, Y.; Braun, J.; Bilic, A.; Hsieh, M.; Fouissi, S.; Borowsky, A.; Kaseb, H.; Fraser, A.; Wray, B.-A.; Chen, C.; Wang, L.; Husain, M.; Hadley, D.

2024-11-06 pathology
10.1101/2024.11.05.24316736
Show abstract

A pathologist typically diagnoses tissue samples by examining glass slides under a light microscope. The entire tissue specimen can be stored digitally as a Whole Slide Image (WSI) for further analysis. However, managing and diagnosing large numbers of images manually is time-consuming and requires specialized expertise. Consequently, computer-aided diagnosis of these pathology images is an active research area, with deep learning showing promise in disease classification and cancer cell segmentation. Robust deep learning models need many annotated images, but public datasets are limited, often constrained to specific organs, cancer types, or binary classifications, which limits generalizability. To address this, we introduce the UCF multi-organ histopathologic (UCF-MultiOrgan-Path) dataset, containing 977 WSIs from cadaver tissues across 15 organ classes, including lung, kidney, liver, and pancreas. This dataset includes [~]2.38 million patches of 512x512 pixels. For technical validation, we provide patch-based and slide-based approaches for patch- and slide-level classification. Our dataset, containing millions of patches, can serve as a benchmark for training and validating deep learning models in multi-organ classification.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.