Back

SortIT - A Tool For Assessing Observer Variability And Creating Ground Truth Image Classification Datasets

Uegami, W.; Bisson, T.; Okoshi, E. N.; Costa da Silva, F. G.; Jiragawasan, C.; Zerbe, N.; Bychkov, A.; Fukuoka, J.

2026-05-29 pathology
10.64898/2026.05.28.728616 bioRxiv
Show abstract

Interobserver variability in pathological assessments is a well-recognized challenge that impacts diagnostic reliability and disease understanding. This variability exists across many subspecialties due to the subjective nature of evaluations. Artificial intelligence (AI) applied to whole slide images has potential to standardize procedures and reduce variability in pathology, but transitioning to these technologies does not guarantee improvement. Establishing reliable ground truth datasets with consensus annotations is crucial for developing robust AI solutions. We introduce SortIT, an open-source web application that facilitates systematic creation and evaluation of ground truth image tile annotations. SortIT enables multiple annotators to independently label tiles, with flexible user permission controls. Annotated data can be exported for statistical analysis of observer variation and for creating ground truth datasets from consensus tiles. We outline protocols using SortIT for several use cases: (1) mitosis segmentation in tumor regions, (2) evaluating AI solutions for prostate cancer grading by comparing to expert consensus, and (3) granuloma classification by annotating discriminative tile-level features. Key strengths of SortIT lies in its ease of deployment, making it accessible and usable for a wide range of users. Overall, SortIT provides a valuable tool to establish high-quality ground truth datasets and comprehensively assess observer variability. Critical evaluation of ground truth quality using systematic annotation methodologies is crucial for developing accurate and generalizable diagnostic AI tools. Its open-source nature facilitates community adoption and further development.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Journal of Pathology Informatics
13 papers in training set
Top 0.1%
33.6%
2
Modern Pathology
21 papers in training set
Top 0.1%
22.9%
50% of probability mass above
3
PLOS ONE
4510 papers in training set
Top 24%
6.9%
4
Scientific Reports
3102 papers in training set
Top 31%
3.9%
5
Biology Methods and Protocols
53 papers in training set
Top 0.3%
3.7%
6
Journal of Medical Imaging
11 papers in training set
Top 0.1%
2.4%
7
PLOS Computational Biology
1633 papers in training set
Top 13%
2.1%
8
npj Digital Medicine
97 papers in training set
Top 2%
1.5%
9
GigaScience
172 papers in training set
Top 2%
1.4%
10
The Journal of Pathology
22 papers in training set
Top 0.2%
1.2%
11
Nature Communications
4913 papers in training set
Top 56%
1.2%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.1%
13
Biological Imaging
15 papers in training set
Top 0.2%
1.0%
14
Breast Cancer Research
32 papers in training set
Top 0.5%
0.8%
15
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
0.8%
16
iScience
1063 papers in training set
Top 28%
0.8%
17
Communications Medicine
85 papers in training set
Top 0.9%
0.8%
18
Physics in Medicine & Biology
17 papers in training set
Top 0.5%
0.8%
19
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
20
The Lancet Digital Health
25 papers in training set
Top 1%
0.7%
21
Journal of Visualized Experiments
30 papers in training set
Top 0.9%
0.7%
22
npj Precision Oncology
48 papers in training set
Top 2%
0.5%
23
Clinical Cancer Research
58 papers in training set
Top 2%
0.5%