Back

A large expert-annotated single-cell peripheral blood dataset for hematological disease diagnostics

Shetab Boushehri, S.; Gruber, A.; Kazeminia, S.; Matek, c.; Spiekermann, K.; Pohlkamp, C.; Haferlach, T.; Marr, C.

2025-02-20 hematology
10.1101/2025.02.18.25322415 medRxiv
Show abstract

Distinguishing cell types in peripheral blood smears is critical for diagnosing blood diseases, such as leukemia subtypes. Artificial intelligence can assist in automating cell classification. For training robust machine learning algorithms, however, large and well-annotated single-cell datasets are pivotal. Here, we introduce a large, publicly available, annotated peripheral blood dataset comprising >40,000 single-cell images classified into 18 classes by cytomorphology experts from the Munich Leukemia Laboratory, the largest European laboratory for blood disease diagnostics. By making our dataset publicly available, we provide a valuable resource for medical and machine learning researchers and support the development of reliable and clinically relevant diagnostic tools for diagnosing hematological diseases.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
Cytometry Part A
30 papers in training set
Top 0.1%
54.4%
50% of probability mass above
2
npj Precision Oncology
48 papers in training set
Top 0.1%
4.5%
3
PLOS Computational Biology
1633 papers in training set
Top 8%
4.1%
4
Clinical Chemistry
22 papers in training set
Top 0.2%
2.5%
5
PLOS ONE
4510 papers in training set
Top 46%
2.5%
6
Scientific Reports
3102 papers in training set
Top 49%
2.2%
7
Clinical Chemistry and Laboratory Medicine (CCLM)
12 papers in training set
Top 0.1%
2.0%
8
Nature Communications
4913 papers in training set
Top 50%
1.8%
9
Science Advances
1098 papers in training set
Top 16%
1.7%
10
Journal of The Royal Society Interface
189 papers in training set
Top 3%
1.6%
11
Frontiers in Immunology
586 papers in training set
Top 5%
1.4%
12
Modern Pathology
21 papers in training set
Top 0.3%
1.3%
13
Frontiers in Medicine
113 papers in training set
Top 5%
1.0%
14
eLife
5422 papers in training set
Top 52%
0.9%
15
Heliyon
146 papers in training set
Top 5%
0.8%
16
Transplantation
13 papers in training set
Top 0.4%
0.8%
17
British Journal of Haematology
15 papers in training set
Top 0.4%
0.8%
18
Advanced Science
249 papers in training set
Top 18%
0.8%
19
Diagnostics
48 papers in training set
Top 2%
0.8%
20
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.7%
21
Medical Image Analysis
33 papers in training set
Top 1%
0.7%
22
Leukemia
39 papers in training set
Top 0.8%
0.7%
23
iScience
1063 papers in training set
Top 39%
0.5%
24
PLOS Digital Health
91 papers in training set
Top 3%
0.5%
25
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.5%
26
Communications Biology
886 papers in training set
Top 31%
0.5%
27
Bioinformatics
1061 papers in training set
Top 10%
0.5%