Back

Accurate, fast and memory efficient quantification of immune cell phenotypes in cytometry using machine learning

Exner, T.; Hackert, N. S.; Pohl, F.; Osmanusta, G.; Schmidt, F.; Lorenz, H.-M.; Wabnitz, G.; Schett, G.; Graw, F.; Henes, J.; Grieshaber-Bouyer, R.

2024-07-29 bioinformatics
10.1101/2024.07.26.605341 bioRxiv
Show abstract

To achieve accurate and reproducible cytometry data analysis, we benchmarked 19 machine learning algorithms for supervised and unsupervised cell classification. The underlying data encompassed 138 million cells from seven independent datasets including conventional flow cytometry, spectral flow cytometry and mass cytometry. We found that tree-based classifiers and in particular Decision Trees, outperformed other approaches in classification accuracy, speed and memory use. High accuracy was achieved even for cell populations rarer than 1% using decision trees. We validated our decision tree-based approach in a clinical setting using diagnostic blood T cell phenotyping of 107 patients. Automatic quantification of CD4 helper T cell phenotypes achieved 99 % accuracy compared to manual expert assessment. Finally, we combined automated data transformation, supervised and unsupervised gating, an application program interface and a user-friendly desktop-application into FACSPy and FACSPyUI, a fast and scalable open-source toolbox for the analysis and visualization of cytometry data.

Matching journals

The top 2 journals account for 50% of the predicted probability mass.

1
Cytometry Part A
30 papers in training set
Top 0.1%
44.6%
2
Nature Communications
4913 papers in training set
Top 16%
10.8%
50% of probability mass above
3
PLOS Computational Biology
1633 papers in training set
Top 7%
4.6%
4
Scientific Reports
3102 papers in training set
Top 32%
3.9%
5
Communications Biology
886 papers in training set
Top 1%
3.9%
6
PLOS ONE
4510 papers in training set
Top 44%
2.8%
7
Bioinformatics
1061 papers in training set
Top 7%
1.9%
8
Genome Medicine
154 papers in training set
Top 4%
1.8%
9
eLife
5422 papers in training set
Top 44%
1.6%
10
Nucleic Acids Research
1128 papers in training set
Top 12%
1.4%
11
BMC Bioinformatics
383 papers in training set
Top 5%
1.4%
12
Frontiers in Immunology
586 papers in training set
Top 6%
1.0%
13
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.8%
14
Science Advances
1098 papers in training set
Top 27%
0.8%
15
The Journal of Immunology
146 papers in training set
Top 2%
0.8%
16
Nature Methods
336 papers in training set
Top 6%
0.8%
17
Clinical Chemistry
22 papers in training set
Top 0.8%
0.8%
18
mAbs
28 papers in training set
Top 0.4%
0.7%
19
Patterns
70 papers in training set
Top 3%
0.7%
20
Nature Machine Intelligence
61 papers in training set
Top 4%
0.5%
21
iScience
1063 papers in training set
Top 39%
0.5%
22
Cell Reports Methods
141 papers in training set
Top 6%
0.5%
23
Genome Biology
555 papers in training set
Top 9%
0.5%
24
Advanced Science
249 papers in training set
Top 22%
0.5%
25
BMC Medical Genomics
36 papers in training set
Top 2%
0.5%
26
ImmunoInformatics
11 papers in training set
Top 0.3%
0.5%