Back

In-silico cell sorting revealed granulocyte-specific single-cell-type gene expression from peripheral blood bulk expression data and its application as host response biomarkers to discriminate bacterial and viral infections

Tang, N. L.-s.; Kwan, T.-K.; Huang, J.; Tang, M. L.; Wang, X.; Wu, J.; Lai, C.; Lui, G.; Ma, S.-L.; Leung, K.-S.

2026-04-13 immunology
10.64898/2026.04.09.717385 bioRxiv
Show abstract

Peripheral Blood transcriptome analysis evaluated the bulk transcript abundance (TA) covering all leukocyte cell populations. However, there are 2 main problems in using bulk expression as biomarkers: (1) A long list of differential expression genes (DEGs) was found, and (2) DEGs cannot be attributed to a host response of any specific cell-type. TA assays after conventional cell sorting, as the gold-standard method, is too tedious for routine use. Recently, we showed that by using a ratio-based biomarker, RBB (ratio of two stringently selected genes), it is feasible to interrogate the gene expression of a single cell-type (monocyte and B lymphocyte) in peripheral whole blood (WB) directly. Here, we apply this in-silico cell sorting algorithm (DIRECT LS-TA, Direct Leukocyte Single cell-type Transcript Abundance) to granulocytes in WB samples to reveal RBBs specific to granulocytes. This DIRECT LS-TA approach without the need for cell-sorting was applied to public datasets to differentiate the 2 types of infection (bacterial vs viral infection). The following RBBs measured in WB correlate with the expression of target (numerator) genes in purified granulocytes, thus cell-sorting can be avoided by using these RBBs: ARG1/SRGN, ANXA3/SRGN, RSAD2/SRGN. Together with monocyte DIRECT LS-TA biomarkers, IFI27/PSAP, direct quantification of 4 genes provided optimal differentiation of viral from bacterial infection. Meta-analysis and unsupervised machine learning classification confirmed the superior performance of DIRECT LS-TA biomarkers. These RBBs found by prior In-silico cell-sorting identified pairs of genes that are used to formulate as ratio-based biomarkers (RBBs) to represent gene expression of granulocytes inside whole blood cell-mixture samples which was useful to triage febrile patients into two major categories of febrile diseases between viral and bacterial infection with high degree of sensitivity and specificity.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 0.9%
18.6%
2
Cytometry Part A
30 papers in training set
Top 0.1%
17.5%
3
PLOS ONE
4510 papers in training set
Top 24%
7.2%
4
Frontiers in Immunology
586 papers in training set
Top 2%
4.2%
5
BioMed Research International
25 papers in training set
Top 0.9%
2.9%
50% of probability mass above
6
International Journal of Molecular Sciences
453 papers in training set
Top 4%
2.4%
7
BMC Bioinformatics
383 papers in training set
Top 4%
2.4%
8
BMC Genomics
328 papers in training set
Top 2%
2.1%
9
Frontiers in Genetics
197 papers in training set
Top 4%
2.1%
10
iScience
1063 papers in training set
Top 10%
2.1%
11
Heliyon
146 papers in training set
Top 1%
1.9%
12
PLOS Computational Biology
1633 papers in training set
Top 15%
1.8%
13
Biosensors and Bioelectronics
52 papers in training set
Top 0.9%
1.5%
14
European Journal of Immunology
57 papers in training set
Top 0.3%
1.5%
15
Frontiers in Cellular and Infection Microbiology
98 papers in training set
Top 3%
1.5%
16
Journal of Virological Methods
36 papers in training set
Top 0.4%
1.2%
17
Bioinformatics
1061 papers in training set
Top 8%
1.2%
18
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.1%
19
Physical Biology
43 papers in training set
Top 2%
0.9%
20
Nature Communications
4913 papers in training set
Top 60%
0.9%
21
Life
27 papers in training set
Top 0.3%
0.8%
22
Communications Biology
886 papers in training set
Top 21%
0.8%
23
Frontiers in Physiology
93 papers in training set
Top 6%
0.7%
24
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
25
Cells
232 papers in training set
Top 8%
0.6%
26
Computational and Structural Biotechnology Journal
216 papers in training set
Top 11%
0.6%
27
Computational Biology and Chemistry
23 papers in training set
Top 0.7%
0.6%
28
npj Systems Biology and Applications
99 papers in training set
Top 3%
0.6%
29
Archives of Clinical and Biomedical Research
28 papers in training set
Top 3%
0.6%
30
Analytica Chimica Acta
17 papers in training set
Top 0.7%
0.6%