Back

An extensive evaluation of single-cell RNA-Seq contrastivelearning generative networks for intrinsic cell-typesdistribution estimation

Alsaggaf, I.; Buchan, D.; Wan, C.

2025-09-17 bioinformatics
10.1101/2025.09.15.675691 bioRxiv
Show abstract

Contrastive learning has already been widely used to handle single-cell RNA-Seq data due to its outstanding performance in transforming original data distributions into hypersphere feature spaces. In this work, we conduct a large-scale empirical evaluation to investigate the generative encoder networks that are learned by five different state-of-the-art single-cell RNA-Seq contrastive learning methods. Unlike the conventional discriminative model-based cell-type prediction studies, this work is focused on the performance of contrastive learning-based generative encoder networks in terms of their capacity to estimate the intrinsic distributions of different cell-types - a fundamental property that directly affects the performance of any downstream single-cell RNA-Seq data analytics. The experimental results confirm that supervised contrastive learning-based encoder networks lead to better performance than self-supervised contrastive learning-based encoder networks, and the recently proposed Gaussian noise augmentation-based single-cell RNA-Seq contrastive learning method shows the best performance on estimating the intrinsic distribution of different cell-types.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.1%
33.1%
2
Bioinformatics
1061 papers in training set
Top 3%
8.4%
3
Nature Machine Intelligence
61 papers in training set
Top 0.6%
4.9%
4
Frontiers in Genetics
197 papers in training set
Top 2%
4.0%
50% of probability mass above
5
Nucleic Acids Research
1128 papers in training set
Top 5%
3.7%
6
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
7
BMC Bioinformatics
383 papers in training set
Top 3%
2.6%
8
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 0.7%
2.4%
9
Advanced Science
249 papers in training set
Top 8%
2.4%
10
Nature Communications
4913 papers in training set
Top 47%
2.1%
11
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 3%
1.9%
12
IEEE Transactions on Computational Biology and Bioinformatics
17 papers in training set
Top 0.2%
1.9%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.5%
14
Communications Biology
886 papers in training set
Top 11%
1.5%
15
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
1.2%
16
Scientific Reports
3102 papers in training set
Top 66%
1.2%
17
Bioinformatics Advances
184 papers in training set
Top 4%
1.1%
18
Genome Biology
555 papers in training set
Top 6%
1.0%
19
Quantitative Biology
11 papers in training set
Top 0.5%
1.0%
20
PLOS ONE
4510 papers in training set
Top 62%
1.0%
21
BMC Genomics
328 papers in training set
Top 4%
0.9%
22
iScience
1063 papers in training set
Top 34%
0.7%
23
Computers in Biology and Medicine
120 papers in training set
Top 5%
0.6%
24
Journal of Computational Biology
37 papers in training set
Top 0.7%
0.6%
25
Genome Research
409 papers in training set
Top 5%
0.6%