Considering Zeros in Single Cell Sequencing Data Correlation Analysis
Cai, G.; Yu, X.; Xiao, F.
Show abstract
Single-cell sequencing technology has enabled correlation analysis of genomic features at the cellular level. However, high levels of noise and sparsity in single-cell sequencing data make accurate assessment of correlations challenging. This study provides a toolkit, SCSC (https://github.com/thecailab/SCSC), for the estimation of correlation coefficients in single-cell sequencing data. It comprehensively assessed four strategies (classical, non-zero, dropout-weighted, imputation) and the impact of data features in various simulated scenarios. The study found that filtering zeros significantly improves estimation accuracy, and further improvement can be achieved by considering the drop-out probability. In addition, the study also identified data features including expression level, library size, and biological variations that affect correlation estimation.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.