Recovering biological structure in sparse single-cell proteomics with GIRAFI
Zhong, H.; Chi, S.; Wong, R.; Rogalski, J.; Wang, Z.; Chan, S.; Bailey, M. L.; Ebrahimi, A.; Jayme, G.; Yin, J.; Gong, A.; Snutch, T. P.; Maier, C. S.; Marra, M. A.; Foster, L. J.; Tang, X.
Show abstract
Single-cell proteomics (SCP) based on liquid-chromatography mass-spectrometry resolves protein-level cellular heterogeneity, but interpretation remains limited by detection-linked sparsity. SCP profiles continuous, peptide-derived intensities and has lower throughput than single-cell RNA sequencing, making denoising methods for large-scale, count-based transcriptomics difficult to apply. Here we present GIRAFI, a graph-informed statistical learning framework that imputes missing values and reveals reproducible cell states by constraining inference to dataset-aware, prior-knowledge-informed protein neighborhoods. We evaluated GIRAFI across SCP datasets spanning diverse biological/technical contexts. In masking-based recovery experiments and cell-type-specific protein-protein interaction inference, GIRAFI outperformed existing methods, and matched bulk proteomics comparisons corroborated recovery accuracy and ablations supported the graph-informed design. Beyond reduced replicate- and source-associated technical structure, GIRAFI recovered ground-truth cell-type annotations, improved cell state-resolved pathway analysis, and enabled trajectory inference consistent with known time courses. These results establish graph-constrained imputation as an effective strategy for improving SCP robustness, biological structure, interpretation, and cross-dataset comparability.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.