Benchmarking Static Gene Regulatory Network Reconstruction and Dynamic Transition Probing in Single-Cell Foundation Models.
Ye, z.; Yang, N.; Yang, X.; Mao, X.; Tang, C.
Show abstract
Single-cell foundation models may encode gene regulatory information, but it remains unclear which model components capture this signal and how it compares with conventional inference methods. Here, we introduce a unified benchmark that evaluates gene regulatory network (GRN) reconstruction from six single-cell foundation models and three classical baselines across six datasets and four reference network types. We disentangle three sources of regulatory signal within each model--pretrained token embeddings, final-layer hidden states, and attention-derived scores. Under a strict zero-shot setting, scGPT token-embedding similarity outperforms classical baselines on STRING and ChIP-seq references, recovers core transcription factors, and best preserves reference network topology. Moreover, static GRNs cannot test whether learned gene-gene relationships are predictive of expression dynamics, we therefore introduce dynamic transition probing, which iteratively applies a models reconstruction head to drive early-cell profiles toward late-cell states without temporal supervision. We find pretrained models capture meaningful developmental transitions, with scFoundation showing the strongest overall performance. Together, our results show that single-cell foundation models encode transferable regulatory and dynamical priors, but how well these priors can be recovered depends on model architecture, pretraining design, and extraction strategy.
Matching journals
The top 7 journals account for 50% of the predicted probability mass.