SimpleFold-Turbo: Adaptive Inference Caching Yields 14-fold Acceleration of Flow-Matching Protein Structure Prediction
Taghon, G.
Show abstract
We apply TeaCache, an adaptive caching technique from video diffusion to SimpleFolds flow-matching protein structure prediction and achieve (9 to 14)-fold inference speedups with negligible quality loss. We determine that flow matchings near-linear generative trajectories make consecutive neural-network evaluations highly redundant. At a low redundancy threshold, SimpleFold-Turbo (SF-T) skips {approx} 93 % of forward passes while preserving near-baseline template modeling (TM)-scores across 300 structurally diverse CATH domains and all six SimpleFold model sizes (100 million to 3 billion parameters), at compute budgets where log-uniform step-skipping collapses. Speedup scales with model size because caching overhead is constant while per-step cost grows, and a general three-phase skip pattern emerges independent of protein size or fold. SF-T requires no retraining, no weight modification, and no MSA server dependencies. We release SF-T as fully open-source software enabling thousands of structure predictions per hour on commodity hardware.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.