Multiple molecular and cellular properties jointly affect protein and site-specific evolutionary rates
Saini, A.; Usmanova, D. R.; Supo Escalante, R.; Vitkup, D.
Show abstract
Protein evolutionary rates vary widely across proteins and among sites within proteins, reflecting multiple molecular, cellular, and functional constraints. While protein-level properties, such as expression and essentiality, and site-level structural and functional constraints, are known to influence evolutionary rates, how these constraints combine across scales to determine site-specific evolutionary rates remains unclear. Moreover, because many protein features are strongly correlated, it is difficult to disentangle their individual contributions to evolutionary rate variance, and unified predictive models that integrate these properties are still lacking. Here, we use neural networks to predict protein evolutionary rates across multiple scales based on multiple molecular and cellular features. At the protein level, integrating molecular and cellular descriptors explains substantial variance in evolutionary rates across proteins in multiple eukaryotic species, including nearly 50% of the variance in humans and substantial fractions of the variance in other eukaryotic species. The model also allows us to identify proteins whose evolutionary rates deviate from expectations based on their molecular and cellular properties. At the site level, we found that structural and functional features explain a comparable fraction of the variance in relative evolutionary rates. By integrating protein-level and site-level predictors, the model explains up to 37% of the variance in site-specific evolutionary rates across proteins. Our analysis demonstrates that constraints at these two scales combine largely additively, with protein-level properties setting the overall evolutionary context and site-level properties shaping variation within proteins. Together, these results provide a quantitative framework for understanding protein evolution across biological scales.
Matching journals
The top 3 journals account for 50% of the predicted probability mass.