Back

Multiple molecular and cellular properties jointly affect protein and site-specific evolutionary rates

Saini, A.; Usmanova, D. R.; Supo Escalante, R.; Vitkup, D.

2026-05-23 evolutionary biology
10.64898/2026.05.20.726710 bioRxiv
Show abstract

Protein evolutionary rates vary widely across proteins and among sites within proteins, reflecting multiple molecular, cellular, and functional constraints. While protein-level properties, such as expression and essentiality, and site-level structural and functional constraints, are known to influence evolutionary rates, how these constraints combine across scales to determine site-specific evolutionary rates remains unclear. Moreover, because many protein features are strongly correlated, it is difficult to disentangle their individual contributions to evolutionary rate variance, and unified predictive models that integrate these properties are still lacking. Here, we use neural networks to predict protein evolutionary rates across multiple scales based on multiple molecular and cellular features. At the protein level, integrating molecular and cellular descriptors explains substantial variance in evolutionary rates across proteins in multiple eukaryotic species, including nearly 50% of the variance in humans and substantial fractions of the variance in other eukaryotic species. The model also allows us to identify proteins whose evolutionary rates deviate from expectations based on their molecular and cellular properties. At the site level, we found that structural and functional features explain a comparable fraction of the variance in relative evolutionary rates. By integrating protein-level and site-level predictors, the model explains up to 37% of the variance in site-specific evolutionary rates across proteins. Our analysis demonstrates that constraints at these two scales combine largely additively, with protein-level properties setting the overall evolutionary context and site-level properties shaping variation within proteins. Together, these results provide a quantitative framework for understanding protein evolution across biological scales.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Molecular Biology and Evolution
488 papers in training set
Top 0.1%
22.7%
2
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 0.6%
22.7%
3
PLOS Computational Biology
1633 papers in training set
Top 4%
8.5%
50% of probability mass above
4
eLife
5422 papers in training set
Top 13%
6.4%
5
PLOS Biology
408 papers in training set
Top 2%
4.3%
6
Evolution
199 papers in training set
Top 1.0%
2.8%
7
Scientific Reports
3102 papers in training set
Top 47%
2.4%
8
Journal of Molecular Evolution
21 papers in training set
Top 0.1%
2.1%
9
Nature Communications
4913 papers in training set
Top 47%
2.1%
10
PLOS Genetics
756 papers in training set
Top 8%
1.8%
11
Genome Biology and Evolution
280 papers in training set
Top 1.0%
1.7%
12
Nature Ecology & Evolution
113 papers in training set
Top 2%
1.7%
13
Evolution Letters
71 papers in training set
Top 1%
1.7%
14
Proceedings of the Royal Society B: Biological Sciences
341 papers in training set
Top 4%
1.5%
15
Science Advances
1098 papers in training set
Top 23%
1.2%
16
PLOS ONE
4510 papers in training set
Top 59%
1.2%
17
Genome Biology
555 papers in training set
Top 6%
0.9%
18
Virus Evolution
140 papers in training set
Top 1%
0.8%
19
BMC Ecology and Evolution
49 papers in training set
Top 2%
0.6%
20
Genetics
225 papers in training set
Top 5%
0.6%
21
Bioinformatics
1061 papers in training set
Top 10%
0.6%
22
Communications Biology
886 papers in training set
Top 28%
0.6%
23
NAR Genomics and Bioinformatics
214 papers in training set
Top 5%
0.5%
24
New Phytologist
309 papers in training set
Top 5%
0.5%
25
BMC Biology
248 papers in training set
Top 7%
0.5%
26
Molecular Ecology
304 papers in training set
Top 5%
0.5%
27
Current Biology
596 papers in training set
Top 16%
0.5%