Back

The most probable ancestral sequence reconstruction yields proteins without systematic bias in thermal stability or activity

Theobald, D.; Sennett, M. A.; Beckett, B. C.

2023-02-22 biochemistry
10.1101/2023.02.22.529562 bioRxiv
Show abstract

Ancestral sequence resurrection (ASR) is the inference of extinct biological sequences from extant sequences, the most popular of which are based on probabilistic models of evolution. ASR is becoming a popular method for studying the evolution of enzyme characteristics. The properties of ancestral enzymes are biochemically and biophysically characterized to gain some knowledge regarding the origin of some enzyme property. Current methodology relies on resurrection of the single most probable (SMP) sequence and is systematically biased. Previous theoretical work suggests this will result in a thermostability bias in resurrected SMP sequences, and even the activity, calling into question inferences derived from ancestral protein properties. We experimentally test the potential stability bias hypothesis by resurrecting 40 malate and lactate dehydrogenases. Despite the methodological bias in resurrecting an SMP protein, the measured biophysical and biochemical properties of the SMP protein are not biased in comparison to other, less probable, resurrections. In addition, the SMP protein property seems to be representative of the ancestral probability distribution. Therefore, the conclusions and inferences drawn from the SMP protein are likely not a source of bias. SignificanceAncestral sequence resurrection (ASR) is a powerful tool for: determining how new protein functions evolve; inferring the properties of an environment in which species existed; and protein engineering applications. We demonstrate, using lactate and malate dehydrogenases (L/MDHs), that resurrecting the single most probable sequence (SMP) from a maximum likelihood phylogeny does not result in biased activity and stability relative to sequences sampled from the posterior probability distribution. Previous studies using experimentally measured phenotypes of SMP sequences to make inferences about the environmental conditions and the path of evolution are likely not biased in their conclusions. Serendipitously, we discover ASR is also a valid tool for protein engineering because sampled reconstructions are both highly active and stable.

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Journal of Molecular Evolution
21 papers in training set
Top 0.1%
18.2%
2
PLOS Computational Biology
1633 papers in training set
Top 1%
17.1%
3
Protein Science
221 papers in training set
Top 0.1%
9.9%
4
Molecular Biology and Evolution
488 papers in training set
Top 0.5%
8.2%
50% of probability mass above
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 12%
6.2%
6
Bioinformatics
1061 papers in training set
Top 5%
4.1%
7
Scientific Reports
3102 papers in training set
Top 32%
3.9%
8
eLife
5422 papers in training set
Top 25%
3.6%
9
Genome Biology and Evolution
280 papers in training set
Top 0.7%
2.3%
10
Biophysical Journal
545 papers in training set
Top 2%
2.0%
11
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.6%
1.5%
12
PLOS ONE
4510 papers in training set
Top 59%
1.3%
13
PLOS Biology
408 papers in training set
Top 14%
1.2%
14
Biochemical Journal
80 papers in training set
Top 0.2%
1.2%
15
Philosophical Transactions of the Royal Society B
51 papers in training set
Top 5%
0.9%
16
Open Biology
95 papers in training set
Top 2%
0.8%
17
Biochemistry
130 papers in training set
Top 2%
0.8%
18
Nucleic Acids Research
1128 papers in training set
Top 18%
0.7%
19
Cell Systems
167 papers in training set
Top 12%
0.7%
20
Protein Engineering, Design and Selection
14 papers in training set
Top 0.1%
0.7%
21
BMC Genomics
328 papers in training set
Top 6%
0.7%
22
PeerJ
261 papers in training set
Top 16%
0.7%
23
Computational and Structural Biotechnology Journal
216 papers in training set
Top 11%
0.6%
24
Microbial Genomics
204 papers in training set
Top 3%
0.6%