Back

Visualizing Amino Acid Substitutions in a Physicochemical Vector Space

Nemzer, L. R.

2021-07-16 bioinformatics
10.1101/2021.07.15.452549 bioRxiv
Show abstract

A three-dimensional representation of the twenty proteinogenic amino acids in a physicochemical space is presented. Vectors corresponding to amino acid substitutions are classified based on whether they are accessible via a single-nucleotide mutation. It is shown that the standard genetic code establishes a "choice architecture" that permits nearly independent tuning of the properties related with size and those related with hydrophobicity. This work sheds light on the non-arbitrary benefits of evolvability that may have shaped the development standard genetic code to increase the probability that adaptive point mutations will be generated. Illustrations of the usefulness of visualizing amino acid substitutions in a 3D physicochemical space are shown using recent datasets collected regarding the SARS-CoV-2 receptor binding domain. First, the substitutions most responsible for antibody escape are almost always inaccessible via single nucleotide mutation, and change multiple properties concurrently. Second, it is shown that assays of ACE2 binding by sarbecovirus variants, including the viruses responsible for SARS and COVID-19, are more easily understood when plotted with this method. The results of this research can extend our understanding of certain hereditary disorders caused by point mutations, as well as guide the development of rational protein and vaccine design.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.1%
26.1%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
7.3%
3
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.7%
7.3%
4
Scientific Reports
3102 papers in training set
Top 13%
6.9%
5
PLOS ONE
4510 papers in training set
Top 38%
3.7%
50% of probability mass above
6
Advanced Science
249 papers in training set
Top 7%
2.7%
7
Computers in Biology and Medicine
120 papers in training set
Top 1%
2.7%
8
Physical Review E
95 papers in training set
Top 0.5%
2.1%
9
BMC Bioinformatics
383 papers in training set
Top 4%
1.8%
10
Journal of Computational Chemistry
11 papers in training set
Top 0.1%
1.7%
11
Frontiers in Immunology
586 papers in training set
Top 4%
1.7%
12
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 4%
1.7%
13
Physical Biology
43 papers in training set
Top 1%
1.7%
14
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.7%
15
iScience
1063 papers in training set
Top 15%
1.7%
16
Viruses
318 papers in training set
Top 3%
1.5%
17
npj Systems Biology and Applications
99 papers in training set
Top 1%
1.2%
18
International Journal of Molecular Sciences
453 papers in training set
Top 12%
1.0%
19
The Journal of Physical Chemistry B
158 papers in training set
Top 2%
0.9%
20
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.8%
0.8%
21
Communications Biology
886 papers in training set
Top 23%
0.8%
22
Nature Communications
4913 papers in training set
Top 62%
0.8%
23
Frontiers in Bioengineering and Biotechnology
88 papers in training set
Top 3%
0.7%
24
Frontiers in Plant Science
240 papers in training set
Top 5%
0.7%
25
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
26
Frontiers in Microbiology
375 papers in training set
Top 10%
0.7%
27
ImmunoInformatics
11 papers in training set
Top 0.2%
0.7%
28
eLife
5422 papers in training set
Top 61%
0.7%
29
Bioinformatics
1061 papers in training set
Top 10%
0.7%
30
Journal of The Royal Society Interface
189 papers in training set
Top 5%
0.7%