Back

Analysis of the Confidence in the Prediction of the Protein Folding by Artificial Intelligence

Tejera-Nevado, P.; Serrano, E.; Gonzalez-Herrero, A.; Bermejo-Moreno, R.; Rodriguez-Gonzalez, A.

2023-05-19 bioinformatics
10.1101/2023.05.17.540933 bioRxiv
Show abstract

The determination of protein structure has been facilitated using deep learning models, which can predict protein folding from protein sequences. In some cases, the predicted structure can be compared to the already-known distribution if there is information from classic methods such as nuclear magnetic resonance (NMR) spectroscopy, X-ray crystallography, or electron microscopy (EM). However, challenges arise when the proteins are not abundant, their structure is heterogeneous, and protein sample preparation is difficult. To determine the level of confidence that supports the prediction, different metrics are provided. These values are important in two ways: they offer information about the strength of the result and can supply an overall picture of the structure when different models are combined. This work provides an overview of the different deep-learning methods used to predict protein folding and the metrics that support their outputs. The confidence of the model is evaluated in detail using two proteins that contain four domains of unknown function.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 3%
10.1%
2
PLOS Computational Biology
1633 papers in training set
Top 4%
8.4%
3
Scientific Reports
3102 papers in training set
Top 14%
6.8%
4
Computational Biology and Chemistry
23 papers in training set
Top 0.1%
6.3%
5
BMC Bioinformatics
383 papers in training set
Top 2%
4.9%
6
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.9%
7
International Journal of Molecular Sciences
453 papers in training set
Top 3%
3.6%
8
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
3.6%
9
Frontiers in Molecular Biosciences
100 papers in training set
Top 0.4%
3.6%
50% of probability mass above
10
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
2.4%
11
Molecules
37 papers in training set
Top 0.5%
2.1%
12
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
2.1%
13
Computers in Biology and Medicine
120 papers in training set
Top 2%
2.1%
14
Journal of Molecular Biology
217 papers in training set
Top 1%
1.8%
15
Physical Biology
43 papers in training set
Top 1.0%
1.8%
16
The Journal of Physical Chemistry B
158 papers in training set
Top 1.0%
1.8%
17
PLOS ONE
4510 papers in training set
Top 54%
1.7%
18
Biology Methods and Protocols
53 papers in training set
Top 0.9%
1.7%
19
Journal of Structural Biology
58 papers in training set
Top 0.9%
1.5%
20
International Journal of Biological Macromolecules
65 papers in training set
Top 2%
1.3%
21
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.6%
1.2%
22
Protein Science
221 papers in training set
Top 1%
1.1%
23
Biomolecules
95 papers in training set
Top 1%
0.9%
24
Communications Biology
886 papers in training set
Top 19%
0.9%
25
Bioinformatics Advances
184 papers in training set
Top 4%
0.9%
26
ACS Omega
90 papers in training set
Top 4%
0.8%
27
Biochemistry and Biophysics Reports
28 papers in training set
Top 2%
0.7%
28
Journal of Biosciences
12 papers in training set
Top 0.1%
0.7%
29
Physica A: Statistical Mechanics and its Applications
10 papers in training set
Top 0.3%
0.7%
30
PeerJ
261 papers in training set
Top 16%
0.7%