Back

Evaluating FoldX5.1 for MAVISp Stability Data Collection

Vliora, A.; Tiberti, M.; Papaleo, E.

2026-04-02 bioinformatics
10.64898/2026.03.31.715598 bioRxiv
Show abstract

MAVISp (Multi-layered Assessment of VarIants by Structure for proteins) is a structure-based framework for facilitating mechanistic interpretation of missense variants, with protein stability as one of its core analytical layers. When software tools are updated, a key consideration for database curation is whether the new version can be adopted without compromising compatibility with existing entries. This study evaluated the effect of replacing FoldX5 with FoldX5.1 on the results of the MAVISp stability workflow. We compared predicted changes in folding free energy for 539,809 shared variants across 119 proteins. We found high overall agreement with a mean Pearson correlation of 0.933 and a mean Cohen coefficient of 0.814. Most proteins showed strong concordance, whereas only three (NUPR1, TSC1, and TMEM127) showed poor agreement. The number of disagreements was higher at sites with low AlphaFold2 confidence for NUPR1 and TSC1. These outliers did not display systematic inter-version bias, as mean shifts in folding free energies between versions were minimal. Collectively, these findings support adopting FoldX5.1 for future MAVISp data collection. We will include a transition period, during which existing entries retain FoldX5 annotations until their scheduled annual update, while new or updated entries are processed with FoldX5.1. To facilitate this transition, the FoldX software version has been added as a new metadata annotation in the MAVISp database.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Protein Science
221 papers in training set
Top 0.1%
10.1%
2
Bioinformatics Advances
184 papers in training set
Top 0.2%
8.4%
3
PLOS Computational Biology
1633 papers in training set
Top 5%
6.8%
4
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.8%
4.8%
5
PLOS ONE
4510 papers in training set
Top 32%
4.8%
6
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
7
Bioinformatics
1061 papers in training set
Top 5%
4.3%
8
International Journal of Molecular Sciences
453 papers in training set
Top 2%
4.3%
9
Nature Communications
4913 papers in training set
Top 40%
3.6%
50% of probability mass above
10
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.7%
3.6%
11
BMC Bioinformatics
383 papers in training set
Top 3%
3.6%
12
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
2.4%
13
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
2.1%
14
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.8%
15
Scientific Reports
3102 papers in training set
Top 58%
1.7%
16
Structure
175 papers in training set
Top 2%
1.7%
17
Journal of Molecular Biology
217 papers in training set
Top 2%
1.7%
18
Journal of Proteome Research
215 papers in training set
Top 1%
1.7%
19
Genome Biology
555 papers in training set
Top 4%
1.7%
20
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.5%
21
Database
51 papers in training set
Top 0.5%
1.5%
22
Genome Medicine
154 papers in training set
Top 6%
1.2%
23
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.7%
0.9%
24
Frontiers in Genetics
197 papers in training set
Top 8%
0.9%
25
Cell Systems
167 papers in training set
Top 11%
0.8%
26
PeerJ
261 papers in training set
Top 15%
0.7%
27
Human Mutation
29 papers in training set
Top 0.7%
0.7%
28
Biophysical Journal
545 papers in training set
Top 5%
0.7%
29
Molecular Systems Biology
142 papers in training set
Top 2%
0.6%
30
The American Journal of Human Genetics
206 papers in training set
Top 4%
0.6%