Back

Why Many Molecular Simulation Research Findings Might Be False: An Analysis of Inter-Simulations Differences Based on Simulation Time and Number of Replicas

Knapp, B.; Deane, C. M.

2022-08-25 bioinformatics
10.1101/2022.08.23.504950 bioRxiv
Show abstract

Molecular simulations are a common technique to investigate the dynamics of proteins, DNA and RNA. A typical application is the simulation of a wild-type structure and a mutant structure where the mutant has a significantly higher (or lower) potency to trigger a signalling cascade. The study would then analyse the observed differences between the wild-type and mutant simulation and link these to their differences. However differences in the simulations cannot always be reproduced by other research groups even if the same parameters as in the original simulations are used. This is caused by the rugged energy landscape of many biological structures which means that minor differences in hardware or software can cause simulation to take different paths. This would not be a problem if the simulation time would be infinitely long but in real life the simulation time is always finite. In this study we use large scale molecular simulations of four different systems (a 10-mer peptide wild-type and mutant as well as a T-cell receptor, peptide and MHC complex as wild-type and mutant) with 100 replicas each totalling 620 000 ns to quantify the magnitude of (non-) reproducibility when comparing inter-simulation differences (e.g. wild-type vs mutant). Using a bootstrapping approach we found that simulation times of at least 2 to 3 times the experimental folding time using a minimum of 3 replicas are necessary for reproducible results. However, for most complexes of interest such long simulation times are far out of reach which means that it is only possible to sample the local phase space neighbourhood of the x-ray structure. To sample this neighbourhood reliably around 10 to 20 replicas are needed. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=122 SRC="FIGDIR/small/504950v1_ufig1.gif" ALT="Figure 1"> View larger version (19K): org.highwire.dtl.DTLVardef@530c47org.highwire.dtl.DTLVardef@4b1aeborg.highwire.dtl.DTLVardef@d47906org.highwire.dtl.DTLVardef@155a679_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.1%
17.2%
2
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.4%
14.1%
3
Journal of Computational Chemistry
11 papers in training set
Top 0.1%
7.1%
4
Bioinformatics
1061 papers in training set
Top 4%
7.1%
5
The Journal of Physical Chemistry B
158 papers in training set
Top 0.2%
7.1%
50% of probability mass above
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.9%
4.8%
7
PLOS Computational Biology
1633 papers in training set
Top 10%
3.5%
8
Frontiers in Molecular Biosciences
100 papers in training set
Top 0.5%
3.5%
9
Biophysical Journal
545 papers in training set
Top 2%
2.6%
10
International Journal of Molecular Sciences
453 papers in training set
Top 5%
2.0%
11
Protein Science
221 papers in training set
Top 0.9%
1.7%
12
Scientific Reports
3102 papers in training set
Top 60%
1.6%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.3%
14
Computational Biology and Chemistry
23 papers in training set
Top 0.2%
1.3%
15
ACS Omega
90 papers in training set
Top 3%
1.2%
16
Physical Biology
43 papers in training set
Top 2%
0.9%
17
PeerJ
261 papers in training set
Top 13%
0.9%
18
F1000Research
79 papers in training set
Top 4%
0.9%
19
PLOS ONE
4510 papers in training set
Top 65%
0.9%
20
Pharmaceuticals
33 papers in training set
Top 2%
0.8%
21
Journal of Biomolecular Structure and Dynamics
43 papers in training set
Top 1%
0.8%
22
SoftwareX
15 papers in training set
Top 0.4%
0.8%
23
The Journal of Physical Chemistry Letters
58 papers in training set
Top 2%
0.7%
24
Molecules
37 papers in training set
Top 2%
0.7%
25
Journal of Molecular Graphics and Modelling
16 papers in training set
Top 0.3%
0.7%
26
Journal of Molecular Biology
217 papers in training set
Top 4%
0.7%
27
eLife
5422 papers in training set
Top 62%
0.6%
28
Structure
175 papers in training set
Top 4%
0.6%
29
Physical Chemistry Chemical Physics
34 papers in training set
Top 0.8%
0.6%