Why Many Molecular Simulation Research Findings Might Be False: An Analysis of Inter-Simulations Differences Based on Simulation Time and Number of Replicas
Knapp, B.; Deane, C. M.
Show abstract
Molecular simulations are a common technique to investigate the dynamics of proteins, DNA and RNA. A typical application is the simulation of a wild-type structure and a mutant structure where the mutant has a significantly higher (or lower) potency to trigger a signalling cascade. The study would then analyse the observed differences between the wild-type and mutant simulation and link these to their differences. However differences in the simulations cannot always be reproduced by other research groups even if the same parameters as in the original simulations are used. This is caused by the rugged energy landscape of many biological structures which means that minor differences in hardware or software can cause simulation to take different paths. This would not be a problem if the simulation time would be infinitely long but in real life the simulation time is always finite. In this study we use large scale molecular simulations of four different systems (a 10-mer peptide wild-type and mutant as well as a T-cell receptor, peptide and MHC complex as wild-type and mutant) with 100 replicas each totalling 620 000 ns to quantify the magnitude of (non-) reproducibility when comparing inter-simulation differences (e.g. wild-type vs mutant). Using a bootstrapping approach we found that simulation times of at least 2 to 3 times the experimental folding time using a minimum of 3 replicas are necessary for reproducible results. However, for most complexes of interest such long simulation times are far out of reach which means that it is only possible to sample the local phase space neighbourhood of the x-ray structure. To sample this neighbourhood reliably around 10 to 20 replicas are needed. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=122 SRC="FIGDIR/small/504950v1_ufig1.gif" ALT="Figure 1"> View larger version (19K): org.highwire.dtl.DTLVardef@530c47org.highwire.dtl.DTLVardef@4b1aeborg.highwire.dtl.DTLVardef@d47906org.highwire.dtl.DTLVardef@155a679_HPS_FORMAT_FIGEXP M_FIG C_FIG
Matching journals
The top 5 journals account for 50% of the predicted probability mass.