Back

ProPrep: An Interactive and Instructional Interface for Proper Protein Preparation with AMBER

Walker, a.; Guberman-Pfeffer, M. J.

2026-03-02 bioinformatics
10.64898/2026.02.26.708365 bioRxiv
Show abstract

Millions of experimental and AI-predicted protein structures are now available, and the biosynthetic promise of bespoke proteins is increasingly within reach. The functional characterization challenge thus posed cannot be addressed by experimental techniques alone. Molecular dynamics (MD) simulations offer functional screening with atomic resolution, yet accessibility remains limited. Existing computational chemistry software presents stark trade-offs whereby powerful tools require extensive expertise and manual effort, or user-friendly programs function as black boxes that obscure critical preparation decisions. Herein, we present ProPrep, an interactive workflow manager that guides users through expert-quality MD preparation by showing the what, why, and how of each step while automating tedious manual operations. Within a single workspace, ProPrep integrates (1) downloading structures from multiple sources (PDB, AlphaFold, AlphaFill), (2) performing homology searches, (3) aligning structures, (4) curating and repairing structural issues, (5) applying mutations, (6) parameterizing specialized residues, (7) converting redox-active sites to forcefield-compatible forms, (8) generating topology and coordinate files, and (9) configuring, executing, and analyzing simulations with active monitoring of key quantities via ASCII visualizations. A key innovation is ProPreps extensible transformer framework for detecting, defining, and transforming redox-active sites--including mono- and polynuclear metal centers, organic cofactors, and redox-active amino acids--for forcefield compatibility. We demonstrate the full workflow on a 64-heme cytochrome nanowire bundle (PDB: 9YUQ), proceeding from a PDF file to energy minimization of the solvated system (467,635 atoms) for constant pH molecular dynamics--a process demanding 4,819 PDB record modifications and 610 bond definitions--in 18 minutes of user interaction. The entire process is recorded in an interactive session log that can be shared and replayed for reproducibility, making simulation setup a fully transparent process that relies on what was done instead of what was remembered and reported.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.1%
28.8%
2
Bioinformatics
1061 papers in training set
Top 2%
13.0%
3
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.1%
8.7%
50% of probability mass above
4
Computational and Structural Biotechnology Journal
216 papers in training set
Top 0.9%
4.5%
5
Protein Science
221 papers in training set
Top 0.3%
4.1%
6
PLOS Computational Biology
1633 papers in training set
Top 11%
2.8%
7
Journal of Computational Chemistry
11 papers in training set
Top 0.1%
2.2%
8
The Journal of Physical Chemistry B
158 papers in training set
Top 0.9%
2.0%
9
Journal of Molecular Biology
217 papers in training set
Top 1%
2.0%
10
SoftwareX
15 papers in training set
Top 0.1%
1.9%
11
The Journal of Physical Chemistry Letters
58 papers in training set
Top 0.8%
1.7%
12
PLOS ONE
4510 papers in training set
Top 53%
1.7%
13
BMC Bioinformatics
383 papers in training set
Top 5%
1.5%
14
Scientific Data
174 papers in training set
Top 1%
1.3%
15
Nature Communications
4913 papers in training set
Top 55%
1.3%
16
Communications Biology
886 papers in training set
Top 15%
1.2%
17
Nucleic Acids Research
1128 papers in training set
Top 14%
1.0%
18
Journal of Cheminformatics
25 papers in training set
Top 0.5%
0.9%
19
Scientific Reports
3102 papers in training set
Top 70%
0.9%
20
Chemical Science
71 papers in training set
Top 2%
0.8%
21
Biophysical Journal
545 papers in training set
Top 4%
0.8%
22
eLife
5422 papers in training set
Top 54%
0.8%
23
Bioinformatics Advances
184 papers in training set
Top 4%
0.8%
24
Structure
175 papers in training set
Top 3%
0.8%
25
NAR Genomics and Bioinformatics
214 papers in training set
Top 4%
0.7%
26
Frontiers in Molecular Biosciences
100 papers in training set
Top 6%
0.7%
27
mSphere
281 papers in training set
Top 7%
0.7%
28
Communications Chemistry
39 papers in training set
Top 1%
0.7%
29
ACS Omega
90 papers in training set
Top 5%
0.7%
30
International Journal of Molecular Sciences
453 papers in training set
Top 17%
0.7%