Back

LPATH: A semi-automated Python tool for clustering molecular pathways

Bogetti, A.; Leung, J. M.; Chong, L.

2023-08-20 biophysics
10.1101/2023.08.17.553774 bioRxiv
Show abstract

The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here we present the LPATH Python tool, which implements a semi-automated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities. TOC Graphic O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=111 SRC="FIGDIR/small/553774v2_ufig1.gif" ALT="Figure 1"> View larger version (15K): org.highwire.dtl.DTLVardef@14eed4corg.highwire.dtl.DTLVardef@bd1f67org.highwire.dtl.DTLVardef@58c04borg.highwire.dtl.DTLVardef@b88034_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.1%
27.3%
2
The Journal of Chemical Physics
49 papers in training set
Top 0.1%
17.3%
3
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.1%
9.0%
50% of probability mass above
4
PLOS ONE
4510 papers in training set
Top 29%
6.3%
5
Journal of Computational Chemistry
11 papers in training set
Top 0.1%
3.9%
6
The Journal of Physical Chemistry B
158 papers in training set
Top 0.6%
3.5%
7
PLOS Computational Biology
1633 papers in training set
Top 10%
3.5%
8
Biophysical Journal
545 papers in training set
Top 2%
3.0%
9
Frontiers in Molecular Biosciences
100 papers in training set
Top 0.8%
2.6%
10
ACS Omega
90 papers in training set
Top 2%
1.6%
11
Physical Biology
43 papers in training set
Top 1%
1.5%
12
Scientific Reports
3102 papers in training set
Top 65%
1.3%
13
Computational and Structural Biotechnology Journal
216 papers in training set
Top 6%
1.3%
14
eLife
5422 papers in training set
Top 54%
0.9%
15
Bioinformatics
1061 papers in training set
Top 9%
0.9%
16
SoftwareX
15 papers in training set
Top 0.4%
0.8%
17
iScience
1063 papers in training set
Top 33%
0.7%
18
Physical Chemistry Chemical Physics
34 papers in training set
Top 0.7%
0.7%
19
Protein Science
221 papers in training set
Top 2%
0.7%
20
The Journal of Physical Chemistry Letters
58 papers in training set
Top 2%
0.7%
21
The European Physical Journal E
15 papers in training set
Top 0.2%
0.6%