Back

Hashi: Bridging Statistical Model Derived 1D Microstate Encodings and Protein 3D Structural Ensembles

Naganathan, A. N.; Madhan, H.

2026-06-02 biophysics
10.64898/2026.06.01.729173 bioRxiv
Show abstract

The functioning of proteins is intimately linked to the conformational states they sample within the native ensemble. Generating ensembles from a single static structure is therefore a research domain receiving considerable attention. In this application note, we introduce Hashi, a pipeline to rapidly generate realistic structural ensembles from the outputs of the structure-based Wako-Saito-Munoz Eaton (WSME) statistical mechanical model of protein folding. This approach relies on integrating the block WSME model outputs - strings of zeros and ones describing the conformational status of every residue over thousands or millions of microstates each assigned a statistical weight derived from physically grounded energy-entropy terms, and free energy profiles - with the RANCH module of the EOM (ensemble optimization method) from the ATSAS software suite, providing three-dimensional views of the structural ensembles within the model framework. It is applicable to a variety of single-chain monomeric systems with lengths ranging from 30 to 500 residues, including globular and repeat proteins. The generated structural ensembles can also be rank ordered according to their free energies within a given macrostate or a range of reaction coordinate values. Since the statistical weights of the WSME model microstates can be reweighted or calibrated with experiments, the ensembles shed light on not just the folding mechanism but also on the structural excursions that determine function and opening of otherwise buried binding pockets.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Bioinformatics Advances
184 papers in training set
Top 0.1%
14.7%
2
Nature Communications
4913 papers in training set
Top 22%
8.4%
3
Bioinformatics
1061 papers in training set
Top 3%
7.2%
4
Journal of Chemical Information and Modeling
207 papers in training set
Top 1.0%
4.9%
5
Journal of Molecular Biology
217 papers in training set
Top 0.4%
4.3%
6
Nucleic Acids Research
1128 papers in training set
Top 5%
4.0%
7
Nature Computational Science
50 papers in training set
Top 0.1%
3.7%
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
50% of probability mass above
9
Protein Science
221 papers in training set
Top 0.4%
3.6%
10
Frontiers in Molecular Biosciences
100 papers in training set
Top 0.7%
2.7%
11
Nature Methods
336 papers in training set
Top 4%
2.1%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 30%
1.9%
13
Structure
175 papers in training set
Top 2%
1.8%
14
Briefings in Bioinformatics
326 papers in training set
Top 4%
1.8%
15
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.5%
1.7%
16
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.2%
1.7%
17
Scientific Reports
3102 papers in training set
Top 57%
1.7%
18
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.5%
1.7%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
20
eLife
5422 papers in training set
Top 43%
1.7%
21
IUCrJ
29 papers in training set
Top 0.2%
1.7%
22
Cell Systems
167 papers in training set
Top 7%
1.7%
23
iScience
1063 papers in training set
Top 19%
1.3%
24
The Journal of Physical Chemistry Letters
58 papers in training set
Top 1.0%
1.3%
25
Cell Reports Methods
141 papers in training set
Top 3%
1.2%
26
Communications Biology
886 papers in training set
Top 17%
1.0%
27
Journal of Computational Chemistry
11 papers in training set
Top 0.1%
0.9%
28
Nature Biotechnology
147 papers in training set
Top 7%
0.9%
29
Journal of Structural Biology
58 papers in training set
Top 1%
0.9%
30
PLOS ONE
4510 papers in training set
Top 63%
0.9%