Back

Exploring protein conformational ensembles using evolutionary conditional diffusion

cui, X.; Ge, L.; Yang, X.; Li, X.; Hou, D.; Zhou, X.; Zhang, G.

2026-01-30 bioinformatics
10.64898/2026.01.30.702768 bioRxiv
Show abstract

Protein conformational ensembles encode the dynamic landscapes underlying biological function, regulation, and allostery. Accurately reconstructing such ensembles while balancing conformational distributions accuracy and physical plausibility remains a fundamental challenge in structural biology, particularly when dynamic data is scarce. Here, we propose DiffEnsemble, a diffusion-based framework designed for modeling protein conformational ensembles. DiffEnsemble learns latent dynamical representations from static protein structures in the Protein Data Bank, integrated with the structural profile derived from the AlphaFold Protein Structure Database as conditional guidance during the diffusion process. Benchmarking on 72 protein targets from the ATLAS molecular dynamics simulation dataset demonstrates that DiffEnsemble outperforms existing methods, including BioEmu and AlphaFLOW. Compared with AlphaFLOW, DiffEnsemble achieves improvements of 28.9% and 11.3% in Pearson correlation coefficients for ensemble pairwise root mean square deviation and root mean square fluctuation, respectively. Importantly, DiffEnsemble successfully captures the dominant motions for 42% of the targets. These results demonstrate that latent dynamical information embedded in static structural data can effectively support the modeling of protein conformational ensembles.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.3%
10.4%
2
Bioinformatics
1061 papers in training set
Top 3%
9.1%
3
Nature Communications
4913 papers in training set
Top 24%
8.2%
4
Cell Systems
167 papers in training set
Top 2%
6.8%
5
Nature Methods
336 papers in training set
Top 2%
4.8%
6
Nature Computational Science
50 papers in training set
Top 0.1%
4.8%
7
Nature Machine Intelligence
61 papers in training set
Top 0.8%
3.9%
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
50% of probability mass above
9
Nature Biotechnology
147 papers in training set
Top 3%
3.6%
10
Communications Biology
886 papers in training set
Top 4%
2.6%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 25%
2.6%
12
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
2.3%
13
Nucleic Acids Research
1128 papers in training set
Top 9%
2.1%
14
Advanced Science
249 papers in training set
Top 9%
2.1%
15
Protein Science
221 papers in training set
Top 0.9%
1.7%
16
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
17
Cell Reports Methods
141 papers in training set
Top 3%
1.3%
18
Biophysical Journal
545 papers in training set
Top 4%
1.2%
19
Scientific Reports
3102 papers in training set
Top 66%
1.2%
20
Nano Letters
63 papers in training set
Top 2%
1.2%
21
eLife
5422 papers in training set
Top 49%
1.2%
22
Journal of Structural Biology
58 papers in training set
Top 1%
0.9%
23
Communications Chemistry
39 papers in training set
Top 0.7%
0.9%
24
Science
429 papers in training set
Top 19%
0.8%
25
PRX Life
34 papers in training set
Top 0.8%
0.8%
26
International Journal of Molecular Sciences
453 papers in training set
Top 14%
0.8%
27
Journal of Cheminformatics
25 papers in training set
Top 0.6%
0.7%
28
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.7%
29
Genome Research
409 papers in training set
Top 4%
0.7%
30
BMC Bioinformatics
383 papers in training set
Top 7%
0.7%