Back

ProAR: Probabilistic Autoregressive Modeling for Molecular Dynamics

Cheng, K.; Liu, Y.; Nie, Z.; Lin, M.; Hou, Y.; Tao, Y.; Liu, C.; Chen, J.; Mao, Y.; Tian, Y.

2026-03-21 molecular biology
10.64898/2026.03.20.713063 bioRxiv
Show abstract

Understanding the structural dynamics of biomolecules is crucial for uncovering biological functions. As molecular dynamics (MD) simulation data becomes more available, deep generative models have been developed to synthesize realistic MD trajectories. However, existing methods produce fixed-length trajectories by jointly denoising high-dimensional spatiotemporal representations, which conflicts with MDs frame-by-frame integration process and fails to capture time-dependent conformational diversity. Inspired by MDs sequential nature, we introduce a new probabilistic autoregressive (ProAR) framework for trajectory generation. ProAR uses a dual-network system that models each frame as a multivariate Gaussian distribution and employs an anti-drifting sampling strategy to reduce cumulative errors. This approach captures conformational uncertainty and time-coupled structural changes while allowing flexible generation of trajectories of arbitrary length. Experiments on ATLAS, a large-scale protein MD dataset, demonstrate that for long trajectory generation, our model achieves a 7.5% reduction in reconstruction RMSE and an average 25.8% improvement in conformation change accuracy compared to previous state-of-the-art methods. For conformation sampling task, it performs comparably to specialized time-independent models, providing a flexible and dependable alternative to standard MD simulations.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
14.2%
2
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.8%
6.7%
3
Nature Communications
4913 papers in training set
Top 29%
6.3%
4
Nature Methods
336 papers in training set
Top 2%
4.8%
5
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 14%
4.8%
6
Communications Biology
886 papers in training set
Top 1%
3.9%
7
PLOS Computational Biology
1633 papers in training set
Top 10%
3.5%
8
Journal of Structural Biology
58 papers in training set
Top 0.4%
3.5%
9
Nature Computational Science
50 papers in training set
Top 0.2%
3.5%
50% of probability mass above
10
Briefings in Bioinformatics
326 papers in training set
Top 2%
3.0%
11
Nature Machine Intelligence
61 papers in training set
Top 1%
3.0%
12
PLOS ONE
4510 papers in training set
Top 49%
2.1%
13
Nucleic Acids Research
1128 papers in training set
Top 9%
2.1%
14
eLife
5422 papers in training set
Top 38%
1.9%
15
Scientific Reports
3102 papers in training set
Top 54%
1.9%
16
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.5%
1.9%
17
Nature Biotechnology
147 papers in training set
Top 4%
1.8%
18
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
19
Biophysical Journal
545 papers in training set
Top 3%
1.5%
20
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.3%
21
Physical Review Research
46 papers in training set
Top 0.5%
1.2%
22
Genome Research
409 papers in training set
Top 3%
1.2%
23
Advanced Science
249 papers in training set
Top 16%
0.9%
24
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.3%
0.9%
25
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
26
Medical Image Analysis
33 papers in training set
Top 0.9%
0.9%
27
Journal of Computational Chemistry
11 papers in training set
Top 0.2%
0.7%
28
Cell Reports
1338 papers in training set
Top 33%
0.7%
29
Cell Reports Methods
141 papers in training set
Top 5%
0.7%
30
The Lancet Infectious Diseases
71 papers in training set
Top 3%
0.7%