Back

Experimental Data Driven AI Framework for Flexible Protein Conformational Reconstruction

Yu, F.; Prince, S.; Tritt, A.; Pande, K.; Hura, G. L.; Ruebel, O.; Tsutakawa, S. E.

2026-03-14 biophysics
10.64898/2026.03.12.708611 bioRxiv
Show abstract

Deep learning has revolutionized structural biology by prediction with near experimental accuracy static protein folds from amino acid sequence alone. However, proteins function as dynamic ensembles of protein conformation states, and current sequence-only models often fail to capture the specific conformational states and heterogeneity dictated by cellular environments or ligand binding. While recent generative models can sample broad conformational landscapes, they remain unconstrained by physical reality, often hallucinating plausible but experimentally invalid states. Here, we present AlphaSAXS, an end-to-end framework that constrains artificial intelligence (AI) inference using Small Angle X-ray Scattering (SAXS) experimental solution scattering data. By integrating real-space pair distance distributions (P(r)) directly into the AlphaFold architecture, AlphaSAXS effectively steers the structural hypothesis toward the experimentally observed structures. We demonstrate that AlphaSAXS resolves documented failure modes of sequence-only models in Apo-Holo transitions, successfully distinguishing between states with identical sequences but distinct scattering profiles. Furthermore, we introduce a hybrid inference protocol that couples deep learning with biophysical hydration modeling, enabling the reconstruction of solution state protein ensembles compatible with experimental data. This work establishes a paradigm for experimentally guided AI, bridging the gap between probabilistic sampling and biophysical measurement.

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 3%
14.0%
2
Science
429 papers in training set
Top 3%
9.8%
3
Nature Methods
336 papers in training set
Top 1%
8.2%
4
Nature Communications
4913 papers in training set
Top 24%
8.2%
5
IUCrJ
29 papers in training set
Top 0.1%
6.2%
6
Cell Systems
167 papers in training set
Top 3%
4.7%
50% of probability mass above
7
Nature Biotechnology
147 papers in training set
Top 3%
3.6%
8
eLife
5422 papers in training set
Top 27%
3.5%
9
Structure
175 papers in training set
Top 0.9%
3.5%
10
Nature Computational Science
50 papers in training set
Top 0.3%
2.8%
11
Nature
575 papers in training set
Top 8%
2.7%
12
PLOS Computational Biology
1633 papers in training set
Top 14%
2.0%
13
Nucleic Acids Research
1128 papers in training set
Top 9%
2.0%
14
Science Advances
1098 papers in training set
Top 19%
1.6%
15
Journal of Structural Biology
58 papers in training set
Top 0.8%
1.6%
16
Biophysical Journal
545 papers in training set
Top 3%
1.6%
17
Scientific Reports
3102 papers in training set
Top 65%
1.3%
18
Advanced Science
249 papers in training set
Top 15%
1.2%
19
Neuron
282 papers in training set
Top 7%
1.2%
20
ACS Nano
99 papers in training set
Top 3%
1.2%
21
Protein Science
221 papers in training set
Top 1%
0.9%
22
Journal of the American Chemical Society
199 papers in training set
Top 4%
0.9%
23
Nature Structural & Molecular Biology
218 papers in training set
Top 4%
0.9%
24
The Journal of Physical Chemistry Letters
58 papers in training set
Top 2%
0.7%
25
PLOS ONE
4510 papers in training set
Top 69%
0.7%
26
Communications Biology
886 papers in training set
Top 26%
0.7%
27
Cell
370 papers in training set
Top 18%
0.7%
28
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.6%
29
Chemical Science
71 papers in training set
Top 3%
0.6%