Back

Decoupling Topology from Geometry: Detecting Large-Scale Conformational Changes via Conformational Scanning

Lin, R.; Ahnert, S. E.

2026-03-31 bioinformatics
10.64898/2026.03.28.714756 bioRxiv
Show abstract

Protein function is fundamentally driven by structural dynamics, yet the majority of structural bioinformatics treats proteins as static rigid bodies. While Molecular Dynamics (MD) simulations attempt to capture these motions, they are computationally prohibitive for exploring large-scale conformational changes, such as domain movements or allostery, which occur on timescales often inaccessible to standard simulation. However, the Protein Data Bank (PDB) contains a latent wealth of dynamic information in the form of redundant entries proteins solved in multiple distinct conformational states. Detecting these "shape-shifting" pairs remains challenging because standard structural alignment algorithms (e.g., TM-align) rely on rigid-body superposition, which fails when substantial geometric rearrangement occurs. In this study, we introduce a high-throughput method to systematically mine the PDB for proteins that share identical topology but exhibit divergent tertiary conformations. By utilizing a coarse-grained Secondary Structure Element (SSE) representation, we decouple topological connectivity from geometric rigidity, allowing for the detection of conformational homologues that share low global structural similarity despite high predicted structural similarity. We applied this "conformational scanning" across the entire RCSB database, identifying a curated dataset of proteins undergoing significant structural rearrangements. This work bridges the gap between static structural data and dynamic function, providing a critical "ground truth" dataset for benchmarking data-driven protein design and checking the plausibility of generative structure models.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Bioinformatics
1061 papers in training set
Top 2%
18.2%
2
Cell Systems
167 papers in training set
Top 1%
9.9%
3
PLOS Computational Biology
1633 papers in training set
Top 5%
6.7%
4
Protein Science
221 papers in training set
Top 0.2%
6.2%
5
Bioinformatics Advances
184 papers in training set
Top 0.8%
4.7%
6
Nature Communications
4913 papers in training set
Top 37%
3.9%
7
Journal of Molecular Biology
217 papers in training set
Top 0.7%
3.5%
50% of probability mass above
8
Journal of Chemical Information and Modeling
207 papers in training set
Top 1%
3.5%
9
Scientific Reports
3102 papers in training set
Top 39%
3.5%
10
Structure
175 papers in training set
Top 0.9%
3.5%
11
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 26%
2.3%
12
Computational and Structural Biotechnology Journal
216 papers in training set
Top 3%
2.0%
13
BMC Bioinformatics
383 papers in training set
Top 4%
1.8%
14
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.4%
1.7%
15
Journal of Cheminformatics
25 papers in training set
Top 0.3%
1.7%
16
Nature Computational Science
50 papers in training set
Top 0.7%
1.7%
17
Molecular Biology and Evolution
488 papers in training set
Top 3%
1.7%
18
NAR Genomics and Bioinformatics
214 papers in training set
Top 2%
1.5%
19
eLife
5422 papers in training set
Top 46%
1.5%
20
PLOS ONE
4510 papers in training set
Top 59%
1.3%
21
Journal of Structural Biology
58 papers in training set
Top 1%
1.2%
22
Nature Methods
336 papers in training set
Top 5%
1.2%
23
Nature Biotechnology
147 papers in training set
Top 6%
1.2%
24
Cell Reports Methods
141 papers in training set
Top 4%
0.9%
25
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.3%
0.9%
26
Nucleic Acids Research
1128 papers in training set
Top 15%
0.9%
27
Communications Biology
886 papers in training set
Top 20%
0.9%
28
iScience
1063 papers in training set
Top 30%
0.8%
29
Frontiers in Molecular Biosciences
100 papers in training set
Top 4%
0.8%
30
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%