Back

Improving Protein Structure Prediction Using Integrative Cryo-EM and Ion Mobility Mass Spectrometry Modeling

Howard, J. B.; Narayanasamy, A.; Lindert, S.

2026-02-10 biochemistry
10.64898/2026.02.07.704481 bioRxiv
Show abstract

Proteins perform essential roles across nearly all cellular processes, and accurate three-dimensional structures remain critical for elucidating structure-function relationships and studies on drug discovery. Cryo-electron microscopy (cryo-EM), X-ray crystallography, and nuclear magnetic resonance can provide detailed structural information. However, for many proteins, structural information is available only as lower-resolution experimental data or sparse data. Such information is more difficult to translate into accurate atomic coordinates; a common example is low-resolution cryo-EM density maps. In parallel, mass spectrometry-based methods, including ion mobility (IM-MS), offer rapid, broadly applicable structural descriptors such as collisional cross section (CCS), a global measure of molecular shape and size, but CCS values also do not provide atomistic detail. Here we present CRIM (cryo-EM + IM-MS), an integrative Rosetta scoring function that combines low-resolution cryo-EM density information with IM-MS derived CCS as restraints to improve monomeric protein structure prediction. CRIM incorporates the Rosetta REF2015 (RS) energy with a CCS agreement penalty (computed via PARCS) and an electron-density agreement term (elec_dens_fast). We tested CRIM on an ideal dataset of 60 monomeric proteins using simulated CCS values and density maps. Across the ideal dataset, the CRIM score function improved or maintained prediction quality for many targets, reducing the mean RMSD from 3.65 [A] (RS) to 2.90 [A] and increasing the mean TM-score from 0.88 to 0.90. Furthermore, an experimental benchmark dataset of 54 proteins was curated to include either experimental cryo-EM maps or published CCS values. On the experimental dataset, CRIM similarly improved model selection, lowering the mean RMSD from 6.65 [A] to 4.38 [A] and raising the mean TM-score from 0.73 to 0.79. In comparison to AlphaFold3 predictions, CRIM frequently yielded competitive predictions and was able to substantially outperform AlphaFold3 for select difficult targets where sparse experimental restraints provide strong discriminatory power. The CRIM score function is freely available within the Rosetta software suite and provides a practical framework for leveraging complementary IM-MS and cryo-EM data to improve monomeric protein structure prediction.

Matching journals

The top 3 journals account for 50% of the predicted probability mass.

1
Structure
175 papers in training set
Top 0.1%
37.4%
2
Journal of Structural Biology
58 papers in training set
Top 0.2%
6.3%
3
Protein Science
221 papers in training set
Top 0.2%
6.3%
50% of probability mass above
4
Nature Communications
4913 papers in training set
Top 29%
6.3%
5
Nature Methods
336 papers in training set
Top 3%
3.6%
6
Communications Biology
886 papers in training set
Top 3%
3.0%
7
IUCrJ
29 papers in training set
Top 0.1%
3.0%
8
Bioinformatics
1061 papers in training set
Top 6%
2.7%
9
eLife
5422 papers in training set
Top 34%
2.3%
10
Acta Crystallographica Section D Structural Biology
54 papers in training set
Top 0.2%
2.1%
11
Cell Reports Methods
141 papers in training set
Top 2%
1.7%
12
Journal of Structural Biology: X
15 papers in training set
Top 0.1%
1.7%
13
Communications Chemistry
39 papers in training set
Top 0.3%
1.6%
14
Science
429 papers in training set
Top 14%
1.6%
15
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 33%
1.6%
16
Journal of Molecular Biology
217 papers in training set
Top 2%
1.2%
17
PLOS Computational Biology
1633 papers in training set
Top 20%
1.1%
18
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.9%
19
Cell Research
49 papers in training set
Top 2%
0.8%
20
Advanced Science
249 papers in training set
Top 19%
0.7%
21
Briefings in Bioinformatics
326 papers in training set
Top 7%
0.7%
22
Computational and Structural Biotechnology Journal
216 papers in training set
Top 9%
0.7%
23
Nature Computational Science
50 papers in training set
Top 2%
0.7%
24
Scientific Data
174 papers in training set
Top 3%
0.6%
25
Cell Systems
167 papers in training set
Top 14%
0.6%