Back

Detecting misfolded non-covalent lasso entanglements in protein structures, simulation trajectories, and mass spectrometry data

Sitarik, I.; Jiang, Y.; Song, H.; O'Brien, E. P.

2026-04-17 molecular biology
10.64898/2026.04.15.718775 bioRxiv
Show abstract

A previously overlooked class of protein entanglements, non-covalent lasso entanglements (NCLEs), has been found to play a role in widespread protein misfolding. However, understanding the influence NCLEs have on biological processes is hindered by the absence of dedicated algorithms and computational tools to detect and characterize these geometries in protein structures, molecular dynamics simulations, and in comparison to experimental data from limited proteolysis (LiP) and cross-linking (XL) mass spectrometry (MS). Here, we present EntDetect, a software tool designed to: (1) identify non-redundant NCLEs in protein structures, (2) detect misfolded states by comparing NCLE changes through pairwise comparisons of structures, (3) extract structural ensembles consistent with experimental signals from LiP-MS and XL-MS, and (4) investigate proteome-wide protein misfolding using high-throughput MS data. We demonstrate the utility of EntDetect on a simulated structural ensemble of phosphoglycerate kinase (PGK), alongside corresponding LiP- and XL-MS experimental data. Additionally, we detail the application of EntDetect to detect misfolding associated with native NCLEs on a proteome-wide MS dataset and select candidate proteins for further investigation. This protocol is intended for biophysicists, structural biologists, and molecular biologists with domain knowledge of protein structure, mass spectrometry proteomics data, and beginner experience with Python who want to interpret their experimental observations and computer simulations results through the presence and potential misfolding of NCLE topologies. EntDetect is open-source and freely available (https://github.com/obrien-lab-psu/EntDetect). NCLEweb is also available which is a webserver that identifies NCLEs within a given user-uploaded structure (https://www.ncleweb.org/).

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Journal of Proteome Research
215 papers in training set
Top 0.1%
22.4%
2
Molecular & Cellular Proteomics
158 papers in training set
Top 0.2%
12.3%
3
Journal of the American Society for Mass Spectrometry
33 papers in training set
Top 0.1%
8.4%
4
Nature Communications
4913 papers in training set
Top 27%
6.8%
5
Nature Methods
336 papers in training set
Top 2%
6.3%
50% of probability mass above
6
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
7
PLOS ONE
4510 papers in training set
Top 48%
2.1%
8
Journal of Molecular Biology
217 papers in training set
Top 1%
1.9%
9
Protein Science
221 papers in training set
Top 0.8%
1.8%
10
eLife
5422 papers in training set
Top 42%
1.7%
11
Frontiers in Molecular Biosciences
100 papers in training set
Top 2%
1.7%
12
Bioinformatics
1061 papers in training set
Top 7%
1.7%
13
Communications Chemistry
39 papers in training set
Top 0.4%
1.5%
14
Structure
175 papers in training set
Top 2%
1.2%
15
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
1.1%
16
Nucleic Acids Research
1128 papers in training set
Top 15%
0.9%
17
International Journal of Molecular Sciences
453 papers in training set
Top 12%
0.9%
18
Journal of Structural Biology
58 papers in training set
Top 1%
0.9%
19
Communications Biology
886 papers in training set
Top 17%
0.9%
20
PROTEOMICS
35 papers in training set
Top 0.6%
0.9%
21
Journal of Structural Biology: X
15 papers in training set
Top 0.2%
0.9%
22
Cell Systems
167 papers in training set
Top 11%
0.9%
23
Computational and Structural Biotechnology Journal
216 papers in training set
Top 8%
0.9%
24
Analytical Chemistry
205 papers in training set
Top 2%
0.8%
25
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 45%
0.7%
26
Nature Machine Intelligence
61 papers in training set
Top 4%
0.6%