Back

AI-first structural identification of pathogenic protein targets

Saluri, M.; Landreh, M.; Bryant, P.

2024-12-16 bioinformatics
10.1101/2024.12.12.628104 bioRxiv
Show abstract

The likelihood for pandemics is increasing as the world population grows and becomes more interconnected. Obtaining structural knowledge of protein-protein interactions between a pathogen and its host can inform pathogenic mechanisms and treatment or vaccine design. Currently, there are 52 nonredundant human-pathogen interactions with known structure in the PDB, although there are 21064 with experimental support in the HPIDB, meaning that only 0.2% of known interactions have known structure. Recent improvements in structure prediction of protein complexes based on AlphaFold have made it possible to model heterodimeric complexes with very high accuracy. However, it is not known how this translates to host-pathogen interactions which share a different evolutionary relationship. Here, we analyse the structural protein-protein interaction network between ten different pathogens and their human host. We predict the structure of 9452 human-pathogen interactions of which only 10 have known structure. We find that we can model 30 interactions with an expected TM-score of [≥]0.9, expanding the structural knowledge in these networks three-fold. We select the highly-scoring Francisella tularensis dihydroprolyl dehydrogenase (IPD) complex with human immunoglobulin Kappa constant (IGKC) for detailed analysis with homology modeling and native mass spectrometry. Our results confirm the predicted 1:2:1 heterotetrameric complex with potential implications for bacterial immune response evasion. We are entering a new era where structure prediction can be used to guide vaccine and drug development towards new pathogenic targets in very short time frames.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 0.8%
22.3%
2
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.1%
14.2%
3
Structure
175 papers in training set
Top 0.3%
7.1%
4
Journal of Chemical Information and Modeling
207 papers in training set
Top 1%
4.8%
5
Nature Communications
4913 papers in training set
Top 38%
3.8%
50% of probability mass above
6
eLife
5422 papers in training set
Top 26%
3.6%
7
Communications Biology
886 papers in training set
Top 3%
3.0%
8
Scientific Reports
3102 papers in training set
Top 44%
2.7%
9
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 26%
2.3%
10
Protein Science
221 papers in training set
Top 0.7%
2.1%
11
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
12
PLOS ONE
4510 papers in training set
Top 52%
1.8%
13
Journal of Molecular Biology
217 papers in training set
Top 2%
1.7%
14
Computational and Structural Biotechnology Journal
216 papers in training set
Top 5%
1.7%
15
iScience
1063 papers in training set
Top 16%
1.6%
16
Cell Systems
167 papers in training set
Top 8%
1.5%
17
Frontiers in Immunology
586 papers in training set
Top 6%
1.2%
18
Bioinformatics
1061 papers in training set
Top 8%
1.2%
19
Biophysical Journal
545 papers in training set
Top 4%
1.1%
20
Journal of Structural Biology
58 papers in training set
Top 1%
1.1%
21
PLOS Pathogens
721 papers in training set
Top 8%
0.9%
22
Advanced Science
249 papers in training set
Top 18%
0.8%
23
Patterns
70 papers in training set
Top 3%
0.7%
24
Chemical Science
71 papers in training set
Top 2%
0.7%
25
Nature Methods
336 papers in training set
Top 6%
0.7%
26
Science Advances
1098 papers in training set
Top 31%
0.7%
27
Nature Machine Intelligence
61 papers in training set
Top 4%
0.7%
28
Frontiers in Molecular Biosciences
100 papers in training set
Top 6%
0.6%
29
Molecular Biology and Evolution
488 papers in training set
Top 5%
0.6%