Back

Sifting Through the Noise: A Computational Pipeline for Accurate Prioritization of Protein- Protein Binding Candidates in High-Throughput Protein Libraries

Mondal, A.; Singh, B.; Felkner, R. H.; De Falco, A.; Swapna, G.; Montelione, G. T.; Roth, M.; Perez, A.

2024-01-23 biophysics
10.1101/2024.01.20.576374 bioRxiv
Show abstract

Identifying the interactome for a protein of interest is challenging due to the large number of possible binders. High-throughput experimental approaches narrow down possible binding partners, but often include false positives. Furthermore, they provide no information about what the binding region is (e.g. the binding epitope). We introduce a novel computational pipeline based on an AlphaFold2 (AF) Competition Assay (AF-CBA) to identify proteins that bind a target of interest from a pull-down experiment, along with the binding epitope. Our focus is on proteins that bind the Extraterminal (ET) domain of Bromo and Extraterminal domain (BET) proteins, but we also introduce nine additional systems to show transferability to other peptide-protein systems. We describe a series of limitations to the methodology based on intrinsic deficiencies to AF and AF-CBA, to help users identify scenarios where the approach will be most useful. Given the speed and accuracy of the methodology, we expect it to be generally applicable to facilitate target selection for experimental verification starting from high-throughput protein libraries. Table of Contents O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=173 SRC="FIGDIR/small/576374v1_ufig1.gif" ALT="Figure 1"> View larger version (41K): org.highwire.dtl.DTLVardef@b90071org.highwire.dtl.DTLVardef@1cc2f66org.highwire.dtl.DTLVardef@3c1109org.highwire.dtl.DTLVardef@1827353_HPS_FORMAT_FIGEXP M_FIG C_FIG

Matching journals

The top 4 journals account for 50% of the predicted probability mass.

1
Protein Science
221 papers in training set
Top 0.1%
22.3%
2
PLOS ONE
4510 papers in training set
Top 18%
10.3%
3
Bioinformatics Advances
184 papers in training set
Top 0.1%
10.0%
4
Journal of Molecular Biology
217 papers in training set
Top 0.1%
10.0%
50% of probability mass above
5
Bioinformatics
1061 papers in training set
Top 5%
3.9%
6
Methods
29 papers in training set
Top 0.1%
3.6%
7
PLOS Computational Biology
1633 papers in training set
Top 14%
2.1%
8
Nucleic Acids Research
1128 papers in training set
Top 9%
1.9%
9
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.4%
1.8%
10
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.8%
11
Scientific Reports
3102 papers in training set
Top 59%
1.7%
12
Physical Biology
43 papers in training set
Top 1%
1.7%
13
Cell Reports Methods
141 papers in training set
Top 3%
1.6%
14
BMC Bioinformatics
383 papers in training set
Top 5%
1.3%
15
Frontiers in Bioinformatics
45 papers in training set
Top 0.4%
1.3%
16
iScience
1063 papers in training set
Top 22%
1.2%
17
Frontiers in Molecular Biosciences
100 papers in training set
Top 3%
1.1%
18
ACS Omega
90 papers in training set
Top 3%
0.9%
19
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.9%
20
Structure
175 papers in training set
Top 3%
0.9%
21
European Biophysics Journal
11 papers in training set
Top 0.2%
0.9%
22
Biophysical Journal
545 papers in training set
Top 5%
0.8%
23
Nature Communications
4913 papers in training set
Top 63%
0.7%
24
Journal of Proteome Research
215 papers in training set
Top 2%
0.7%
25
IUCrJ
29 papers in training set
Top 0.4%
0.7%
26
Biochemistry and Biophysics Reports
28 papers in training set
Top 2%
0.6%
27
Communications Biology
886 papers in training set
Top 29%
0.6%