Back

On the applicability domain of HADDOCK3 for protein-aptamer docking: documented failure modes from a 5x7 cross-target screening matrix and a 1676 aa receptor case study (P01031)

Dohi, E.

2026-05-12 bioinformatics
10.64898/2026.05.11.724398 bioRxiv
Show abstract

We screened a 5 receptor x 7 aptamer = 35-cell cross-target matrix with HADDOCK3 [1] under blind ambiguous-interaction-restraint (AIR) protocols on AlphaFold-modelled receptors. The screen surfaced 12 operationally distinct failure modes (collapsing to [~]8 conceptual classes; [§]3.1). The K_D-calibration subset is n = 4 cells with literature K_D records under matched assay conditions; the broader cohort includes [≥] 6 biological cognate or intended-cognate cells. The principal case study is P01031 (complement C5, 1676 aa, [≥] 12 structural domains): all 7 panel members produced positive HADDOCK3 top-1 scores under a scale-adaptive AIR. Score-term decomposition locates the anomaly in the AIR term (+217 to +268 to top-1 score). With AIR zeroed, scores fall to -131 to -74 -- the small-receptor regime. Boltz-2 cofolding chain-pair ipTM (cpi_AB) is an independent channel: P01031 shows the lowest median cpi_AB (0.211; 0/7 above the 0.5 confident-interface threshold). To our knowledge, this is the first reported case study of a 1676 aa multi-domain receptor exhibiting this signature under blind scale-adaptive AIR -- an n = 1 mechanistic case, not a statistical generalisation. We adapt the QSAR applicability domain concept [14-16] to in silico aptamer screening. [§]3.7 reports an empirical Mode 1 mitigation (pLDDT-aware AIR prefilter; cohort Jaccard recovery [~]10x).

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Journal of Chemical Information and Modeling
207 papers in training set
Top 0.2%
23.0%
2
Bioinformatics
1061 papers in training set
Top 2%
12.6%
3
Journal of Cheminformatics
25 papers in training set
Top 0.1%
7.0%
4
International Journal of Molecular Sciences
453 papers in training set
Top 2%
4.0%
5
Bioinformatics Advances
184 papers in training set
Top 1.0%
4.0%
50% of probability mass above
6
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
3.7%
7
Nature Methods
336 papers in training set
Top 3%
2.8%
8
Chemical Science
71 papers in training set
Top 0.5%
2.7%
9
Nature Communications
4913 papers in training set
Top 47%
2.1%
10
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 27%
2.1%
11
eLife
5422 papers in training set
Top 37%
1.9%
12
Scientific Reports
3102 papers in training set
Top 52%
1.9%
13
Journal of Molecular Biology
217 papers in training set
Top 1%
1.9%
14
Protein Science
221 papers in training set
Top 0.8%
1.7%
15
PLOS ONE
4510 papers in training set
Top 53%
1.7%
16
BMC Bioinformatics
383 papers in training set
Top 5%
1.4%
17
Artificial Intelligence in the Life Sciences
11 papers in training set
Top 0.1%
1.4%
18
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.4%
19
Proteins: Structure, Function, and Bioinformatics
82 papers in training set
Top 0.6%
1.2%
20
Cell Systems
167 papers in training set
Top 9%
1.2%
21
Communications Chemistry
39 papers in training set
Top 0.5%
1.2%
22
Molecules
37 papers in training set
Top 1%
1.2%
23
PLOS Computational Biology
1633 papers in training set
Top 20%
1.1%
24
Journal of Chemical Theory and Computation
126 papers in training set
Top 0.7%
0.9%
25
Nucleic Acids Research
1128 papers in training set
Top 15%
0.9%
26
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.7%
27
Patterns
70 papers in training set
Top 3%
0.7%
28
Biomolecules
95 papers in training set
Top 3%
0.7%
29
Communications Biology
886 papers in training set
Top 28%
0.7%
30
Advanced Science
249 papers in training set
Top 23%
0.5%