Back

IMMREP25: Unseen Peptides

Richardson, E.; Aarts, Y. J. M.; Altin, J. A.; Baakman, C. A. B.; Bradley, P.; Chen, B.; Clifford, J.; Dhar, M.; Diepenbroek, D.; Fast, E.; Gowthaman, R.; He, J.; Karnaukhov, V.; Marzella, D. F.; Meysman, P.; Nielsen, M.; Nilsson, J. B.; Deleuran, S. N.; Parizi, F. M.; Pelissier, A.; Pierce, B. G.; Rodriguez Martinez, M.; Roran A R, D.; Saravanakumar, S.; Shao, Y.; Smit, N.; Van Houcke, M.; Visani, G. M.; Wan, Y.-T. R.; Wang, X.; Woods, L.; Wuyts, S.; Xiao, C.; Xue, L. C.; IMMREP25 Participant Consortium, ; Barton, J.; Noakes, M.; May, D. H.; Peters, B.

2026-04-01 bioinformatics
10.64898/2026.03.30.715276 bioRxiv
Show abstract

T cell receptors (TCRs) can bind to peptides presented by MHC molecules (pMHC) as a first step to trigger a T cell response. Reliable approaches to predict TCR:pMHC binding would have broad applications in clinical diagnostics, therapeutics, and the fundamental understanding of molecular interactions. IMMREP is a community organized series of prediction contests that asks participants to predict TCR:pMHC binding on unpublished datasets. Previous iterations in 2022 and 2023 showed multiple approaches can predict TCR-pMHC binding with significant accuracy (median AUC_0.1[≥]0.7) for peptides where experimental data is available ("seen" peptides). In contrast, models did not outperform random guessing for peptides that have no such data available ("unseen" peptides). Here we report on the results of IMMREP25, which focused solely on unseen peptides in order to evaluate the cutting edge of the field. We received 126 named submissions predicting the specificity of 1,000 TCRs against twenty unseen peptides restricted by one of two MHC molecules (HLA-A*02:01 and HLA-B*40:01). The best performing methods showed a macro-AUC_0.1 of 0.60, significantly better than random, demonstrating significant advances in the field. The top performing methods incorporated structural modeling into their approach, indicating that especially for unseen peptides, a structural understanding aids in the prediction of TCR:pMHC interactions. The results from this benchmark highlight the significant challenges remaining for TCR:pMHC predictions and will inform future method development.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
ImmunoInformatics
11 papers in training set
Top 0.1%
14.0%
2
Bioinformatics
1061 papers in training set
Top 2%
12.1%
3
Frontiers in Immunology
586 papers in training set
Top 1%
6.2%
4
Briefings in Bioinformatics
326 papers in training set
Top 1%
4.7%
5
PLOS Computational Biology
1633 papers in training set
Top 8%
4.1%
6
Nature Machine Intelligence
61 papers in training set
Top 0.8%
3.9%
7
Computational and Structural Biotechnology Journal
216 papers in training set
Top 1%
3.9%
8
BMC Bioinformatics
383 papers in training set
Top 2%
3.9%
50% of probability mass above
9
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
3.5%
10
Nucleic Acids Research
1128 papers in training set
Top 8%
2.5%
11
Scientific Reports
3102 papers in training set
Top 48%
2.3%
12
Journal of Chemical Information and Modeling
207 papers in training set
Top 2%
2.0%
13
Bioinformatics Advances
184 papers in training set
Top 3%
1.7%
14
PLOS ONE
4510 papers in training set
Top 52%
1.7%
15
iScience
1063 papers in training set
Top 14%
1.7%
16
International Journal of Molecular Sciences
453 papers in training set
Top 8%
1.7%
17
Communications Biology
886 papers in training set
Top 10%
1.7%
18
Genome Medicine
154 papers in training set
Top 5%
1.6%
19
GigaScience
172 papers in training set
Top 2%
1.5%
20
Science Advances
1098 papers in training set
Top 22%
1.3%
21
Cell Reports Methods
141 papers in training set
Top 3%
1.3%
22
mAbs
28 papers in training set
Top 0.2%
1.3%
23
Nature Communications
4913 papers in training set
Top 55%
1.3%
24
Journal of Proteome Research
215 papers in training set
Top 2%
1.2%
25
Cell Genomics
162 papers in training set
Top 5%
1.1%
26
Nature Methods
336 papers in training set
Top 5%
1.1%
27
Advanced Science
249 papers in training set
Top 17%
0.9%
28
Patterns
70 papers in training set
Top 2%
0.9%
29
NAR Genomics and Bioinformatics
214 papers in training set
Top 3%
0.9%
30
eLife
5422 papers in training set
Top 59%
0.7%