Back

t2pmhc: A Structure-Informed Graph Neural Network to predict TCR-pMHC Binding

Polster, M.; Stadelmaier, J.; Ball, E.; Scheid, J.; Bauer, J.; Nelde, A.; Claassen, M.; Dubbelaar, M. L.; Walz, J. S.; Nahnsen, S.

2026-03-02 bioinformatics
10.64898/2026.02.27.708137 bioRxiv
Show abstract

Mapping of T cell receptors (TCRs) to their cognate MHC-presented peptides (pMHC) is central for the development of precision immunotherapies and vaccine design. However, accurate prediction of TCR affinity to peptide antigens remains an open challenge. Most approaches rely solely on sequence information, although increasing evidence suggests that TCR-pMHC binding is primarily determined by three-dimensional structural interactions within the entire TCR-pMHC complex. Consequently, sequence-based methods often fail to generalize to peptides not included in the training data (unseen peptides). Here we introduce t2pmhc, a structure-based graph neural network framework for predicting TCR-pMHC binding using predicted structures of the entire TCR-pMHC complex. We evaluated a Graph Convolutional Network (GCN) and a Graph Attention Network, both demonstrating improved generalization to unseen peptides compared to state-of-the-art models across a variety of public datasets. Evaluation with crystallographic structures yields high-confidence predictions, indicating that current limitations of structure-based models are largely driven by the accuracy of structure prediction. Analysis of node attention patterns in t2pmhc-GCN reveals biologically consistent patterns, assigning high attention to the peptide and the CDR3 regions. Within the peptide sequence, canonical MHC anchor residues are consistently downweighted, whereas potential TCR-binding residues are upweighted. These findings establish t2pmhc as a structure-informed framework for robust TCR-pMHC binding prediction, enabling improved generalization to unseen antigens and providing a foundation for integrating TCR repertoire sequencing into vaccine design and immunotherapy.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Nature Machine Intelligence
61 papers in training set
Top 0.1%
14.2%
2
Nature Communications
4913 papers in training set
Top 21%
9.0%
3
Cell Systems
167 papers in training set
Top 1%
8.3%
4
Advanced Science
249 papers in training set
Top 3%
6.3%
5
Structure
175 papers in training set
Top 0.5%
4.8%
6
Nature Biotechnology
147 papers in training set
Top 3%
3.5%
7
PLOS Computational Biology
1633 papers in training set
Top 10%
3.5%
8
Nucleic Acids Research
1128 papers in training set
Top 6%
3.5%
50% of probability mass above
9
Frontiers in Immunology
586 papers in training set
Top 2%
3.5%
10
Bioinformatics
1061 papers in training set
Top 5%
3.5%
11
Nature Methods
336 papers in training set
Top 3%
3.0%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 26%
2.3%
13
Science Advances
1098 papers in training set
Top 13%
2.1%
14
eLife
5422 papers in training set
Top 36%
2.0%
15
Communications Biology
886 papers in training set
Top 7%
1.9%
16
Scientific Reports
3102 papers in training set
Top 59%
1.7%
17
Genome Medicine
154 papers in training set
Top 5%
1.7%
18
Cell Genomics
162 papers in training set
Top 3%
1.7%
19
Cell Reports Medicine
140 papers in training set
Top 5%
1.3%
20
Nano Letters
63 papers in training set
Top 2%
1.2%
21
iScience
1063 papers in training set
Top 25%
0.9%
22
mAbs
28 papers in training set
Top 0.3%
0.9%
23
Cell Reports
1338 papers in training set
Top 30%
0.9%
24
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
25
Patterns
70 papers in training set
Top 2%
0.8%
26
Cell Reports Methods
141 papers in training set
Top 5%
0.8%
27
Computational and Structural Biotechnology Journal
216 papers in training set
Top 10%
0.7%
28
Cell Research
49 papers in training set
Top 3%
0.7%
29
ACS Nano
99 papers in training set
Top 4%
0.7%
30
Journal of Chemical Information and Modeling
207 papers in training set
Top 3%
0.7%