Modeling TCR-pMHC Binding with Dual Encoders and Cross-Attention Fusion
Wang, W.; Qi, C.; Wei, Z.
Show abstract
Accurately modeling the binding between T-cell receptors (TCRs) and peptide-MHC (pMHC) complexes is essential for guiding immunotherapy development and personalized vaccine design. However, the vast diversity of TCR repertoires and the scarcity of experimentally validated interactions make generalization to unseen epitopes challenging. This paper proposes TIDE, a cross-attention-driven dual-encoder framework that leverages large protein and molecular language models to learn discriminative representations of TCRs and peptides. In TIDE, TCR sequences are encoded using Evolutionary Scale Modeling (ESM), while peptides are transformed into SMILES strings and processed by MolFormer to capture chemical and structural properties. Multi-layer cross-attention then refines and integrates these embeddings, highlighting interaction-relevant patterns without requiring explicit structural alignment. Evaluated on the TCHard benchmark under both zero-shot and few-shot settings, TIDE achieves superior predictive accuracy and robustness compared to state-of-the-art baselines such as ChemBERTa, TITAN, and NetTCR. These results demonstrate that combining pretrained language models with cross-attention fusion offers a powerful approach for TCR-pMHC binding prediction and paves the way for more reliable computational immunology applications.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.