Back

TRACE: a graph-based workflow for TCR-epitope prioritization and tumor-reactive T-cell identification

Chen, Y.; Giuliano, V.; Dacillo, I.; Lin, W.; Yan, Y.; Luo, P.

2026-05-31 bioinformatics
10.64898/2026.05.27.728217 bioRxiv
Show abstract

Accurate prioritization of T-cell receptor (TCR)-epitope interactions and identification of tumor-reactive T cells are important but difficult steps in immunotherapy-oriented bioinformatics workflows. Existing methods typically address these tasks separately and either model TCR-epitope pairs as independent observations or rely primarily on transcriptomic signatures. In this study, we present TRACE (TCR-epitope pRioritization And T-Cell idEntification), a graph-based computational workflow that unifies both applications within a single heterogeneous graph framework. The protocol represents TCRs, epitopes, and T cells as typed nodes connected by similarity and association edges, and combines pretrained sequence embeddings with edge-aware graph attention, Laplacian positional encoding, and bidirectional cross-domain attention. Applied to the IEDB and VDJdb benchmarks, TRACE achieved AUROC/AUPR values of 0.937/0.922 and 0.992/0.990, respectively, outperforming five state-of-the-art algorithms. In addition, on a single-cell RNA-seq dataset, the workflow achieved an AUROC of 0.984 and an AUPR of 0.984, substantially exceeding transcriptomic signature-based baselines for tumor-reactive T-cell identification. Ablation analysis showed that Laplacian positional encoding provided the largest performance gain, particularly in sparse graph settings. These results suggest that heterogeneous graph modeling can serve as a practical protocol for integrating receptor sequence, antigen context, and cellular phenotype in computational immunology.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
Briefings in Bioinformatics
326 papers in training set
Top 0.2%
14.3%
2
PLOS Computational Biology
1633 papers in training set
Top 3%
10.0%
3
Bioinformatics
1061 papers in training set
Top 3%
10.0%
4
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
5
ImmunoInformatics
11 papers in training set
Top 0.1%
4.8%
6
NAR Genomics and Bioinformatics
214 papers in training set
Top 0.3%
4.8%
7
BMC Bioinformatics
383 papers in training set
Top 2%
3.9%
50% of probability mass above
8
Bioinformatics Advances
184 papers in training set
Top 1%
3.9%
9
Patterns
70 papers in training set
Top 0.2%
3.6%
10
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 2%
3.6%
11
Cell Systems
167 papers in training set
Top 5%
2.7%
12
Nature Machine Intelligence
61 papers in training set
Top 1%
2.7%
13
Genome Medicine
154 papers in training set
Top 4%
2.1%
14
Advanced Science
249 papers in training set
Top 10%
1.9%
15
Computational and Structural Biotechnology Journal
216 papers in training set
Top 4%
1.7%
16
Frontiers in Immunology
586 papers in training set
Top 4%
1.7%
17
Nature Communications
4913 papers in training set
Top 52%
1.7%
18
IEEE Journal of Biomedical and Health Informatics
34 papers in training set
Top 1%
1.3%
19
iScience
1063 papers in training set
Top 19%
1.3%
20
Scientific Reports
3102 papers in training set
Top 66%
1.2%
21
GigaScience
172 papers in training set
Top 2%
1.2%
22
Frontiers in Genetics
197 papers in training set
Top 7%
1.1%
23
Science Advances
1098 papers in training set
Top 25%
0.9%
24
Cell Reports Medicine
140 papers in training set
Top 7%
0.9%
25
PLOS ONE
4510 papers in training set
Top 68%
0.7%
26
Nature Methods
336 papers in training set
Top 6%
0.7%
27
npj Systems Biology and Applications
99 papers in training set
Top 3%
0.6%
28
Cell Reports Methods
141 papers in training set
Top 6%
0.6%