Back

Deep learning enables direct HLA typing from immunopeptidomics data

Pilz, M.; Scheid, J.; Bauer, A.; Lemke, S.; Sachsenberg, T.; Bauer, J.; Nelde, A.; Stadelmaier, J.; Walter, A.; Rammensee, H.-G.; Nahnsen, S.; Kohlbacher, O.; Walz, J. S.

2026-04-10 bioinformatics
10.64898/2026.04.08.717021 bioRxiv
Show abstract

The immune system eliminates malignant and infected cells through T-cell-mediated recognition of peptides presented by human leukocyte antigen molecules. Mass spectrometry-based immunopeptidomics enables unbiased identification of naturally presented HLA-restricted peptides and has become central to the development of T-cell-based immunotherapies. However, immunopeptidomics data reflects the combined peptide presentation of multiple HLA alleles, and determining which allotypes are represented in this multi-allelic complexity remains an unmet computational challenge. Here, we introduce immunotype, a deep learning-based ensemble predictor for HLA class I allotyping directly from immunopeptidomics data. Immunotype integrates peptide and HLA protein sequence information through transformer encoders and a graph neural network, complemented by a curated mono-allelic reference of known peptide-HLA binding preferences. Immunotype achieves an overall accuracy of 87.2% at protein-level resolution across diverse tissues and thereby enables rapid, cost-effective HLA typing of large-scale immunopeptidomics datasets.

Matching journals

The top 5 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 6%
18.3%
2
Nature Biotechnology
147 papers in training set
Top 0.5%
12.3%
3
Nature Machine Intelligence
61 papers in training set
Top 0.2%
9.9%
4
Nature Methods
336 papers in training set
Top 1%
8.3%
5
Advanced Science
249 papers in training set
Top 3%
6.2%
50% of probability mass above
6
Genome Medicine
154 papers in training set
Top 1%
4.8%
7
Cell Systems
167 papers in training set
Top 3%
4.8%
8
Nucleic Acids Research
1128 papers in training set
Top 8%
2.6%
9
Genome Biology
555 papers in training set
Top 4%
2.0%
10
Bioinformatics
1061 papers in training set
Top 7%
1.8%
11
Cell Genomics
162 papers in training set
Top 3%
1.7%
12
Proceedings of the National Academy of Sciences
2130 papers in training set
Top 34%
1.6%
13
Science Advances
1098 papers in training set
Top 19%
1.6%
14
Nature Chemical Biology
104 papers in training set
Top 2%
1.5%
15
Science
429 papers in training set
Top 16%
1.3%
16
Communications Biology
886 papers in training set
Top 13%
1.3%
17
Briefings in Bioinformatics
326 papers in training set
Top 5%
1.2%
18
Cell Reports Methods
141 papers in training set
Top 4%
0.9%
19
eLife
5422 papers in training set
Top 56%
0.8%
20
Cell
370 papers in training set
Top 17%
0.7%
21
Cell Reports Medicine
140 papers in training set
Top 8%
0.7%
22
Nature
575 papers in training set
Top 16%
0.7%
23
Nature Biomedical Engineering
42 papers in training set
Top 2%
0.7%
24
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 7%
0.7%
25
Science Immunology
81 papers in training set
Top 2%
0.6%
26
PLOS ONE
4510 papers in training set
Top 72%
0.6%