Back

GRIMM-II: A Two-Stage Real-Time Algorithm for Nine-Locus HLA Imputation and Matching with Up to Three Mismatches

Kirshenboim, O.; Kabya, A.; Yehezkel-Imra, R.; Tshuva, Y.; Maiers, M.; Gragert, L.; Bashyal, P.; Israeli, S.; Louzoun, Y.

2026-03-31 bioinformatics
10.64898/2026.03.28.715027 bioRxiv
Show abstract

BackgroundThe success of hematopoietic stem cell transplantation (HSCT) depends critically on human leukocyte antigen (HLA) matching between donor and recipient. While traditional matching focuses on five classical HLA loci (A, B, C, DRB1, DQB1), clinical practice increasingly considers extended typing at nine loci, including DPA1, DQA1, DPB1, and DRB3/4/5. Furthermore, emerging evidence supports transplantation with up to three HLA mismatches under post-transplant cyclophosphamide (PTCy) regimens. However, current donor search algorithms cannot efficiently identify donors with multiple mismatches across extended HLA loci in real-time. MethodsWe developed GRIMM-II (GRaph IMputation and Matching, version II), which comprises two novel algorithms: ML-GRIM (Multi-Locus GRIM) for HLA imputation across multiple loci, and ML-GRMA (Multi-Locus GRMA) for real-time donor-patient matching with up to three mismatches. Both algorithms employ a two-stage approach that combines efficient candidate reduction through graph-theoretic frameworks with detailed genotype comparison. ML-GRIM partitions genotypes into class I (HLA-A, B, C) and class II (remaining loci) components, enabling memory-efficient storage and rapid candidate identification. ML-GRMA searches a pre-imputed donor graph composed of donor genotypes and their sub-components, then computes asymmetric graft-versus-host (GvH) and host-versus-graft (HvG) mismatch probabilities to provide clinically relevant compatibility assessments. Both imputation and matching tools are available as a web application at https://grimmard.math.biu.ac.il/ and through GitHub repositories at https://github.com/nmdp-bioinformatics/py-graph-imputation (imputation) and https://github.com/nmdp-bioinformatics/py-graph-match (matching). ResultsWe validated ML-GRMA and ML-GRIM using the WMDA3 (World Marrow Donor Association) validation dataset, successfully reproducing all previously reported matches while identifying numerous additional candidate donors not detected by previous algorithms. Further validation of ML-GRMA using 3,000 patients with artificially introduced mismatches (0-3 allele substitutions) demonstrated 100% sensitivity and specificity in identifying matching donors at expected mismatch levels. We validated ML-GRIM using simulated nine-locus typings derived from 8,078,224 US donors in the NMDP registry. The algorithm successfully imputed genotypes across variable numbers of typed loci while incorporating multiethnic haplotype frequencies. The algorithm achieved real-time performance with typical imputation times under one second and matching times of 1-13 seconds per patient for up to three mismatches, even when searching databases exceeding 8 million donors. Notably, ML-GRMA identified substantially more potentially suitable donors than traditional algorithms by accounting for the biological reality that GvH and HvG mismatches often differ, particularly for donors homozygous at specific loci. To evaluate ML-GRIM performance with low-resolution typing, we tested it on simulated 3-locus typings from the same population. The resulting imputation accuracy correlated with the mutual information between typed loci and complete genotypes. ConclusionsGRIMM-II provides a scalable, memory-efficient solution for nine-locus HLA imputation and real-time identification of donors with up to three mismatches. The graph-based framework supports dynamic registry updates and can readily accommodate additional HLA loci and matching criteria as clinical knowledge evolves. By expanding the pool of acceptable donors while maintaining computational efficiency, GRIMM-II addresses a critical need in contemporary transplantation practice, particularly for patients from underrepresented ethnic minorities who face lower probabilities of finding perfectly matched donors.

Matching journals

The top 9 journals account for 50% of the predicted probability mass.

1
Nature Communications
4913 papers in training set
Top 18%
10.0%
2
Bioinformatics
1061 papers in training set
Top 4%
6.3%
3
Bioinformatics Advances
184 papers in training set
Top 0.5%
6.3%
4
Genome Medicine
154 papers in training set
Top 1%
6.3%
5
Journal of Thrombosis and Haemostasis
28 papers in training set
Top 0.1%
6.3%
6
Nucleic Acids Research
1128 papers in training set
Top 4%
4.8%
7
PLOS ONE
4510 papers in training set
Top 32%
4.8%
8
PLOS Computational Biology
1633 papers in training set
Top 10%
3.6%
9
Leukemia
39 papers in training set
Top 0.3%
3.6%
50% of probability mass above
10
Communications Biology
886 papers in training set
Top 4%
2.3%
11
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.4%
2.1%
12
Blood
67 papers in training set
Top 0.7%
1.9%
13
Scientific Reports
3102 papers in training set
Top 54%
1.9%
14
Cell Reports Methods
141 papers in training set
Top 2%
1.7%
15
ImmunoInformatics
11 papers in training set
Top 0.1%
1.6%
16
Clinical and Translational Science
21 papers in training set
Top 0.5%
1.5%
17
JCI Insight
241 papers in training set
Top 4%
1.5%
18
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
1.2%
19
Computational and Structural Biotechnology Journal
216 papers in training set
Top 7%
0.9%
20
Science Advances
1098 papers in training set
Top 25%
0.9%
21
BMC Bioinformatics
383 papers in training set
Top 6%
0.9%
22
Cytometry Part A
30 papers in training set
Top 0.3%
0.9%
23
Journal of Translational Medicine
46 papers in training set
Top 2%
0.8%
24
Advanced Science
249 papers in training set
Top 18%
0.8%
25
Cell Reports Medicine
140 papers in training set
Top 7%
0.8%
26
Trials
25 papers in training set
Top 1%
0.8%
27
American Journal of Transplantation
15 papers in training set
Top 0.2%
0.7%
28
iScience
1063 papers in training set
Top 33%
0.7%
29
Cell Systems
167 papers in training set
Top 12%
0.7%
30
American Journal of Respiratory and Critical Care Medicine
39 papers in training set
Top 0.9%
0.7%