Allogeneic Hematopoietic Cell Transplantation (HCT) is a curative therapy for hematologic disorders and often requires human leukocyte antigen (HLA)-matched donors. Donor registries have recruited donors utilizing evolving technologies of HLA genotyping methods. This necessitates in-silico ambiguity resolution and statistical imputation based on haplotype frequencies estimated from donor data stratified by self-identified race and ethnicity (SIRE). However, SIRE has limited genetic validity and presents a challenge for individuals with unknown or mixed SIRE. We present MR-GRIMM "Multi-Race Graph IMputation and Matching" that simultaneously imputes the race/ethnic category and HLA genotype using a SIRE based prior. Additionally, we propose a novel method to impute HLA typing inconsistent with current haplotype frequencies. The performance of MR-GRIMM was validated using a dataset of 170,000 donor-recipient pairs. MR-GRIMM has an average 20 % lower matching error (1-AUC) than single-race imputation. The recall metric (sensitivity) of the race/ethnic category imputation from HLA was measured by comparing the imputed donor race with the donor-provided SIRE. Accuracies of 0.74 and 0.55 were obtained for the prediction of 5 broad and 21 detailed US population groups respectively. The operational implementation of this algorithm in a registry search could help improve match predictions and access to HLA-matched donors.
Keywords: EM; Ethnic groups; HLA; HSCT; Imputation.
Copyright © 2023 The Author(s). Published by Elsevier Inc. All rights reserved.