Validation of statistical imputation of allele-level multilocus phased genotypes from ambiguous HLA assignments

Tissue Antigens. 2014 Sep;84(3):285-92. doi: 10.1111/tan.12390. Epub 2014 Jul 11.

Abstract

Genetic matching for loci in the human leukocyte antigen (HLA) region between a donor and a patient in hematopoietic stem cell transplantation (HSCT) is critical to outcome; however, methods for HLA genotyping of donors in unrelated stem cell registries often yield results with allelic and phase ambiguity and/or do not query all clinically relevant loci. We present and evaluate a statistical method for in silico imputation of HLA alleles and haplotypes in large ambiguous population data from the Be The Match(®) Registry. Our method builds on haplotype frequencies estimated from registry populations and exploits patterns of linkage disequilibrium (LD) across HLA haplotypes to infer high resolution HLA assignments. We performed validation on simulated and real population data from the Registry with non-trivial ambiguity content. While real population datasets caused some predictions to deviate from expectation, validations still showed high percent recall for imputed results with average recall >76% when imputing HLA alleles from registry data. We simulated ambiguity generated by several HLA genotyping methods to evaluate the imputation performance on several levels of typing resolution. On average, imputation percent recall of allele-level HLA haplotypes was >95% for allele-level typing, >92% for intermediate resolution typing and >58% for serology (low-resolution) typing. Thus, allele-level HLA assignments can be imputed through the application of a set of statistical and population genetics inferences and with knowledge of haplotype frequencies and self-identified race and ethnicities.

Keywords: expectation maximization; human leukocyte antigen; imputation; maximum likelihood; typing ambiguity; typing resolution.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Validation Study

MeSH terms

  • Alleles
  • Computer Simulation / statistics & numerical data
  • Ethnicity*
  • Gene Frequency
  • Genetic Loci / genetics
  • Genotype
  • HLA Antigens / genetics*
  • Haplotypes
  • Hematopoietic Stem Cell Transplantation*
  • Histocompatibility Testing / methods*
  • Histocompatibility Testing / statistics & numerical data
  • Humans
  • Linkage Disequilibrium
  • Models, Genetic
  • Registries
  • Tissue Donors
  • United States

Substances

  • HLA Antigens