Objective: To investigate the utility of statistical tools in translating Affymetrix Drug Metabolizing Enzyme and Transporter (DMET) Assay single-nucleotide polymorphisms (SNPs) into common consensus star alleles.
Methods: DMET SNP data from clinical trials in different ethnicities were pooled for analyses. Three different statistical methods, PHASE, Bayesian, and expectation-maximization (EM), were first assessed by comparing the consistency of calling CYP2D6 alleles among 1108 Asians and 55 Caucasians. Subsequently, the performance of EM in deriving haplotype calls was evaluated against the Affymetrix Translation Table for CYPs 2B6, 2C19, 2C9, and 3A4/5 in 582 Asians, 296 Caucasians, and 369 Africans. Selected DNA samples were sequenced to verify the EM-predicted haplotype calls.
Results: PHASE, Bayesian, and EM methods showed a similar CYP2D6 star allele call rate. The EM method, with a 0.99 posterior probability cutoff, was chosen for further evaluation because of its low false-positive call rate. Haplotype calls obtained with the EM method were consistent with the Affymetrix Translation Table more than 95% of the time for all five CYPs, except for the CYP2B6 calls in the African descents (83%). In addition, the EM method was superior to the Translation Table-only approach in resolving complex haplotype patterns, identifying novel haplotypes in CYP2B6 and CYP3A5, and determining genotype calls in the presence of missing SNP data.
Conclusion: A statistical method such as EM could be used to augment the translation of DMET assay SNP data into star alleles, especially for complex genes, to facilitate full utilization and interpretation of clinical pharmacogenetics data.