Evaluation of haplotype inference using definitive haplotype data obtained from complete hydatidiform moles, and its significance for the analyses of positively selected regions

PLoS Genet. 2009 May;5(5):e1000468. doi: 10.1371/journal.pgen.1000468. Epub 2009 May 8.

Abstract

The haplotype map constructed by the HapMap Project is a valuable resource in the genetic studies of disease genes, population structure, and evolution. In the Project, Caucasian and African haplotypes are fairly accurately inferred, based mainly on the rules of Mendelian inheritance using the genotypes of trios. However, the Asian haplotypes are inferred from the genotypes of unrelated individuals based on population genetics, and are less accurate. Thus, the effects of this inaccuracy on downstream analyses needs to be assessed. We determined true Japanese haplotypes by genotyping 100 complete hydatidiform moles (CHM), each carrying a genome derived from a single sperm, using Affymetrix 500 K Arrays. We then assessed how inferred haplotypes can differ from true haplotypes, by phasing pseudo-individualized true haplotypes using the programs PHASE, fastPHASE, and Beagle. We found that, at various genomic regions, especially the MHC locus, the expansion of extended haplotype homozygosity (EHH), which is a measure of positive selection, is obscured when inferred Asian haplotype data is used to detect the expansion. We then mapped the genome using a new statistic, XDiHH, which directly detects the difference between the true and inferred haplotypes, in the determination of EHH expansion. We also show that the true haplotype data presented here is useful to assess and improve the accuracy of phasing of Asian genotypes.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asian People / genetics
  • Biometry
  • Chromosome Mapping
  • Databases, Genetic
  • Female
  • Genome-Wide Association Study
  • Haplotypes
  • Homozygote
  • Humans
  • Hydatidiform Mole / genetics*
  • Japan
  • Linkage Disequilibrium
  • Male
  • Oligonucleotide Array Sequence Analysis
  • Pregnancy
  • Principal Component Analysis
  • Uterine Neoplasms / genetics*