The effect of missing data on linkage disequilibrium mapping and haplotype association analysis in the GAW14 simulated datasets

BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S151. doi: 10.1186/1471-2156-6-S1-S151.

Abstract

We used our newly developed linkage disequilibrium (LD) plotting software, JLIN, to plot linkage disequilibrium between pairs of single-nucleotide polymorphisms (SNPs) for three chromosomes of the Genetic Analysis Workshop 14 Aipotu simulated population to assess the effect of missing data on LD calculations. Our haplotype analysis program, SIMHAP, was used to assess the effect of missing data on haplotype-phenotype association. Genotype data was removed at random, at levels of 1%, 5%, and 10%, and the LD calculations and haplotype association results for these levels of missingness were compared to those for the complete dataset. It was concluded that ignoring individuals with missing data substantially affects the number of regions of LD detected which, in turn, could affect tagging SNPs chosen to generate haplotypes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromosome Mapping / methods*
  • Computer Simulation*
  • Congresses as Topic*
  • Databases, Genetic*
  • Genetics, Population
  • Genome-Wide Association Study / methods*
  • Haplotypes / genetics*
  • Humans
  • Linkage Disequilibrium / genetics*