A new model of multi-marker correlation for genome-wide tag SNP selection

Genome Inform. 2008:21:27-41.

Abstract

Tag SNP selection is an important problem in computational biology and genetics because a small set of tag SNP markers may help reduce the cost of genotyping and thus genome-wide association studies. Several methods for selecting a smallest possible set of tag SNPs based on different formulations of tag SNP selection (block-based or genome-wide) and mathematical models of marker correlation have been investigated in the literature. In this paper, we propose a new model of multi-marker correlation for genome-wide tag SNP selection, and a simple greedy algorithm to select a smallest possible set of tag SNPs according to the model. Our experimental results on several real datasets from the HapMap project demonstrate that the new model yields more succinct tag SNP sets than the previous methods.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Chromosome Mapping / methods
  • Gene Frequency*
  • Genetic Markers*
  • Genome-Wide Association Study*
  • Likelihood Functions
  • Linkage Disequilibrium / genetics
  • Models, Genetic*
  • Polymorphism, Single Nucleotide*
  • Selection, Genetic*

Substances

  • Genetic Markers