A reduced representation approach to population genetic analyses and applications to human evolution

Genome Res. 2011 Jul;21(7):1087-98. doi: 10.1101/gr.119792.110. Epub 2011 May 31.

Abstract

Second-generation sequencing technologies allow surveys of sequence variation on an unprecedented scale. However, despite the rapid decrease in sequencing costs, collecting whole-genome sequence data on a population scale is still prohibitive for many laboratories. We have implemented an inexpensive, reduced representation protocol for preparing resequencing targets, and we have developed the analytical tools necessary for making population genetic inferences. This approach can be applied to any species for which a draft or complete reference genome sequence is available. The new tools we have developed include methods for aligning reads, calling genotypes, and incorporating sample-specific sequencing error rates in the estimate of evolutionary parameters. When applied to 19 individuals from a total of 18 human populations, our approach allowed sampling regions that are largely overlapping across individuals and that are representative of the entire genome. The resequencing data were used to test the serial founder model of human dispersal and to estimate the time of the Out of Africa migration. Our results also represent the first attempt to provide a time frame for the colonization of Australia based on large-scale resequencing data.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Africa
  • Australia
  • Biological Evolution*
  • Databases, Genetic
  • Female
  • Gene Frequency
  • Genetic Variation
  • Genetics, Population*
  • Genome, Human*
  • Genotype
  • Humans
  • Male
  • Models, Biological
  • Polymorphism, Single Nucleotide
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*