An evaluation of allele frequency estimation accuracy using pooled sequencing data

Int J Comput Biol Drug Des. 2013;6(4):279-93. doi: 10.1504/IJCBDD.2013.056709. Epub 2013 Sep 30.

Abstract

Next generation sequencing technology has matured, and with its current affordability, will replace the SNP chip as the genotyping tool of choice. Even with the current affordability of NGS, large scale studies will require careful study design to reduce cost. In this study, we designed an experiment to assess the accuracy of allele frequency estimated from pooled sequencing data. We compared the allele frequency estimated from sequencing data with the allele frequency estimated from individual SNP chip data and observed high correlations between them. However, by calculating error rate, we found that many SNPs had their allele frequency estimated from sequencing data significantly different from allele frequency estimated from SNP chip data. In conclusion, we found correlation is not an ideal measurement for comparing allele frequencies. And for the purpose of estimating allele frequency, we do not recommend using pooling with NGS as a cheaper alternative to genotype each sample individually.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene Frequency*
  • Genome-Wide Association Study
  • High-Throughput Nucleotide Sequencing*
  • Humans
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA*