Genotyping common and rare variation using overlapping pool sequencing

BMC Bioinformatics. 2011;12 Suppl 6(Suppl 6):S2. doi: 10.1186/1471-2105-12-S6-S2. Epub 2011 Jul 28.

Abstract

Background: Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants.

Results: In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications.

Conclusions: Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Gene Frequency*
  • Gene Fusion
  • Genotype
  • Humans
  • Likelihood Functions
  • Neoplasms / genetics*
  • Polymorphism, Single Nucleotide*
  • Probability
  • Sequence Analysis, DNA*