RVD2: an ultra-sensitive variant detection model for low-depth heterogeneous next-generation sequencing data

Bioinformatics. 2015 Sep 1;31(17):2785-93. doi: 10.1093/bioinformatics/btv275. Epub 2015 Apr 29.

Abstract

Motivation: Next-generation sequencing technology is increasingly being used for clinical diagnostic tests. Clinical samples are often genomically heterogeneous due to low sample purity or the presence of genetic subpopulations. Therefore, a variant calling algorithm for calling low-frequency polymorphisms in heterogeneous samples is needed.

Results: We present a novel variant calling algorithm that uses a hierarchical Bayesian model to estimate allele frequency and call variants in heterogeneous samples. We show that our algorithm improves upon current classifiers and has higher sensitivity and specificity over a wide range of median read depth and minor allele fraction. We apply our model and identify 15 mutated loci in the PAXP1 gene in a matched clinical breast ductal carcinoma tumor sample; two of which are likely loss-of-heterozygosity events.

Availability and implementation: http://genomics.wpi.edu/rvd2/.

Contact: [email protected]

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Algorithms*
  • Alleles
  • Bayes Theorem
  • Biomarkers, Tumor / genetics*
  • Breast Neoplasms / genetics
  • Carcinoma, Ductal, Breast / genetics
  • Female
  • Gene Frequency
  • Genomics
  • High-Throughput Nucleotide Sequencing / methods*
  • Humans
  • Loss of Heterozygosity
  • Mutation / genetics*
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Alignment
  • Sequence Analysis, DNA / methods*
  • Software*

Substances

  • Biomarkers, Tumor