Biased estimates of clonal evolution and subclonal heterogeneity can arise from PCR duplicates in deep sequencing experiments

Genome Biol. 2014 Aug 7;15(8):420. doi: 10.1186/s13059-014-0420-4.

Abstract

Accurate allele frequencies are important for measuring subclonal heterogeneity and clonal evolution. Deep-targeted sequencing data can contain PCR duplicates, inflating perceived read depth. Here we adapted the Illumina TruSeq Custom Amplicon kit to include single molecule tagging (SMT) and show that SMT-identified duplicates arise from PCR. We demonstrate that retention of PCR duplicate reads can imply clonal evolution when none exists, while their removal effectively controls the false positive rate. Additionally, PCR duplicates alter estimates of subclonal heterogeneity in tumor samples. Our method simplifies PCR duplicate identification and emphasizes their removal in studies of tumor heterogeneity and clonal evolution.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Clonal Evolution*
  • False Positive Reactions
  • Gene Frequency
  • Genetic Heterogeneity*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Leukemia, Lymphocytic, Chronic, B-Cell / genetics
  • Polymerase Chain Reaction / methods*
  • Sequence Analysis, DNA / methods
  • Software