ScanITD: Detecting internal tandem duplication with robust variant allele frequency estimation

Gigascience. 2020 Aug 1;9(8):giaa089. doi: 10.1093/gigascience/giaa089.

Abstract

Background: Internal tandem duplications (ITDs) are tandem duplications within coding exons and are important prognostic markers and drug targets for acute myeloid leukemia (AML). Next-generation sequencing has enabled the discovery of ITD at single-nucleotide resolution. ITD allele frequency is used in the risk stratification of patients with AML; higher ITD allele frequency is associated with poorer clinical outcomes. However, the ITD allele frequency data are often unavailable to treating physicians and the detection of ITDs with accurate variant allele frequency (VAF) estimation remains challenging for short-read sequencing.

Results: Here we present the ScanITD approach, which performs a stepwise seed-and-realignment procedure for ITD detection with accurate VAF prediction. The evaluations on simulated and real data demonstrate that ScanITD outperforms 3 state-of-the-art ITD detectors, especially for VAF estimation. Importantly, ScanITD yields better accuracy than general-purpose structural variation callers for predicting ITD size range duplications.

Conclusions: ScanITD enables the accurate identification of ITDs with robust VAF estimation. ScanITD is written in Python and is open-source software that is freely accessible at https://github.com/ylab-hi/ScanITD.

Keywords: FLT3; TCGA; acute myeloid leukemia; chimeric alignment; internal tandem duplications; variant allele frequency.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Gene Frequency
  • High-Throughput Nucleotide Sequencing
  • Humans
  • Leukemia, Myeloid, Acute* / genetics
  • Mutation
  • Tandem Repeat Sequences*
  • fms-Like Tyrosine Kinase 3

Substances

  • fms-Like Tyrosine Kinase 3