Indel detection from DNA and RNA sequencing data with transIndel

BMC Genomics. 2018 Apr 19;19(1):270. doi: 10.1186/s12864-018-4671-4.

Abstract

Background: Insertions and deletions (indels) are a major class of genomic variation associated with human disease. Indels are primarily detected from DNA sequencing (DNA-seq) data but their transcriptional consequences remain unexplored due to challenges in discriminating medium-sized and large indels from splicing events in RNA-seq data.

Results: Here, we developed transIndel, a splice-aware algorithm that parses the chimeric alignments predicted by a short read aligner and reconstructs the mid-sized insertions and large deletions based on the linear alignments of split reads from DNA-seq or RNA-seq data. TransIndel exhibits competitive or superior performance over eight state-of-the-art indel detection tools on benchmarks using both synthetic and real DNA-seq data. Additionally, we applied transIndel to DNA-seq and RNA-seq datasets from 333 primary prostate cancer patients from The Cancer Genome Atlas (TCGA) and 59 metastatic prostate cancer patients from AACR-PCF Stand-Up- To-Cancer (SU2C) studies. TransIndel enhanced the taxonomy of DNA- and RNA-level alterations in prostate cancer by identifying recurrent FOXA1 indels as well as exitron splicing in genes implicated in disease progression.

Conclusions: Our study demonstrates that transIndel is a robust tool for elucidation of medium- and large-sized indels from DNA-seq and RNA-seq data. Including RNA-seq in indel discovery efforts leads to significant improvements in sensitivity for identification of med-sized and large indels missed by DNA-seq, and reveals non-canonical RNA-splicing events in genes associated with disease pathology.

Keywords: Cancer genome; DNA-seq; Exitron; Indel detection; Metastasis; RNA-seq; TCGA.

MeSH terms

  • DNA Mutational Analysis*
  • Exons / genetics
  • Gene Expression Profiling
  • Humans
  • INDEL Mutation*
  • Male
  • Neoplasm Metastasis
  • Prostatic Neoplasms / genetics
  • Prostatic Neoplasms / pathology
  • RNA Splicing / genetics
  • Sequence Analysis, RNA*