DEBrowser: interactive differential expression analysis and visualization tool for count data

BMC Genomics. 2019 Jan 5;20(1):6. doi: 10.1186/s12864-018-5362-x.

Abstract

Background: Sequencing data has become a standard measure of diverse cellular activities. For example, gene expression is accurately measured by RNA sequencing (RNA-Seq) libraries, protein-DNA interactions are captured by chromatin immunoprecipitation sequencing (ChIP-Seq), protein-RNA interactions by crosslinking immunoprecipitation sequencing (CLIP-Seq) or RNA immunoprecipitation (RIP-Seq) sequencing, DNA accessibility by assay for transposase-accessible chromatin (ATAC-Seq), DNase or MNase sequencing libraries. The processing of these sequencing techniques involves library-specific approaches. However, in all cases, once the sequencing libraries are processed, the result is a count table specifying the estimated number of reads originating from each genomic locus. Differential analysis to determine which loci have different cellular activity under different conditions starts with the count table and iterates through a cycle of data assessment, preparation and analysis. Such complex analysis often relies on multiple programs and is therefore a challenge for those without programming skills.

Results: We developed DEBrowser as an R bioconductor project to interactively visualize every step of the differential analysis, without programming. The application provides a rich and interactive web based graphical user interface built on R's shiny infrastructure. DEBrowser allows users to visualize data with various types of graphs that can be explored further by selecting and re-plotting any desired subset of data. Using the visualization approaches provided, users can determine and correct technical variations such as batch effects and sequencing depth that affect differential analysis. We show DEBrowser's ease of use by reproducing the analysis of two previously published data sets.

Conclusions: DEBrowser is a flexible, intuitive, web-based analysis platform that enables an iterative and interactive analysis of count data without any requirement of programming knowledge.

Keywords: Data visualization; Differential expression; Interactive data analysis.

MeSH terms

  • Chromatin / genetics
  • Chromatin Immunoprecipitation / statistics & numerical data*
  • DNA / genetics
  • DNA-Binding Proteins / genetics
  • Data Interpretation, Statistical
  • Genome, Human / genetics*
  • Genomics / statistics & numerical data
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Sequence Analysis, DNA
  • Sequence Analysis, RNA / statistics & numerical data*
  • Software*

Substances

  • Chromatin
  • DNA-Binding Proteins
  • DNA