CoBRA: Containerized Bioinformatics Workflow for Reproducible ChIP/ATAC-seq Analysis

Genomics Proteomics Bioinformatics. 2021 Aug;19(4):652-661. doi: 10.1016/j.gpb.2020.11.007. Epub 2021 Jul 18.

Abstract

Chromatin immunoprecipitation sequencing (ChIP-seq) and the Assay for Transposase-Accessible Chromatin with high-throughput sequencing (ATAC-seq) have become essential technologies to effectively measure protein-DNA interactions and chromatin accessibility. However, there is a need for a scalable and reproducible pipeline that incorporates proper normalization between samples, correction of copy number variations, and integration of new downstream analysis tools. Here we present Containerized Bioinformatics workflow for Reproducible ChIP/ATAC-seq Analysis (CoBRA), a modularized computational workflow which quantifies ChIP-seq and ATAC-seq peak regions and performs unsupervised and supervised analyses. CoBRA provides a comprehensive state-of-the-art ChIP-seq and ATAC-seq analysis pipeline that can be used by scientists with limited computational experience. This enables researchers to gain rapid insight into protein-DNA interactions and chromatin accessibility through sample clustering, differential peak calling, motif enrichment, comparison of sites to a reference database, and pathway analysis. CoBRA is publicly available online at https://bitbucket.org/cfce/cobra.

Keywords: ATAC-seq; ChIP-seq; Docker; Snakemake; Workflow.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Chromatin / genetics
  • Chromatin Immunoprecipitation Sequencing*
  • Computational Biology*
  • DNA Copy Number Variations
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA
  • Workflow

Substances

  • Chromatin