The analysis of ChIP-Seq data

Methods Enzymol. 2011:497:51-73. doi: 10.1016/B978-0-12-385075-1.00003-2.

Abstract

Chromatin immunoprecipitation coupled with ultra-high-throug put parallel DNA sequencing (ChIP-seq) is an effective technology for the investigation of genome-wide protein-DNA interactions. Examples of applications include the studies of RNA polymerases transcription, transcriptional regulation, and histone modifications. The technology provides accurate and high-resolution mapping of the protein-DNA binding loci that are important in the understanding of many processes in development and diseases. Since the introduction of ChIP-seq experiments in 2007, many statistical and computational methods have been developed to support the analysis of the massive datasets from these experiments. However, because of the complex, multistaged analysis workflow, it is still difficult for an experimental investigator to conduct the analysis of his or her own ChIP-seq data. In this chapter, we review the basic design of ChIP-seq experiments and provide an in-depth tutorial on how to prepare, to preprocess, and to analyze ChIP-seq datasets. The tutorial is based on a revised version of our software package CisGenome, which was designed to encompass most standard tasks in ChIP-seq data analysis. Relevant statistical and computational issues will be highlighted, discussed, and illustrated by means of real data examples.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Sequence
  • Chromatin Immunoprecipitation / instrumentation
  • Chromatin Immunoprecipitation / methods*
  • Computational Biology / instrumentation
  • Computational Biology / methods
  • Databases, Nucleic Acid
  • Genome
  • Humans
  • Sequence Analysis, DNA / instrumentation
  • Sequence Analysis, DNA / methods*
  • Software