Joint modeling of ChIP-seq data via a Markov random field model

Biostatistics. 2014 Apr;15(2):296-310. doi: 10.1093/biostatistics/kxt047. Epub 2013 Oct 30.

Abstract

Chromatin ImmunoPrecipitation-sequencing (ChIP-seq) experiments have now become routine in biology for the detection of protein-binding sites. In this paper, we present a Markov random field model for the joint analysis of multiple ChIP-seq experiments. The proposed model naturally accounts for spatial dependencies in the data, by assuming first-order Markov dependence and, for the large proportion of zero counts, by using zero-inflated mixture distributions. In contrast to all other available implementations, the model allows for the joint modeling of multiple experiments, by incorporating key aspects of the experimental design. In particular, the model uses the information about replicates and about the different antibodies used in the experiments. An extensive simulation study shows a lower false non-discovery rate for the proposed method, compared with existing methods, at the same false discovery rate. Finally, we present an analysis on real data for the detection of histone modifications of two chromatin modifiers from eight ChIP-seq experiments, including technical replicates with different IP efficiencies.

Keywords: ChIP-sequencing; Markov random field model; Mixture distributions.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin Immunoprecipitation / standards*
  • Markov Chains*
  • Models, Statistical*
  • Protein Binding
  • Sequence Analysis, DNA / standards*
  • Statistical Distributions