Nonparametric Bayesian Bi-Clustering for Next Generation Sequencing Count Data

Bayesian Anal. 2013 Dec;8(4):759-780. doi: 10.1214/13-ba822.

Abstract

Histone modifications (HMs) play important roles in transcription through post-translational modifications. Combinations of HMs, known as chromatin signatures, encode specific messages for gene regulation. We therefore expect that inference on possible clustering of HMs and an annotation of genomic locations on the basis of such clustering can contribute new insights about the functions of regulatory elements and their relationships to combinations of HMs. We propose a nonparametric Bayesian local clustering Poisson model (NoB-LCP) to facilitate posterior inference on two-dimensional clustering of HMs and genomic locations. The NoB-LCP clusters HMs into HM sets and lets each HM set define its own clustering of genomic locations. Furthermore, it probabilistically excludes HMs and genomic locations that are irrelevant to clustering. By doing so, the proposed model effectively identifies important sets of HMs and groups regulatory elements with similar functionality based on HM patterns.

Keywords: Bi-Clustering; ChIP-Seq; Histone modifications; Markov chain Monte Carlo; Nonparametric Bayes.