In silico pooling of ChIP-seq control experiments

Guannan Sun; Rajini Srinivasan; Camila Lopez-Anido; Holly A Hung; John Svaren; Sündüz Keleş

doi:10.1371/journal.pone.0109691

In silico pooling of ChIP-seq control experiments

PLoS One. 2014 Nov 7;9(11):e109691. doi: 10.1371/journal.pone.0109691. eCollection 2014.

Authors

Guannan Sun¹, Rajini Srinivasan², Camila Lopez-Anido², Holly A Hung², John Svaren², Sündüz Keleş³

Affiliations

¹ Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America.
² Waisman Center, Department of Comparative Biosciences, University of Wisconsin, Madison, Wisconsin, United States of America.
³ Department of Statistics, University of Wisconsin, Madison, Wisconsin, United States of America; Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin, United States of America.

Abstract

As next generation sequencing technologies are becoming more economical, large-scale ChIP-seq studies are enabling the investigation of the roles of transcription factor binding and epigenome on phenotypic variation. Studying such variation requires individual level ChIP-seq experiments. Standard designs for ChIP-seq experiments employ a paired control per ChIP-seq sample. Genomic coverage for control experiments is often sacrificed to increase the resources for ChIP samples. However, the quality of ChIP-enriched regions identifiable from a ChIP-seq experiment depends on the quality and the coverage of the control experiments. Insufficient coverage leads to loss of power in detecting enrichment. We investigate the effect of in silico pooling of control samples within multiple biological replicates, multiple treatment conditions, and multiple cell lines and tissues across multiple datasets with varying levels of genomic coverage. Our computational studies suggest guidelines for performing in silico pooling of control experiments. Using vast amounts of ENCODE data, we show that pairwise correlations between control samples originating from multiple biological replicates, treatments, and cell lines/tissues can be grouped into two classes representing whether or not in silico pooling leads to power gain in detecting enrichment between the ChIP and the control samples. Our findings have important implications for multiplexing samples.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Animals
CCCTC-Binding Factor
Chromatin Immunoprecipitation*
Cluster Analysis
Computational Biology / methods*
Computer Simulation*
High-Throughput Nucleotide Sequencing*
Histone Deacetylases / genetics
Histones / genetics
Humans
JNK Mitogen-Activated Protein Kinases / genetics
K562 Cells
Rats
Repressor Proteins / genetics

Substances

CCCTC-Binding Factor
CTCF protein, human
Histones
Repressor Proteins
JNK Mitogen-Activated Protein Kinases
HDAC8 protein, human
Histone Deacetylases

Abstract

Publication types

MeSH terms

Substances

Grants and funding