Multiple testing methods for ChIP-Chip high density oligonucleotide array data

J Comput Biol. 2006 Apr;13(3):579-613. doi: 10.1089/cmb.2006.13.579.

Abstract

Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription factors along human chromosomes 21 and 22 using ChIP-Chip experiments. ChIP-Chip experiments are a new approach to the genomewide identification of transcription factor binding sites and consist of chromatin (Ch) immunoprecipitation (IP) of transcription factor-bound genomic DNA followed by high density oligonucleotide hybridization (Chip) of the IP-enriched DNA. We investigate the ChIP-Chip data structure and propose methods for inferring the location of transcription factor binding sites from these data. The proposed methods involve testing for each probe whether it is part of a bound sequence using a scan statistic that takes into account the spatial structure of the data. Different multiple testing procedures are considered for controlling the familywise error rate and false discovery rate. A nested-Bonferroni adjustment, which is more powerful than the traditional Bonferroni adjustment when the test statistics are dependent, is discussed. Simulation studies show that taking into account the spatial structure of the data substantially improves the sensitivity of the multiple testing procedures. Application of the proposed methods to ChIP-Chip data for transcription factor p53 identified many potential target binding regions along human chromosomes 21 and 22. Among these identified regions, 18% fall within a 3 kb vicinity of the 5'UTR of a known gene or CpG island and 31% fall between the codon start site and the codon end site of a known gene but not inside an exon. More than half of these potential target sequences contain the p53 consensus binding site or very close matches to it. Moreover, these target segments include the 13 experimentally verified p53 binding regions of Cawley et al. (2004), as well as 49 additional regions that show higher hybridization signal than these 13 experimentally verified regions.

MeSH terms

  • 5' Untranslated Regions / genetics
  • 5' Untranslated Regions / metabolism
  • Binding Sites / genetics
  • Chromatin Immunoprecipitation*
  • Chromosome Mapping*
  • Chromosomes, Human, Pair 21 / genetics
  • Chromosomes, Human, Pair 21 / metabolism
  • Chromosomes, Human, Pair 22 / genetics
  • Chromosomes, Human, Pair 22 / metabolism
  • CpG Islands / genetics
  • Gene Expression Profiling
  • Genome, Human / genetics*
  • Humans
  • Models, Genetic*
  • Oligonucleotide Array Sequence Analysis*
  • Protein Binding
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Tumor Suppressor Protein p53 / genetics*
  • Tumor Suppressor Protein p53 / metabolism

Substances

  • 5' Untranslated Regions
  • Tumor Suppressor Protein p53