Gene set control analysis predicts hematopoietic control mechanisms from genome-wide transcription factor binding data

Exp Hematol. 2013 Apr;41(4):354-66.e14. doi: 10.1016/j.exphem.2012.11.008. Epub 2012 Dec 4.

Abstract

Transcription factors are key regulators of both normal and malignant hematopoiesis. Chromatin immunoprecipitation (ChIP) coupled with high-throughput sequencing (ChIP-Seq) has become the method of choice to interrogate the genome-wide effect of transcription factors. We have collected and integrated 142 publicly available ChIP-Seq datasets for both normal and leukemic murine blood cell types. In addition, we introduce the new bioinformatic tool Gene Set Control Analysis (GSCA). GSCA predicts likely upstream regulators for lists of genes based on statistical significance of binding event enrichment within the gene loci of a user-supplied gene set. We show that GSCA analysis of lineage-restricted gene sets reveals expected and previously unrecognized candidate upstream regulators. Moreover, application of GSCA to leukemic gene sets allowed us to predict the reactivation of blood stem cell control mechanisms as a likely contributor to LMO2 driven leukemia. It also allowed us to clarify the recent debate on the role of Myc in leukemia stem cell transcriptional programs. As a result, GSCA provides a valuable new addition to analyzing gene sets of interest, complementary to Gene Ontology and Gene Set Enrichment analyses. To facilitate access to the wider research community, we have implemented GSCA as a freely accessible web tool (http://bioinformatics.cscr.cam.ac.uk/GSCA/GSCA.html).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites / genetics
  • Cell Line
  • Chromatin Immunoprecipitation
  • Cluster Analysis
  • Computational Biology / methods*
  • Gene Expression Profiling
  • Gene Expression Regulation, Leukemic
  • Genome / genetics*
  • Hematopoiesis / genetics*
  • Hematopoietic Stem Cells / metabolism
  • Humans
  • Internet
  • Leukemia / genetics
  • Lymphocytes / metabolism
  • Mice
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Transcription Factors / genetics*
  • Transcription Factors / metabolism

Substances

  • Transcription Factors