High-resolution human core-promoter prediction with CoreBoost_HM

Genome Res. 2009 Feb;19(2):266-75. doi: 10.1101/gr.081638.108. Epub 2008 Nov 7.

Abstract

Correctly locating the gene transcription start site and the core-promoter is important for understanding transcriptional regulation mechanism. Here we have integrated specific genome-wide histone modification and DNA sequence features together to predict RNA polymerase II core-promoters in the human genome. Our new predictor CoreBoost_HM outperforms existing promoter prediction algorithms by providing significantly higher sensitivity and specificity at high resolution. We demonstrated that even though the histone modification data used in this study are from a specific cell type (CD4+ T-cell), our method can be used to identify both active and repressed promoters. We have applied it to search the upstream regions of microRNA genes, and show that CoreBoost_HM can accurately identify the known promoters of the intergenic microRNAs. We also identified a few intronic microRNAs that may have their own promoters. This result suggests that our new method can help to identify and characterize the core-promoters of both coding and noncoding genes.

Publication types

  • Comparative Study
  • Evaluation Study
  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • CpG Islands / genetics
  • Forecasting / methods*
  • Gene Expression
  • Genetic Markers
  • Histones / metabolism
  • Humans
  • MicroRNAs / genetics
  • Promoter Regions, Genetic*
  • Protein Processing, Post-Translational / physiology
  • Software*

Substances

  • Genetic Markers
  • Histones
  • MicroRNAs