High-throughput identification of transcription start sites, conserved promoter motifs and predicted regulons

Nat Biotechnol. 2007 May;25(5):584-92. doi: 10.1038/nbt1294. Epub 2007 Apr 1.

Abstract

Using 62 probe-level datasets obtained with a custom-designed Caulobacter crescentus microarray chip, we identify transcriptional start sites of 769 genes, 53 of which are transcribed from multiple start sites. Transcriptional start sites are identified by analyzing probe signal cross-correlation matrices created from probe pairs tiled every 5 bp upstream of the genes. Signals from probes binding the same message are correlated. The contribution of each promoter for genes transcribed from multiple promoters is identified. Knowing the transcription start site enables targeted searching for regulatory-protein binding motifs in the promoter regions of genes with similar expression patterns. We identified 27 motifs, 17 of which share no similarity to the characterized motifs of other C. crescentus transcriptional regulators. Using these motifs, we predict coregulated genes. We verified novel promoter motifs that regulate stress-response genes, including those responding to uranium challenge, a stress-response sigma factor and a stress-response noncoding RNA.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Base Sequence
  • Caulobacter crescentus / genetics*
  • Computer Simulation
  • Conserved Sequence / genetics*
  • DNA, Bacterial / genetics*
  • Models, Genetic*
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis / methods*
  • Promoter Regions, Genetic / genetics
  • Regulon / genetics*
  • Sequence Analysis, DNA / methods
  • Transcription, Genetic / genetics*

Substances

  • DNA, Bacterial