Nucleotide variation of regulatory motifs may lead to distinct expression patterns

Bioinformatics. 2007 Jul 1;23(13):i440-9. doi: 10.1093/bioinformatics/btm183.

Abstract

Motivation: Current methodologies for the selection of putative transcription factor binding sites (TFBS) rely on various assumptions such as over-representation of motifs occurring on gene promoters, and the use of motif descriptions such as consensus or position-specific scoring matrices (PSSMs). In order to avoid bias introduced by such assumptions, we apply an unsupervised motif extraction (MEX) algorithm to sequences of promoters. The extracted motifs are assessed for their likely cis-regulatory function by calculating the expression coherence (EC) of the corresponding genes, across a set of biological conditions.

Results: Applying MEX to all Saccharomyces cerevisiae promoters, followed by EC analysis across 40 biological conditions, we obtained a high percentage of putative cis-regulatory motifs. We clustered motifs that obtained highly significant EC scores, based on both their sequence similarity and similarity in the biological conditions these motifs appear to regulate. We describe 20 clusters, some of which regroup known TFBS. The clusters display different mRNA expression profiles, correlated with typical changes in the nucleotide composition of their relevant motifs. In several cases, a variation of a single nucleotide is shown to lead to distinct differences in expression patterns. These results are confronted with additional information, such as binding of transcription factors to groups of genes. Detailed analysis is presented for clusters related to MCB/SCB, STRE and PAC. In the first two cases, we provide evidence for different binding mechanisms of different clusters of motifs. For PAC-related motifs we uncover a new cluster that has so far been overshadowed by the stronger effects of known PAC motifs.

Supplementary information: Supplementary data are available at http://adios.tau.ac.il/regmotifs and at Bioinformatics online.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Gene Expression / genetics*
  • Genetic Variation / genetics*
  • Nucleotides / genetics*
  • Promoter Regions, Genetic / genetics*
  • Regulatory Sequences, Nucleic Acid / genetics*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae Proteins / genetics
  • Sequence Analysis, DNA / methods*

Substances

  • Nucleotides
  • Saccharomyces cerevisiae Proteins