Large-scale cis-element detection by analysis of correlated expression and sequence conservation between Arabidopsis and Brassica oleracea

Plant Physiol. 2006 Dec;142(4):1589-602. doi: 10.1104/pp.106.085639. Epub 2006 Oct 6.

Abstract

The rapidly increasing amount of plant genomic sequences allows for the detection of cis-elements through comparative methods. In addition, large-scale gene expression data for Arabidopsis (Arabidopsis thaliana) have recently become available. Coexpression and evolutionarily conserved sequences are criteria widely used to identify shared cis-regulatory elements. In our study, we employ an integrated approach to combine two sources of information, coexpression and sequence conservation. Best-candidate orthologous promoter sequences were identified by a bidirectional best blast hit strategy in genome survey sequences from Brassica oleracea. The analysis of 779 microarrays from 81 different experiments provided detailed expression information for Arabidopsis genes coexpressed in multiple tissues and under various conditions and developmental stages. We discovered candidate transcription factor binding sites in 64% of the Arabidopsis genes analyzed. Among them, we detected experimentally verified binding sites and showed strong enrichment of shared cis-elements within functionally related genes. This study demonstrates the value of partially shotgun sequenced genomes and their combinatorial use with functional genomics data to address complex questions in comparative genomics.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis / genetics*
  • Arabidopsis / metabolism
  • Base Sequence
  • Binding Sites
  • Brassica / genetics*
  • Brassica / metabolism
  • Computational Biology
  • Conserved Sequence
  • Genomics
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis
  • Plant Proteins / chemistry
  • Plant Proteins / genetics*
  • Plant Proteins / metabolism
  • Promoter Regions, Genetic*

Substances

  • Plant Proteins