Tracing the evolutionary history of Drosophila regulatory regions with models that identify transcription factor binding sites

Mol Biol Evol. 2003 May;20(5):703-14. doi: 10.1093/molbev/msg077. Epub 2003 Apr 2.

Abstract

Much of evolutionary change is mediated at the level of gene expression, yet our understanding of regulatory evolution remains unsatisfying. In light of recent data indicating that transcription factor binding sites undergo substantial turnover between species, we attempt to quantify the process of binding site turnover in regulatory regions of well-studied genes controlling embryonic patterning in Drosophila. We examine polymorphism and divergence data in Drosophila melanogaster and four related species from regulatory regions of five early development genes for which functional binding sites have been identified. This analysis reveals that Drosophila regulatory regions exhibit patterns of variation consistent with functional constraint. We develop a novel approach to binding site prediction which we use to characterize the process of binding site divergence in regulatory regions. This method uses sets of known binding sites to construct a model that predicts transcription factor specificity and bootstrap sampling to derive significance levels. This approach allows appropriate significance levels to be determined even in the face of skewed base composition in the background sequence. Using this approach, we show that, although functional elements exhibit conservation of sequence, there is substantial potential to gain new functional elements within the regulatory regions. Our results show that application of models that predict transcription factor binding sites can yield insights into the process and dynamics of binding site evolution within regulatory regions.

Publication types

  • Comparative Study

MeSH terms

  • Animals
  • Base Sequence
  • Binding Sites / genetics
  • Databases, Nucleic Acid
  • Drosophila / genetics*
  • Evolution, Molecular*
  • Genes, Regulator / genetics*
  • Likelihood Functions
  • Models, Genetic*
  • Molecular Sequence Data
  • Polymorphism, Genetic / genetics*
  • Sequence Alignment
  • Sequence Analysis, DNA

Associated data

  • GENBANK/AY184070
  • GENBANK/AY184071
  • GENBANK/AY184072
  • GENBANK/AY184073
  • GENBANK/AY184074
  • GENBANK/AY184075
  • GENBANK/AY184076
  • GENBANK/AY184077
  • GENBANK/AY184078
  • GENBANK/AY184079
  • GENBANK/AY184080
  • GENBANK/AY184081
  • GENBANK/AY184082
  • GENBANK/AY184083
  • GENBANK/AY184084
  • GENBANK/AY184085
  • GENBANK/AY184086
  • GENBANK/AY184087
  • GENBANK/AY184088
  • GENBANK/AY184089
  • GENBANK/AY184090
  • GENBANK/AY184091
  • GENBANK/AY184092
  • GENBANK/AY184093
  • GENBANK/AY184094
  • GENBANK/AY184095
  • GENBANK/AY184096
  • GENBANK/AY184097
  • GENBANK/AY184098
  • GENBANK/AY184099
  • GENBANK/AY184100
  • GENBANK/AY184101
  • GENBANK/AY184102
  • GENBANK/AY184103
  • GENBANK/AY184104
  • GENBANK/AY184105
  • GENBANK/AY184106