Structural analysis of conserved base pairs in protein-DNA complexes

Nucleic Acids Res. 2002 Apr 1;30(7):1704-11. doi: 10.1093/nar/30.7.1704.

Abstract

Understanding of protein-DNA interactions is crucial for prediction of DNA-binding specificity of transcription factors and design of novel DNA-binding proteins. In this paper we develop a novel approach to analysis of protein-DNA interactions. We bring together two sources of information: (i) structures of protein-DNA complexes (PDB/NDB database) and (ii) experimentally obtained sites recognized by DNA-binding proteins. Sites are used to compute conservation (information content) of each base pair, which indicates relative importance of the base pair in specific recognition. The main result of this study is that conservation of base pairs in a site exhibits significant correlation with the number of contacts the base pairs have with the protein. In particular, base pairs that have more contacts with the protein are more conserved in evolution. As natural as it is, this result has never been reported before. We also observe that for most of the studied proteins, hydrogen bonds and hydrophobic interactions alone cannot explain the pattern of evolutionary conservation in the binding site suggesting cumulative contribution of different types of interactions to specific recognition. Implications for prediction of the DNA-binding specificity are discussed.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism
  • Base Pairing / genetics*
  • Binding Sites / genetics
  • Carrier Proteins
  • Cyclic AMP Receptor Protein / genetics
  • Cyclic AMP Receptor Protein / metabolism
  • DNA / metabolism*
  • DNA-Binding Proteins / genetics*
  • DNA-Binding Proteins / metabolism
  • Databases, Protein
  • Escherichia coli / genetics
  • Escherichia coli Proteins*
  • Evolution, Molecular
  • Integration Host Factors
  • Protein Binding
  • Repressor Proteins / genetics
  • Repressor Proteins / metabolism
  • Statistics as Topic

Substances

  • Bacterial Proteins
  • Carrier Proteins
  • Cyclic AMP Receptor Protein
  • DNA-Binding Proteins
  • Escherichia coli Proteins
  • Integration Host Factors
  • PurR protein, Bacteria
  • PurR protein, E coli
  • Repressor Proteins
  • TRPR protein, E coli
  • methionine repressor protein, Bacteria
  • methionine repressor protein, E coli
  • DNA