Inference of relationships in the 'twilight zone' of homology using a combination of bioinformatics and site-directed mutagenesis: a case study of restriction endonucleases Bsp6I and PvuII

Nucleic Acids Res. 2005 Jan 31;33(2):661-71. doi: 10.1093/nar/gki213. Print 2005.

Abstract

Thus far, identification of functionally important residues in Type II restriction endonucleases (REases) has been difficult using conventional methods. Even though known REase structures share a fold and marginally recognizable active site, the overall sequence similarities are statistically insignificant, unless compared among proteins that recognize identical or very similar sequences. Bsp6I is a Type II REase, which recognizes the palindromic DNA sequence 5'GCNGC and cleaves between the cytosine and the unspecified nucleotide in both strands, generating a double-strand break with 5'-protruding single nucleotides. There are no solved structures of REases that recognize similar DNA targets or generate cleavage products with similar characteristics. In straightforward comparisons, the Bsp6I sequence shows no significant similarity to REases with known structures. However, using a fold-recognition approach, we have identified a remote relationship between Bsp6I and the structure of PvuII. Starting from the sequence-structure alignment between Bsp6I and PvuII, we constructed a homology model of Bsp6I and used it to predict functionally significant regions in Bsp6I. The homology model was supported by site-directed mutagenesis of residues predicted to be important for dimerization, DNA binding and catalysis. Completing the picture of sequence-structure-function relationships in protein superfamilies becomes an essential task in the age of structural genomics and our study may serve as a paradigm for future analyses of superfamilies comprising strongly diverged members with little or no sequence similarity.

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Catalysis
  • Catalytic Domain
  • Circular Dichroism
  • Computational Biology
  • DNA / chemistry
  • DNA / metabolism
  • Deoxyribonucleases, Type II Site-Specific / chemistry*
  • Deoxyribonucleases, Type II Site-Specific / genetics
  • Deoxyribonucleases, Type II Site-Specific / metabolism
  • Dimerization
  • Magnesium / metabolism
  • Models, Molecular*
  • Molecular Sequence Data
  • Mutagenesis, Site-Directed
  • Mutation
  • Sequence Alignment
  • Sequence Homology, Amino Acid
  • Structural Homology, Protein
  • Substrate Specificity

Substances

  • Amino Acids
  • DNA
  • endodeoxyribonuclease Fnu4HI
  • CAGCTG-specific type II deoxyribonucleases
  • Deoxyribonucleases, Type II Site-Specific
  • Magnesium