A strategy for detecting the conservation of folding-nucleus residues in protein superfamilies

Fold Des. 1998;3(4):239-51. doi: 10.1016/S1359-0278(98)00035-2.

Abstract

Background: Nucleation-growth theory predicts that fast-folding peptide sequences fold to their native structure via structures in a transition-state ensemble that share a small number of native contacts (the folding nucleus). Experimental and theoretical studies of proteins suggest that residues participating in folding nuclei are conserved among homologs. We attempted to determine if this is true in proteins with highly diverged sequences but identical folds (superfamilies).

Results: We describe a strategy based on comparisons of residue conservation in natural superfamily sequences with simulated sequences (generated with a Monte-Carlo sequence design strategy) for the same proteins. The basic assumptions of the strategy were that natural sequences will conserve residues needed for folding and stability plus function, the simulated sequences contain no functional conservation, and nucleus residues make native contacts with each other. Based on these assumptions, we identified seven potential nucleus residues in ubiquitin superfamily members. Non-nucleus conserved residues were also identified; these are proposed to be involved in stabilizing native interactions. We found that all superfamily members conserved the same potential nucleus residue positions, except those for which the structural topology is significantly different.

Conclusions: Our results suggest that the conservation of the nucleus of a specific fold can be predicted by comparing designed simulated sequences with natural highly diverged sequences that fold to the same structure. We suggest that such a strategy could be used to help plan protein folding and design experiments, to identify new superfamily members, and to subdivide superfamilies further into classes having a similar folding mechanism.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Conserved Sequence / genetics
  • Ferredoxins / chemistry
  • Models, Molecular
  • Molecular Sequence Data
  • Monte Carlo Method
  • Protein Folding*
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proto-Oncogene Proteins c-raf / chemistry
  • Sequence Alignment
  • Ubiquitins / chemistry

Substances

  • Ferredoxins
  • Proteins
  • Ubiquitins
  • Proto-Oncogene Proteins c-raf