Similarities and differences between nonhomologous proteins with similar folds: evaluation of threading strategies

Fold Des. 1997;2(5):307-17. doi: 10.1016/S1359-0278(97)00042-4.

Abstract

Background: There are many pairs and groups of proteins with similar folds and interaction patterns, but whose sequence similarity is below the threshold of easily recognizable sequence homology. The existence of multiple sequence solutions for a given fold has inspired fold prediction methods in which structural information from one protein is used to estimate the energy of another, putatively similar, structure.

Results: A set of 68 pairs of proteins with similar folds and sequence identity in the 8-30% range is identified from the literature. for each pair, the energy of one protein, calculated using knowledge-based statistical potentials, is compared to the estimated energy, calculated with the same potentials but using the structural information (burial status and interaction pattern) of another protein with the same fold. Different energy estimates, corresponding to approximations used in various fold recognition algorithms, are calculated and compared to each other, as well as to the correct energy. It is shown that the local energy terms, based on burial and secondary structure preferences, can be reliably estimated with an accuracy close to 70%. At the same time, the two-body nonlocal energy loses over 60% of its value due to the repacking of the structure. Further approximations, such as the 'frozen approximation', can bring it to an essentially random value.

Conclusions: Local energy terms could be used safely to improve fold recognition algorithms. To utilize pair interaction information, specially designed pair potentials and/or a self-consistent description of pair interactions is necessary.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Databases, Factual
  • Models, Molecular
  • Models, Theoretical
  • Molecular Sequence Data
  • Protein Structure, Secondary*
  • Sequence Alignment*
  • Software
  • Thermodynamics