Prediction protein structural classes with pseudo-amino acid composition: approximate entropy and hydrophobicity pattern

J Theor Biol. 2008 Jan 7;250(1):186-93. doi: 10.1016/j.jtbi.2007.09.014. Epub 2007 Sep 15.

Abstract

Compared with the conventional amino acid (AA) composition, the pseudo-amino acid (PseAA) composition as originally introduced for protein subcellular location prediction can incorporate much more information of a protein sequence, so as to remarkably enhance the power of using a discrete model to predict various attributes of a protein. In this study, based on the concept of PseAA composition, the approximate entropy and hydrophobicity pattern of a protein sequence are used to characterize the PseAA components. Also, the immune genetic algorithm (IGA) is applied to search the optimal weight factors in generating the PseAA composition. Thus, for a given protein sequence sample, a 27-D (dimensional) PseAA composition is generated as its descriptor. The fuzzy K nearest neighbors (FKNN) classifier is adopted as the prediction engine. The results thus obtained in predicting protein structural classification are quite encouraging, indicating that the current approach may also be used to improve the prediction quality of other protein attributes, or at least can play a complimentary role to the existing methods in the relevant areas. Our algorithm is written in Matlab that is available by contacting the corresponding author.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acids / analysis*
  • Chemical Phenomena
  • Chemistry, Physical
  • Entropy
  • Fuzzy Logic
  • Hydrophobic and Hydrophilic Interactions
  • Models, Chemical*
  • Protein Conformation*
  • Sequence Analysis, Protein / methods

Substances

  • Amino Acids