Two multi-classification strategies used on SVM to predict protein structural classes by using auto covariance

Interdiscip Sci. 2009 Dec;1(4):315-9. doi: 10.1007/s12539-009-0066-1. Epub 2009 Nov 14.

Abstract

Machine learning methods play the very important role in protein secondary structure prediction and other related works. On condition of a certain approach, the prediction qualities mostly depend on the ways of representing protein sequences into numeric features. In this paper, two Support Vector Machine (SVM) multi-classification strategies, "one-against-one" (1-a-1) and "one-against-all" (1-a-a), were used in protein structural classes identification. Auto covariance (AC), which transforms the physicochemical properties of the amino acids of the proteins into a data matrix, focuses on the neighboring effects and the interactions between residues in protein sequences. "1-a-1" approach was used on SVM to predict protein structural classes and obtained very promising overall accuracy 90.69% by Jackknife test. It was more than 10% higher than the accuracy obtained by using "1-a-a". Experimental results led to the finding that the SVM predictor constructed by "1-a-1" can avoid the appearance of biased prediction accuracy. This current method, using the protein primary sequence information described by auto covariance (AC) and "1-a-1" approach on SVM, should play an important complementary role in other related applications.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Artificial Intelligence*
  • Computational Biology / methods*
  • Computer Simulation
  • Genetic Vectors
  • Pattern Recognition, Automated / methods
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Proteins / classification*
  • Reproducibility of Results
  • Sequence Analysis, Protein / methods
  • Software

Substances

  • Proteins