Improving protein complex classification accuracy using amino acid composition profile

Comput Biol Med. 2013 Sep;43(9):1196-204. doi: 10.1016/j.compbiomed.2013.05.026. Epub 2013 Jun 6.

Abstract

Protein complex prediction approaches are based on the assumptions that complexes have dense protein-protein interactions and high functional similarity between their subunits. We investigated those assumptions by studying the subunits' interaction topology, sequence similarity and molecular function for human and yeast protein complexes. Inclusion of amino acids' physicochemical properties can provide better understanding of protein complex properties. Principal component analysis is carried out to determine the major features. Adopting amino acid composition profile information with the SVM classifier serves as an effective post-processing step for complexes classification. Improvement is based on primary sequence information only, which is easy to obtain.

Keywords: Amino acid composition profile; Gene Ontology; Hydrophilic; Hydrophobic; Machine learning method; Physicochemical property; Protein complex; Protein–protein interaction; Sequence alignment.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Predictive Value of Tests
  • Saccharomyces cerevisiae / genetics*
  • Saccharomyces cerevisiae Proteins / classification*
  • Saccharomyces cerevisiae Proteins / genetics*
  • Sequence Analysis, Protein / methods*

Substances

  • Saccharomyces cerevisiae Proteins