Gram-positive and Gram-negative protein subcellular localization by incorporating evolutionary-based descriptors into Chou׳s general PseAAC

J Theor Biol. 2015 Jan 7:364:284-94. doi: 10.1016/j.jtbi.2014.09.029. Epub 2014 Sep 28.

Abstract

Protein subcellular localization is defined as predicting the functioning location of a given protein in the cell. It is considered an important step towards protein function prediction and drug design. Recent studies have shown that relying on Gene Ontology (GO) for feature extraction can improve protein subcellular localization prediction performance. However, relying solely on GO, this problem remains unsolved. At the same time, the impact of other sources of features especially evolutionary-based features has not been explored adequately for this task. In this study, we aim to extract discriminative evolutionary features to tackle this problem. To do this, we propose two segmentation based feature extraction methods to explore potential local evolutionary-based information for Gram-positive and Gram-negative subcellular localizations. We will show that by applying a Support Vector Machine (SVM) classifier to our extracted features, we are able to enhance Gram-positive and Gram-negative subcellular localization prediction accuracies by up to 6.4% better than previous studies including the studies that used GO for feature extraction.

Keywords: Evolutionary-based features; Segmented autocorrelation; Segmented distribution; Support Vector Machine (SVM).

MeSH terms

  • Algorithms
  • Cell Membrane / microbiology
  • Cell Wall / microbiology
  • Computational Biology
  • Cytoplasm / microbiology
  • Databases, Genetic
  • Electronic Data Processing
  • Gram-Negative Bacteria / physiology*
  • Gram-Positive Bacteria / physiology*
  • Models, Biological
  • Probability
  • Reproducibility of Results
  • Support Vector Machine