iProtGly-SS: Identifying protein glycation sites using sequence and structure based features

Proteins. 2018 Jul;86(7):777-789. doi: 10.1002/prot.25511. Epub 2018 May 2.

Abstract

Glycation is chemical reaction by which sugar molecule bonds with a protein without the help of enzymes. This is often cause to many diseases and therefore the knowledge about glycation is very important. In this paper, we present iProtGly-SS, a protein lysine glycation site identification method based on features extracted from sequence and secondary structural information. In the experiments, we found the best feature groups combination: Amino Acid Composition, Secondary Structure Motifs, and Polarity. We used support vector machine classifier to train our model and used an optimal set of features using a group based forward feature selection technique. On standard benchmark datasets, our method is able to significantly outperform existing methods for glycation prediction. A web server for iProtGly-SS is implemented and publicly available to use: http://brl.uiu.ac.bd/iprotgly-ss/.

Keywords: classification; evolutionary features; feature selection; protein glycation; structural features.

MeSH terms

  • Databases, Protein
  • Lysine / chemistry*
  • Protein Structure, Secondary
  • Sequence Analysis, Protein
  • Support Vector Machine

Substances

  • Lysine