GlycoMinestruct: a new bioinformatics tool for highly accurate mapping of the human N-linked and O-linked glycoproteomes by incorporating structural features

Sci Rep. 2016 Oct 6:6:34595. doi: 10.1038/srep34595.

Abstract

Glycosylation plays an important role in cell-cell adhesion, ligand-binding and subcellular recognition. Current approaches for predicting protein glycosylation are primarily based on sequence-derived features, while little work has been done to systematically assess the importance of structural features to glycosylation prediction. Here, we propose a novel bioinformatics method called GlycoMinestruct(http://glycomine.erc.monash.edu/Lab/GlycoMine_Struct/) for improved prediction of human N- and O-linked glycosylation sites by combining sequence and structural features in an integrated computational framework with a two-step feature-selection strategy. Experiments indicated that GlycoMinestruct outperformed NGlycPred, the only predictor that incorporated both sequence and structure features, achieving AUC values of 0.941 and 0.922 for N- and O-linked glycosylation, respectively, on an independent test dataset. We applied GlycoMinestruct to screen the human structural proteome and obtained high-confidence predictions for N- and O-linked glycosylation sites. GlycoMinestruct can be used as a powerful tool to expedite the discovery of glycosylation events and substrates to facilitate hypothesis-driven experimental studies.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology*
  • Glycoproteins* / chemistry
  • Glycoproteins* / genetics
  • Glycosylation
  • Humans
  • Proteome* / chemistry
  • Proteome* / genetics
  • Sequence Analysis, Protein / methods*
  • Software*

Substances

  • Glycoproteins
  • Proteome