Feature-based classification of amino acid substitutions outside conserved functional protein domains

ScientificWorldJournal. 2013 Nov 17:2013:948617. doi: 10.1155/2013/948617. eCollection 2013.

Abstract

There are more than 500 amino acid substitutions in each human genome, and bioinformatics tools irreplaceably contribute to determination of their functional effects. We have developed feature-based algorithm for the detection of mutations outside conserved functional domains (CFDs) and compared its classification efficacy with the most commonly used phylogeny-based tools, PolyPhen-2 and SIFT. The new algorithm is based on the informational spectrum method (ISM), a feature-based technique, and statistical analysis. Our dataset contained neutral polymorphisms and mutations associated with myeloid malignancies from epigenetic regulators ASXL1, DNMT3A, EZH2, and TET2. PolyPhen-2 and SIFT had significantly lower accuracies in predicting the effects of amino acid substitutions outside CFDs than expected, with especially low sensitivity. On the other hand, only ISM algorithm showed statistically significant classification of these sequences. It outperformed PolyPhen-2 and SIFT by 15% and 13%, respectively. These results suggest that feature-based methods, like ISM, are more suitable for the classification of amino acid substitutions outside CFDs than phylogeny-based tools.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Substitution*
  • Computational Biology / methods
  • DNA (Cytosine-5-)-Methyltransferases / genetics
  • DNA Methyltransferase 3A
  • DNA-Binding Proteins / genetics
  • Databases, Genetic
  • Dioxygenases
  • Enhancer of Zeste Homolog 2 Protein
  • Epigenesis, Genetic
  • Humans
  • Polycomb Repressive Complex 2 / genetics
  • Protein Interaction Domains and Motifs / genetics*
  • Proteins / chemistry*
  • Proteins / genetics*
  • Proteins / metabolism
  • Proto-Oncogene Proteins / genetics
  • ROC Curve
  • Repressor Proteins / genetics

Substances

  • ASXL1 protein, human
  • DNA-Binding Proteins
  • DNMT3A protein, human
  • Proteins
  • Proto-Oncogene Proteins
  • Repressor Proteins
  • Dioxygenases
  • TET2 protein, human
  • DNA (Cytosine-5-)-Methyltransferases
  • DNA Methyltransferase 3A
  • EZH2 protein, human
  • Enhancer of Zeste Homolog 2 Protein
  • Polycomb Repressive Complex 2