Achieving high accuracy prediction of minimotifs

PLoS One. 2012;7(9):e45589. doi: 10.1371/journal.pone.0045589. Epub 2012 Sep 27.

Abstract

The low complexity of minimotif patterns results in a high false-positive prediction rate, hampering protein function prediction. A multi-filter algorithm, trained and tested on a linear regression model, support vector machine model, and neural network model, using a large dataset of verified minimotifs, vastly improves minimotif prediction accuracy while generating few false positives. An optimal threshold for the best accuracy reaches an overall accuracy above 90%, while a stringent threshold for the best specificity generates less than 1% false positives or even no false positives and still produces more than 90% true positives for the linear regression and neural network models. The minimotif multi-filter with its excellent accuracy represents the state-of-the-art in minimotif prediction and is expected to be very useful to biologists investigating protein function and how missense mutations cause disease.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Amino Acid Motifs*
  • Computational Biology / methods
  • Internet
  • Models, Theoretical
  • Pattern Recognition, Automated / methods*
  • Protein Binding
  • Proteins / chemistry*
  • ROC Curve

Substances

  • Proteins