Spider Neurotoxins, Short Linear Cationic Peptides and Venom Protein Classification Improved by an Automated Competition between Exhaustive Profile HMM Classifiers

Toxins (Basel). 2017 Aug 8;9(8):245. doi: 10.3390/toxins9080245.

Abstract

Spider venoms are rich cocktails of bioactive peptides, proteins, and enzymes that are being intensively investigated over the years. In order to provide a better comprehension of that richness, we propose a three-level family classification system for spider venom components. This classification is supported by an exhaustive set of 219 new profile hidden Markov models (HMMs) able to attribute a given peptide to its precise peptide type, family, and group. The proposed classification has the advantages of being totally independent from variable spider taxonomic names and can easily evolve. In addition to the new classifiers, we introduce and demonstrate the efficiency of hmmcompete, a new standalone tool that monitors HMM-based family classification and, after post-processing the result, reports the best classifier when multiple models produce significant scores towards given peptide queries. The combined used of hmmcompete and the new spider venom component-specific classifiers demonstrated 96% sensitivity to properly classify all known spider toxins from the UniProtKB database. These tools are timely regarding the important classification needs caused by the increasing number of peptides and proteins generated by transcriptomic projects.

Keywords: classification; hmmcompete; machine learning; profile HMM; spider; toxin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Arthropod Proteins / classification*
  • Databases, Protein
  • Neurotoxins / classification*
  • Peptides / classification*
  • Proteomics
  • Spider Venoms / classification*
  • Spiders

Substances

  • Arthropod Proteins
  • Neurotoxins
  • Peptides
  • Spider Venoms