3D entropy and moments prediction of enzyme classes and experimental-theoretic study of peptide fingerprints in Leishmania parasites

Biochim Biophys Acta. 2009 Dec;1794(12):1784-94. doi: 10.1016/j.bbapap.2009.08.020. Epub 2009 Aug 28.

Abstract

The number of protein 3D structures without function annotation in Protein Data Bank (PDB) has been steadily increased. This fact has led in turn to an increment of demand for theoretical models to give a quick characterization of these proteins. In this work, we present a new and fast Markov chain model (MCM) to predict the enzyme classification (EC) number. We used both linear discriminant analysis (LDA) and/or artificial neural networks (ANN) in order to compare linear vs. non-linear classifiers. The LDA model found is very simple (three variables) and at the same time is able to predict the first EC number with an overall accuracy of 79% for a data set of 4755 proteins (859 enzymes and 3896 non-enzymes) divided into both training and external validation series. In addition, the best non-linear ANN model is notably more complex but has an overall accuracy of 98.85%. It is important to emphasize that this method may help us to predict not only new enzyme proteins but also to select peptide candidates found on the peptide mass fingerprints (PMFs) of new proteins that may improve enzyme activity. In order to illustrate the use of the model in this regard, we first report the 2D electrophoresis (2DE) and MADLI-TOF mass spectra characterization of the PMF of a new possible malate dehydrogenase sequence from Leishmania infantum. Next, we used the models to predict the contribution to a specific enzyme action of 30 peptides found in the PMF of the new protein. We implemented the present model in a server at portal Bio-AIMS (http://miaja.tic.udc.es/Bio-AIMS/EnzClassPred.php). This free on-line tool is based on PHP/HTML/Python and MARCH-INSIDE routines. This combined strategy may be used to identify and predict peptides of prokaryote and eukaryote parasites and their hosts as well as other superior organisms, which may be of interest in drug development or target identification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computer Simulation
  • Discriminant Analysis
  • Electrophoresis, Gel, Two-Dimensional
  • Enzymes / chemistry*
  • Enzymes / classification*
  • Enzymes / isolation & purification
  • Leishmania infantum / chemistry
  • Leishmania infantum / enzymology*
  • Linear Models
  • Markov Chains
  • Models, Molecular
  • Neural Networks, Computer
  • Nonlinear Dynamics
  • Peptide Mapping
  • Protein Conformation
  • Protozoan Proteins / chemistry*
  • Protozoan Proteins / classification*
  • Protozoan Proteins / isolation & purification
  • Spectrometry, Mass, Matrix-Assisted Laser Desorption-Ionization
  • Thermodynamics

Substances

  • Enzymes
  • Protozoan Proteins