Rapid catalytic template searching as an enzyme function prediction procedure

PLoS One. 2013 May 10;8(5):e62535. doi: 10.1371/journal.pone.0062535. Print 2013.

Abstract

We present an enzyme protein function identification algorithm, Catalytic Site Identification (CatSId), based on identification of catalytic residues. The method is optimized for highly accurate template identification across a diverse template library and is also very efficient in regards to time and scalability of comparisons. The algorithm matches three-dimensional residue arrangements in a query protein to a library of manually annotated, catalytic residues--The Catalytic Site Atlas (CSA). Two main processes are involved. The first process is a rapid protein-to-template matching algorithm that scales quadratically with target protein size and linearly with template size. The second process incorporates a number of physical descriptors, including binding site predictions, in a logistic scoring procedure to re-score matches found in Process 1. This approach shows very good performance overall, with a Receiver-Operator-Characteristic Area Under Curve (AUC) of 0.971 for the training set evaluated. The procedure is able to process cofactors, ions, nonstandard residues, and point substitutions for residues and ions in a robust and integrated fashion. Sites with only two critical (catalytic) residues are challenging cases, resulting in AUCs of 0.9411 and 0.5413 for the training and test sets, respectively. The remaining sites show excellent performance with AUCs greater than 0.90 for both the training and test data on templates of size greater than two critical (catalytic) residues. The procedure has considerable promise for larger scale searches.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms*
  • Binding Sites
  • Catalysis
  • Catalytic Domain
  • Computational Biology / methods*
  • Databases, Protein
  • Enzymes / chemistry*
  • Enzymes / metabolism*
  • Logistic Models
  • Models, Molecular
  • Protein Conformation
  • ROC Curve
  • Reproducibility of Results

Substances

  • Enzymes

Grants and funding

The authors gratefully acknowledge the Defense Threat Reduction Agency (DTRA) for supporting this work (grant B094679I). A portion of this work was also funded by LDRD 12-SI-004. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.