Due to the rising number of solved protein structures, computer-based techniques for automatic protein functional annotation and classification into families are of high scientific interest. DoGSiteScorer automatically calculates global descriptors for self-predicted pockets based on the 3D structure of a protein. Protein function predictors on three levels with increasing granularity are built by use of a support vector machine (SVM), based on descriptors of 26632 pockets from enzymes with known structure and enzyme classification. The SVM models represent a generalization of the available descriptor space for each enzyme class, subclass, and substrate-specific sub-subclass. Cross-validation studies show accuracies of 68.2% for predicting the correct main class and accuracies between 62.8% and 80.9% for the six subclasses. Substrate-specific recall rates for a kinase subset are 53.8%. Furthermore, application studies show the ability of the method for predicting the function of unknown proteins and gaining valuable information for the function prediction field.
Copyright © 2012 Wiley Periodicals, Inc.