Structural knowledge about protein-protein interactions can provide insights to the basic processes underlying cell function. Recent progress in experimental and computational structural biology has led to a rapid growth of experimentally resolved structures and computationally determined near-native models of protein-protein interactions. However, determining whether a protein-protein interaction is physiological or it is the artifact of an experimental or computational method remains a challenging problem. In this work, we have addressed two related problems. The first problem is distinguishing between the experimentally obtained physiological and crystal-packing protein-protein interactions. The second problem is concerned with the classification of near-native and inaccurate docking models. We first defined a universal set of interface features and employed a support vector machines (SVM)-based approach to classify the interactions for both problems, with the accuracy, precision, and recall for the first problem classifier reaching 93%. To improve the classification, we next developed a semi-supervised learning approach for the second problem, using transductive SVM (TSVM). We applied both classifiers to a commonly used protein docking benchmark of 124 complexes. We found that while we reached the classification accuracies of 78.9% for the SVM classifier and 80.3% for the TSVM classifier, improving protein-docking methods by model re-ranking remains a challenging problem.
Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.