Selecting high quality protein structures from diverse conformational ensembles

Ashwin Subramani; Peter A DiMaggio Jr; Christodoulos A Floudas

doi:10.1016/j.bpj.2009.06.046

Selecting high quality protein structures from diverse conformational ensembles

Biophys J. 2009 Sep 16;97(6):1728-36. doi: 10.1016/j.bpj.2009.06.046.

Authors

Ashwin Subramani¹, Peter A DiMaggio Jr, Christodoulos A Floudas

Affiliation

¹ Department of Chemical Engineering, Princeton University, Princeton, New Jersey, USA.

Abstract

Protein structure prediction encompasses two major challenges: 1), the generation of a large ensemble of high resolution structures for a given amino-acid sequence; and 2), the identification of the structure closest to the native structure for a blind prediction. In this article, we address the second challenge, by proposing what is, to our knowledge, a novel iterative traveling-salesman problem-based clustering method to identify the structures of a protein, in a given ensemble, which are closest to the native structure. The method consists of an iterative procedure, which aims at eliminating clusters of structures at each iteration, which are unlikely to be of similar fold to the native, based on a statistical analysis of cluster density and average spherical radius. The method, denoted as ICON, has been tested on four data sets: 1), 1400 proteins with high resolution decoys; 2), medium-to-low resolution decoys from Decoys 'R' Us; 3), medium-to-low resolution decoys from the first-principles approach, ASTRO-FOLD; and 4), selected targets from CASP8. The extensive tests demonstrate that ICON can identify high-quality structures in each ensemble, regardless of the resolution of conformers. In a total of 1454 proteins, with an average of 1051 conformers per protein, the conformers selected by ICON are, on an average, in the top 3.5% of the conformers in the ensemble.

Publication types

Research Support, N.I.H., Extramural
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms
Computational Biology / methods*
Protein Conformation
Protein Folding
Proteins / chemistry*
Proteins / metabolism

Substances

Proteins

Abstract

Publication types

MeSH terms

Substances

Grants and funding