Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors

J Mol Biol. 1991 Mar 5;218(1):183-94. doi: 10.1016/0022-2836(91)90883-8.

Abstract

The problem of constructing all-atom model co-ordinates of a protein from an outline of the polypeptide chain is encountered in protein structure determination by crystallography or nuclear magnetic resonance spectroscopy, in model building by homology and in protein design. Here, we present an automatic procedure for generating full protein co-ordinates (backbone and, optionally, side-chains) given the C alpha trace and amino acid sequence. To construct backbones, a protein structure database is first scanned for fragments that locally fit the chain trace according to distance criteria. A best path algorithm then sifts through these segments and selects an optimal path with minimal mismatch at fragment joints. In blind tests, using fully known protein structures, backbones (C alpha, C, N, O) can be reconstructed with a reliability of 0.4 to 0.6 A root-mean-square position deviation and not more than 0 to 5% peptide flips. This accuracy is sufficient to identify possible errors in protein co-ordinate sets. To construct full co-ordinates, side-chains are added from a library of frequently occurring rotamers using a simple and fast Monte Carlo procedure with simulated annealing. In tests on X-ray structures determined at better than 2.5 A resolution, the positions of side-chain atoms in the protein core (less than 20% relative accessibility) have an accuracy of 1.6 A (r.m.s. deviation) and 70% of chi 1 angles are within 30 degrees of the X-ray structure. The computer program MaxSprout is available on request.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Databases, Bibliographic*
  • Models, Molecular
  • Monte Carlo Method
  • Protein Conformation*
  • Proteins / chemistry*
  • X-Ray Diffraction / methods

Substances

  • Proteins