Methods are presented to locate residues, stabilization center elements, which are expected to stabilize protein structures by preventing their decay with their cooperative long range interactions. Artificial neural network-based algorithms were developed to predict these residues from the primary structure of single proteins and from the amino acid sequences of homologous proteins. The prediction accuracy using only single sequence information is 65%, but the incorporation of evolutionary information in the form of multiple alignments and conservation scores raises the efficiency by 3%. The composition, relative accessibility, number and type of interactions, conservation and the X-ray thermal factor of the identified stabilization center residues are different, not only from the whole data set but from the rest of the long range interacting residues as well. The most frequent stabilization center residues are usually found at buried positions and have a hydrophobic or aromatic side-chain, but some polar or charged residues also play an important role in the stabilization. The stabilization centers show significant difference in the composition and in the type of linked secondary structural elements compared with the rest of the residues. The performed structural and sequential conservation analysis showed the higher conservation of stabilization centers over protein families. The relation of the proposed stabilization centers to folding nuclei is also discussed.
Copyright 1997 Academic Press Limited.