Protein secondary structure: entropy, correlations and prediction

Bioinformatics. 2004 Jul 10;20(10):1603-11. doi: 10.1093/bioinformatics/bth132. Epub 2004 Feb 26.

Abstract

Motivation: Is protein secondary structure primarily determined by local interactions between residues closely spaced along the amino acid backbone or by non-local tertiary interactions? To answer this question, we measure the entropy densities of primary and secondary structure sequences, and the local inter-sequence mutual information density.

Results: We find that the important inter-sequence interactions are short ranged, that correlations between neighboring amino acids are essentially uninformative and that only one-fourth of the total information needed to determine the secondary structure is available from local inter-sequence correlations. These observations support the view that the majority of most proteins fold via a cooperative process where secondary and tertiary structure form concurrently. Moreover, existing single-sequence secondary structure prediction algorithms are almost optimal, and we should not expect a dramatic improvement in prediction accuracy.

Availability: Both the data sets and analysis code are freely available from our Web site at http://compbio.berkeley.edu/

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computer Simulation
  • Entropy
  • Markov Chains
  • Models, Chemical*
  • Models, Molecular*
  • Models, Statistical
  • Molecular Sequence Data
  • Protein Structure, Secondary*
  • Proteins / analysis
  • Proteins / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Statistics as Topic

Substances

  • Proteins