Predicting solvent accessibility: higher accuracy using Bayesian statistics and optimized residue substitution classes

Proteins. 1996 May;25(1):38-47. doi: 10.1002/(SICI)1097-0134(199605)25:1<38::AID-PROT4>3.0.CO;2-G.

Abstract

We introduce a novel Bayesian probabilistic method for predicting the solvent accessibilities of amino acid residues in globular proteins. Using single sequence data, this method achieves prediction accuracies higher than previously published methods. Substantially improved predictions-comparable to the highest accuracies reported in the literature to date-are obtained by representing alignments of the example proteins and their homologs as strings of residue substitution classes, depending on the side chain types observed at each alignment position. These results demonstrate the applicability of this relatively simple Bayesian approach to structure prediction and illustrate the utility of the classification methodology previously developed to extract information from aligned sets of structurally related proteins.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Amino Acids / chemistry*
  • Bayes Theorem
  • Databases, Factual
  • Evolution, Molecular
  • Information Theory
  • Likelihood Functions
  • Molecular Sequence Data
  • Neural Networks, Computer
  • Protein Conformation*
  • Protein Folding
  • Protein Structure, Secondary
  • Proteins / chemistry*
  • Sequence Alignment
  • Solvents

Substances

  • Amino Acids
  • Proteins
  • Solvents