On the relationship between sequence and structure similarities in proteomics

Bioinformatics. 2007 Mar 15;23(6):717-23. doi: 10.1093/bioinformatics/btm006. Epub 2007 Jan 22.

Abstract

Motivation: The underlying assumption of many sequence-based comparative studies in proteomics is that different aspects of protein structure and therefore functionality may be linked to particular sequence motifs. This holds true if sequence similarity is sufficiently high, but in general the relationship between protein sequence and structure appears complex and is not well understood.

Results: Statistical analysis of multiple and pairwise structural alignments of protein structures within SCOP folds is performed. The results indicate that multiple conservation of residue identity is not common and that relationship between sequence and structure may be explained by a model based on the assumption that protein structure is tolerant to residue substitutions preserving hydropathic profile of the sequence. This model also explains the origin and specific value of the sequence similarity threshold, noticed in many previous studies, below which structural resemblance is not statistically expected.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Computer Simulation
  • Models, Chemical*
  • Models, Molecular*
  • Molecular Sequence Data
  • Protein Conformation
  • Proteome / metabolism*
  • Proteome / ultrastructure*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Statistics as Topic
  • Structure-Activity Relationship

Substances

  • Proteome