An integrated system for studying residue coevolution in proteins

Bioinformatics. 2008 Jan 15;24(2):290-2. doi: 10.1093/bioinformatics/btm584. Epub 2007 Dec 1.

Abstract

Residue coevolution has recently emerged as an important concept, especially in the context of protein structures. While a multitude of different functions for quantifying it have been proposed, not much is known about their relative strengths and weaknesses. Also, subtle algorithmic details have discouraged implementing and comparing them. We addressed this issue by developing an integrated online system that enables comparative analyses with a comprehensive set of commonly used scoring functions, including Statistical Coupling Analysis (SCA), Explicit Likelihood of Subset Variation (ELSC), mutual information and correlation-based methods. A set of data preprocessing options are provided for improving the sensitivity and specificity of coevolution signal detection, including sequence weighting, residue grouping and the filtering of sequences, sites and site pairs. A total of more than 100 scoring variations are available. The system also provides facilities for studying the relationship between coevolution scores and inter-residue distances from a crystal structure if provided, which may help in understanding protein structures.

Availability: The system is available at http://coevolution.gersteinlab.org. The source code and JavaDoc API can also be downloaded from the web site.

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computer Graphics
  • Evolution, Molecular*
  • Molecular Sequence Data
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / genetics*
  • Proteins / ultrastructure
  • Sequence Analysis, Protein / methods*
  • Software*
  • Systems Integration
  • User-Computer Interface*

Substances

  • Proteins