Eigenvalue analysis of amino acid substitution matrices reveals a sharp transition of the mode of sequence conservation in proteins

Bioinformatics. 2004 Nov 1;20(16):2504-8. doi: 10.1093/bioinformatics/bth297. Epub 2004 May 6.

Abstract

The pattern of amino acid substitutions and sequence conservation over many structure-based alignments of protein sequences was analyzed as a function of percentage sequence identity. The statistics of the amino acid substitutions were converted into the form of log-odds amino acid substitution matrices to which eigenvalue decomposition was applied. It was found that the most important component of the substitution matrices exhibited a sharp transition at the sequence identity of 30-35%, which coincides with the twilight zone. Above the transition point, the most dominant component is related to the mutability of amino acids and it acts to disfavor any substitutions, whereas below the transition point, the most dominant component is related to the hydrophobicity of amino acids and substitutions between residues of similar hydrophobic character are positively favored. Implications for protein evolution and sequence analysis are discussed.

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Amino Acid Substitution*
  • Amino Acids / chemistry*
  • Conserved Sequence
  • Hydrophobic and Hydrophilic Interactions
  • Mutation
  • Numerical Analysis, Computer-Assisted*
  • Proteins / chemistry*
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid

Substances

  • Amino Acids
  • Proteins