Trends of amino acid usage in the proteins from the human genome

J Biomol Struct Dyn. 2007 Aug;25(1):55-9. doi: 10.1080/07391102.2007.10507155.

Abstract

Correspondence analysis of amino acid usage was applied to 14,815 complete proteins from the human genome. We found that three major factors influence the variability of amino acidic composition of these proteins, explaining, respectively 20.4%, 14.7%, and 9.9% of the total variability. The first trend is strongly correlated with the GC content of first and second codon positions and is also significantly correlated with the GC level of the corresponding flanking regions and introns. Therefore, the main force shaping amino acid usage among human proteins are the compositional constraints determined by the isochore in which each gene is embedded. The second trend correlates with the hydropathy of each protein and with the frequency of beta-strands. Finally, the third trend is strongly associated with the usage of Cys and the frequency of alpha-helices.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids / analysis*
  • Animals
  • Base Composition*
  • Exons
  • Genome, Human*
  • Humans
  • Molecular Sequence Data
  • Protein Conformation
  • Proteins / chemistry
  • Proteins / genetics*
  • Sequence Analysis, Protein

Substances

  • Amino Acids
  • Proteins