Amino Acid metabolism conflicts with protein diversity

Mol Biol Evol. 2014 Nov;31(11):2905-12. doi: 10.1093/molbev/msu228. Epub 2014 Aug 1.

Abstract

The 20 protein-coding amino acids are found in proteomes with different relative abundances. The most abundant amino acid, leucine, is nearly an order of magnitude more prevalent than the least abundant amino acid, cysteine. Amino acid metabolic costs differ similarly, constraining their incorporation into proteins. On the other hand, a diverse set of protein sequences is necessary to build functional proteomes. Here, we present a simple model for a cost-diversity trade-off postulating that natural proteomes minimize amino acid metabolic flux while maximizing sequence entropy. The model explains the relative abundances of amino acids across a diverse set of proteomes. We found that the data are remarkably well explained when the cost function accounts for amino acid chemical decay. More than 100 organisms reach comparable solutions to the trade-off by different combinations of proteome cost and sequence diversity. Quantifying the interplay between proteome size and entropy shows that proteomes can get optimally large and diverse.

Keywords: amino acid decay; amino acid metabolism; information theory; maximum entropy; proteomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adenosine Triphosphate / metabolism
  • Amino Acid Sequence
  • Amino Acids / chemistry
  • Amino Acids / genetics
  • Amino Acids / metabolism*
  • Entropy
  • Genome*
  • Genomic Structural Variation
  • Least-Squares Analysis
  • Models, Biological*
  • Molecular Sequence Data
  • Protein Biosynthesis / genetics*
  • Proteome / chemistry
  • Proteome / genetics
  • Proteome / metabolism*

Substances

  • Amino Acids
  • Proteome
  • Adenosine Triphosphate