Comparing expression level-dependent features in codon usage with protein abundance: an analysis of 'predictive proteomics'

Proteomics. 2004 Jan;4(1):46-58. doi: 10.1002/pmic.200300501.

Abstract

Synonymous codon usage is a commonly used means for estimating gene expression levels of Escherichia coli genes and has also been used for predicting highly expressed genes for a number of prokaryotic genomes. By comparison of expression level-dependent features in codon usage with protein abundance data from two proteome studies of exponentially growing E. coli and Bacillus subtilis cells, we try to evaluate whether the implicit assumption of this approach can be confirmed with experimental data. Log-odds ratio scores are used to model differences in codon usage between highly expressed genes and genomic average. Using these, the strength and significance of expression level-dependent features in codon usage were determined for the genes of the Escherichia coli, Bacillus subtilis and Haemophilus influenzae genomes. The comparison of codon usage features with protein abundance data confirmed a relationship between these to be present, although exceptions to this, possibly related to functional context, were found. For species with expression level-dependent features in their codon usage, the applied methodology could be used to improve in silico simulations of the outcome of two-dimensional gel electrophoretic experiments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Bacillus subtilis / genetics
  • Bacillus subtilis / metabolism
  • Bacterial Proteins / genetics*
  • Bacterial Proteins / metabolism
  • Codon / genetics*
  • Codon / metabolism
  • Electrophoresis, Gel, Two-Dimensional
  • Escherichia coli / genetics
  • Escherichia coli / metabolism
  • Gene Expression Profiling
  • Humans
  • Proteomics*
  • RNA, Messenger / genetics*
  • RNA, Messenger / metabolism

Substances

  • Bacterial Proteins
  • Codon
  • RNA, Messenger