Codon choice in genes depends on flanking sequence information--implications for theoretical reverse translation

Nucleic Acids Res. 2008 Feb;36(3):e16. doi: 10.1093/nar/gkm1181. Epub 2008 Jan 18.

Abstract

Algorithms for theoretical reverse translation have direct applications in degenerate PCR. The conventional practice is to create several degenerate primers each of which variably encode the peptide region of interest. In the current work, for each codon we have analyzed the flanking residues in proteins and determined their influence on codon choice. From this, we created a method for theoretical reverse translation that includes information from flanking residues of the protein in question. Our method, named the neighbor correlation method (NCM) and its enhancement, the consensus-NCM (c-NCM) performed significantly better than the conventional codon-usage statistic method (CSM). Using the methods NCM and c-NCM, we were able to increase the average sequence identity from 77% up to 81%. Furthermore, we revealed a significant increase in coverage, at 80% identity, from < 20% (CSM) to > 75% (c-NCM). The algorithms, their applications and implications are discussed herein.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Bacterial Proteins / chemistry
  • Bacterial Proteins / genetics*
  • Codon*
  • Escherichia coli K12 / genetics
  • Genes, Bacterial
  • Genome, Bacterial
  • Models, Genetic
  • Polymerase Chain Reaction
  • Probability
  • Protein Biosynthesis
  • Salmonella typhi / genetics
  • Sequence Analysis, Protein / methods*

Substances

  • Bacterial Proteins
  • Codon