Amino acid translation program for full-length cDNA sequences with frameshift errors

Y Fukunishi; Y Hayashizaki

doi:10.1152/physiolgenomics.2001.5.2.81

Amino acid translation program for full-length cDNA sequences with frameshift errors

Physiol Genomics. 2001 Mar 8;5(2):81-7. doi: 10.1152/physiolgenomics.2001.5.2.81.

Authors

Y Fukunishi¹, Y Hayashizaki

Affiliation

¹ Laboratory for Genome Exploration Research Group, RIKEN Genomic Sciences Center, RIKEN Yokahama Institute, Yokohama City, Kanagawa 230-0045, Japan.

PMID: 11242592
DOI: 10.1152/physiolgenomics.2001.5.2.81

Abstract

Here we present an amino acid translation program designed to suggest the position of experimental frameshift errors and predict amino acid sequences for full-length cDNA sequences having phred scores. Our program generates artificial insertions into artificial deletions from low-accuracy positions of the original sequence, thereby generating many candidate sequences. The validity of the most probable sequence (the likelihood that it represents the actual protein) is evaluated by using a score (V(a)) that is calculated in light of the Kozak consensus, preferred codon usage, and position of the initiation codon. To evaluate the software, we have used a database in which, out of 612 cDNA sequences, 524 (86%) carried 773 frameshift errors in the coding sequence. Our software detected and corrected 48% of the total frameshift errors in 62% of the total cDNA sequences with frameshift errors. The false positive rate of frameshift correction was 9%, and 91% of the suggested frameshifts were true.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Base Composition
Base Sequence
Codon / genetics
Codon, Initiator / genetics
Computational Biology / methods*
Consensus Sequence / genetics
DNA, Complementary / genetics
Databases as Topic
Exons / genetics
False Positive Reactions
Frameshift Mutation / genetics*
Internet
Likelihood Functions
Monte Carlo Method
Mutagenesis, Insertional / genetics
Open Reading Frames / genetics*
Protein Biosynthesis / genetics
Research Design
Sensitivity and Specificity
Sequence Analysis, DNA / methods
Sequence Deletion / genetics
Software*

Substances

Codon
Codon, Initiator
DNA, Complementary