Recent advances in gene structure prediction

Curr Opin Struct Biol. 2004 Jun;14(3):264-72. doi: 10.1016/j.sbi.2004.05.007.

Abstract

De novo gene predictors are programs that predict the exon-intron structures of genes using the sequences of one or more genomes as their only input. In the past two years, dual-genome de novo predictors, which exploit local rates and patterns of mutation inferred from alignments between two genomes, have led to significant improvements in accuracy. Systems that exploit more than two genomes simultaneously have only recently begun to appear and are not yet competitive on practical tasks, but offer the greatest hope for near-term improvements. Dual-genome de novo prediction for compact eukaryotic genomes such as those of Arabidopsis thaliana and Caenorhabditis elegans is already quite accurate. Although mammalian gene prediction lags behind in accuracy, it is yielding ever more useful results. Coupled with significant improvements in pseudogene detection methods, which have eliminated many false positives, we have reached the point where de novo gene predictions are being used as hypotheses to drive experimental annotation via systematic RT-PCR and sequencing.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.
  • Review

MeSH terms

  • Animals
  • Base Sequence
  • Exons
  • Genes*
  • Genomics / trends*
  • Humans
  • Introns
  • Models, Molecular*
  • Nucleic Acid Conformation
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism

Substances

  • Proteins