Finding genes in genome sequence

Methods Mol Biol. 2008:452:163-77. doi: 10.1007/978-1-60327-159-2_8.

Abstract

Gene-finding is concerned with the identification of stretches of DNA in a genomic sequence that encode biologically active products, such as proteins or functional non-coding RNAs. This is usually the first step in the analysis of any novel piece of genomic sequence, which makes it a very important issue, as all downstream analyses depend on the results. This chapter focuses on the biological basis, computational approaches, and corresponding programs that are available for the automated identification of protein-coding genes. for prokaryotic and eukaryotic genomes, as well as the novel, multi-species sequence data originating from environmental community studies, the state of the art in automated gene finding is described.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Electronic Data Processing / methods*
  • Eukaryotic Cells
  • Genes*
  • Prokaryotic Cells
  • Proteins / genetics*
  • RNA, Untranslated / genetics*
  • Sequence Analysis, DNA / methods*

Substances

  • Proteins
  • RNA, Untranslated