Exploration and grading of possible genes from 183 bacterial strains by a common protocol to identification of new genes: Gene Trek in Prokaryote Space (GTPS)

DNA Res. 2006 Dec 31;13(6):245-54. doi: 10.1093/dnares/dsl014. Epub 2006 Dec 13.

Abstract

A large number of complete microorganism genomes has been sequenced and submitted to the public database and then incorporated into our complete genome database, Genome Information Broker (GIB, http://gib.genes.nig.ac.jp/). However, when comparative genomics is carried out, researchers must be aware that there are protein-coding genes not confirmed by homology or motif search and that reliable protein-coding genes are missing. Therefore, we developed a protocol (Gene Trek in Prokaryote Space, GTPS) for finding possible protein-coding genes in bacterial genomes. GTPS assigns a degree of reliability to predicted protein-coding genes. We first systematically applied the protocol to the complete genomes of all 123 bacterial species and strains that were publicly available as of July 2003, and then to those of 183 species and strains available as of September 2004. We found a number of incorrect genes and several new ones in the genome data in question. We also found a way to estimate the total number of orthologous genes in the bacterial world.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacteria / classification*
  • Bacteria / genetics
  • Computational Biology
  • DNA, Bacterial / genetics
  • Database Management Systems
  • Genes, Bacterial*
  • Genetics, Microbial*
  • Genome, Bacterial*
  • Open Reading Frames
  • Prokaryotic Cells

Substances

  • DNA, Bacterial