Genome Properties: a system for the investigation of prokaryotic genetic content for microbiology, genome annotation and comparative genomics

Bioinformatics. 2005 Feb 1;21(3):293-306. doi: 10.1093/bioinformatics/bti015. Epub 2004 Sep 3.

Abstract

Motivation: The presence or absence of metabolic pathways and structures provide a context that makes protein annotation far more reliable. Compiling such information across microbial genomes improves the functional classification of proteins and provides a valuable resource for comparative genomics.

Results: We have created a Genome Properties system to present key aspects of prokaryotic biology using standardized computational methods and controlled vocabularies. Properties reflect gene content, phenotype, phylogeny and computational analyses. The results of searches using hidden Markov models allow many properties to be deduced automatically, especially for families of proteins (equivalogs) conserved in function since their last common ancestor. Additional properties are derived from curation, published reports and other forms of evidence. Genome Properties system was applied to 156 complete prokaryotic genomes, and is easily mined to find differences between species, correlations between metabolic features and families of uncharacterized proteins, or relationships among properties.

Availability: Genome Properties can be found at http://www.tigr.org/Genome_Properties

Supplementary information: http://www.tigr.org/tigr-scripts/CMR2/genome_properties_references.spl.

Publication types

  • Evaluation Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Chromosome Mapping / methods*
  • Database Management Systems*
  • Databases, Genetic*
  • Documentation / methods
  • Gene Expression Profiling / methods
  • Gene Expression Regulation / physiology
  • Genomics / methods*
  • Information Storage and Retrieval / methods*
  • Microbiological Techniques / methods
  • Natural Language Processing
  • Prokaryotic Cells / physiology*
  • Proteome / metabolism
  • Signal Transduction / physiology
  • Software
  • User-Computer Interface*
  • Vocabulary, Controlled

Substances

  • Proteome