Exploring microbial genome sequences to identify protein families on the grid

IEEE Trans Inf Technol Biomed. 2007 Jul;11(4):435-42. doi: 10.1109/titb.2007.892913.

Abstract

The analysis of microbial genome sequences can identify protein families that provide potential drug targets for new antibiotics. With the rapid accumulation of newly sequenced genomes, this analysis has become a computationally intensive and data-intensive problem. This paper describes the development of a Web-service-enabled, component-based, architecture to support the large-scale comparative analysis of complete microbial genome sequences and the subsequent identification of orthologues and protein families (Microbase). The system is coordinated through the use of Web-service-based notifications and integrates distributed computing resources together with genomic databases to realize all-against-all comparisons for a large volume of genome sequences and to present the data in a computationally amenable format through a Web service interface. We demonstrate the use of the system in searching for orthologues and candidate protein families, which ultimately could lead to the identification of potential therapeutic targets.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bacterial Proteins / classification*
  • Bacterial Proteins / genetics*
  • Chromosome Mapping / methods*
  • Databases, Protein
  • Genome, Bacterial / genetics*
  • Information Storage and Retrieval / methods*
  • Internet*
  • Multigene Family / genetics*

Substances

  • Bacterial Proteins