Using meta computing tools to facilitate large-scale analyses of biological databases

Pac Symp Biocomput. 2001:360-71. doi: 10.1142/9789814447362_0035.

Abstract

Given the high rate at which biological data are being collected and made public, it is essential that computational tools be developed that are capable of efficiently accessing and analyzing these data. High-performance distributed computing resources can play a key role in enabling large-scale analyses of biological databases. We use a distributed computing environment, Legion, to enable large-scale computations on the Protein Data Bank (PDB). In particular, we employ the Feature program to scan all protein structures in the PDB in search for unrecognized potential cation binding sites. We evaluate the efficiency of Legion's parallel execution capabilities and analyze the initial biological implications that result from having a site annotation scan of the entire PDB. We discuss four interesting proteins with unannotated, high-scoring candidate cation binding sites.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Binding Sites
  • Cations / metabolism
  • Databases, Factual*
  • Models, Molecular
  • Protein Conformation
  • Proteins / chemistry*
  • Proteins / metabolism
  • Software*

Substances

  • Cations
  • Proteins