Squid - a simple bioinformatics grid

BMC Bioinformatics. 2005 Aug 3:6:197. doi: 10.1186/1471-2105-6-197.

Abstract

Background: BLAST is a widely used genetic research tool for analysis of similarity between nucleotide and protein sequences. This paper presents a software application entitled "Squid" that makes use of grid technology. The current version, as an example, is configured for BLAST applications, but adaptation for other computing intensive repetitive tasks can be easily accomplished in the open source version. This enables the allocation of remote resources to perform distributed computing, making large BLAST queries viable without the need of high-end computers.

Results: Most distributed computing / grid solutions have complex installation procedures requiring a computer specialist, or have limitations regarding operating systems. Squid is a multi-platform, open-source program designed to "keep things simple" while offering high-end computing power for large scale applications. Squid also has an efficient fault tolerance and crash recovery system against data loss, being able to re-route jobs upon node failure and recover even if the master machine fails. Our results show that a Squid application, working with N nodes and proper network resources, can process BLAST queries almost N times faster than if working with only one computer.

Conclusion: Squid offers high-end computing, even for the non-specialist, and is freely available at the project web site. Its open-source and binary Windows distributions contain detailed instructions and a "plug-n-play" instalation containing a pre-configured example.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computational Biology*
  • Computer Systems
  • Databases, Protein*
  • Internet
  • Online Systems
  • Sequence Analysis, Protein / methods*
  • Software*
  • User-Computer Interface