Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm

P L Miller; P M Nadkarni; N M Carriero

doi:10.1093/bioinformatics/7.1.71

Parallel computation and FASTA: confronting the problem of parallel database search for a fast sequence comparison algorithm

Comput Appl Biosci. 1991 Jan;7(1):71-8. doi: 10.1093/bioinformatics/7.1.71.

Authors

P L Miller¹, P M Nadkarni, N M Carriero

Affiliation

¹ Department of Anesthesiology, Yale University School of Medicine, New Haven, CT 06510.

PMID: 2004277
DOI: 10.1093/bioinformatics/7.1.71

Abstract

We have parallelized the FASTA algorithm for biological sequence comparison using Linda, a machine-independent parallel programming language. The resulting parallel program runs on a variety of different parallel machines. A straight-forward parallelization strategy works well if the amount of computation to be done is relatively large. When the amount of computation is reduced, however, disk I/O becomes a bottleneck which may prevent additional speed-up as the number of processors is increased. The paper describes the parallelization of FASTA, and uses FASTA to illustrate the I/O bottleneck problem that may arise when performing parallel database search with a fast sequence comparison algorithm. The paper also describes several program design strategies that can help with this problem. The paper discusses how this bottleneck is an example of a general problem that may occur when parallelizing, or otherwise speeding up, a time-consuming computation.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Algorithms*
Amino Acid Sequence
Computer Systems*
Mathematical Computing*
Molecular Sequence Data
Programming Languages
Software Design

Grants and funding

R01 LM05044/LM/NLM NIH HHS/United States