Accelerating pairwise statistical significance estimation for local alignment by harvesting GPU's power

BMC Bioinformatics. 2012 Apr 12;13 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2105-13-S5-S3.

Abstract

Background: Pairwise statistical significance has been recognized to be able to accurately identify related sequences, which is a very important cornerstone procedure in numerous bioinformatics applications. However, it is both computationally and data intensive, which poses a big challenge in terms of performance and scalability.

Results: We present a GPU implementation to accelerate pairwise statistical significance estimation of local sequence alignment using standard substitution matrices. By carefully studying the algorithm's data access characteristics, we developed a tile-based scheme that can produce a contiguous data access in the GPU global memory and sustain a large number of threads to achieve a high GPU occupancy. We further extend the parallelization technique to estimate pairwise statistical significance using position-specific substitution matrices, which has earlier demonstrated significantly better sequence comparison accuracy than using standard substitution matrices. The implementation is also extended to take advantage of dual-GPUs. We observe end-to-end speedups of nearly 250 (370) × using single-GPU Tesla C2050 GPU (dual-Tesla C2050) over the CPU implementation using Intel Corei7 CPU 920 processor.

Conclusions: Harvesting the high performance of modern GPUs is a promising approach to accelerate pairwise statistical significance estimation for local sequence alignment.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Graphics / instrumentation*
  • Sequence Alignment / instrumentation
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / instrumentation
  • Sequence Analysis, Protein / methods*
  • Software