COACH: profile-profile alignment of protein families using hidden Markov models

Bioinformatics. 2004 May 22;20(8):1309-18. doi: 10.1093/bioinformatics/bth091. Epub 2004 Feb 12.

Abstract

Motivation: Alignments of two multiple-sequence alignments, or statistical models of such alignments (profiles), have important applications in computational biology. The increased amount of information in a profile versus a single sequence can lead to more accurate alignments and more sensitive homolog detection in database searches. Several profile-profile alignment methods have been proposed and have been shown to improve sensitivity and alignment quality compared with sequence-sequence methods (such as BLAST) and profile-sequence methods (e.g. PSI-BLAST). Here we present a new approach to profile-profile alignment we call Comparison of Alignments by Constructing Hidden Markov Models (HMMs) (COACH). COACH aligns two multiple sequence alignments by constructing a profile HMM from one alignment and aligning the other to that HMM.

Results: We compare the alignment accuracy of COACH with two recently published methods: Yona and Levitt's prof_sim and Sadreyev and Grishin's COMPASS. On two sets of reference alignments selected from the FSSP database, we find that COACH is able, on average, to produce alignments giving the best coverage or the fewest errors, depending on the chosen parameter settings.

Availability: COACH is freely available from www.drive5.com/lobster

Publication types

  • Comparative Study
  • Evaluation Study
  • Validation Study

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Gene Expression Profiling / methods*
  • Markov Chains
  • Models, Genetic*
  • Models, Statistical
  • Molecular Sequence Data
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Analysis, Protein / methods*
  • Sequence Homology, Amino Acid
  • Software*