The use of structure information to increase alignment accuracy does not aid homologue detection with profile HMMs

Bioinformatics. 2002 Sep;18(9):1243-9. doi: 10.1093/bioinformatics/18.9.1243.

Abstract

Motivation: The best quality multiple sequence alignments are generally considered to derive from structural superposition. However, no previous work has studied the relative performance of profile hidden Markov models (HMMs) derived from such alignments. Therefore several alignment methods have been used to generate multiple sequence alignments from 348 structurally aligned families in the HOMSTRAD database. The performance of profile HMMs derived from the structural and sequence-based alignments has been assessed for homologue detection.

Results: The best alignment methods studied here correctly align nearly 80% of residues with respect to structure alignments. Alignment quality and model sensitivity are found to be dependent on average number, length, and identity of sequences in the alignment. The striking conclusion is that, although structural data may improve the quality of multiple sequence alignments, this does not add to the ability of the derived profile HMMs to find sequence homologues.

Supplementary information: A list of HOMSTRAD families used in this study and the corresponding Pfam families is available at http://www.sanger.ac.uk/Users/sgj/alignments/map.html

Contact: [email protected]

Publication types

  • Comparative Study

MeSH terms

  • Amino Acid Sequence
  • Database Management Systems*
  • Databases, Genetic*
  • Evaluation Studies as Topic
  • Gene Expression Profiling / methods*
  • Information Storage and Retrieval / methods*
  • Internet
  • Markov Chains
  • Models, Genetic
  • Models, Statistical
  • Molecular Sequence Data
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Sequence Alignment / methods*
  • Sequence Alignment / standards
  • Sequence Homology*