Simultaneous sequence alignment and tree construction using hidden Markov models

Pac Symp Biocomput. 2003:180-91.

Abstract

We present a new algorithm (SATCHMO) that simultaneously estimates a tree and generates a set of multiple sequence alignments given a set of protein sequences. Alignments are constructed for each node in the tree. These alignments predict the structurally conserved elements of the sequences in a subtree and are therefore of different lengths, and represent different amino acid preferences, at different nodes. Hidden Markov Models (HMMs) are also generated for each node and are used to determine branching order, to align sequences and to predict structurally alignable regions. In experiments on the BAliBASE benchmark alignment database, SATCHMO is shown to perform comparably to ClustalW and the UCSC SAM HMM software. Results using SATCHMO to identify protein domains are demonstrated on potassium channels, with implications for the mechanism by which tumor necrosis factor alpha affects potassium current.

Publication types

  • Comparative Study

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Computer Simulation
  • Computer Systems
  • Databases, Protein
  • Humans
  • Markov Chains*
  • Molecular Sequence Data
  • Protein Structure, Tertiary
  • Proteins / chemistry
  • Proteins / genetics
  • Sequence Alignment / statistics & numerical data*
  • Sequence Homology, Amino Acid

Substances

  • Proteins