It is often desired to identify further homologs of a family of biological sequences from the ever-growing sequence databases. Profile hidden Markov models excel at capturing the common statistical features of a group of biological sequences. With these common features, we can search the biological database and find new homologous sequences. Most general profile hidden Markov model methods, however, treat the evolutionary relationships between the sequences in a homologous group in an ad-hoc manner. We hereby introduce a method to incorporate phylogenetic information directly into hidden Markov models, and demonstrate that the resulting model performs better than most of the current multiple sequence-based methods for finding distant homologs.
Copyright 2003 Wiley-Liss, Inc.