Different versions of the Dayhoff rate matrix

Mol Biol Evol. 2005 Feb;22(2):193-9. doi: 10.1093/molbev/msi005. Epub 2004 Oct 13.

Abstract

Many phylogenetic inference methods are based on Markov models of sequence evolution. These are usually expressed in terms of a matrix (Q) of instantaneous rates of change but some models of amino acid replacement, most notably the PAM model of Dayhoff and colleagues, were originally published only in terms of time-dependent probability matrices (P(t)). Previously published methods for deriving Q have used eigen-decomposition of an approximation to P(t). We show that the commonly used value of t is too large to ensure convergence of the estimates of elements of Q. We describe two simpler alternative methods for deriving Q from information such as that published by Dayhoff and colleagues. Neither of these methods requires approximation or eigen-decomposition. We identify the methods used to derive various different versions of the Dayhoff model in current software, perform a comparison of existing and new implementations, and, to facilitate agreement among scientists using supposedly identical models, recommend that one of the new methods be used as a standard.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Evolution, Molecular*
  • Humans
  • Markov Chains*
  • Models, Genetic*
  • Phylogeny*
  • Sequence Alignment / methods
  • Software Design*