Fast recovery of evolutionary trees with thousands of nodes

J Comput Biol. 2002;9(2):277-97. doi: 10.1089/10665270252935467.

Abstract

We present a novel distance-based algorithm for evolutionary tree reconstruction. Our algorithm reconstructs the topology of a tree with n leaves in O(n(2)) time using O(n) working space. In the general Markov model of evolution, the algorithm recovers the topology successfully with (1 - o(1)) probability from sequences with polynomial length in n. Moreover, for almost all trees, our algorithm achieves the same success probability on polylogarithmic sample sizes. The theoretical results are supported by simulation experiments involving trees with 500, 1,895, and 3,135 leaves. The topologies of the trees are recovered with high success from 2,000 bp DNA sequences.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Computational Biology
  • DNA / genetics
  • Evolution, Molecular*
  • Markov Chains
  • Models, Genetic
  • Phylogeny

Substances

  • DNA