Does gene flow destroy phylogenetic signal? The performance of three methods for estimating species phylogenies in the presence of gene flow

Mol Phylogenet Evol. 2008 Dec;49(3):832-42. doi: 10.1016/j.ympev.2008.09.008. Epub 2008 Sep 21.

Abstract

Incomplete lineage sorting has been documented across a diverse set of taxa ranging from song birds to conifers. Such patterns are expected theoretically for species characterized by certain life history characteristics (e.g. long generation times) and those influenced by certain historical demographic events (e.g. recent divergences). A number of methods to estimate the underlying species phylogeny from a set of gene trees have been proposed and shown to be effective when incomplete lineage sorting has occurred. The further effects of gene flow on those methods, however, remain to be investigated. Here, we focus on the performance of three methods of species tree inference, ESP-COAL, minimizing deep coalescence (MDC), and concatenation, when incomplete lineage sorting and gene flow jointly confound the relationship between gene and species trees. Performance was investigated using Monte Carlo coalescent simulations under four models (n-island, stepping stone, parapatric, and allopatric) and three magnitudes of gene flow (N(e)m=0.01, 0.10, 1.00). Although results varied by the model and magnitude of gene flow, methods incorporating aspects of the coalescent process (ESP-COAL and MDC) performed well, with probabilities of identifying the correct species tree topology typically increasing to greater than 0.75 when five more loci are sampled. The only exceptions to that pattern included gene flow at moderate to high magnitudes under the n-island and stepping stone models. Concatenation performs poorly relative to the other methods. We extend these results to a discussion of the importance of species and population phylogenies to the fields of molecular systematics and phylogeography using an empirical example from Rhododendron.

Publication types

  • Comparative Study

MeSH terms

  • Computer Simulation
  • Evolution, Molecular*
  • Gene Flow*
  • Genetic Speciation
  • Genetics, Population
  • Geography
  • Likelihood Functions
  • Models, Genetic*
  • Models, Statistical
  • Monte Carlo Method
  • Phylogeny*
  • Rhododendron / classification
  • Rhododendron / genetics
  • Sequence Analysis, DNA