Background: The use of mitochondrial DNA data in phylogenetics is controversial, yet studies that combine mitochondrial and nuclear DNA data (mtDNA and nucDNA) to estimate phylogeny are common, especially in vertebrates. Surprisingly, the consequences of combining these data types are largely unexplored, and many fundamental questions remain unaddressed in the literature. For example, how much do trees from mtDNA and nucDNA differ? How are topological conflicts between these data types typically resolved in the combined-data tree? What determines whether a node will be resolved in favor of mtDNA or nucDNA, and are there any generalities that can be made regarding resolution of mtDNA-nucDNA conflicts in combined-data trees? Here, we address these and related questions using new and published nucDNA and mtDNA data for Plethodon salamanders and published data from 13 other vertebrate clades (including fish, frogs, lizards, birds, turtles, and mammals).
Results: We find widespread discordance between trees from mtDNA and nucDNA (30-70% of nodes disagree per clade), but this discordance is typically not strongly supported. Despite often having larger numbers of variable characters, mtDNA data do not typically dominate combined-data analyses, and combined-data trees often share more nodes with trees from nucDNA alone. There is no relationship between the proportion of nodes shared between combined-data and mtDNA trees and relative numbers of variable characters or levels of homoplasy in the mtDNA and nucDNA data sets. Congruence between trees from mtDNA and nucDNA is higher on branches that are longer and deeper in the combined-data tree, but whether a conflicting node will be resolved in favor mtDNA or nucDNA is unrelated to branch length. Conflicts that are resolved in favor of nucDNA tend to occur at deeper nodes in the combined-data tree. In contrast to these overall trends, we find that Plethodon have an unusually large number of strongly supported conflicts between data types, which are generally resolved in favor of mtDNA in the combined-data tree (despite the large number of nuclear loci sampled).
Conclusions: Overall, our results from 14 vertebrate clades show that combined-data analyses are not necessarily dominated by the more variable mtDNA data sets. However, given cases like Plethodon, there is also the need for routine checking of incongruence between mtDNA and nucDNA data and its impacts on combined-data analyses.