Towards Reliable Detection of Introgression in the Presence of Among-Species Rate Variation

Syst Biol. 2024 Oct 30;73(5):769-788. doi: 10.1093/sysbio/syae028.

Abstract

The role of interspecific hybridization has recently seen increasing attention, especially in the context of diversification dynamics. Genomic research has now made it abundantly clear that both hybridization and introgression-the exchange of genetic material through hybridization and backcrossing-are far more common than previously thought. Besides cases of ongoing or recent genetic exchange between taxa, an increasing number of studies report "ancient introgression"- referring to results of hybridization that took place in the distant past. However, it is not clear whether commonly used methods for the detection of introgression are applicable to such old systems, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species, affected by recent or ongoing genetic exchange. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to old systems, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we showed that some commonly applied statistical methods, including the D-statistic and certain tests based on sets of local phylogenetic trees, can produce false-positive signals of introgression between divergent taxa that have different rates of evolution. These misleading signals are caused by the presence of homoplasies occurring at different rates in different lineages. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites along the genome and implemented this test in the program Dsuite.

Keywords: D-Statistic; Branch lengths; hybridization; introgression; phylogenetic network; phylogenomics; rate variation; tree topology variation.

MeSH terms

  • Classification / methods
  • Computer Simulation*
  • Genetic Introgression
  • Genetic Variation
  • Hybridization, Genetic
  • Models, Genetic
  • Phylogeny