The distribution of admixture tract lengths has received considerable attention, in part because it can be used to infer the timing of past gene flow events between populations. It is commonly assumed that these lengths can be modeled as independently and identically distributed (iid) exponential random variables. This assumption is fundamental for many popular methods that analyze admixture using hidden Markov models. We compare the expected distribution of admixture tract lengths under a number of population-genetic models to the distribution predicted by the Wright-Fisher model with recombination. We show that under the latter model, the assumption of iid exponential tract lengths does not hold for recent or for ancient admixture events and that relying on this assumption can lead to false positives when inferring the number of admixture events. To further investigate the tract-length distribution, we develop a dyadic interval-based stochastic process for generating admixture tracts. This representation is useful for analyzing admixture tract-length distributions for populations with recent admixture, a scenario in which existing models perform poorly.
Keywords: admixture; admixture tracts; hidden Markov model; migrant tracts.
Copyright © 2014 by the Genetics Society of America.