Inference of multiple-wave admixtures by length distribution of ancestral tracks

Heredity (Edinb). 2018 Jul;121(1):52-63. doi: 10.1038/s41437-017-0041-2. Epub 2018 Jan 23.

Abstract

The ancestral tracks in admixed genomes are valuable for population history inference. While a few methods have been developed to infer admixture history based on ancestral tracks, these methods suffer the same flaw: only population admixture history under some specific models can be inferred. In addition, the inference of history might be biased or even unreliable if the specific model deviates from the real situation. To address this problem, we firstly proposed a general discrete admixture model to describe the admixture history with multiple ancestral populations and multiple-wave admixtures. We next deduced the length distribution of ancestral tracks under the general discrete admixture model. We further developed a new method, MultiWaver, to explore multiple-wave admixture histories. Our method could automatically determine an optimal admixture model based on the length distribution of ancestral tracks, and estimate the corresponding parameters under this optimal model. Specifically, we used a likelihood ratio test (LRT) to determine the number of admixture waves, and implemented an expectation-maximization (EM) algorithm to estimate parameters. We used simulation studies to validate the reliability and effectiveness of our method. Finally, good performance was observed when our method was applied to real data sets of African Americans and Mexicans, and new insights were gained into the admixture history of Uyghurs and Hazaras.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computer Simulation
  • Data Analysis
  • Evolution, Molecular
  • Genetics, Population* / methods
  • Humans
  • Models, Genetic*
  • Population Density
  • Reproducibility of Results