Determining ancestry proportions in complex admixture scenarios in South Africa using a novel proxy ancestry selection method

PLoS One. 2013 Sep 16;8(9):e73971. doi: 10.1371/journal.pone.0073971. eCollection 2013.

Abstract

Admixed populations can make an important contribution to the discovery of disease susceptibility genes if the parental populations exhibit substantial variation in susceptibility. Admixture mapping has been used successfully, but is not designed to cope with populations that have more than two or three ancestral populations. The inference of admixture proportions and local ancestry and the imputation of missing genotypes in admixed populations are crucial in both understanding variation in disease and identifying novel disease loci. These inferences make use of reference populations, and accuracy depends on the choice of ancestral populations. Using an insufficient or inaccurate ancestral panel can result in erroneously inferred ancestry and affect the detection power of GWAS and meta-analysis when using imputation. Current algorithms are inadequate for multi-way admixed populations. To address these challenges we developed PROXYANC, an approach to select the best proxy ancestral populations. From the simulation of a multi-way admixed population we demonstrate the capability and accuracy of PROXYANC and illustrate the importance of the choice of ancestry in both estimating admixture proportions and imputing missing genotypes. We applied this approach to a complex, uniquely admixed South African population. Using genome-wide SNP data from over 764 individuals, we accurately estimate the genetic contributions from the best ancestral populations: isiXhosa [Formula: see text], ‡Khomani SAN [Formula: see text], European [Formula: see text], Indian [Formula: see text], and Chinese [Formula: see text]. We also demonstrate that the ancestral allele frequency differences correlate with increased linkage disequilibrium in the South African population, which originates from admixture events rather than population bottlenecks.

Nomenclature: The collective term for people of mixed ancestry in southern Africa is "Coloured," and this is officially recognized in South Africa as a census term, and for self-classification. Whilst we acknowledge that some cultures may use this term in a derogatory manner, these connotations are not present in South Africa, and are certainly not intended here.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genetics, Population
  • Genome, Human / genetics*
  • Genotype
  • Humans
  • South Africa

Grants and funding

This project was funded by the MRC Centre for Molecular and Cellular Biology and the DST/NRF Centre of Excellence for Biomedical TB Research and supported by a Carnegie Corporation Grant and by the Department of Clinical Laboratory Sciences, University of Cape Town. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.