Testing for association in the presence of population stratification: a simulation study comparing the S-TDT, STRAT and the GC

Biom J. 2006 Jun;48(3):420-34. doi: 10.1002/bimj.200410214.

Abstract

A novel approach for association testing in the presence of population stratification has been introduced by Pritchard et al. (2000a) and Pritchard et al. (2000b). The structured association approach is a two-tiered procedure that first estimates the population structure and then tests the null hypothesis H0: 'no association within subpopulations' in the second step. A power comparison of the stratified test for association (STRAT) (Pritchard et al., 2000b) and the Transmission-Disequilibrium-Test (TDT) (Spielman and Ewens, 1993a) in a simulation framework showed superiority of STRAT if allele frequencies or associations between allele and disease differ strongly in subpopulations. In more homogeneous situations, the TDT had greater power than STRAT. However, the TDT, based on family trios,that uses population controls, needs 50% more genotyping compared to STRAT. The Sib-Transmission-Disequilibrium-Test (S-TDT) needs the same amount of genotyping since it relays in its minimal configuration on pairs of siblings. This raises the question how the S-TDT (Spielman and Ewens, 1998a) performs compared to the population based methods STRAT and Genomic Controls (GC). In this paper, we present a simulation study accounting for two different models of population stratification in different settings of allele frequencies and under different risk models. The results showed that under a discrete as well as under an admixed population model, STRAT strongly outperformed the S-TDT and the GC when different alleles were associated in different subpopulations. In contrast, the S-TDT had greater power than STRAT when the same allele was associated in both subpopulations. Here, the GC was sometimes even more powerful than the S-TDT, depending on the population model and the allele frequency differences. A general recommendation for the use of one of the tests can therefore not be given.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical*
  • Genetic Markers / genetics
  • Genetic Predisposition to Disease / epidemiology*
  • Genetic Predisposition to Disease / genetics*
  • Genetics, Population*
  • Linkage Disequilibrium*
  • Models, Genetic
  • Models, Statistical
  • Reproducibility of Results
  • Risk Assessment / methods*
  • Risk Factors
  • Sensitivity and Specificity
  • Statistics as Topic

Substances

  • Genetic Markers