Errors and linkage disequilibrium interact multiplicatively when computing sample sizes for genetic case-control association studies

D Gordon; M A Levenstien; S J Finch; J Ott

doi:10.1142/9789812776303_0046

Errors and linkage disequilibrium interact multiplicatively when computing sample sizes for genetic case-control association studies

Pac Symp Biocomput. 2003:490-501. doi: 10.1142/9789812776303_0046.

Authors

D Gordon¹, M A Levenstien, S J Finch, J Ott

Affiliation

¹ Laboratory of Statistical Genetics, Rockefeller University, 1230 York Avenue, New York, NY 10021-6399, USA.

PMID: 12603052
DOI: 10.1142/9789812776303_0046

Abstract

Single nucleotide polymorphisms (SNP) may be used in case-control designs to test for association between a SNP marker and a disease. Such designs may assume that the genotype data are reported without error. Our goal is quantifying the effects that errors have on sample size for case-control studies with haplotypes formed by a disease locus and a SNP marker locus in the presence of linkage disequilibrium (LD). We consider the effects of a recently published error model on 2x3 chi-square analysis. We study the joint relation of LD and errors with sample size for three specific genetic disease models and two settings each of marker allele frequencies (total of 6 studies). Minimal sample size necessary for fixed asymptotic power is estimated as a 4th degree polynomial in the variables S (error) and D' (LD measure) via a backward step-wise regression. We find that increased error rates lower power. In all studies, we observe that LD and errors interact in a non-linear fashion. In particular, regression analyses shows that several higher order interaction terms have coefficients significantly different from 0 in each study, with fraction of variance explained greater than 0.9999. Finally, the increase in sample size necessary to maintain constant asymptotic power and level of significance as a function of S is smallest when D' = 1 (perfect LD). The increase grows monotonically as D' decreases to 0.5 for all studies.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Case-Control Studies*
Computational Biology
Gene Frequency
Genotype
Haplotypes
Humans
Linkage Disequilibrium*
Models, Genetic
Polymorphism, Single Nucleotide*
Regression Analysis
Sample Size

Abstract

Publication types

MeSH terms

Grants and funding