Background: The search for interaction effects is common in epidemiological studies, but the power of such studies is a major concern. This is a practical issue as many future studies will wish to investigate potential gene-gene and gene-environment interactions and therefore need to be planned on the basis of appropriate sample size calculations.
Methods: The underlying model considered in this paper is a simple linear regression and relating a continuous outcome to a continuously distributed exposure variable.
Results: The slope of the regression line is taken to be dependent on genotype, and the ratio of the slopes for each genotype is considered as the interaction parameter. Sample size is affected by the allele frequency and whether the genetic model is dominant or recessive. It is also critically dependent upon the size of the association between exposure and outcome, and the strength of the interaction term. The link between these determinants is graphically displayed to allow sample size and power to be estimated. An example of the analysis of the association between physical activity and glucose intolerance demonstrates how information from previous studies can be used to determine the sample size required to examine gene-environment interactions.
Conclusions: The formulae allowing the computation of the sample size required to study the interaction between a continuous environmental exposure and a genetic factor on a continuous outcome variable should have a practical utility in assisting the design of studies of appropriate power.