Sample size calculations for skewed distributions

BMC Med Res Methodol. 2015 Apr 2:15:28. doi: 10.1186/s12874-015-0023-0.

Abstract

Background: Sample size calculations should correspond to the intended method of analysis. Nevertheless, for non-normal distributions, they are often done on the basis of normal approximations, even when the data are to be analysed using generalized linear models (GLMs).

Methods: For the case of comparison of two means, we use GLM theory to derive sample size formulae, with particular cases being the negative binomial, Poisson, binomial, and gamma families. By simulation we estimate the performance of normal approximations, which, via the identity link, are special cases of our approach, and for common link functions such as the log. The negative binomial and gamma scenarios are motivated by examples in hookworm vaccine trials and insecticide-treated materials, respectively.

Results: Calculations on the link function (log) scale work well for the negative binomial and gamma scenarios examined and are often superior to the normal approximations. However, they have little advantage for the Poisson and binomial distributions.

Conclusions: The proposed method is suitable for sample size calculations for comparisons of means of highly skewed outcome variables.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Binomial Distribution*
  • Computer Simulation
  • Humans
  • Linear Models*
  • Models, Theoretical*
  • Sample Size