The use of classic nonparametric tests (cNPTs), such as the Kruskal-Wallis and Mann-Whitney U tests, in the presence of unequal variance for between-group comparisons of means and medians may lead to marked increases in the rate of falsely rejecting null hypotheses and decreases in statistical power. Yet, this practice remains prevalent in the scientific literature, including nutrition and obesity literature. Some nutrition and obesity studies use a cNPT in the presence of unequal variance (i.e., heteroscedasticity), sometimes because of the mistaken rationale that the test corrects for heteroscedasticity. Herein, we discuss misconceptions of using cNPTs in the presence of heteroscedasticity. We then discuss assumptions, purposes, and limitations of 3 common tests used to test for mean differences between multiple groups, including 2 parametric tests: Fisher's ANOVA and Welch's ANOVA; and 1 cNPT: the Kruskal-Wallis test. To document the impact of heteroscedasticity on the validity of these tests under conditions similar to those used in nutrition and obesity research, we conducted simple simulations and assessed type I error rates (i.e., false positives, defined as incorrectly rejecting the null hypothesis). We demonstrate that type I error rates for Fisher's ANOVA, which does not account for heteroscedasticity, and Kruskal-Wallis, which tests for differences in distributions rather than means, deviated from the expected significance level. Greater deviation from the expected type I error rate was observed as the heterogeneity increased, especially in the presence of an imbalanced sample size. We provide brief tutorial guidance for authors, editors, and reviewers to identify appropriate statistical tests when test assumptions are violated, with a particular focus on cNPTs.
Keywords: association; causation; heteroscedasticity; nonparametric tests; nutrition; obesity; research rigor; statistical methods.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Society for Nutrition.