Variance estimation in tests of clustered categorical data with informative cluster size

Stat Methods Med Res. 2020 Nov;29(11):3396-3408. doi: 10.1177/0962280220928572. Epub 2020 Jun 8.

Abstract

In the analysis of clustered data, inverse cluster size weighting has been shown to be resistant to the potentially biasing effects of informative cluster size, where the number of observations within a cluster is associated with the outcome variable of interest. The method of inverse cluster size reweighting has been implemented to establish clustered data analogues of common tests for independent data, but the method has yet to be extended to tests of categorical data. Many variance estimators have been implemented across established cluster-weighted tests, but potential effects of differing methods on test performance has not previously been explored. Here, we develop cluster-weighted estimators of marginal proportions that remain unbiased under informativeness, and derive analogues of three popular tests for clustered categorical data, the one-sample proportion, goodness of fit, and independence chi square tests. We construct these tests using several variance estimators and show substantial differences in the performance of cluster-weighted tests based on variance estimation technique, with variance estimators constructed under the null hypothesis maintaining size closest to nominal. We illustrate the proposed tests through an application to a data set of functional measures from patients with spinal cord injuries participating in a rehabilitation program.

Keywords: Informative cluster size; categorical tests; clustered data; marginal analysis.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Bias
  • Chi-Square Distribution
  • Cluster Analysis
  • Humans
  • Research Design*