In correlated data settings, analysts typically choose between fitting conditional and marginal models, whose parameters come with distinct interpretations, and as such the choice between the two should be made on scientific grounds. For settings where interest lies in marginal-or population-averaged-parameters, the question of how best to estimate those parameters is a statistical one, and analysts have at their disposal two distinct modeling frameworks: generalized estimating equations (GEE) and marginalized multilevel models (MMMs). The two have been contrasted theoretically and in large sample settings, but asymptotic theory provides no guarantees in the small sample settings that are commonplace. In a comprehensive series of simulation studies, we shed light on the relative performance of GEE and MMMs in small-sample settings to help guide analysis decisions in practice. We find that both GEE and MMMs exhibit similar small-sample bias when the correct correlation structure is adopted (ie, when the random effects distribution is correctly specified or moderately misspecified)-but MMMs can be sensitive to misspecification of the correlation structure. When there are a small number of clusters, MMMs only slightly underestimate standard errors (SEs) for within-cluster associations but can severely underestimate SEs for between-cluster associations. By contrast, while GEE severely underestimates SEs, the Mancl and DeRouen correction provides approximately valid inference.
Keywords: cluster correlated data; cluster randomized trials; generalized estimating equations; marginalized multilevel models; small sample bias.
© 2021 John Wiley & Sons Ltd.