Practical strategies for operationalizing optimal allocation in stratified cluster-based outcome-dependent sampling designs

Stat Med. 2023 Mar 30;42(7):917-935. doi: 10.1002/sim.9650. Epub 2023 Jan 17.

Abstract

Cluster-based outcome-dependent sampling (ODS) has the potential to yield efficiency gains when the outcome of interest is relatively rare, and resource constraints allow only a certain number of clusters to be visited for data collection. Previous research has shown that when the intended analysis is inverse-probability weighted generalized estimating equations, and the number of clusters that can be sampled is fixed, optimal allocation of the (cluster-level) sample size across strata defined by auxiliary variables readily available at the design stage has the potential to increase efficiency in the estimation of the parameter(s) of interest. In such a setting, the optimal allocation formulae depend on quantities that are unknown in practice, currently making such designs difficult to implement. In this paper, we consider a two-wave adaptive sampling approach, in which data is collected from a first wave sample, and subsequently used to compute the optimal second wave stratum-specific sample sizes. We consider two strategies for estimating the necessary components using the first wave data: an inverse-probability weighting (IPW) approach and a multiple imputation (MI) approach. In a comprehensive simulation study, we show that the adaptive sampling approach performs well, and that the MI approach yields designs that are very near-optimal, regardless of the covariate type. The IPW approach, on the other hand, has mixed results. Finally, we illustrate the proposed adaptive sampling procedures with data on maternal characteristics and birth outcomes among women enrolled in the Safer Deliveries program in Zanzibar, Tanzania.

Keywords: adaptive sampling; generalized estimating equations; multilevel multiple imputation; optimal allocation; outcome-dependent sampling.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Computer Simulation
  • Data Collection
  • Female
  • Humans
  • Probability
  • Research Design*
  • Sample Size