Substudies of the Childhood Asthma Management Program (CAMP Research Group, 1999, 2000) seek to identify patient characteristics associated with asthma symptoms and lung function. To determine if genetic measures are associated with trajectories of lung function as measured by forced vital capacity (FVC), children in the primary cohort study retrospectively had candidate loci evaluated. Given participant burden and constraints on financial resources, it is often desirable to target a sub-sample for ascertainment of costly measures. Methods that can leverage the longitudinal outcome on the full cohort to selectively measure informative individuals have been promising, but have been restricted in their use to analysis of the targeted sub-sample. In this paper we detail two multiple imputation analysis strategies that exploit outcome and partially observed covariate data on the non-sampled subjects, and we characterize alternative design and analysis combinations that could be used for future studies of pulmonary function and other outcomes. Candidate predictor (e.g. IL10 cytokine polymorphisms) associations obtained from targeted sampling designs can be estimated with very high efficiency compared to standard designs. Further, even though multiple imputation can dramatically improve estimation efficiency for covariates available on all subjects (e.g., gender and baseline age), only modest efficiency gains were observed in parameters associated with predictors that are exclusive to the targeted sample. Our results suggest that future studies of longitudinal trajectories can be efficiently conducted by use of outcome-dependent designs and associated full cohort analysis.
Keywords: biased sampling; childhood asthma; conditional likelihood; epidemiological study design; forced vital capacity; linear mixed effect models; longitudinal data analysis; multiple imputation; outcome dependent sampling; time-dependent covariates.