A parameter in a statistical model is identified if its value can be uniquely determined from the distribution of the observable data. We consider the context of an instrumental variable analysis with a binary outcome for estimating a causal risk ratio. The semiparametric generalized method of moments and structural mean model frameworks use estimating equations for parameter estimation. In this paper, we demonstrate that lack of identification can occur in either of these frameworks, especially if the instrument is weak. In particular, the estimating equations may have no solution or multiple solutions. We investigate the relationship between the strength of the instrument and the proportion of simulated data sets for which there is a unique solution of the estimating equations. We see that this proportion does not appear to depend greatly on the sample size, particularly for weak instruments (ρ(2) ≤ 0.01). Poor identification was observed in a considerable proportion of simulated data sets for instruments explaining up to 10% of the variance in the exposure with sample sizes up to 1 million. In an applied example considering the causal effect of body mass index (weight (kg)/height (m)(2)) on the probability of early menarche, estimates and standard errors from an automated optimization routine were misleading.
Keywords: Avon Longitudinal Study of Parents and Children; generalized method of moments; identifiability; identification; instrumental variables; semiparametric methods; structural mean model; weak instruments.
© The Author 2014. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health.