Lack of identification in semiparametric instrumental variable models with binary outcomes

Am J Epidemiol. 2014 Jul 1;180(1):111-9. doi: 10.1093/aje/kwu107. Epub 2014 May 23.

Abstract

A parameter in a statistical model is identified if its value can be uniquely determined from the distribution of the observable data. We consider the context of an instrumental variable analysis with a binary outcome for estimating a causal risk ratio. The semiparametric generalized method of moments and structural mean model frameworks use estimating equations for parameter estimation. In this paper, we demonstrate that lack of identification can occur in either of these frameworks, especially if the instrument is weak. In particular, the estimating equations may have no solution or multiple solutions. We investigate the relationship between the strength of the instrument and the proportion of simulated data sets for which there is a unique solution of the estimating equations. We see that this proportion does not appear to depend greatly on the sample size, particularly for weak instruments (ρ(2) ≤ 0.01). Poor identification was observed in a considerable proportion of simulated data sets for instruments explaining up to 10% of the variance in the exposure with sample sizes up to 1 million. In an applied example considering the causal effect of body mass index (weight (kg)/height (m)(2)) on the probability of early menarche, estimates and standard errors from an automated optimization routine were misleading.

Keywords: Avon Longitudinal Study of Parents and Children; generalized method of moments; identifiability; identification; instrumental variables; semiparametric methods; structural mean model; weak instruments.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adolescent
  • Age Factors
  • Asthma / etiology
  • Body Mass Index
  • Causality
  • Child
  • Data Interpretation, Statistical
  • Female
  • Humans
  • Menarche
  • Models, Statistical*
  • Odds Ratio
  • Sample Size