Estimating treatment effects under untestable assumptions with nonignorable missing data

Stat Med. 2020 May 20;39(11):1658-1674. doi: 10.1002/sim.8504. Epub 2020 Feb 14.

Abstract

Nonignorable missing data poses key challenges for estimating treatment effects because the substantive model may not be identifiable without imposing further assumptions. For example, the Heckman selection model has been widely used for handling nonignorable missing data but requires the study to make correct assumptions, both about the joint distribution of the missingness and outcome and that there is a valid exclusion restriction. Recent studies have revisited how alternative selection model approaches, for example estimated by multiple imputation (MI) and maximum likelihood, relate to Heckman-type approaches in addressing the first hurdle. However, the extent to which these different selection models rely on the exclusion restriction assumption with nonignorable missing data is unclear. Motivated by an interventional study (REFLUX) with nonignorable missing outcome data in half of the sample, this article critically examines the role of the exclusion restriction in Heckman, MI, and full-likelihood selection models when addressing nonignorability. We explore the implications of the different methodological choices concerning the exclusion restriction for relative bias and root-mean-squared error in estimating treatment effects. We find that the relative performance of the methods differs in practically important ways according to the relevance and strength of the exclusion restriction. The full-likelihood approach is less sensitive to alternative assumptions about the exclusion restriction than Heckman-type models and appears an appropriate method for handling nonignorable missing data. We illustrate the implications of method choice for inference in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life for patients with gastro-oseophageal reflux disease.

Keywords: Heckman model; average treatment effects; full-information maximum likelihood; missing not at random; multiple imputation; selection models.

MeSH terms

  • Bias
  • Gastroesophageal Reflux*
  • Humans
  • Likelihood Functions
  • Models, Statistical
  • Quality of Life*