We consider identification and inference about mean functionals of observed covariates and an outcome variable subject to non-ignorable missingness. By leveraging a shadow variable, we establish a necessary and sufficient condition for identification of the mean functional even if the full data distribution is not identified. We further characterize a necessary condition for -estimability of the mean functional. This condition naturally strengthens the identifying condition, and it requires the existence of a function as a solution to a representer equation that connects the shadow variable to the mean functional. Solutions to the representer equation may not be unique, which presents substantial challenges for non-parametric estimation, and standard theories for non-parametric sieve estimators are not applicable here. We construct a consistent estimator of the solution set and then adapt the theory of extremum estimators to find from the estimated set a consistent estimator of an appropriately chosen solution. The estimator is asymptotically normal, locally efficient and attains the semi-parametric efficiency bound under certain regularity conditions. We illustrate the proposed approach via simulations and a real data application on home pricing.
Keywords: identification; model-free estimation; non-ignorable missingness; shadow variable.
© (RSS) Royal Statistical Society 2023. All rights reserved. For permissions, please e-mail: [email protected].