Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants

Stat Methods Med Res. 2016 Aug;25(4):1393-421. doi: 10.1177/0962280213484401. Epub 2013 Apr 23.

Abstract

The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. This is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias. In this paper, we address some issues that arise when the target of the inference (i.e. the analysis model or model of interest) is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is proposed. Missing data are multiply imputed by means of chained equations. In particular, imputation of continuous variables is based on their empirical distribution, conditional on all other variables in the analysis. This method preserves the distributional relationships in the data, including conditional skewness and kurtosis, and successfully handles bounded outcomes. Our motivating study concerns the analysis of birthweight determinants in a large UK-based cohort of children. A novel finding on the parental conflict theory is reported. R code implementing these procedures is provided.

Keywords: Khmaladze tests; chained equations; multiple imputation; paediatrics; weights.

MeSH terms

  • Adult
  • Birth Weight*
  • Data Interpretation, Statistical*
  • Female
  • Humans
  • Infant, Newborn
  • Male
  • Models, Statistical
  • Pregnancy
  • Regression Analysis
  • Surveys and Questionnaires*
  • United Kingdom