Nested case-control data analysis using weighted conditional logistic regression in The Environmental Determinants of Diabetes in the Young (TEDDY) study: A novel approach

Diabetes Metab Res Rev. 2020 Jan;36(1):e3204. doi: 10.1002/dmrr.3204. Epub 2019 Jul 31.

Abstract

Background: A nested case-control (NCC) design within a prospective cohort study can realize substantial benefits for biomarker studies. In this context, it is natural to consider the sample availability in the selection of controls to minimize data loss when implementing the design. However, this violates the randomness required for selection, and it leads to biased analyses. An inverse probability weighting may improve the analysis, but the current approach using weighted Cox regression fails to maintain the benefits of NCC design.

Methods: This paper introduces weighted conditional logistic regression. We illustrate our proposed analysis using data recently investigated in The Environmental Determinants of Diabetes in the Young (TEDDY). Considering the potential data loss, the TEDDY NCC design was moderately selective in its selection of controls. A data-driven simulation study was performed to present the bias correction when a nonrandom control selection was ignored in the analysis.

Results: The TEDDY data analysis showed that the standard analysis using conditional logistic regression estimated the parameter: -0.015 (-0.023, -0.007). The biased estimate using Cox regression was -0.011 (95% confidence interval: -0.019, -0.003). Weighted Cox regression estimated -0.013 (-0.026, 0.0004). The proposed weighted conditional logistic regression estimated -0.020 (-0.033, -0.007), showing a stronger negative effect size than the one using conditional logistic regression. The simulation study also showed that the standard estimate of β ignoring the nonrandom control selection tends to be greater than the true β (ie, positive relative biases).

Conclusion: Weighted conditional logistic regression can enhance the analysis by offering flexibility in the selection of controls, while maintaining the matching.

Keywords: inverse probability weighting; nested case-control design; prospective cohort study; selection bias; weighted conditional logistic regression.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Adolescent
  • Case-Control Studies
  • Child
  • Child, Preschool
  • Computer Simulation
  • Diabetes Mellitus / epidemiology*
  • Diabetes Mellitus / physiopathology*
  • Environment*
  • Female
  • Follow-Up Studies
  • Humans
  • Infant
  • Male
  • Models, Statistical*
  • Patient Selection
  • Prognosis
  • Prospective Studies
  • Regression Analysis
  • Social Determinants of Health*