Semiparametric marginal regression for clustered competing risks data with missing cause of failure

Biostatistics. 2023 Jul 14;24(3):795-810. doi: 10.1093/biostatistics/kxac012.

Abstract

Clustered competing risks data are commonly encountered in multicenter studies. The analysis of such data is often complicated due to informative cluster size (ICS), a situation where the outcomes under study are associated with the size of the cluster. In addition, the cause of failure is frequently incompletely observed in real-world settings. To the best of our knowledge, there is no methodology for population-averaged analysis with clustered competing risks data with an ICS and missing causes of failure. To address this problem, we consider the semiparametric marginal proportional cause-specific hazards model and propose a maximum partial pseudolikelihood estimator under a missing at random assumption. To make the latter assumption more plausible in practice, we allow for auxiliary variables that may be related to the probability of missingness. The proposed method does not impose assumptions regarding the within-cluster dependence and allows for ICS. The asymptotic properties of the proposed estimators for both regression coefficients and infinite-dimensional parameters, such as the marginal cumulative incidence functions, are rigorously established. Simulation studies show that the proposed method performs well and that methods that ignore the within-cluster dependence and the ICS lead to invalid inferences. The proposed method is applied to competing risks data from a large multicenter HIV study in sub-Saharan Africa where a significant portion of causes of failure is missing.

Keywords: Clustered data; Competing risks; Informative cluster size; Missing cause of failure.

Publication types

  • Multicenter Study

MeSH terms

  • Computer Simulation
  • Humans
  • Incidence
  • Likelihood Functions
  • Models, Statistical*
  • Proportional Hazards Models