Martingale-residual-based greedy model averaging for high-dimensional current status data

Stat Med. 2024 Apr 30;43(9):1726-1742. doi: 10.1002/sim.10037. Epub 2024 Feb 21.

Abstract

Current status data are a type of failure time data that arise when the failure time of study subject cannot be determined precisely but is known only to occur before or after a random monitoring time. Variable selection methods for the failure time data have been discussed extensively in the literature. However, the statistical inference of the model selected based on the variable selection method ignores the uncertainty caused by model selection. To enhance the prediction accuracy for risk quantities such as survival probability, we propose two optimal model averaging methods under semiparametric additive hazards models. Specifically, based on martingale residuals processes, a delete-one cross-validation (CV) process is defined, and two new CV functional criteria are derived for choosing model weights. Furthermore, we present a greedy algorithm for the implementation of the techniques, and the asymptotic optimality of the proposed model averaging approaches is established, along with the convergence of the greedy averaging algorithms. A series of simulation experiments demonstrate the effectiveness and superiority of the proposed methods. Finally, a real-data example is provided as an illustration.

Keywords: asymptotic optimality; current status data; greedy algorithm; martingale‐residuals process; prediction.

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Humans
  • Models, Statistical*
  • Probability
  • Proportional Hazards Models