Doubly structured sparsity for grouped multivariate responses with application to functional outcome score modeling

Stat Med. 2023 Jul 10;42(15):2619-2636. doi: 10.1002/sim.9740. Epub 2023 Apr 9.

Abstract

This work is motivated by the need to accurately model a vector of responses related to pediatric functional status using administrative health data from inpatient rehabilitation visits. The components of the responses have known and structured interrelationships. To make use of these relationships in modeling, we develop a two-pronged regularization approach to borrow information across the responses. The first component of our approach encourages joint selection of the effects of each variable across possibly overlapping groups of related responses and the second component encourages shrinkage of effects towards each other for related responses. As the responses in our motivating study are not normally-distributed, our approach does not rely on an assumption of multivariate normality of the responses. We show that with an adaptive version of our penalty, our approach results in the same asymptotic distribution of estimates as if we had known in advance which variables have non-zero effects and which variables have the same effects across some outcomes. We demonstrate the performance of our method in extensive numerical studies and in an application in the prediction of functional status of pediatric patients using administrative health data in a population of children with neurological injury or illness at a large children's hospital.

Keywords: fused lasso; hierarchical sparsity; high dimensional data; risk prediction; variable selection.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Child
  • Humans
  • Rehabilitation*
  • Routinely Collected Health Data*