The spike-and-slab lasso and scalable algorithm to accommodate multinomial outcomes in variable selection problems

Justin M Leach; Nengjun Yi; Inmaculada Aban; The Alzheimer's Disease Neuroimaging Initiative

doi:10.1080/02664763.2023.2258301

The spike-and-slab lasso and scalable algorithm to accommodate multinomial outcomes in variable selection problems

J Appl Stat. 2023 Sep 14;51(11):2039-2061. doi: 10.1080/02664763.2023.2258301. eCollection 2024.

Authors

Justin M Leach¹, Nengjun Yi¹, Inmaculada Aban¹, The Alzheimer's Disease Neuroimaging Initiative¹

Affiliation

¹ Department of Biostatistics, University of Alabama at Birmingham, Birmingham, AL, USA.

Abstract

Spike-and-slab prior distributions are used to impose variable selection in Bayesian regression-style problems with many possible predictors. These priors are a mixture of two zero-centered distributions with differing variances, resulting in different shrinkage levels on parameter estimates based on whether they are relevant to the outcome. The spike-and-slab lasso assigns mixtures of double exponential distributions as priors for the parameters. This framework was initially developed for linear models, later developed for generalized linear models, and shown to perform well in scenarios requiring sparse solutions. Standard formulations of generalized linear models cannot immediately accommodate categorical outcomes with > 2 categories, i.e. multinomial outcomes, and require modifications to model specification and parameter estimation. Such modifications are relatively straightforward in a Classical setting but require additional theoretical and computational considerations in Bayesian settings, which can depend on the choice of prior distributions for the parameters of interest. While previous developments of the spike-and-slab lasso focused on continuous, count, and/or binary outcomes, we generalize the spike-and-slab lasso to accommodate multinomial outcomes, developing both the theoretical basis for the model and an expectation-maximization algorithm to fit the model. To our knowledge, this is the first generalization of the spike-and-slab lasso to allow for multinomial outcomes.

Keywords: Bayesian variable selection; elastic net; generalized linear models; multinomial outcomes; spike-and-slab.

Grants and funding

U01 AG024904/AG/NIA NIH HHS/United States