Elastic net grouping variable selection combined with partial least squares regression (EN-PLSR) for the analysis of strongly multi-collinear spectroscopic data

Appl Spectrosc. 2011 Apr;65(4):402-8. doi: 10.1366/10-06069.

Abstract

In this paper a novel wavelength region selection algorithm, called elastic net grouping variable selection combined with partial least squares regression (EN-PLSR), is proposed for multi-component spectral data analysis. The EN-PLSR algorithm can automatically select successive strongly correlated prediction variable groups related to the response variable using two steps. First, a portion of the correlated predictors are selected and divided into subgroups by means of the grouping effect of elastic net estimation. Then, a recursive leave-one-group-out strategy is employed to further shrink the variable groups in terms of the root mean square error of cross-validation (RMSECV) criterion. The performance of the algorithm with real near-infrared (NIR) spectroscopic data sets shows that the EN-PLSR algorithm is competitive with full-spectrum PLS and moving window partial least squares (MWPLS) regression methods and it is suitable for use with strongly correlated spectroscopic data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology / methods*
  • Data Interpretation, Statistical*
  • Databases, Factual
  • Gasoline
  • Least-Squares Analysis*
  • Reproducibility of Results
  • Zea mays / chemistry

Substances

  • Gasoline