A novel hybrid feature selection strategy in quantitative analysis of laser-induced breakdown spectroscopy

Anal Chim Acta. 2019 Nov 8:1080:35-42. doi: 10.1016/j.aca.2019.07.012. Epub 2019 Jul 9.

Abstract

Laser-induced breakdown spectroscopy (LIBS) has been recognized as a significant tool for quantitative analysis of elements with its unique advantages, especially in combination with multivariate calibration methods. However, LIBS spectra typically include large amounts of high-dimensional data that affect the predictive accuracy of multivariate calibration based on LIBS. Feature selection, as an important preprocessing step in data mining, can improve the performance of the multivariate calibration model by eliminating the redundant and irrelevant features. In this study, a hybrid feature selection method based on V-WSP-PSO was proposed to improve the accuracy of LIBS analysis. The proposed method combines the advantages of V-WSP based filter method and particle swarm optimization (PSO) based wrapper method. The uncorrelated and redundant features were first eliminated by V-WSP method to form a simplified input subset, and then the retained features were further refined by PSO method to find a small set of features with high predictive accuracy. In order to evaluate the performance of proposed method, LIBS experiments were performed using 28 coal samples, and a nonlinear multivariate calibration method based on kernel extreme learning machine (KELM) was selected to implement the proposed hybrid feature selection method for calorific value determination of coal. Comparing the proposed method with several other feature selection methods shows that the V-WSP-PSO method is best in terms of number of selected features and predictive accuracy. Finally, 114 features were selected from full spectrum (27620 features) by V-WSP-PSO method and the best root mean square error of cross validation (RMSECV) and determination coefficient of cross validation (RCV2) were 0.4013 MJ/kg and 0.9908, root mean square error of prediction (RMSEP) and determination coefficient of prediction (RP2) were 0.3534 MJ/kg and 0.9894. The overall results demonstrate that the V-WSP-PSO method is more efficient to reduce the redundant features, calculation time and improve the model performance, and it is a good alternative for feature selection in multivariate calibration.

Keywords: Hybrid feature selection; Kernel extreme learning machine; Laser-induced breakdown spectroscopy; Particle swarm optimization; V-WSP.