Harnessing unlabeled data: Enhanced rare earth component content prediction based on BiLSTM-Deep autoencoder

ISA Trans. 2025 Jan 2:S0019-0578(24)00612-8. doi: 10.1016/j.isatra.2024.12.027. Online ahead of print.

Abstract

Traditional data-driven models for predicting rare earth component content are primarily developed by relying on supervised learning methods, which suffer from limitations such as a lack of labeled data, lagging, and poor usage of a major amount of unlabeled data. This paper proposes a novel prediction approach based on the BiLSTM-Deep autoencoder enhanced traditional LSSVM algorithm, termed BiLSTM-DeepAE-LSSVM. This approach thoroughly exploits the implicit information contained in copious amounts of unlabeled data in the rare earth production process, thereby improving the traditional supervised prediction method and increasing the accuracy of component content predictions. Initially, a BiLSTM autoencoder is established for unsupervised training on the rare earth production process data, enabling the extraction of inherent time series characteristics. Subsequently, boolean vectors are introduced in the Deep autoencoder training process to perform masking operations on the input data, simulating scenarios with noise and missing data. This is facilitated by their adherence to Bernoulli distributions, which allow for the random setting of certain input vector dimensions to zero. Additionally, the Deep autoencoder is capable of extracting high-dimensional implicit features from the data. After that, the conventional supervised prediction technique, least squares support vector machine (LSSVM), is fused with the implicit characteristics derived from the well-constructed BiLSTM-Deep autoencoder, culminating in the creation of a prediction model for rare earth component content. Ultimately, the simulation verification using LaCe/PrNd extraction field data demonstrates the effectiveness of the proposed approach in harnessing substantial quantities of unlabeled data from the rare earth extraction production process, thereby bolstering the accuracy of model predictions.

Keywords: BiLSTM-deep autoencoder; Fusion prediction; Prediction of rare earth component content; Time series characteristics; Unsupervised training.