An Oversampling Method of Unbalanced Data for Mechanical Fault Diagnosis Based on MeanRadius-SMOTE

Sensors (Basel). 2022 Jul 10;22(14):5166. doi: 10.3390/s22145166.

Abstract

With the development of machine learning, data-driven mechanical fault diagnosis methods have been widely used in the field of PHM. Due to the limitation of the amount of fault data, it is a difficult problem for fault diagnosis to solve the problem of unbalanced data sets. Under unbalanced data sets, faults with little historical data are always difficult to diagnose and lead to economic losses. In order to improve the prediction accuracy under unbalanced data sets, this paper proposes MeanRadius-SMOTE based on the traditional SMOTE oversampling algorithm, which effectively avoids the generation of useless samples and noise samples. This paper validates the effectiveness of the algorithm on three linear unbalanced data sets and four step unbalanced data sets. Experimental results show that MeanRadius-SMOTE outperforms SMOTE and LR-SMOTE in various evaluation indicators, as well as has better robustness against different imbalance rates. In addition, MeanRadius-SMOTE can take into account the prediction accuracy of the overall and minority class, which is of great significance for engineering applications.

Keywords: MeanRadius-SMOTE; mechanical fault diagnosis; minority class; unbalanced data set.