A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator

Li, Nan; Zeng, Dejun; Zhao, Yun; Wang, Jiahao; Wang, Bo

doi:10.3390/pr12102129

Open AccessArticle

A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator

by

Nan Li

¹,

Dejun Zeng

¹,

Yun Zhao

¹,

Jiahao Wang

^2,3,* and

Bo Wang

^2,3

¹

State Grid Hubei Electric Power Co., Ltd., Jingmen Power Supply Company, Jingmen 448000, China

²

Hubei Key Laboratory of Power Equipment & System Security for Integrated Energy, Wuhan University, Wuhan 430072, China

³

School of Electrical and Automation, Wuhan University, Wuhan 430072, China

^*

Author to whom correspondence should be addressed.

Processes 2024, 12(10), 2129; https://doi.org/10.3390/pr12102129

Submission received: 14 August 2024 / Revised: 9 September 2024 / Accepted: 12 September 2024 / Published: 30 September 2024

(This article belongs to the Special Issue Research on Intelligent Fault Diagnosis Based on Neural Network)

Download

Browse Figures

Versions Notes

Abstract

:

As the scale of the power grid rapidly expands, its operation becomes increasingly complex, with higher demands on personnel proficiency, grid stability, equipment safety, and operational efficiency. In this study, a novel power insulator defect detection method based on convolutional neural networks (CNNs) is proposed. This method innovatively combines the feature extraction advantages of deep learning to build an efficient binary classification model capable of accurately detecting defects in power insulators in complex backgrounds. To avoid the impact of a small dataset on model performance, transfer learning was employed during model training to enhance the model’s generalization ability. A combination of Grid Search and Random Search was used for hyperparameter tuning, and the Early Stopping strategy was introduced to effectively prevent the model from overfitting to the training set, ensuring generalization performance on the validation set. Experimental results show that the proposed method achieves an average accuracy of 98.6%, a recall of 96.8%, and an F1 score of 97.7% on the test set. Compared to traditional Faster RCNN and PCA-SVM methods, the proposed CNN model significantly improves detection accuracy and computational efficiency in complex backgrounds, exhibiting superior recognition precision and model generalization ability for efficiently and accurately identifying defective insulators.

Keywords:

power insulator; convolutional neural networks; identification method; insulator defect recognition

1. Introduction

As UAV technology advances and gains widespread adoption, visual image recognition facilitated by UAVs has emerged as the cornerstone of intelligent inspection methodologies [1,2]. Insulators, being a vital component of high-voltage transmission lines, are prone to defects that undermine transmission stability and pose safety hazards [3,4]. Consequently, regular inspections of insulators are imperative in the maintenance of power equipment. Current inspection protocols typically involve photographing or visually assessing insulators using drones or handheld filming devices operated by maintenance personnel. However, post-inspection, manually reviewing these images to identify insulator faults is a time-consuming, error-prone process prone to misdetections and oversights [5]. Thus, developing an intelligent method for identifying power insulator defects holds significant importance for ensuring power production safety.

Convolutional neural networks represent a formidable tool for feature extraction and classification tasks [6,7,8,9,10]. Research in [11] contributed an ultra-high-resolution optical remote sensing image dataset tailored for small target detection and introduced a CNN-driven algorithm to enhance the fidelity of small target identification. In [12], a nonparametric mobile pose recognition model was proposed for low-latency character action recognition, leveraging pose information. The author of [13] achieved a 91.86% accuracy in recognizing human actions from depth images and pose data, utilizing varied inputs. Ref. [14] combined depth sequences with sparse coding to mitigate noise and occlusion, introducing a depth map sampling-based human motion model for robust recognition. To mitigate overfitting, Ref. [15] trained CNNs of varying depths on grayscale images, integrating raw pixel data with oriented gradient histogram features. While [16] amassed features to boost accuracy, it encountered erroneous classifications due to feature similarities among actions with subtle visual differences. Depth-graph features, however, aid in classifying similar actions performed differently, disregarding inconsequential feature variations. In [17], a sophisticated CNN model incorporating spiking temporal neural coding through sequential convolutional and pooling layers was presented, albeit requiring numerous spikes and lengthy processing times per image. Image classification entails determining the presence or absence of a specific object within an image, hinging on feature extraction and classification. Traditional approaches often involve manually designing or learning features to globally describe images, followed by classification with support vector machines or shallow neural networks. This manual feature selection and subsequent classification, though fundamental, falls short of accuracy demands, particularly when confronted with large-scale problems where shallow neural networks struggle.

To address the challenges of recognition accuracy under environmental variability and high noise conditions, the author of [18] proposes a machine learning method that combines k-medoids clustering with a new damage indicator, employing an adaptive probabilistic approach based on extreme value theory to effectively tackle environmental changes and noise in bridge health monitoring, significantly improving early damage detection accuracy. The author of [19] introduces the sparse index and variation coefficient method to optimize wavelet threshold denoising techniques, demonstrating excellent performance in concrete strain monitoring data from the Hong Kong–Zhuhai–Macao Bridge tunnel, providing a reliable denoising parameter selection scheme for critical infrastructure health monitoring. Additionally, the author of [20] presents an adaptive thresholding and coordinate attention-based tree-inspired network for aero-engine bearing health monitoring under strong noise conditions, combining adaptive thresholding, coordinate attention modules, and a tree-structured decision layer, achieving remarkable results in feature extraction and fault localization, particularly excelling in hierarchical fault detection and diagnostic accuracy. Similarly, the author of [21] proposes a strategy to reduce temporal redundancy in sensor data, introducing an online confidence threshold adjustment mechanism that flexibly balances performance and cost without retraining the model. Based on this, recent research has also provided effective solutions for noise issues in sensor data. The author of [22] introduces a new self-supervised learning approach through a masked convolutional autoencoder, significantly enhancing feature extraction in noisy environments. Furthermore, the author of [23] proposes a multi-frequency channel attention framework, utilizing global average pooling to compress each channel’s features, effectively reducing computational complexity without compromising model performance. However, many current methods still face limitations when handling weakly labeled data. To address this, an attention-based human activity recognition method is proposed in [24], incorporating attention mechanisms within a CNN architecture to handle weakly labeled activity data, significantly improving model robustness and recognition accuracy. Despite these advances, achieving a low computational cost and high robustness under strong noise conditions remains an area for further exploration [25]. To this end, this paper proposes a novel model that integrates these technical approaches, aiming to enhance computational efficiency and robustness while addressing the challenges of strong noise interference and sensor data diversity.

Traditional insulator inspection heavily relies on manual processes like ground-based visual checks, instrument measurements, and physical pole climbing, which are not only time-consuming and labor-intensive but also financially burdensome. In contrast, deep learning-powered target detection techniques have gained widespread adoption for insulator defect recognition. Nevertheless, existing detection algorithms predominantly necessitate vast defect samples for model training, limiting their accuracy in scenarios with limited sample sizes. This paper addresses this challenge by introducing an automated insulator identification technology leveraging machine learning principles. Specifically, we propose a convolutional neural network (CNN)-based approach for identifying defects in power insulators. The contributions of this paper are as follows:

(1) A novel power insulator defect detection method based on CNN is proposed. Unlike traditional detection techniques, this study innovatively combines the feature extraction advantages of deep learning to build an efficient binary classification model that can accurately identify defects in power insulators under complex backgrounds.

(2) To avoid the performance limitations caused by small dataset sizes, transfer learning is employed during model training to enhance the generalization ability of the model. Additionally, to further improve the model’s performance in small-sample training, Mini-batch Stochastic Gradient Descent (SGD) is used, combined with L2 regularization and dropout techniques. Mini-batch SGD stabilizes the gradient descent process and reduces fluctuations during training, while L2 regularization and dropout prevent overfitting, enhancing the model’s robustness on small datasets.

(3) A combination of Grid Search and Random Search is utilized for hyperparameter tuning. By systematically searching the key hyperparameters—learning rate, batch size, and convolution kernel size—the optimal combination is identified based on validation set performance. An Early Stopping strategy is also introduced to effectively prevent overfitting on the training set and ensure the model’s generalization performance on the validation set.

The organization of this paper is as follows: Section 1 constructs an efficient binary classification model; Section 2 introduces the convolutional neural network-based defect recognition method for power insulators; Section 3 is the simulation verification; and Section 4 is the conclusion.

2. Binary Classification Model Construction

In this paper, based on the CNN model, the features of the input image are used to recognize the corresponding category of the input image and to recognize the insulator [18,19].

For n input images, defined as

I_{total} = \{I_{1}, I_{2}, \dots, I_{n}\}

, the image without a defect is defined as

I_{r}

, and the image with s defect is defined as

I_{v}

. For p data sources, the image

I_{r}

without a defect is denoted as:

I_{r} = [R_{i = 1}^{k = 1}, R_{i = 2}^{k = 1}, \dots, R_{i = x}^{k = 1}, R_{i = 1}^{k = 2}, \dots, R_{i = x}^{k = 2}, \dots, R_{i = x}^{k = p}]

(1)

where

R_{i = x}^{k = p}

denotes the set of the pth dataset containing x images without a defect.

For the image

I_{v}

with a defect, for q data sources, the image

I_{d f}

with a defect is denoted as:

I_{v} = [R_{i = 1}^{j = 1}, R_{i = 2}^{j = 1}, \dots, R_{i = z}^{j = 1}, R_{i = 1}^{j = 2}, \dots, R_{i = z}^{j = 2}, \dots, R_{i = l}^{j = q}]

(2)

where

R_{i = l}^{j = q}

denotes the set containing z images of the presence of a defect in the qth dataset.

The label of the corresponding class can be defined as:

Y = [y_{1}, y_{2}, \dots, y_{m}]

(3)

where

m

denotes the total number of images in

I_{total}

. The proposed architecture belongs to the binary classification problem, which is recognized as

I_{r}

when y = 0, and

I_{v}

when y = 1.

To adapt to the classification task in this paper, each category of the output is converted to a value between [0, 1], which satisfies all classification probabilities summing to 1. The

S o f t m a x

function can be expressed as:

S o f t m a x (z_{i}) = \frac{e^{z_{i}}}{\sum_{c = 1}^{C} e^{z} c}

(4)

This paper uses the ReLU activation function behind each convolutional layer. ReLU makes the network sparser, reduces parameter dependency, and alleviates overfitting. The ReLU function can be expressed as:

z = R e L U (\sum_{i = 1} w_{i} x_{i} + h)

(5)

3. Convolutional Neural Network-Based Defect Recognition Method for Power Insulators

The convolutional neural network (CNN) is comprised, primarily, of three fundamental layers: a convolutional layer, a pooling layer, and a dense layer, each contributing significantly to its widespread adoption in the realm of image recognition. CNNs harness robust convolutional and pooling operations to effectively extract salient image features and seamlessly fuse them, leading to unparalleled recognition efficiency and performance, as documented in various studies [26,27,28,29]. Within the convolutional layer, an array of distinct convolutional kernels is employed to convolve with the input data, subsequently applying a bias and passing the result through an activation function to generate multiple feature maps. This computational flow, encapsulating the core principle of convolution, is visually represented in Figure 1.

x_{j}^{l} = f (\sum_{i \in M_{j}} x_{i}^{l - 1} * W_{i j}^{l} + b_{i}^{l})

(6)

where

x_{j}^{l}

is the j-feature map of layer l;

f (•)

is an activation function.

Subsequent to the convolutional layer, a flattening layer is employed to transform all feature maps into a single, one-dimensional array. The pooling layer, which receives its input from the preceding convolutional layer, plays a pivotal role in enhancing the model’s robustness and reducing the parameter count, thereby mitigating the risk of overfitting. This layer encompasses two primary variants: maximum pooling and average pooling, both aimed at performing a secondary dimensionality reduction on the feature maps to further safeguard against overfitting during training. Notably, the pooling layer typically lacks learnable parameters, simplifying the backpropagation process by requiring only the derivation of input parameters, with no need for weight updates. In the network architecture designed in this paper, the pooling layer utilizes max-pooling, selecting the maximum value within an image region as the representative pooled value for that region. The functionality of the pooling layer can be mathematically expressed as follows:

P_{i} = down (x_{j}^{l}) + ϑ_{i}

(7)

where

P_{i}

is the output of the i-th pooling layer;

down (•)

is the pooling function; and

ϑ_{i}

is the bias vector of the i-th pooling layer.

The ResNet architecture revolutionizes convolutional neural networks (CNNs) by transcending depth-related limitations. Its innovative residual structure facilitates an increase in network depth while maintaining or even enhancing model performance. Prior to ResNet, CNNs were primarily constructed by stacking convolutional and pooling layers. With increasing depth, the features extracted became progressively more sophisticated and abstract, accompanied by a surge in model parameters and an augmented capacity to fit the data. While deeper CNNs generally outperform shallower ones, their ever-growing depth also leads to a substantial increase in parameters, complicating the training process and rendering it more susceptible to issues such as gradient vanishing and explosion.

To address these challenges, ResNet introduces batch normalization and a groundbreaking residual module. The residual module enables the neural network to bypass neuron connections to subsequent layers, diminishing the strong interlayer dependencies. Consequently, even as the network depth increases, the model’s performance improves rather than deteriorates, as the residual connections facilitate the effective flow of information throughout the network.

During the training of a convolutional neural network (CNN) model, the constantly evolving parameters of each layer result in fluctuations in the probability distribution of the inputs to subsequent layers. This fluctuation triggers saturation phenomena in the positive and negative domains of the nonlinear activation function, significantly slowing down the training process. To tackle this issue, batch normalization is employed. In each iteration of stochastic gradient descent, batch normalization operates on the smallest batch of samples, ensuring that the mean of each dimension in the output is set to 0 and the variance is normalized to 1. This strategy not only circumvents the saturation problem of the nonlinear activation function but also minimizes the influence of parameters and their initial values on gradient changes, thereby fostering a more efficient and stable training procedure. Assuming that the current hidden layer is normalized, since there are d inputs, its kth dimensional output is:

x^{k} = \frac{x_{_} k - μ_{-} β}{\sqrt{δ_{-} β^{2} + ε}}

(8)

The insulator recognition process based on CNN is as follows:

First, the image is preprocessed by rescaling the image into the input array, repositioning the image (rotating, flipping horizontally and vertically), and scaling. An input image of size (480, 480, 3) is obtained.

In the first module, smaller filters preserve the high-level features of the image, so 6 (3, 3) filters are used for 2D convolution operation using Leaky ReLU as an activation function. The distribution of the input batches may vary depending on the type of images contained in the different batches, thus causing problems with the convergence of the optimizer algorithm and leading to an unstable training process. For this reason, the feature maps are batch normalized to accelerate the training convergence and reduce the dependency on weight initialization.

Then, the feature maps of size (480, 480, 6) are batch normalized and passed to the second module. This module is capable of extracting deeper image features. As deeper-level features are extracted, the size of the feature maps will increase the computation. For this reason, an average pooling layer of size (2, 2) is used, thereby reducing the dimension. The second module yields a feature map of size (240, 240, 12).

The third module takes the same architecture as the second module, with the difference of using three consecutive convolutional layers for 32 filters of size (3, 3), using Leaky ReLU as the activation function. Batch normalization and average pooling layer operations are then performed and the feature map size becomes (120, 120, 24).

The feature map of size (120, 120, 24) is passed to the fourth module. The fourth module has four layers of 64 consecutive filters of size (3, 3). Then batch normalization and average pooling layer operations are performed and the feature map size becomes (60, 60, 48).

The feature map of size (60, 60, 48) is passed to the fifth module. Based on the above four modules, the extracted depth image features can be used to classify the images as depth-containing defective insulator images. The fifth module uses a large filter, and 64 filters are used to perform the convolution operation. Then, the feature map size becomes (30, 30, 96).

The feature map of size (30, 30, 96) is passed to the sixth module. Then, the batch normalization and average pooling layer operations are performed, and the feature map size becomes (15, 15, 192).

In the seventh module, after the flat layer, half of the input units are randomly set to zero using a discarded layer with a value of 0.5, thus avoiding overfitting the training data. The eighth module uses 32 neurons and half of the input units are randomly set to zero using a discarded layer with a value of 0.5. The ninth module uses 16 neurons and half of the input units are randomly set to zero using a discarded layer with a value of 0.5. The tenth module uses 16 neurons and half of the input units are randomly set to zero using a discarded layer with a value of 0.5.

The eleventh module uses a single neuron and an output layer with an S-shaped activation function to identify whether the input image contains a defective insulator. If the value is less than 0.5, the recognition output results in no defect; otherwise, it results in a defect. The insulator recognition process based on CNN is shown in Figure 2.

4. Case Study

The Chinese Power Line Insulator Dataset (CPLID) is selected for experimental validation, and the images are divided into two categories: normal insulators and defective insulators [30]. CPLID contains a total of 1184 aerial images of composite insulators. In this study, 70% of the dataset was used for training, 15% for validation, and 15% for testing. The training set was utilized to learn the model parameters, the validation set was applied to monitor model performance and adjust hyperparameters, and the test set was reserved for the final evaluation of the model’s performance.

To prevent the performance from being impacted by the small dataset size, transfer learning was employed during the training process to enhance the generalization ability of the model. Pre-trained models on large datasets (such as ImageNet) were used, allowing the features learned in the pre-training process to be transferred to the new task and fine-tuned accordingly. Specifically, DenseNet, GoogLeNet, and ResNet models, which had been pre-trained on large image classification datasets like ImageNet, were selected as the base models and fine-tuned on the CPLID dataset. This approach allowed the model to benefit from the rich feature representations obtained from large datasets, leading to improved generalization performance on the smaller dataset. The steps of transfer learning were as follows:

Step 1: Loading the pre-trained model. The model pre-trained on large datasets such as ImageNet was used as the initial set of weights.

Step 2: Freezing the convolutional layers. Most of the convolutional layers in the model were frozen to retain the general image features learned during the pre-training phase.

Step 3: Fine-tuning the model. Only the final fully connected layers were trained to adapt to the specific task of power insulator defect detection.

To further enhance the model’s performance in small-sample training, Mini-batch Stochastic Gradient Descent (SGD) was applied, combined with L2 regularization and dropout techniques. Mini-batch SGD was used to stabilize the gradient descent process and reduce fluctuations during training, while L2 regularization and dropout were employed to prevent overfitting, thus improving the model’s robustness on small datasets.

The loss function with L2 regularization can be expressed as:

γ (υ) = γ_{0} (υ) + λ {‖υ‖}_{2}^{2}

(9)

where

γ_{0} (υ)

is the original loss function,

λ

is the regularization coefficient, and

{‖υ‖}_{2}^{2}

is the L2 regularization term.

The samples of intact insulators and damaged insulators are shown in Figure 3. Model specific parameters are shown in Table 1. The experimental environment configuration is shown in Table 2. To reduce overfitting, four types of data augmentation were randomly applied: rotation (ranging from 0 to 23 degrees), horizontal flip, width offset (0–20%), and height offset (0–20%).

Grid Search and Random Search were combined for hyperparameter tuning [31,32]. Systematic searches of three key hyperparameters—learning rate, batch size, and kernel size—were conducted to identify the optimal combination on the validation set. To further improve search efficiency, Random Search was applied to adjust some hyperparameters, expanding the search space. After each adjustment, the best hyperparameter combination was selected based on performance metrics from the validation set, such as accuracy, recall, and F1 score, ensuring the best performance on the validation set. The hyperparameters being searched include:

Learning Rate: The search range is {0.001, 0.005, 0.01, 0.05}.

Batch Size: The search range is {32, 64, 128}.

Kernel Size: The search range is {3 × 3, 5 × 5, 7 × 7}.

The loss function is defined as

L (θ; D_{t r a i n})

, where

θ

represents the model parameters. The hyperparameter tuning was aimed at minimizing the loss on the validation set.

θ^{*} = a r g \min_{θ} L (θ; D_{t r a i n})

(10)

Let the update rule of the model be:

θ_{t + 1} = θ_{t - η} \nabla_{θ} L (θ_{t}; D_{t r a i n})

(11)

where

η

is the learning rate.

Based on the accuracy and loss function performance on the validation set, the optimal hyperparameter combination was selected, with a learning rate of 0.01, batch size of 64, and kernel size of 3 × 3. This combination resulted in the best model performance on the validation set.

During the training process, the Early Stopping strategy was introduced. When the loss on the validation set no longer decreased, the training process was automatically terminated. This strategy effectively prevented the model from overfitting to the training set and ensured the generalization performance on the validation set. If the change in the loss function

L (θ; D_{v a l})

of the validation set satisfies the following conditions:

\frac{1}{T} \sum_{t = 1}^{T} Δ L (θ_{t}; D_{v a l}) \geq 0

, then training is stopped. By employing the Early Stopping strategy, the model was able to halt the training process at an earlier stage, avoiding unnecessary computational resource waste and ensuring good generalization ability.

The DenseNet model, GoogLeNet model, and ResNet model are selected, respectively, to recognize the images containing defective insulators and compare and analyze the recognition effect of each model. This paper compares and analyzes the recognition effect of these three models in terms of recognition rate and recognition speed, and determines the optimal model based on the evaluation results. The default hyperparameter settings of each model are shown in Table 3.

The evaluation results of the ability of the three neural network models to recognize the validation set are shown in Table 4.

From Table 4, it can be seen that the DenseNet model has 96.34%, 78.13%, and 97.68% mean values of precision, recall, and F1 values for each class. The GoogLeNet model has 96.37%, 79.22%, and 97.73% mean values of precision, recall, and F1 values for each class. The ResNet model has 96.53%, 86.89%, and 97.84% mean values of precision, recall, and F1 values for each class. Compared with the GoogLeNet model and the DenseNet model, the ResNet model shows the best recognition results, indicating that it can better balance precision and recall in recognizing insulator defect.

The recognition speed is taken as the average of the recognition speed of the images in the test set, and the recognition speeds of the DenseNet model, GoogLeNet model, and ResNet model are shown in Table 5.

In Table 5, we can see that the recognition speeds of the ResNet model and GoogLeNet model are similar, in which it takes 10.38 ms and 10.14 ms to recognize an original image, respectively, and the recognition speed of the DenseNet model is the lowest, which is 17.35 ms. Combining the recognition rate and the recognition speed, the ResNet model is the optimal model for recognizing a defective insulator, and it meets the real-time recognition requirements.

DenseNet achieves efficient feature reuse through dense connections, which allows it to perform well in small-sample training. However, due to its large number of parameters, its training speed is relatively slow, and it consumes significant computational resources. GoogLeNet utilizes the Inception module to extract multi-scale features, improving its accuracy in image classification tasks, though its ability in handling deep and complex tasks is inferior to ResNet. In contrast, ResNet, through residual networks, maintains depth while achieving higher accuracy with lower computational costs, demonstrating better generalization performance on large-scale datasets and complex scenarios. Through the above comparative analysis, the ResNet model is adopted for the identification of power insulators. The convergence curve of the loss function is shown in Figure 4, and the accuracy curve is shown in Figure 5.

Observing Figure 4, the curve initially plummets rapidly during training, signifying that the learning rate employed in this study is suitably calibrated. As the gradient descent progresses, the convergence curve stabilizes, evidencing a steady convergence of the model throughout the training phase.

Figure 5 reveals that, initially, when the iteration count is less than 25, the loss function value for the training set surpasses that of the validation set, accompanied by a lower accuracy for the training set compared to the validation set. This phenomenon suggests that the model is underfitted, necessitating an extension of the training period. As the iteration count increases, however, both the training and validation sets’ accuracy curves gradually ascend and plateau, accompanied by a decrease in loss values and a corresponding increase in accuracy. This trend underscores the model’s robust generalization capabilities.

The model is tested based on the test set, and the test results are shown in Figure 6 and Table 6.

From the detection results in Figure 6, it can be seen that the proposed method can accurately detect and recognize insulator defects at different angles under complex backgrounds, and achieve more excellent and stable recognition results. As can be seen from Table 6, the average precision, recall, and F1 values for each class are 99.64%, 86.78%, and 99.32%, respectively, which shows a high recognition accuracy and meets the practical engineering requirements.

Faster-RCNN, as a deep learning method, can effectively handle object detection tasks in complex backgrounds. In contrast, PCA-SVM, a traditional machine learning method, provides a reference for feature extraction and classification in small-sample scenarios. In the experiment, the performances of Faster-RCNN, PCA-SVM, and the method proposed in this paper were compared in the detection of defects in power insulators. Table 7 shows the accuracy comparison for the same parameters to verify the superiority of the method in this paper.

Table 7 shows that compared with the Faster-RCNN algorithm and PCA-SVM algorithm, the proposed method in this paper is more accurate, improving by 10.98% and 15.09%, respectively. It can accurately detect and recognize insulator defects at different angles under complex backgrounds.

5. Conclusions

In this paper, we present a CNN-based approach for power insulator defect recognition, utilizing the distinctive features of input images to accurately classify insulator defects. Our simulation outcomes demonstrate that the loss function’s convergence curve stabilizes progressively during training, indicating stable model convergence. Remarkable recognition performance is achieved, with average precision, recall, and F1 scores reaching 99.64%, 99.28%, and 99.32%, respectively. This high accuracy, coupled with our efforts to minimize the model size while maintaining precision and speed, enhances the efficiency of intelligent recognition monitoring devices.

While our current work showcases promising recognition outcomes, it is focused on a limited set of insulator defects and employs a basic optimization model tailored for straightforward image classification. The extension of our approach to handle a wider range of categories or finer-grained image classification remains unexplored. Future research will focus on collecting more diverse data, improving model architectures, introducing attention mechanisms, and further enhancing the model’s robustness through techniques such as transfer learning. This will enable its applicability to other types of industrial equipment detection tasks, particularly in environments with complex backgrounds and significant noise interference.

Author Contributions

Conceptualization, N.L., D.Z., Y.Z., J.W. and B.W.; methodology, N.L., D.Z., Y.Z., J.W. and B.W.; software, N.L., D.Z., Y.Z., J.W. and B.W.; writing—original draft preparation, N.L., D.Z., Y.Z., J.W. and B.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Science and Technology Project of State Grid Hubei Electric Power Company Limited (No. SGHBJMOOFCIS2310967).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Wu, X.; Li, X.; Zhang, T.; Wu, J. Risk and Emergency System Analysis in Power Engineering. Integr. Circuit Appl. 2023, 40, 216–217. [Google Scholar]
Wu, D. Research on Deep Convolutional Networks in Power Safety Production Management. Electr. Technol. Econ. 2023, 2, 19–21+35. [Google Scholar]
Wu, H.; Tang, Q. Analysis and Countermeasures of Adverse Factors for Efficient and Safe Operation of Xiangjiaba Hydropower Station. People’s Yangtze River 2022, 53, 148–151. [Google Scholar]
Huang, J.; Wang, Z. Design of Power Safety Production Monitoring System for Real time Data Analysis. Energy Environ. Prot. 2022, 44, 256–261. [Google Scholar]
Chang, L. Exploration of the Development of Electric Power Technology and Safety in Electric Power Production. Electr. Technol. Econ. 2023, 8, 210–212. [Google Scholar]
Long, Y. A Face Recognition Algorithm Based on Intermediate Layers Connected by the CNN. J. Circuits Syst. Comput. 2022, 31, 2250107. [Google Scholar] [CrossRef]
Ru, Y.; Sun, Y.; Zhu, W.; Li, Y. Recognition of Handwritten Capitalized Chinese Currency Amounts Based on CNN and Finite State Automata. Comput. Eng. 2021, 47, 304–312. [Google Scholar] [CrossRef]
Yu, H.; Gong, Z.; Zhang, H.; Zhou, S.; Yu, Z. Research on substation equipment identification and defect detection technology based on Faster R-CNN algorithm. Electr. Meas. Instrum. 2024, 61, 153–159. [Google Scholar]
Ye, J.; Tang, H.; Chen, J.; Que, J.; Zhai, M. Design of face recognition attendance system based on OpenCV and CNN. Inf. Comput. 2022, 34, 124–127. [Google Scholar]
Liu, B.; Li, X. Three-channel wavelet filter bank identification based on CNN and fusion target. Comput. Appl. Softw. 2024, 41, 209–215+285. [Google Scholar]
Liu, E.; Zhi, M. Fusion of CNN and Transformer for cross-age face recognition. J. Inn. Mong. Norm. Univ. 2024, 53, 53–60. [Google Scholar]
Zhu, M.; Ye, Y. Ankle motion pattern recognition based on convolutional neural network. J. Luoyang Inst. Technol. (Nat. Sci. Ed.) 2023, 33, 59–63. [Google Scholar]
Li, J.; Ban, X.; Yang, G.; Li, Y.; Wang, Y. Real-time human action recognition using depth motion maps and convolutional neural networks. Int. J. High Perform. Comput. Netw. 2019, 13, 312–320. [Google Scholar] [CrossRef]
Gao, C.; Tang, C.; Tong, A.; Wang, W. Infrared human behavior recognition based on hybrid CNN and LSTM model. J. Hefei Coll. (Gen. Ed.) 2023, 40, 77–85. [Google Scholar]
Liu, H.; Lin, H.; Chu, J.; Ye, J.; Lin, Q. GIS partial discharge mapping identification based on CWGAN-div and Mi-CNN. Zhejiang Electr. Power 2023, 42, 75–83. [Google Scholar]
Wang, Q.; Tian, J.; Li, M.; Lu, M. Text Classification Based on CNN-BiGRU and Its Application in Telephone Comments Recognition. Int. J. Comput. Intell. Appl. 2023, 22, 2350021. [Google Scholar] [CrossRef]
Zheng, Q.; Wang, R.; Tian, X.; Yu, Z.; Wang, H.; Elhanashi, A.; Saponara, S. A real-time transformer discharge pattern recognition method based on CNN-LSTM driven by few-shot learning. Electr. Power Syst. Res. 2023, 219, 109241. [Google Scholar] [CrossRef]
Dong, B.; Yang, J.; Ma, Y.; Zhang, X. Medical monitoring model of internet of things based on the adaptive threshold difference algorithm. Int. J. Multimed. Ubiquitous Eng. 2016, 11, 75–82. [Google Scholar] [CrossRef]
Jiang, X.; Lang, Q. An improved wavelet threshold denoising method for health monitoring data: A case study of the Hong Kong-Zhuhai-Macao Bridge immersed tunnel. Appl. Sci. 2022, 12, 6743. [Google Scholar] [CrossRef]
Zhao, D.; Cai, W.; Cui, L. Adaptive thresholding and coordinate attention-based tree-inspired network for aero-engine bearing health monitoring under strong noise. Adv. Eng. Inform. 2024, 61, 102559. [Google Scholar] [CrossRef]
Liang, J.; Zhang, L.; Han, C. A Collaborative Compression Scheme for Fast Activity Recognition on Mobile Devices via Global Compression Ratio Decision. IEEE Trans. Mob. Comput. 2024, 23, 3259–3273. [Google Scholar] [CrossRef]
Bu, C.; Zhang, L.; Cui, H.; Yang, G.; Wu, H. Dynamic Inference via Localizing Semantic Intervals in Sensor Data for Budget-Tunable Activity Recognition. IEEE Trans. Ind. Inform. 2024, 20, 3801–3813. [Google Scholar] [CrossRef]
Cheng, D.; Zhang, L.; Qin, L.; Wang, S.; Wu, H.; Song, A. MaskCAE: Masked Convolutional AutoEncoder via Sensor Data Reconstruction for Self-Supervised Human Activity Recognition. IEEE J. Biomed. Health Inform. 2024, 28, 2687–2698. [Google Scholar] [CrossRef] [PubMed]
Xu, S.; Zhang, L.; Tang, Y.; Han, C.; Wu, H.; Song, A. Channel Attention for Sensor-Based Activity Recognition: Embedding Features into all Frequencies in DCT Domain. IEEE Trans. Knowl. Data Eng. 2023, 35, 12497–12512. [Google Scholar] [CrossRef]
Wang, K.; He, J.; Zhang, L. Attention-Based Convolutional Neural Network for Weakly Labeled Human Activities’ Recognition with Wearable Sensors. IEEE Sens. J. 2019, 19, 7598–7604. [Google Scholar] [CrossRef]
Shang, D.; Zhao, X.; Li, L.; Liu, W. Non-line-of-sight recognition method based on ICEEMDAN-CNN. Electron. Meas. Technol. 2023, 46, 61–67. [Google Scholar]
Liu, T.; Duan, Y. Fusion of CNN and Transformer for robot indoor scene recognition. J. Electron. Meas. Instrum. 2023, 37, 223–229. [Google Scholar]
Hu, J.; Shi, M.; Jiao, C.; Shen, Z. CNN-based acoustic pattern recognition method for flat wave reactors. Zhejiang Electr. Power 2023, 42, 88–94. [Google Scholar]
Jiang, S.; Yao, K.; Chen, L.; Wang, Z.; Guo, F. Face recognition method of mask based on hybrid model of CNN and Transformer. Sens. Microsyst. 2023, 42, 144–148. [Google Scholar]
Song, X.; Jiang, M.; Zhou, Y.; Ji, J.; Lu, X. Improved Faster R-CNN for device switch image recognition. Comput. Syst. Appl. 2022, 31, 211–224. [Google Scholar]
Hao, L. Research on FaceNet algorithm face image recognition based on CNN. Intell. Comput. Appl. 2022, 12, 130–135+143. [Google Scholar]
Zhang, L. Research on Remote Sensing Image Near-Shore Ship Target Detection and Recognition Based on Convolutional Neural Network. Ph.D. Thesis, Huazhong University of Science and Technology, Wuhan, China, 2022. [Google Scholar]
Wang, Y.; Xu, B.; Zu, F. Video action recognition system based on HOF-CNN and HOG features. Comput. Simul. 2022, 39, 179–182+318. [Google Scholar]
Tao, X.; Zhang, D.; Wang, Z.; Liu, X.; Zhang, H.; Xu, D. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks. IEEE Trans. Syst. Man Cybern. Syst. 2018, 50, 1486–1498. [Google Scholar] [CrossRef]

Figure 1. Schematic diagram of convolution.

Figure 2. Insulator recognition process based on convolutional neural networks.

Figure 3. Typical insulator image samples.

Figure 4. Loss function variation curve.

Figure 5. Accuracy variation curve.

Figure 6. Insulator target recognition results.

Table 1. Model-specific parameters.

Layer	Type	Output Dimension
1	Input (Convolution 2-D)	(480 × 480 × 1)
2	Number of arguments	(480 × 480 × 6)
3	Convolution 2-D	(480 × 480 × 12)
4	Convolution 2-D	(480 × 480 × 12)
5	Batch normalization	(480 × 480 × 12)
6	Average pooling 2-D	(240 × 240 × 12)
7	Convolution 2-D	(240 × 240 × 12 × 24)
8	Convolution 2-D	(240 × 240 × 12 × 24)
9	Convolution 2-D	(240 × 240 × 12 × 24)
10	Batch normalization	(240 × 240 × 12 × 24)
11	Average pooling 2-D	(120 × 120 × 24)
12	Convolution 2-D	(120 × 120 × 48)
13	Convolution 2-D	(120 × 120 × 48)
14	Convolution 2-D	(120 × 120 × 48)
15	Convolution 2-D	(120 × 120 × 48)
16	Batch normalization	(120 × 120 × 48)
17	Average pooling 2-D	(60 × 60 × 48)
18	Convolution 2-D	(60 × 60 × 48)
19	Batch normalization	(30 × 30 × 96)
20	Maximum pooling 2-D	(30 × 30 × 96)
21	Convolution 2-D	(30 × 30 × 96)
22	Batch normalization	(15 × 15 × 192)
23	Maximum pooling 2-D	(15 × 15 × 192)
24	Flat layer	(43,200)
25	Discard layer	(43,200)
26	Fully Connected layer	(6400)
27	Leaky ReLU activation function	(6400)
28	Discard layer	(3200)
29	Fully connected layer	(1600)
30	Leaky ReLU activation function	(32)
31	Discard layer	(32)
32	Fully connected layer	(32)
33	Leaky ReLU activation function	(16)
34	Discard layer	(16)
35	Fully connected layer	(1)

Table 2. Experimental environment.

Hard and Soft Items	Details
CPU	Intel(R) Core(TM) i5-7300HQ CPU @ 2.50 GHz 2.50 GHz
MEM	128 G
Hard Disk	500 G
Operating System	Ubuntu18.04
GPU Model	NVIDIA GeForce GTX 1050

Table 3. Hyperparameters used for training the neural networks.

Neural Network	Optimizer	Base Learning Rate	Learning Rate Policy	Batch Size	Training Epochs
DenseNet	SGD	0.0008	LambdaLR	16	100
GoogLeNet	Adam	0.0003	StepLR	16	100
ResNet	Adam	0.0001	StepLR	16	100

Table 4. The evaluation results of the ability of the three neural network models with the validation set.

Neural Network	Category	Precision/%	Recall Rate/%	F1 Value [33]/%
DenseNet	normal insulator	93.43	79.34	96.74
	defective insulator	99.24	76.91	98.62
	Average value	96.34	78.13	97.68
GoogLeNet	normal insulator	93.43	78.43	96.79
	defective insulator	99.31	80.00	98.66
	Average value	96.37	79.22	97.73
ResNet	normal insulator	93.72	88.63	96.96
	defective insulator	99.34	85.15	98.72
	Average value	96.53	86.89	97.84

Table 5. Recognition speed of different models.

Neural Network	Batch Size	Image Calculations	Recognition Speed/ms
DenseNet	48	1000	17.35
GoogLeNet	48	1000	10.14
ResNet	48	1000	10.38

Table 6. Validation results for the test set.

Category	Precision/%	Recall Rate/%	F1 Value [34]/%
normal insulator	99.89	87.49	99.87
defective insulator	99.39	86.06	98.77
Average value	99.64	86.78	99.32

Table 7. Algorithm comparison.

Algorithm	Accuracy
Faster-RCNN	88.7%
PCA-SVM	84.6%
Methodology of this paper	99.64%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Li, N.; Zeng, D.; Zhao, Y.; Wang, J.; Wang, B. A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator. Processes 2024, 12, 2129. https://doi.org/10.3390/pr12102129

AMA Style

Li N, Zeng D, Zhao Y, Wang J, Wang B. A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator. Processes. 2024; 12(10):2129. https://doi.org/10.3390/pr12102129

Chicago/Turabian Style

Li, Nan, Dejun Zeng, Yun Zhao, Jiahao Wang, and Bo Wang. 2024. "A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator" Processes 12, no. 10: 2129. https://doi.org/10.3390/pr12102129

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Convolutional Neural Network-Based Defect Recognition Method for Power Insulator

Abstract

1. Introduction

2. Binary Classification Model Construction

3. Convolutional Neural Network-Based Defect Recognition Method for Power Insulators

4. Case Study

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI