Unmanned aerial vehicles for human detection and recognition using neural-network model

Yawar Abbas; Naif Al Mudawi; Bayan Alabdullah; Touseef Sadiq; Asaad Algarni; Hameedur Rahman; Ahmad Jalal

doi:10.3389/fnbot.2024.1443678

Unmanned aerial vehicles for human detection and recognition using neural-network model

Front Neurorobot. 2024 Dec 4:18:1443678. doi: 10.3389/fnbot.2024.1443678. eCollection 2024.

Authors

Yawar Abbas¹, Naif Al Mudawi², Bayan Alabdullah³, Touseef Sadiq⁴, Asaad Algarni⁵, Hameedur Rahman¹, Ahmad Jalal^{1

6}

Affiliations

¹ Faculty of Computer Science and AI, Air University, Islamabad, Pakistan.
² Department of Computer Science, College of Computer Science and Information System, Najran University, Najran, Saudi Arabia.
³ Department of Information Systems, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.
⁴ Department of Information and Communication Technology, Centre for Artificial Intelligence Research, University of Agder, Grimstad, Norway.
⁵ Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, Saudi Arabia.
⁶ Department of Computer Science and Engineering, College of Informatics, Korea University, Seoul, Republic of Korea.

Abstract

Introduction: Recognizing human actions is crucial for allowing machines to understand and recognize human behavior, with applications spanning video based surveillance systems, human-robot collaboration, sports analysis systems, and entertainment. The immense diversity in human movement and appearance poses a significant challenge in this field, especially when dealing with drone-recorded (RGB) videos. Factors such as dynamic backgrounds, motion blur, occlusions, varying video capture angles, and exposure issues greatly complicate recognition tasks.

Methods: In this study, we suggest a method that addresses these challenges in RGB videos captured by drones. Our approach begins by segmenting the video into individual frames, followed by preprocessing steps applied to these RGB frames. The preprocessing aims to reduce computational costs, optimize image quality, and enhance foreground objects while removing the background.

Result: This results in improved visibility of foreground objects while eliminating background noise. Next, we employ the YOLOv9 detection algorithm to identify human bodies within the images. From the grayscale silhouette, we extract the human skeleton and identify 15 important locations, such as the head, neck, shoulders (left and right), elbows, wrists, hips, knees, ankles, and hips (left and right), and belly button. By using all these points, we extract specific positions, angular and distance relationships between them, as well as 3D point clouds and fiducial points. Subsequently, we optimize this data using the kernel discriminant analysis (KDA) optimizer, followed by classification using a deep neural network (CNN). To validate our system, we conducted experiments on three benchmark datasets: UAV-Human, UCF, and Drone-Action.

Discussion: On these datasets, our suggested model produced corresponding action recognition accuracies of 0.68, 0.75, and 0.83.

Keywords: convolutional neural network (CNNs); decision-making processes; neural network; sequential data processing; unmanned aerial vehicles; unmanned aerial vehicles neural network.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Deanship of Scientific Research at Najran University, under the Research Group Funding program grant code (NU/PG/SERC/13/30). The authors extend their appreciation to the Deanship of Scientific Research at Northern Border University, Arar, KSA for funding this research work through the project number “NBU-FFR-2024-231-08”. Princess Nourah Bint Abdulrahman University Researchers Supporting Project number (PNURSP2024R440), Princess Nourah Bint Abdulrahman University, Riyadh, Saudi Arabia.