-
Dynamic Subgoal based Path Formation and Task Allocation: A NeuroFleets Approach to Scalable Swarm Robotics
Authors:
Robinroy Peter,
Lavanya Ratnabala,
Eugene Yugarajah Andrew Charles,
Dzmitry Tsetserukou
Abstract:
This paper addresses the challenges of exploration and navigation in unknown environments from the perspective of evolutionary swarm robotics. A key focus is on path formation, which is essential for enabling cooperative swarm robots to navigate effectively. We designed the task allocation and path formation process based on a finite state machine, ensuring systematic decision-making and efficient…
▽ More
This paper addresses the challenges of exploration and navigation in unknown environments from the perspective of evolutionary swarm robotics. A key focus is on path formation, which is essential for enabling cooperative swarm robots to navigate effectively. We designed the task allocation and path formation process based on a finite state machine, ensuring systematic decision-making and efficient state transitions. The approach is decentralized, allowing each robot to make decisions independently based on local information, which enhances scalability and robustness. We present a novel subgoal-based path formation method that establishes paths between locations by leveraging visually connected subgoals. Simulation experiments conducted in the Argos simulator show that this method successfully forms paths in the majority of trials. However, inter-collision (traffic) among numerous robots during path formation can negatively impact performance. To address this issue, we propose a task allocation strategy that uses local communication protocols and light signal-based communication to manage robot deployment. This strategy assesses the distance between points and determines the optimal number of robots needed for the path formation task, thereby reducing unnecessary exploration and traffic congestion. The performance of both the subgoal-based path formation method and the task allocation strategy is evaluated by comparing the path length, time, and resource usage against the A* algorithm. Simulation results demonstrate the effectiveness of our approach, highlighting its scalability, robustness, and fault tolerance.
△ Less
Submitted 1 September, 2024;
originally announced September 2024.
-
HyperSurf: Quadruped Robot Leg Capable of Surface Recognition with GRU and Real-to-Sim Transferring
Authors:
Sergei Satsevich,
Yaroslav Savotin,
Danil Belov,
Elizaveta Pestova,
Artem Erhov,
Batyr Khabibullin,
Artem Bazhenov,
Vyacheslav Kovalev,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
This paper introduces a system of data collection acceleration and real-to-sim transferring for surface recognition on a quadruped robot. The system features a mechanical single-leg setup capable of stepping on various easily interchangeable surfaces. Additionally, it incorporates a GRU-based Surface Recognition System, inspired by the system detailed in the Dog-Surf paper. This setup facilitates…
▽ More
This paper introduces a system of data collection acceleration and real-to-sim transferring for surface recognition on a quadruped robot. The system features a mechanical single-leg setup capable of stepping on various easily interchangeable surfaces. Additionally, it incorporates a GRU-based Surface Recognition System, inspired by the system detailed in the Dog-Surf paper. This setup facilitates the expansion of dataset collection for model training, enabling data acquisition from hard-to-reach surfaces in laboratory conditions. Furthermore, it opens avenues for transferring surface properties from reality to simulation, thereby allowing the training of optimal gaits for legged robots in simulation environments using a pre-prepared library of digital twins of surfaces. Moreover, enhancements have been made to the GRU-based Surface Recognition System, allowing for the integration of data from both the quadruped robot and the single-leg setup. The dataset and code have been made publicly available.
△ Less
Submitted 19 August, 2024; v1 submitted 22 July, 2024;
originally announced July 2024.
-
AirNeRF: 3D Reconstruction of Human with Drone and NeRF for Future Communication Systems
Authors:
Alexey Kotcov,
Maria Dronova,
Vladislav Cheremnykh,
Sausar Karaf,
Dzmitry Tsetserukou
Abstract:
In the rapidly evolving landscape of digital content creation, the demand for fast, convenient, and autonomous methods of crafting detailed 3D reconstructions of humans has grown significantly. Addressing this pressing need, our AirNeRF system presents an innovative pathway to the creation of a realistic 3D human avatar. Our approach leverages Neural Radiance Fields (NeRF) with an automated drone-…
▽ More
In the rapidly evolving landscape of digital content creation, the demand for fast, convenient, and autonomous methods of crafting detailed 3D reconstructions of humans has grown significantly. Addressing this pressing need, our AirNeRF system presents an innovative pathway to the creation of a realistic 3D human avatar. Our approach leverages Neural Radiance Fields (NeRF) with an automated drone-based video capturing method. The acquired data provides a swift and precise way to create high-quality human body reconstructions following several stages of our system. The rigged mesh derived from our system proves to be an excellent foundation for free-view synthesis of dynamic humans, particularly well-suited for the immersive experiences within gaming and virtual reality.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
OmniRace: 6D Hand Pose Estimation for Intuitive Guidance of Racing Drone
Authors:
Valerii Serpiva,
Aleksey Fedoseev,
Sausar Karaf,
Ali Alridha Abdulkarim,
Dzmitry Tsetserukou
Abstract:
This paper presents the OmniRace approach to controlling a racing drone with 6-degree of freedom (DoF) hand pose estimation and gesture recognition. To our knowledge, it is the first-ever technology that allows for low-level control of high-speed drones using gestures. OmniRace employs a gesture interface based on computer vision and a deep neural network to estimate a 6-DoF hand pose. The advance…
▽ More
This paper presents the OmniRace approach to controlling a racing drone with 6-degree of freedom (DoF) hand pose estimation and gesture recognition. To our knowledge, it is the first-ever technology that allows for low-level control of high-speed drones using gestures. OmniRace employs a gesture interface based on computer vision and a deep neural network to estimate a 6-DoF hand pose. The advanced machine learning algorithm robustly interprets human gestures, allowing users to control drone motion intuitively. Real-time control of a racing drone demonstrates the effectiveness of the system, validating its potential to revolutionize drone racing and other applications. Experimental results conducted in the Gazebo simulation environment revealed that OmniRace allows the users to complite the UAV race track significantly (by 25.1%) faster and to decrease the length of the test drone path (from 102.9 to 83.7 m). Users preferred the gesture interface for attractiveness (1.57 UEQ score), hedonic quality (1.56 UEQ score), and lower perceived temporal demand (32.0 score in NASA-TLX), while noting the high efficiency (0.75 UEQ score) and low physical demand (19.0 score in NASA-TLX) of the baseline remote controller. The deep neural network attains an average accuracy of 99.75% when applied to both normalized datasets and raw datasets. OmniRace can potentially change the way humans interact with and navigate racing drones in dynamic and complex environments. The source code is available at https://github.com/SerValera/OmniRace.git.
△ Less
Submitted 16 July, 2024; v1 submitted 13 July, 2024;
originally announced July 2024.
-
MorphoMove: Bi-Modal Path Planner with MPC-based Path Follower for Multi-Limb Morphogenetic UAV
Authors:
Muhammad Ahsan Mustafa,
Yasheerah Yaqoot,
Mikhail Martynov,
Sausar Karaf,
Dzmitry Tsetserukou
Abstract:
This paper discusses developments for a multi-limb morphogenetic UAV, MorphoGear, that is capable of both aerial flight and ground locomotion. A hybrid path planning algorithm based on the A* strategy has been developed, enabling seamless transition between air-to-ground navigation modes, thereby enhancing robot's mobility in complex environments. Moreover, precise path following is achieved durin…
▽ More
This paper discusses developments for a multi-limb morphogenetic UAV, MorphoGear, that is capable of both aerial flight and ground locomotion. A hybrid path planning algorithm based on the A* strategy has been developed, enabling seamless transition between air-to-ground navigation modes, thereby enhancing robot's mobility in complex environments. Moreover, precise path following is achieved during ground locomotion with a Model Predictive Control (MPC) architecture for its novel walking behaviour. Experimental validation was conducted in the Unity simulation environment utilizing Python scripts to compute control values. The algorithm's performance is validated by the Root Mean Squared Error (RMSE) of 0.91 cm and a maximum error of 1.85 cm, as demonstrated by the results. These developments highlight the adaptability of MorphoGear in navigation through cluttered environments, establishing it as a usable tool in autonomous exploration, both aerial and ground-based.
△ Less
Submitted 21 August, 2024; v1 submitted 12 July, 2024;
originally announced July 2024.
-
GazeRace: Revolutionizing Remote Piloting with Eye-Gaze Control
Authors:
Issatay Tokmurziyev,
Valerii Serpiva,
Alexey Fedoseev,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
This paper presents GazeRace, a novel system that leverages eye-tracking technology for intuitive drone control. Using the MediaPipe library, the system translates eye movements into precise drone commands, enabling effective remote piloting. In testing, GazeRace demonstrated an 18% reduction in drone trajectory length while maintaining competitive speed with traditional controls. The results sugg…
▽ More
This paper presents GazeRace, a novel system that leverages eye-tracking technology for intuitive drone control. Using the MediaPipe library, the system translates eye movements into precise drone commands, enabling effective remote piloting. In testing, GazeRace demonstrated an 18% reduction in drone trajectory length while maintaining competitive speed with traditional controls. The results suggest that this approach enhances control accuracy and reduces user frustration, offering a significant advancement in the field of human-computer interaction and drone navigation.
△ Less
Submitted 21 August, 2024; v1 submitted 12 July, 2024;
originally announced July 2024.
-
TornadoDrone: Bio-inspired DRL-based Drone Landing on 6D Platform with Wind Force Disturbances
Authors:
Robinroy Peter,
Lavanya Ratnabala,
Demetros Aschu,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Autonomous drone navigation faces a critical challenge in achieving accurate landings on dynamic platforms, especially under unpredictable conditions such as wind turbulence. Our research introduces TornadoDrone, a novel Deep Reinforcement Learning (DRL) model that adopts bio-inspired mechanisms to adapt to wind forces, mirroring the natural adaptability seen in birds. This model, unlike tradition…
▽ More
Autonomous drone navigation faces a critical challenge in achieving accurate landings on dynamic platforms, especially under unpredictable conditions such as wind turbulence. Our research introduces TornadoDrone, a novel Deep Reinforcement Learning (DRL) model that adopts bio-inspired mechanisms to adapt to wind forces, mirroring the natural adaptability seen in birds. This model, unlike traditional approaches, derives its adaptability from indirect cues such as changes in position and velocity, rather than direct wind force measurements. TornadoDrone was rigorously trained in the gym-pybullet-drone simulator, which closely replicates the complexities of wind dynamics in the real world. Through extensive testing with Crazyflie 2.1 drones in both simulated and real windy conditions, TornadoDrone demonstrated a high performance in maintaining high-precision landing accuracy on moving platforms, surpassing conventional control methods such as PID controllers with Extended Kalman Filters. The study not only highlights the potential of DRL to tackle complex aerodynamic challenges but also paves the way for advanced autonomous systems that can adapt to environmental changes in real-time. The success of TornadoDrone signifies a leap forward in drone technology, particularly for critical applications such as surveillance and emergency response, where reliability and precision are paramount.
△ Less
Submitted 25 June, 2024; v1 submitted 23 June, 2024;
originally announced June 2024.
-
MARLander: A Local Path Planning for Drone Swarms using Multiagent Deep Reinforcement Learning
Authors:
Demetros Aschu,
Robinroy Peter,
Sausar Karaf,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maxim…
▽ More
Achieving safe and precise landings for a swarm of drones poses a significant challenge, primarily attributed to conventional control and planning methods. This paper presents the implementation of multi-agent deep reinforcement learning (MADRL) techniques for the precise landing of a drone swarm at relocated target locations. The system is trained in a realistic simulated environment with a maximum velocity of 3 m/s in training spaces of 4 x 4 x 4 m and deployed utilizing Crazyflie drones with a Vicon indoor localization system. The experimental results revealed that the proposed approach achieved a landing accuracy of 2.26 cm on stationary and 3.93 cm on moving platforms surpassing a baseline method used with a Proportional-integral-derivative (PID) controller with an Artificial Potential Field (APF). This research highlights drone landing technologies that eliminate the need for analytical centralized systems, potentially offering scalability and revolutionizing applications in logistics, safety, and rescue missions.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention
Authors:
Ziang Guo,
Zakhar Yagudin,
Selamawit Asfaw,
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
Camera, LiDAR and radar are common perception sensors for autonomous driving tasks. Robust prediction of 3D object detection is optimally based on the fusion of these sensors. To exploit their abilities wisely remains a challenge because each of these sensors has its own characteristics. In this paper, we propose FADet, a multi-sensor 3D detection network, which specifically studies the characteri…
▽ More
Camera, LiDAR and radar are common perception sensors for autonomous driving tasks. Robust prediction of 3D object detection is optimally based on the fusion of these sensors. To exploit their abilities wisely remains a challenge because each of these sensors has its own characteristics. In this paper, we propose FADet, a multi-sensor 3D detection network, which specifically studies the characteristics of different sensors based on our local featured attention modules. For camera images, we propose dual-attention-based sub-module. For LiDAR point clouds, triple-attention-based sub-module is utilized while mixed-attention-based sub-module is applied for features of radar points. With local featured attention sub-modules, our FADet has effective detection results in long-tail and complex scenes from camera, LiDAR and radar input. On NuScenes validation dataset, FADet achieves state-of-the-art performance on LiDAR-camera object detection tasks with 71.8% NDS and 69.0% mAP, at the same time, on radar-camera object detection tasks with 51.7% NDS and 40.3% mAP. Code will be released at https://github.com/ZionGo6/FADet.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
VR-GPT: Visual Language Model for Intelligent Virtual Reality Applications
Authors:
Mikhail Konenkov,
Artem Lykov,
Daria Trinitatova,
Dzmitry Tsetserukou
Abstract:
The advent of immersive Virtual Reality applications has transformed various domains, yet their integration with advanced artificial intelligence technologies like Visual Language Models remains underexplored. This study introduces a pioneering approach utilizing VLMs within VR environments to enhance user interaction and task efficiency. Leveraging the Unity engine and a custom-developed VLM, our…
▽ More
The advent of immersive Virtual Reality applications has transformed various domains, yet their integration with advanced artificial intelligence technologies like Visual Language Models remains underexplored. This study introduces a pioneering approach utilizing VLMs within VR environments to enhance user interaction and task efficiency. Leveraging the Unity engine and a custom-developed VLM, our system facilitates real-time, intuitive user interactions through natural language processing, without relying on visual text instructions. The incorporation of speech-to-text and text-to-speech technologies allows for seamless communication between the user and the VLM, enabling the system to guide users through complex tasks effectively. Preliminary experimental results indicate that utilizing VLMs not only reduces task completion times but also improves user comfort and task engagement compared to traditional VR interaction methods.
△ Less
Submitted 3 August, 2024; v1 submitted 19 May, 2024;
originally announced May 2024.
-
GrainGrasp: Dexterous Grasp Generation with Fine-grained Contact Guidance
Authors:
Fuqiang Zhao,
Dzmitry Tsetserukou,
Qian Liu
Abstract:
One goal of dexterous robotic grasping is to allow robots to handle objects with the same level of flexibility and adaptability as humans. However, it remains a challenging task to generate an optimal grasping strategy for dexterous hands, especially when it comes to delicate manipulation and accurate adjustment the desired grasping poses for objects of varying shapes and sizes. In this paper, we…
▽ More
One goal of dexterous robotic grasping is to allow robots to handle objects with the same level of flexibility and adaptability as humans. However, it remains a challenging task to generate an optimal grasping strategy for dexterous hands, especially when it comes to delicate manipulation and accurate adjustment the desired grasping poses for objects of varying shapes and sizes. In this paper, we propose a novel dexterous grasp generation scheme called GrainGrasp that provides fine-grained contact guidance for each fingertip. In particular, we employ a generative model to predict separate contact maps for each fingertip on the object point cloud, effectively capturing the specifics of finger-object interactions. In addition, we develop a new dexterous grasping optimization algorithm that solely relies on the point cloud as input, eliminating the necessity for complete mesh information of the object. By leveraging the contact maps of different fingertips, the proposed optimization algorithm can generate precise and determinable strategies for human-like object grasping. Experimental results confirm the efficiency of the proposed scheme.
△ Less
Submitted 15 May, 2024; v1 submitted 15 May, 2024;
originally announced May 2024.
-
Bi-VLA: Vision-Language-Action Model-Based System for Bimanual Robotic Dexterous Manipulations
Authors:
Koffivi Fidèle Gbagbe,
Miguel Altamirano Cabrera,
Ali Alabbas,
Oussama Alyunes,
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulation that seamlessly integrates vision for scene understanding, language comprehension for translating human instructions into executable code, and physical action generation. We evaluated the system's functionality through a series of household tasks, including the pr…
▽ More
This research introduces the Bi-VLA (Vision-Language-Action) model, a novel system designed for bimanual robotic dexterous manipulation that seamlessly integrates vision for scene understanding, language comprehension for translating human instructions into executable code, and physical action generation. We evaluated the system's functionality through a series of household tasks, including the preparation of a desired salad upon human request. Bi-VLA demonstrates the ability to interpret complex human instructions, perceive and understand the visual context of ingredients, and execute precise bimanual actions to prepare the requested salad. We assessed the system's performance in terms of accuracy, efficiency, and adaptability to different salad recipes and human preferences through a series of experiments. Our results show a 100% success rate in generating the correct executable code by the Language Module, a 96.06% success rate in detecting specific ingredients by the Vision Module, and an overall success rate of 83.4% in correctly executing user-requested tasks.
△ Less
Submitted 19 August, 2024; v1 submitted 9 May, 2024;
originally announced May 2024.
-
Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes
Authors:
Ziang Guo,
Artem Lykov,
Zakhar Yagudin,
Mikhail Konenkov,
Dzmitry Tsetserukou
Abstract:
Recent research about Large Language Model based autonomous driving solutions shows a promising picture in planning and control fields. However, heavy computational resources and hallucinations of Large Language Models continue to hinder the tasks of predicting precise trajectories and instructing control signals. To address this problem, we propose Co-driver, a novel autonomous driving assistant…
▽ More
Recent research about Large Language Model based autonomous driving solutions shows a promising picture in planning and control fields. However, heavy computational resources and hallucinations of Large Language Models continue to hinder the tasks of predicting precise trajectories and instructing control signals. To address this problem, we propose Co-driver, a novel autonomous driving assistant system to empower autonomous vehicles with adjustable driving behaviors based on the understanding of road scenes. A pipeline involving the CARLA simulator and Robot Operating System 2 (ROS2) verifying the effectiveness of our system is presented, utilizing a single Nvidia 4090 24G GPU while exploiting the capacity of textual output of the Visual Language Model. Besides, we also contribute a dataset containing an image set and a corresponding prompt set for fine-tuning the Visual Language Model module of our system. In the real-world driving dataset, our system achieved 96.16% success rate in night scenes and 89.7% in gloomy scenes regarding reasonable predictions. Our Co-driver dataset will be released at https://github.com/ZionGo6/Co-driver.
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
FlockGPT: Guiding UAV Flocking with Linguistic Orchestration
Authors:
Artem Lykov,
Sausar Karaf,
Mikhail Martynov,
Valerii Serpiva,
Aleksey Fedoseev,
Mikhail Konenkov,
Dzmitry Tsetserukou
Abstract:
This article presents the world's first rapid drone flocking control using natural language through generative AI. The described approach enables the intuitive orchestration of a flock of any size to achieve the desired geometry. The key feature of the method is the development of a new interface based on Large Language Models to communicate with the user and to generate the target geometry descri…
▽ More
This article presents the world's first rapid drone flocking control using natural language through generative AI. The described approach enables the intuitive orchestration of a flock of any size to achieve the desired geometry. The key feature of the method is the development of a new interface based on Large Language Models to communicate with the user and to generate the target geometry descriptions. Users can interactively modify or provide comments during the construction of the flock geometry model. By combining flocking technology and defining the target surface using a signed distance function, smooth and adaptive movement of the drone swarm between target states is achieved.
Our user study on FlockGPT confirmed a high level of intuitive control over drone flocking by users. Subjects who had never previously controlled a swarm of drones were able to construct complex figures in just a few iterations and were able to accurately distinguish the formed swarm drone figures. The results revealed a high recognition rate for six different geometric patterns generated through the LLM-based interface and performed by a simulated drone flock (mean of 80% with a maximum of 93\% for cube and tetrahedron patterns). Users commented on low temporal demand (19.2 score in NASA-TLX), high performance (26 score in NASA-TLX), attractiveness (1.94 UEQ score), and hedonic quality (1.81 UEQ score) of the developed system. The FlockGPT demo code repository can be found at: coming soon
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning
Authors:
Artem Lykov,
Miguel Altamirano Cabrera,
Koffivi Fidèle Gbagbe,
Dzmitry Tsetserukou
Abstract:
This paper presents the development of a novel ethical reasoning framework for robots. "Robots Can Feel" is the first system for robots that utilizes a combination of logic and human-like emotion simulation to make decisions in morally complex situations akin to humans. The key feature of the approach is the management of the Emotion Weight Coefficient - a customizable parameter to assign the role…
▽ More
This paper presents the development of a novel ethical reasoning framework for robots. "Robots Can Feel" is the first system for robots that utilizes a combination of logic and human-like emotion simulation to make decisions in morally complex situations akin to humans. The key feature of the approach is the management of the Emotion Weight Coefficient - a customizable parameter to assign the role of emotions in robot decision-making. The system aims to serve as a tool that can equip robots of any form and purpose with ethical behavior close to human standards. Besides the platform, the system is independent of the choice of the base model. During the evaluation, the system was tested on 8 top up-to-date LLMs (Large Language Models). This list included both commercial and open-source models developed by various companies and countries. The research demonstrated that regardless of the model choice, the Emotions Weight Coefficient influences the robot's decision similarly. According to ANOVA analysis, the use of different Emotion Weight Coefficients influenced the final decision in a range of situations, such as in a request for a dietary violation F(4, 35) = 11.2, p = 0.0001 and in an animal compassion situation F(4, 35) = 8.5441, p = 0.0001. A demonstration code repository is provided at: https://github.com/TemaLykov/robots_can_feel
△ Less
Submitted 9 May, 2024;
originally announced May 2024.
-
MoveTouch: Robotic Motion Capturing System with Wearable Tactile Display to Achieve Safe HRI
Authors:
Ali Alabbas,
Miguel Altamirano Cabrera,
Mohamed Sayed,
Oussama Alyounes,
Qian Liu,
Dzmitry Tsetserukou
Abstract:
The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is des…
▽ More
The collaborative robot market is flourishing as there is a trend towards simplification, modularity, and increased flexibility on the production line. But when humans and robots are collaborating in a shared environment, the safety of humans should be a priority. We introduce a novel wearable robotic system to enhance safety during Human-Robot Interaction (HRI). The proposed wearable robot is designed to hold a fiducial marker and maintain its visibility to a motion capture system, which, in turn, localizes the user's hand with good accuracy and low latency and provides vibrotactile feedback to the user's wrist. The vibrotactile feedback guides the user's hand movement during collaborative tasks in order to increase safety and enhance collaboration efficiency. A user study was conducted to assess the recognition and discriminability of ten designed vibration patterns applied to the upper (dorsal) and the down (volar) parts of the user's wrist. The results show that the pattern recognition rate on the volar side was higher, with an average of 75.64% among all users. Four patterns with a high recognition rate were chosen to be incorporated into our system. A second experiment was carried out to evaluate users' response to the chosen patterns in real-world collaborative tasks. Results show that all participants responded to the patterns correctly, and the average response time for the patterns was between 0.24 and 2.41 seconds.
△ Less
Submitted 5 July, 2024; v1 submitted 8 May, 2024;
originally announced May 2024.
-
FlyNeRF: NeRF-Based Aerial Mapping for High-Quality 3D Scene Reconstruction
Authors:
Maria Dronova,
Vladislav Cheremnykh,
Alexey Kotcov,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Current methods for 3D reconstruction and environmental mapping frequently face challenges in achieving high precision, highlighting the need for practical and effective solutions. In response to this issue, our study introduces FlyNeRF, a system integrating Neural Radiance Fields (NeRF) with drone-based data acquisition for high-quality 3D reconstruction. Utilizing unmanned aerial vehicle (UAV) f…
▽ More
Current methods for 3D reconstruction and environmental mapping frequently face challenges in achieving high precision, highlighting the need for practical and effective solutions. In response to this issue, our study introduces FlyNeRF, a system integrating Neural Radiance Fields (NeRF) with drone-based data acquisition for high-quality 3D reconstruction. Utilizing unmanned aerial vehicle (UAV) for capturing images and corresponding spatial coordinates, the obtained data is subsequently used for the initial NeRF-based 3D reconstruction of the environment. Further evaluation of the reconstruction render quality is accomplished by the image evaluation neural network developed within the scope of our system. According to the results of the image evaluation module, an autonomous algorithm determines the position for additional image capture, thereby improving the reconstruction quality. The neural network introduced for render quality assessment demonstrates an accuracy of 97%. Furthermore, our adaptive methodology enhances the overall reconstruction quality, resulting in an average improvement of 2.5 dB in Peak Signal-to-Noise Ratio (PSNR) for the 10% quantile. The FlyNeRF demonstrates promising results, offering advancements in such fields as environmental monitoring, surveillance, and digital twins, where high-fidelity 3D reconstructions are crucial.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
HawkDrive: A Transformer-driven Visual Perception System for Autonomous Driving in Night Scene
Authors:
Ziang Guo,
Stepan Perminov,
Mikhail Konenkov,
Dzmitry Tsetserukou
Abstract:
Many established vision perception systems for autonomous driving scenarios ignore the influence of light conditions, one of the key elements for driving safety. To address this problem, we present HawkDrive, a novel perception system with hardware and software solutions. Hardware that utilizes stereo vision perception, which has been demonstrated to be a more reliable way of estimating depth info…
▽ More
Many established vision perception systems for autonomous driving scenarios ignore the influence of light conditions, one of the key elements for driving safety. To address this problem, we present HawkDrive, a novel perception system with hardware and software solutions. Hardware that utilizes stereo vision perception, which has been demonstrated to be a more reliable way of estimating depth information than monocular vision, is partnered with the edge computing device Nvidia Jetson Xavier AGX. Our software for low light enhancement, depth estimation, and semantic segmentation tasks, is a transformer-based neural network. Our software stack, which enables fast inference and noise reduction, is packaged into system modules in Robot Operating System 2 (ROS2). Our experimental results have shown that the proposed end-to-end system is effective in improving the depth estimation and semantic segmentation performance. Our dataset and codes will be released at https://github.com/ZionGo6/HawkDrive.
△ Less
Submitted 6 May, 2024; v1 submitted 6 April, 2024;
originally announced April 2024.
-
MorphoGear: An UAV with Multi-Limb Morphogenetic Gear for Rough-Terrain Locomotion
Authors:
Mikhail Martynov,
Zhanibek Darush,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Robots able to run, fly, and grasp have a high potential to solve a wide scope of tasks and navigate in complex environments. Several mechatronic designs of such robots with adaptive morphologies are emerging. However, the task of landing on an uneven surface, traversing rough terrain, and manipulating objects still presents high challenges.
This paper introduces the design of a novel rotor UAV…
▽ More
Robots able to run, fly, and grasp have a high potential to solve a wide scope of tasks and navigate in complex environments. Several mechatronic designs of such robots with adaptive morphologies are emerging. However, the task of landing on an uneven surface, traversing rough terrain, and manipulating objects still presents high challenges.
This paper introduces the design of a novel rotor UAV MorphoGear with morphogenetic gear and includes a description of the robot's mechanics, electronics, and control architecture, as well as walking behavior and an analysis of experimental results. MorphoGear is able to fly, walk on surfaces with several gaits, and grasp objects with four compatible robotic limbs. Robotic limbs with three degrees of freedom (DoFs) are used by this UAV as pedipulators when walking or flying and as manipulators when performing actions in the environment. We performed a locomotion analysis of the landing gear of the robot. Three types of robot gaits have been developed.
The experimental results revealed low crosstrack error of the most accurate gait (mean of 1.9 cm and max of 5.5 cm) and the ability of the drone to move with a 210 mm step length. Another type of robot gait also showed low crosstrack error (mean of 2.3 cm and max of 6.9 cm). The proposed MorphoGear system can potentially achieve a high scope of tasks in environmental surveying, delivery, and high-altitude operations.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
Lander.AI: Adaptive Landing Behavior Agent for Expertise in 3D Dynamic Platform Landings
Authors:
Robinroy Peter,
Lavanya Ratnabala,
Demetros Aschu,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Mastering autonomous drone landing on dynamic platforms presents formidable challenges due to unpredictable velocities and external disturbances caused by the wind, ground effect, turbines or propellers of the docking platform. This study introduces an advanced Deep Reinforcement Learning (DRL) agent, Lander:AI, designed to navigate and land on platforms in the presence of windy conditions, thereb…
▽ More
Mastering autonomous drone landing on dynamic platforms presents formidable challenges due to unpredictable velocities and external disturbances caused by the wind, ground effect, turbines or propellers of the docking platform. This study introduces an advanced Deep Reinforcement Learning (DRL) agent, Lander:AI, designed to navigate and land on platforms in the presence of windy conditions, thereby enhancing drone autonomy and safety. Lander:AI is rigorously trained within the gym-pybullet-drone simulation, an environment that mirrors real-world complexities, including wind turbulence, to ensure the agent's robustness and adaptability.
The agent's capabilities were empirically validated with Crazyflie 2.1 drones across various test scenarios, encompassing both simulated environments and real-world conditions. The experimental results showcased Lander:AI's high-precision landing and its ability to adapt to moving platforms, even under wind-induced disturbances. Furthermore, the system performance was benchmarked against a baseline PID controller augmented with an Extended Kalman Filter, illustrating significant improvements in landing precision and error recovery. Lander:AI leverages bio-inspired learning to adapt to external forces like birds, enhancing drone adaptability without knowing force magnitudes.This research not only advances drone landing technologies, essential for inspection and emergency applications, but also highlights the potential of DRL in addressing intricate aerodynamic challenges.
△ Less
Submitted 12 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
DogSurf: Quadruped Robot Capable of GRU-based Surface Recognition for Blind Person Navigation
Authors:
Artem Bazhenov,
Vladimir Berman,
Sergei Satsevich,
Olga Shalopanova,
Miguel Altamirano Cabrera,
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
This paper introduces DogSurf - a newapproach of using quadruped robots to help visually impaired people navigate in real world. The presented method allows the quadruped robot to detect slippery surfaces, and to use audio and haptic feedback to inform the user when to stop. A state-of-the-art GRU-based neural network architecture with mean accuracy of 99.925% was proposed for the task of multicla…
▽ More
This paper introduces DogSurf - a newapproach of using quadruped robots to help visually impaired people navigate in real world. The presented method allows the quadruped robot to detect slippery surfaces, and to use audio and haptic feedback to inform the user when to stop. A state-of-the-art GRU-based neural network architecture with mean accuracy of 99.925% was proposed for the task of multiclass surface classification for quadruped robots. A dataset was collected on a Unitree Go1 Edu robot. The dataset and code have been posted to the public domain.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
CognitiveOS: Large Multimodal Model based System to Endow Any Type of Robot with Generative AI
Authors:
Artem Lykov,
Mikhail Konenkov,
Koffivi Fidèle Gbagbe,
Mikhail Litvinov,
Denis Davletshin,
Aleksey Fedoseev,
Miguel Altamirano Cabrera,
Robinroy Peter,
Dzmitry Tsetserukou
Abstract:
This paper introduces CognitiveOS, the first operating system designed for cognitive robots capable of functioning across diverse robotic platforms. CognitiveOS is structured as a multi-agent system comprising modules built upon a transformer architecture, facilitating communication through an internal monologue format. These modules collectively empower the robot to tackle intricate real-world ta…
▽ More
This paper introduces CognitiveOS, the first operating system designed for cognitive robots capable of functioning across diverse robotic platforms. CognitiveOS is structured as a multi-agent system comprising modules built upon a transformer architecture, facilitating communication through an internal monologue format. These modules collectively empower the robot to tackle intricate real-world tasks. The paper delineates the operational principles of the system along with descriptions of its nine distinct modules. The modular design endows the system with distinctive advantages over traditional end-to-end methodologies, notably in terms of adaptability and scalability. The system's modules are configurable, modifiable, or deactivatable depending on the task requirements, while new modules can be seamlessly integrated. This system serves as a foundational resource for researchers and developers in the cognitive robotics domain, alleviating the burden of constructing a cognitive robot system from scratch. Experimental findings demonstrate the system's advanced task comprehension and adaptability across varied tasks, robotic platforms, and module configurations, underscoring its potential for real-world applications. Moreover, in the category of Reasoning it outperformed CognitiveDog (by 15%) and RT2 (by 31%), achieving the highest to date rate of 77%. We provide a code repository and dataset for the replication of CognitiveOS: link will be provided in camera-ready submission.
△ Less
Submitted 19 March, 2024; v1 submitted 29 January, 2024;
originally announced January 2024.
-
CognitiveDog: Large Multimodal Model Based System to Translate Vision and Language into Action of Quadruped Robot
Authors:
Artem Lykov,
Mikhail Litvinov,
Mikhail Konenkov,
Rinat Prochii,
Nikita Burtsev,
Ali Alridha Abdulkarim,
Artem Bazhenov,
Vladimir Berman,
Dzmitry Tsetserukou
Abstract:
This paper introduces CognitiveDog, a pioneering development of quadruped robot with Large Multi-modal Model (LMM) that is capable of not only communicating with humans verbally but also physically interacting with the environment through object manipulation. The system was realized on Unitree Go1 robot-dog equipped with a custom gripper and demonstrated autonomous decision-making capabilities, in…
▽ More
This paper introduces CognitiveDog, a pioneering development of quadruped robot with Large Multi-modal Model (LMM) that is capable of not only communicating with humans verbally but also physically interacting with the environment through object manipulation. The system was realized on Unitree Go1 robot-dog equipped with a custom gripper and demonstrated autonomous decision-making capabilities, independently determining the most appropriate actions and interactions with various objects to fulfill user-defined tasks. These tasks do not necessarily include direct instructions, challenging the robot to comprehend and execute them based on natural language input and environmental cues. The paper delves into the intricacies of this system, dataset characteristics, and the software architecture. Key to this development is the robot's proficiency in navigating space using Visual-SLAM, effectively manipulating and transporting objects, and providing insightful natural language commentary during task execution. Experimental results highlight the robot's advanced task comprehension and adaptability, underscoring its potential in real-world applications. The dataset used to fine-tune the robot-dog behavior generation model is provided at the following link: huggingface.co/datasets/ArtemLykov/CognitiveDog_dataset
△ Less
Submitted 17 January, 2024;
originally announced January 2024.
-
LLM-MARS: Large Language Model for Behavior Tree Generation and NLP-enhanced Dialogue in Multi-Agent Robot Systems
Authors:
Artem Lykov,
Maria Dronova,
Nikolay Naglov,
Mikhail Litvinov,
Sergei Satsevich,
Artem Bazhenov,
Vladimir Berman,
Aleksei Shcherbak,
Dzmitry Tsetserukou
Abstract:
This paper introduces LLM-MARS, first technology that utilizes a Large Language Model based Artificial Intelligence for Multi-Agent Robot Systems. LLM-MARS enables dynamic dialogues between humans and robots, allowing the latter to generate behavior based on operator commands and provide informative answers to questions about their actions. LLM-MARS is built on a transformer-based Large Language M…
▽ More
This paper introduces LLM-MARS, first technology that utilizes a Large Language Model based Artificial Intelligence for Multi-Agent Robot Systems. LLM-MARS enables dynamic dialogues between humans and robots, allowing the latter to generate behavior based on operator commands and provide informative answers to questions about their actions. LLM-MARS is built on a transformer-based Large Language Model, fine-tuned from the Falcon 7B model. We employ a multimodal approach using LoRa adapters for different tasks. The first LoRa adapter was developed by fine-tuning the base model on examples of Behavior Trees and their corresponding commands. The second LoRa adapter was developed by fine-tuning on question-answering examples. Practical trials on a multi-agent system of two robots within the Eurobot 2023 game rules demonstrate promising results. The robots achieve an average task execution accuracy of 79.28% in compound commands. With commands containing up to two tasks accuracy exceeded 90%. Evaluation confirms the system's answers on operators questions exhibit high accuracy, relevance, and informativeness. LLM-MARS and similar multi-agent robotic systems hold significant potential to revolutionize logistics, enabling autonomous exploration missions and advancing Industry 5.0.
△ Less
Submitted 14 December, 2023;
originally announced December 2023.
-
Filasofia: A Framework for Streamlined Development of Real-Time Surgical Simulations
Authors:
Vladimir Poliakov,
Dzmitry Tsetserukou,
Emmanuel Vander Poorten
Abstract:
Virtual reality simulation has become a popular approach for training and assessing medical students. It offers diverse scenarios, realistic visuals, and quantitative performance metrics for objective evaluation. However, creating these simulations can be time-consuming and complex, even for experienced users. The SOFA framework is an open-source solution that efficiently simulates finite element…
▽ More
Virtual reality simulation has become a popular approach for training and assessing medical students. It offers diverse scenarios, realistic visuals, and quantitative performance metrics for objective evaluation. However, creating these simulations can be time-consuming and complex, even for experienced users. The SOFA framework is an open-source solution that efficiently simulates finite element (FE) models in real-time. Yet, some users find it challenging to navigate the software due to the numerous components required for a basic simulation and their variability. Additionally, SOFA has limited visual rendering capabilities, leading developers to integrate other software for high-quality visuals. To address these issues, we developed Filasofia, a dedicated framework that simplifies development, provides modern visualization, and allows fine-tuning using SOFA objects. Our experiments demonstrate that Filasofia outperforms conventional SOFA simulations, even with real-time subdivision. Our design approach aims to streamline development while offering flexibility for fine-tuning. Future work will focus on further simplification of the development process for users.
△ Less
Submitted 24 November, 2023;
originally announced November 2023.
-
PolyMerge: A Novel Technique aimed at Dynamic HD Map Updates Leveraging Polylines
Authors:
Mohamed Sayed,
Stepan Perminov,
Dzmitry Tsetserukou
Abstract:
Currently, High-Definition (HD) maps are a prerequisite for the stable operation of autonomous vehicles. Such maps contain information about all static road objects for the vehicle to consider during navigation, such as road edges, road lanes, crosswalks, and etc. To generate such an HD map, current approaches need to process pre-recorded environment data obtained from onboard sensors. However, re…
▽ More
Currently, High-Definition (HD) maps are a prerequisite for the stable operation of autonomous vehicles. Such maps contain information about all static road objects for the vehicle to consider during navigation, such as road edges, road lanes, crosswalks, and etc. To generate such an HD map, current approaches need to process pre-recorded environment data obtained from onboard sensors. However, recording such a dataset often requires a lot of time and effort. In addition, every time actual road environments are changed, a new dataset should be recorded to generate a relevant HD map.
This paper addresses a novel approach that allows to continuously generate or update the HD map using onboard sensor data. When there is no need to pre-record the dataset, updating the HD map can be run in parallel with the main autonomous vehicle navigation pipeline.
The proposed approach utilizes the VectorMapNet framework to generate vector road object instances from a sensor data scan. The PolyMerge technique is aimed to merge new instances into previous ones, mitigating detection errors and, therefore, generating or updating the HD map.
The performance of the algorithm was confirmed by comparison with ground truth on the NuScenes dataset. Experimental results showed that the mean error for different levels of environment complexity was comparable to the VectorMapNet single instance error.
△ Less
Submitted 31 October, 2023; v1 submitted 27 October, 2023;
originally announced October 2023.
-
TeslaCharge: Smart Robotic Charger Driven by Impedance Control and Human Haptic Patterns
Authors:
Oussama Alyounes,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the research is focused on detecting the position and orientation of the socket, which…
▽ More
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the research is focused on detecting the position and orientation of the socket, which resulted in a relatively high accuracy, $\pm 5 \: mm $ and $\pm 10^o$. However, this accuracy is not enough to complete the charging process. In this work, we focus on designing a novel methodology for robust robotic plug-in and plug-out based on human haptics, to overcome the error in the position and orientation of the socket. Participants were invited to perform the charging task, and their cognitive capabilities were recognized by measuring the applied forces along with the movement of the charger. Three controllers were designed based on impedance control to mimic the human patterns of charging an electric car. The recorded data from humans were used to calibrate the parameters of the impedance controllers: inertia $M_d$, damping $D_d$, and stiffness $K_d$. A robotic validation was performed, where the designed controllers were applied to the robot UR10. Using the proposed controllers and the human kinesthetic data, it was possible to successfully automate the operation of charging an electric car.
△ Less
Submitted 18 October, 2023;
originally announced October 2023.
-
HaptiCharger: Robotic Charging of Electric Vehicles Based on Human Haptic Patterns
Authors:
Oussama Alyounes,
Miguel Altamirano Cabrera,
Dzmitry Tsetserukou
Abstract:
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the automation of the charging task research is focused on detecting the position and o…
▽ More
The growing demand for electric vehicles requires the development of automated car charging methods. At the moment, the process of charging an electric car is completely manual, and that requires physical effort to accomplish the task, which is not suitable for people with disabilities. Typically, the effort in the automation of the charging task research is focused on detecting the position and orientation of the socket, which resulted in a relatively high accuracy, 5 mm, and 10 degrees. However, this accuracy is not enough to complete the charging process. In this work, we focus on designing a novel methodology for robust robotic plug-in and plug-out based on human haptics to overcome the error in the orientation of the socket. Participants were invited to perform the charging task, and their cognitive capabilities were recognized by measuring the applied forces along with the movements of the charger. Eventually, an algorithm was developed based on the human's best strategies to be applied to a robotic arm.
△ Less
Submitted 10 November, 2023; v1 submitted 13 October, 2023;
originally announced October 2023.
-
LocoNeRF: A NeRF-based Approach for Local Structure from Motion for Precise Localization
Authors:
Artem Nenashev,
Mikhail Kurenkov,
Andrei Potapov,
Iana Zhura,
Maksim Katerishich,
Dzmitry Tsetserukou
Abstract:
Visual localization is a critical task in mobile robotics, and researchers are continuously developing new approaches to enhance its efficiency. In this article, we propose a novel approach to improve the accuracy of visual localization using Structure from Motion (SfM) techniques. We highlight the limitations of global SfM, which suffers from high latency, and the challenges of local SfM, which r…
▽ More
Visual localization is a critical task in mobile robotics, and researchers are continuously developing new approaches to enhance its efficiency. In this article, we propose a novel approach to improve the accuracy of visual localization using Structure from Motion (SfM) techniques. We highlight the limitations of global SfM, which suffers from high latency, and the challenges of local SfM, which requires large image databases for accurate reconstruction. To address these issues, we propose utilizing Neural Radiance Fields (NeRF), as opposed to image databases, to cut down on the space required for storage. We suggest that sampling reference images around the prior query position can lead to further improvements. We evaluate the accuracy of our proposed method against ground truth obtained using LIDAR and Advanced Lidar Odometry and Mapping in Real-time (A-LOAM), and compare its storage usage against local SfM with COLMAP in the conducted experiments. Our proposed method achieves an accuracy of 0.068 meters compared to the ground truth, which is slightly lower than the most advanced method COLMAP, which has an accuracy of 0.022 meters. However, the size of the database required for COLMAP is 400 megabytes, whereas the size of our NeRF model is only 160 megabytes. Finally, we perform an ablation study to assess the impact of using reference images from the NeRF reconstruction.
△ Less
Submitted 8 October, 2023;
originally announced October 2023.
-
DNFOMP: Dynamic Neural Field Optimal Motion Planner for Navigation of Autonomous Robots in Cluttered Environment
Authors:
Maksim Katerishich,
Mikhail Kurenkov,
Sausar Karaf,
Artem Nenashev,
Dzmitry Tsetserukou
Abstract:
Motion planning in dynamically changing environments is one of the most complex challenges in autonomous driving. Safety is a crucial requirement, along with driving comfort and speed limits. While classical sampling-based, lattice-based, and optimization-based planning methods can generate smooth and short paths, they often do not consider the dynamics of the environment. Some techniques do consi…
▽ More
Motion planning in dynamically changing environments is one of the most complex challenges in autonomous driving. Safety is a crucial requirement, along with driving comfort and speed limits. While classical sampling-based, lattice-based, and optimization-based planning methods can generate smooth and short paths, they often do not consider the dynamics of the environment. Some techniques do consider it, but they rely on updating the environment on-the-go rather than explicitly accounting for the dynamics, which is not suitable for self-driving. To address this, we propose a novel method based on the Neural Field Optimal Motion Planner (NFOMP), which outperforms state-of-the-art approaches in terms of normalized curvature and the number of cusps. Our approach embeds previously known moving obstacles into the neural field collision model to account for the dynamics of the environment. We also introduce time profiling of the trajectory and non-linear velocity constraints by adding Lagrange multipliers to the trajectory loss function. We applied our method to solve the optimal motion planning problem in an urban environment using the BeamNG.tech driving simulator. An autonomous car drove the generated trajectories in three city scenarios while sharing the road with the obstacle vehicle. Our evaluation shows that the maximum acceleration the passenger can experience instantly is -7.5 m/s^2 and that 89.6% of the driving time is devoted to normal driving with accelerations below 3.5 m/s^2. The driving style is characterized by 46.0% and 31.4% of the driving time being devoted to the light rail transit style and the moderate driving style, respectively.
△ Less
Submitted 7 August, 2023;
originally announced August 2023.
-
NeuroSwarm: Multi-Agent Neural 3D Scene Reconstruction and Segmentation with UAV for Optimal Navigation of Quadruped Robot
Authors:
Iana Zhura,
Denis Davletshin,
Nipun Dhananjaya Weerakkodi Mudalige,
Aleksey Fedoseev,
Robinroy Peter,
Dzmitry Tsetserukou
Abstract:
Quadruped robots have the distinct ability to adapt their body and step height to navigate through cluttered environments. Nonetheless, for these robots to utilize their full potential in real-world scenarios, they require awareness of their environment and obstacle geometry. We propose a novel multi-agent robotic system that incorporates cutting-edge technologies. The proposed solution features a…
▽ More
Quadruped robots have the distinct ability to adapt their body and step height to navigate through cluttered environments. Nonetheless, for these robots to utilize their full potential in real-world scenarios, they require awareness of their environment and obstacle geometry. We propose a novel multi-agent robotic system that incorporates cutting-edge technologies. The proposed solution features a 3D neural reconstruction algorithm that enables navigation of a quadruped robot in both static and semi-static environments. The prior areas of the environment are also segmented according to the quadruped robots' abilities to pass them. Moreover, we have developed an adaptive neural field optimal motion planner (ANFOMP) that considers both collision probability and obstacle height in 2D space.Our new navigation and mapping approach enables quadruped robots to adjust their height and behavior to navigate under arches and push through obstacles with smaller dimensions. The multi-agent mapping operation has proven to be highly accurate, with an obstacle reconstruction precision of 82%. Moreover, the quadruped robot can navigate with 3D obstacle information and the ANFOMP system, resulting in a 33.3% reduction in path length and a 70% reduction in navigation time.
△ Less
Submitted 3 August, 2023;
originally announced August 2023.
-
AirTouch: Towards Safe Human-Robot Interaction Using Air Pressure Feedback and IR Mocap System
Authors:
Viktor Rakhmatulin,
Denis Grankin,
Mikhail Konenkov,
Sergei Davidenko,
Daria Trinitatova,
Oleg Sautenkov,
Dzmitry Tsetserukou
Abstract:
The growing use of robots in urban environments has raised concerns about potential safety hazards, especially in public spaces where humans and robots may interact. In this paper, we present a system for safe human-robot interaction that combines an infrared (IR) camera with a wearable marker and airflow potential field. IR cameras enable real-time detection and tracking of humans in challenging…
▽ More
The growing use of robots in urban environments has raised concerns about potential safety hazards, especially in public spaces where humans and robots may interact. In this paper, we present a system for safe human-robot interaction that combines an infrared (IR) camera with a wearable marker and airflow potential field. IR cameras enable real-time detection and tracking of humans in challenging environments, while controlled airflow creates a physical barrier that guides humans away from dangerous proximity to robots without the need for wearable devices. A preliminary experiment was conducted to measure the accuracy of the perception of safety barriers rendered by controlled air pressure. In a second experiment, we evaluated our approach in an imitation scenario of an interaction between an inattentive person and an autonomous robotic system. Experimental results show that the proposed system significantly improves a participant's ability to maintain a safe distance from the operating robot compared to trials without the system.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
MorphoLander: Reinforcement Learning Based Landing of a Group of Drones on the Adaptive Morphogenetic UAV
Authors:
Sausar Karaf,
Aleksey Fedoseev,
Mikhail Martynov,
Zhanibek Darush,
Aleksei Shcherbak,
Dzmitry Tsetserukou
Abstract:
This paper focuses on a novel robotic system MorphoLander representing heterogeneous swarm of drones for exploring rough terrain environments. The morphogenetic leader drone is capable of landing on uneven terrain, traversing it, and maintaining horizontal position to deploy smaller drones for extensive area exploration. After completing their tasks, these drones return and land back on the landin…
▽ More
This paper focuses on a novel robotic system MorphoLander representing heterogeneous swarm of drones for exploring rough terrain environments. The morphogenetic leader drone is capable of landing on uneven terrain, traversing it, and maintaining horizontal position to deploy smaller drones for extensive area exploration. After completing their tasks, these drones return and land back on the landing pads of MorphoGear. The reinforcement learning algorithm was developed for a precise landing of drones on the leader robot that either remains static during their mission or relocates to the new position. Several experiments were conducted to evaluate the performance of the developed landing algorithm under both even and uneven terrain conditions. The experiments revealed that the proposed system results in high landing accuracy of 0.5 cm when landing on the leader drone under even terrain conditions and 2.35 cm under uneven terrain conditions. MorphoLander has the potential to significantly enhance the efficiency of the industrial inspections, seismic surveys, and rescue missions in highly cluttered and unstructured environments.
△ Less
Submitted 28 July, 2023; v1 submitted 26 July, 2023;
originally announced July 2023.
-
ArUcoGlide: a Novel Wearable Robot for Position Tracking and Haptic Feedback to Increase Safety During Human-Robot Interaction
Authors:
Ali Alabbas,
Miguel Altamirano Cabrera,
Oussama Alyounes,
Dzmitry Tsetserukou
Abstract:
The current capabilities of robotic systems make human collaboration necessary to accomplish complex tasks effectively. In this work, we are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a wearable 2-DOF robot, a low-cost and easy-to-install tracking system, and a collision avoidance algorithm based on the Artificial Potential Field…
▽ More
The current capabilities of robotic systems make human collaboration necessary to accomplish complex tasks effectively. In this work, we are introducing a framework to ensure safety in a human-robot collaborative environment. The system is composed of a wearable 2-DOF robot, a low-cost and easy-to-install tracking system, and a collision avoidance algorithm based on the Artificial Potential Field (APF). The wearable robot is designed to hold a fiducial marker and maintain its visibility to the tracking system, which, in turn, localizes the user's hand with good accuracy and low latency and provides haptic feedback to the user. The system is designed to enhance the performance of collaborative tasks while ensuring user safety. Three experiments were carried out to evaluate the performance of the proposed system. The first one evaluated the accuracy of the tracking system. The second experiment analyzed human-robot behavior during an imminent collision. The third experiment evaluated the system in a collaborative activity in a shared working environment. The results show that the implementation of the introduced system reduces the operation time by 16% and increases the average distance between the user's hand and the robot by 5 cm.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
GHACPP: Genetic-based Human-Aware Coverage Path Planning Algorithm for Autonomous Disinfection Robot
Authors:
Stepan Perminov,
Ivan Kalinov,
Dzmitry Tsetserukou
Abstract:
Numerous mobile robots with mounted Ultraviolet-C (UV-C) lamps were developed recently, yet they cannot work in the same space as humans without irradiating them by UV-C. This paper proposes a novel modular and scalable Human-Aware Genetic-based Coverage Path Planning algorithm (GHACPP), that aims to solve the problem of disinfecting of unknown environments by UV-C irradiation and preventing human…
▽ More
Numerous mobile robots with mounted Ultraviolet-C (UV-C) lamps were developed recently, yet they cannot work in the same space as humans without irradiating them by UV-C. This paper proposes a novel modular and scalable Human-Aware Genetic-based Coverage Path Planning algorithm (GHACPP), that aims to solve the problem of disinfecting of unknown environments by UV-C irradiation and preventing human eyes and skin from being harmed.
The proposed genetic-based algorithm alternates between the stages of exploring a new area, generating parts of the resulting disinfection trajectory, called mini-trajectories, and updating the current state around the robot. The system performance in effectiveness and human safety is validated and compared with one of the latest state-of-the-art online coverage path planning algorithms called SimExCoverage-STC. The experimental results confirmed both the high level of safety for humans and the efficiency of the developed algorithm in terms of decrease of path length (by 37.1%), number (39.5%) and size (35.2%) of turns, and time (7.6%) to complete the disinfection task, with a small loss in the percentage of area covered (0.6%), in comparison with the state-of-the-art approach.
△ Less
Submitted 17 July, 2023;
originally announced July 2023.
-
POA: Passable Obstacles Aware Path-planning Algorithm for Navigation of a Two-wheeled Robot in Highly Cluttered Environments
Authors:
Alexander Petrovsky,
Yomna Youssef,
Kirill Myasoedov,
Artem Timoshenko,
Vladimir Guneavoi,
Ivan Kalinov,
Dzmitry Tsetserukou
Abstract:
This paper focuses on Passable Obstacles Aware (POA) planner - a novel navigation method for two-wheeled robots in a highly cluttered environment. The navigation algorithm detects and classifies objects to distinguish two types of obstacles - passable and unpassable. Our algorithm allows two-wheeled robots to find a path through passable obstacles. Such a solution helps the robot working in areas…
▽ More
This paper focuses on Passable Obstacles Aware (POA) planner - a novel navigation method for two-wheeled robots in a highly cluttered environment. The navigation algorithm detects and classifies objects to distinguish two types of obstacles - passable and unpassable. Our algorithm allows two-wheeled robots to find a path through passable obstacles. Such a solution helps the robot working in areas inaccessible to standard path planners and find optimal trajectories in scenarios with a high number of objects in the robot's vicinity. The POA planner can be embedded into other planning algorithms and enables them to build a path through obstacles. Our method decreases path length and the total travel time to the final destination up to 43% and 39%, respectively, comparing to standard path planners such as GVD, A*, and RRT*
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
MorphoArms: Morphogenetic Teleoperation of Multimanual Robot
Authors:
Mikhail Martynov,
Zhanibek Darush,
Miguel Altamirano Cabrera,
Sausar Karaf,
Dzmitry Tsetserukou
Abstract:
Nowadays, there are few unmanned aerial vehicles (UAVs) capable of flying, walking and grasping. A drone with all these functionalities can significantly improve its performance in complex tasks such as monitoring and exploring different types of terrain, and rescue operations. This paper presents MorphoArms, a novel system that consists of a morphogenetic chassis and a hand gesture recognition te…
▽ More
Nowadays, there are few unmanned aerial vehicles (UAVs) capable of flying, walking and grasping. A drone with all these functionalities can significantly improve its performance in complex tasks such as monitoring and exploring different types of terrain, and rescue operations. This paper presents MorphoArms, a novel system that consists of a morphogenetic chassis and a hand gesture recognition teleoperation system. The mechanics, electronics, control architecture, and walking behavior of the morphogenetic chassis are described. This robot is capable of walking and grasping objects using four robotic limbs. Robotic limbs with four degrees-of-freedom are used as pedipulators when walking and as manipulators when performing actions in the environment. The robot control system is implemented using teleoperation, where commands are given by hand gestures. A motion capture system is used to track the user's hands and to recognize their gestures. The method of controlling the robot was experimentally tested in a study involving 10 users. The evaluation included three questionnaires (NASA TLX, SUS, and UEQ). The results showed that the proposed system was more user-friendly than 56% of the systems, and it was rated above average in terms of attractiveness, stimulation, and novelty.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
LLM-BRAIn: AI-driven Fast Generation of Robot Behaviour Tree based on Large Language Model
Authors:
Artem Lykov,
Dzmitry Tsetserukou
Abstract:
This paper presents a novel approach in autonomous robot control, named LLM-BRAIn, that makes possible robot behavior generation, based on operator's commands. LLM-BRAIn is a transformer-based Large Language Model (LLM) fine-tuned from Stanford Alpaca 7B model to generate robot behavior tree (BT) from the text description. We train the LLM-BRAIn on 8,5k instruction-following demonstrations, genera…
▽ More
This paper presents a novel approach in autonomous robot control, named LLM-BRAIn, that makes possible robot behavior generation, based on operator's commands. LLM-BRAIn is a transformer-based Large Language Model (LLM) fine-tuned from Stanford Alpaca 7B model to generate robot behavior tree (BT) from the text description. We train the LLM-BRAIn on 8,5k instruction-following demonstrations, generated in the style of self-instruct using text-davinchi-003. The developed model accurately builds complex robot behavior while remaining small enough to be run on the robot's onboard microcomputer. The model gives structural and logical correct BTs and can successfully manage instructions that were not presented in training set. The experiment did not reveal any significant subjective differences between BTs generated by LLM-BRAIn and those created by humans (on average, participants were able to correctly distinguish between LLM-BRAIn generated BTs and human-created BTs in only 4.53 out of 10 cases, indicating that their performance was close to random chance). The proposed approach potentially can be applied to mobile robotics, drone operation, robot manipulator systems and Industry 4.0.
△ Less
Submitted 30 May, 2023;
originally announced May 2023.
-
Hierarchical Whole-body Control of the cable-Suspended Aerial Manipulator endowed with Winch-based Actuation
Authors:
Yuri Sarkisov,
Andre Coelho,
Maihara Santos,
Min Jun Kim,
Dzmitry Tsetserukou,
Christian Ott,
Konstantin Kondak
Abstract:
During operation, aerial manipulation systems are affected by various disturbances. Among them is a gravitational torque caused by the weight of the robotic arm. Common propeller-based actuation is ineffective against such disturbances because of possible overheating and high power consumption. To overcome this issue, in this paper we propose a winchbased actuation for the crane-stationed cable-su…
▽ More
During operation, aerial manipulation systems are affected by various disturbances. Among them is a gravitational torque caused by the weight of the robotic arm. Common propeller-based actuation is ineffective against such disturbances because of possible overheating and high power consumption. To overcome this issue, in this paper we propose a winchbased actuation for the crane-stationed cable-suspended aerial manipulator. Three winch-controlled suspension rigging cables produce a desired cable tension distribution to generate a wrench that reduces the effect of gravitational torque. In order to coordinate the robotic arm and the winch-based actuation, a model-based hierarchical whole-body controller is adapted. It resolves two tasks: keeping the robotic arm end-effector at the desired pose and shifting the system center of mass in the location with zero gravitational torque. The performance of the introduced actuation system as well as control strategy is validated through experimental studies.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Hierarchical Visual Localization Based on Sparse Feature Pyramid for Adaptive Reduction of Keypoint Map Size
Authors:
Andrei Potapov,
Mikhail Kurenkov,
Pavel Karpyshev,
Evgeny Yudin,
Alena Savinykh,
Evgeny Kruzhkov,
Dzmitry Tsetserukou
Abstract:
Visual localization is a fundamental task for a wide range of applications in the field of robotics. Yet, it is still a complex problem with no universal solution, and the existing approaches are difficult to scale: most state-of-the-art solutions are unable to provide accurate localization without a significant amount of storage space. We propose a hierarchical, low-memory approach to localizatio…
▽ More
Visual localization is a fundamental task for a wide range of applications in the field of robotics. Yet, it is still a complex problem with no universal solution, and the existing approaches are difficult to scale: most state-of-the-art solutions are unable to provide accurate localization without a significant amount of storage space. We propose a hierarchical, low-memory approach to localization based on keypoints with different descriptor lengths. It becomes possible with the use of the developed unsupervised neural network, which predicts a feature pyramid with different descriptor lengths for images. This structure allows applying coarse-to-fine paradigms for localization based on keypoint map, and varying the accuracy of localization by changing the type of the descriptors used in the pipeline. Our approach achieves comparable results in localization accuracy and a significant reduction in memory consumption (up to 16 times) among state-of-the-art methods.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
SwipeBot: DNN-based Autonomous Robot Navigation among Movable Obstacles in Cluttered Environments
Authors:
Nikolay Zherdev,
Mikhail Kurenkov,
Kristina Belikova,
Dzmitry Tsetserukou
Abstract:
In this paper, we propose a novel approach to wheeled robot navigation through an environment with movable obstacles. A robot exploits knowledge about different obstacle classes and selects the minimally invasive action to perform to clear the path. We trained a convolutional neural network (CNN), so the robot can classify an RGB-D image and decide whether to push a blocking object and which force…
▽ More
In this paper, we propose a novel approach to wheeled robot navigation through an environment with movable obstacles. A robot exploits knowledge about different obstacle classes and selects the minimally invasive action to perform to clear the path. We trained a convolutional neural network (CNN), so the robot can classify an RGB-D image and decide whether to push a blocking object and which force to apply. After known objects are segmented, they are being projected to a cost-map, and a robot calculates an optimal path to the goal. If the blocking objects are allowed to be moved, a robot drives through them while pushing them away. We implemented our algorithm in ROS, and an extensive set of simulations showed that the robot successfully overcomes the blocked regions. Our approach allows a robot to successfully build a path through regions, where it would have stuck with traditional path-planning techniques.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.
-
SwarmGear: Heterogeneous Swarm of Drones with Reconfigurable Leader Drone and Virtual Impedance Links for Multi-Robot Inspection
Authors:
Zhanibek Darush,
Mikhail Martynov,
Aleksey Fedoseev,
Aleksei Shcherbak,
Dzmitry Tsetserukou
Abstract:
The continuous monitoring by drone swarms remains a challenging problem due to the lack of power supply and the inability of drones to land on uneven surfaces. Heterogeneous swarms, including ground and aerial vehicles, can support longer inspections and carry a higher number of sensors on board. However, their capabilities are limited by the mobility of wheeled and legged robots in a cluttered en…
▽ More
The continuous monitoring by drone swarms remains a challenging problem due to the lack of power supply and the inability of drones to land on uneven surfaces. Heterogeneous swarms, including ground and aerial vehicles, can support longer inspections and carry a higher number of sensors on board. However, their capabilities are limited by the mobility of wheeled and legged robots in a cluttered environment.
In this paper, we propose a novel concept for autonomous inspection that we call SwarmGear. SwarmGear utilizes a heterogeneous swarm that investigates the environment in a leader-follower formation. The leader drone is able to land on rough terrain and traverse it by four compliant robotic legs, possessing both the functionalities of an aerial and mobile robot. To preserve the formation of the swarm during its motion, virtual impedance links were developed between the leader and the follower drones.
We evaluated experimentally the accuracy of the hybrid leader drone's ground locomotion. By changing the step parameters, the optimal step configuration was found. Two types of gaits were evaluated. The experiments revealed low crosstrack error (mean of 2 cm and max of 4.8 cm) and the ability of the leader drone to move with a 190 mm step length and a 3 degree standard yaw deviation. Four types of drone formations were considered. The best formation was used for experiments with SwarmGear, and it showed low overall crosstrack error for the swarm (mean 7.9 cm for the type 1 gait and 5.1 cm for the type 2 gait).
The proposed system can potentially improve the performance of autonomous swarms in cluttered and unstructured environments by allowing all agents of the swarm to switch between aerial and ground formations to overcome various obstacles and perform missions over a large area.
△ Less
Submitted 6 April, 2023;
originally announced April 2023.
-
RL-Based Guidance in Outpatient Hysteroscopy Training: A Feasibility Study
Authors:
Vladimir Poliakov,
Kenan Niu,
Emmanuel Vander Poorten,
Dzmitry Tsetserukou
Abstract:
This work presents an RL-based agent for outpatient hysteroscopy training. Hysteroscopy is a gynecological procedure for examination of the uterine cavity. Recent advancements enabled performing this type of intervention in the outpatient setup without anaesthesia. While being beneficial to the patient, this approach introduces new challenges for clinicians, who should take additional measures to…
▽ More
This work presents an RL-based agent for outpatient hysteroscopy training. Hysteroscopy is a gynecological procedure for examination of the uterine cavity. Recent advancements enabled performing this type of intervention in the outpatient setup without anaesthesia. While being beneficial to the patient, this approach introduces new challenges for clinicians, who should take additional measures to maintain the level of patient comfort and prevent tissue damage. Our prior work has presented a platform for hysteroscopic training with the focus on the passage of the cervical canal. With this work, we aim to extend the functionality of the platform by designing a subsystem that autonomously performs the task of the passage of the cervical canal. This feature can later be used as a virtual instructor to provide educational cues for trainees and assess their performance. The developed algorithm is based on the soft actor critic approach to smooth the learning curve of the agent and ensure uniform exploration of the workspace. The designed algorithm was tested against the performance of five clinicians. Overall, the algorithm demonstrated high efficiency and reliability, succeeding in 98% of trials and outperforming the expert group in three out of four measured metrics.
△ Less
Submitted 26 November, 2022;
originally announced November 2022.
-
LiePoseNet: Heterogeneous Loss Function Based on Lie Group for Significant Speed-up of PoseNet Training Process
Authors:
Mikhail Kurenkov,
Ivan Kalinov,
Dzmitry Tsetserukou
Abstract:
Visual localization is an essential modern technology for robotics and computer vision. Popular approaches for solving this task are image-based methods. Nowadays, these methods have low accuracy and a long training time. The reasons are the lack of rigid-body and projective geometry awareness, landmark symmetry, and homogeneous error assumption. We propose a heterogeneous loss function based on c…
▽ More
Visual localization is an essential modern technology for robotics and computer vision. Popular approaches for solving this task are image-based methods. Nowadays, these methods have low accuracy and a long training time. The reasons are the lack of rigid-body and projective geometry awareness, landmark symmetry, and homogeneous error assumption. We propose a heterogeneous loss function based on concentrated Gaussian distribution with the Lie group to overcome these difficulties. Following our experiment, the proposed method allows us to speed up the training process significantly (from 300 to 10 epochs) with acceptable error values.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
DeltaFinger: a 3-DoF Wearable Haptic Display Enabling High-Fidelity Force Vector Presentation at a User Finger
Authors:
Artem Lykov,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
This paper presents a novel haptic device DeltaFinger designed to deliver the force of interaction with virtual objects by guiding user's finger with wearable delta mechanism. The developed interface is capable to deliver 3D force vector to the fingertip of the index finger of the user, allowing complex rendering of virtual reality (VR) environment. The developed device is able to produce the kine…
▽ More
This paper presents a novel haptic device DeltaFinger designed to deliver the force of interaction with virtual objects by guiding user's finger with wearable delta mechanism. The developed interface is capable to deliver 3D force vector to the fingertip of the index finger of the user, allowing complex rendering of virtual reality (VR) environment. The developed device is able to produce the kinesthetic feedback up to 1.8 N in vertical projection and 0.9 N in horizontal projection without restricting the motion freedom of of the remaining fingers. The experimental results showed a sufficient precision in perception of force vector with DeltaFinger (mean force vector error of 0.6 rad). The proposed device potentially can be applied to VR communications, medicine, and navigation of the people with vision problems.
△ Less
Submitted 1 November, 2022;
originally announced November 2022.
-
DroneARchery: Human-Drone Interaction through Augmented Reality with Haptic Feedback and Multi-UAV Collision Avoidance Driven by Deep Reinforcement Learning
Authors:
Ekaterina Dorzhieva,
Ahmed Baza,
Ayush Gupta,
Aleksey Fedoseev,
Miguel Altamirano Cabrera,
Ekaterina Karmanova,
Dzmitry Tsetserukou
Abstract:
We propose a novel concept of augmented reality (AR) human-drone interaction driven by RL-based swarm behavior to achieve intuitive and immersive control of a swarm formation of unmanned aerial vehicles. The DroneARchery system developed by us allows the user to quickly deploy a swarm of drones, generating flight paths simulating archery. The haptic interface LinkGlide delivers a tactile stimulus…
▽ More
We propose a novel concept of augmented reality (AR) human-drone interaction driven by RL-based swarm behavior to achieve intuitive and immersive control of a swarm formation of unmanned aerial vehicles. The DroneARchery system developed by us allows the user to quickly deploy a swarm of drones, generating flight paths simulating archery. The haptic interface LinkGlide delivers a tactile stimulus of the bowstring tension to the forearm to increase the precision of aiming. The swarm of released drones dynamically avoids collisions between each other, the drone following the user, and external obstacles with behavior control based on deep reinforcement learning.
The developed concept was tested in the scenario with a human, where the user shoots from a virtual bow with a real drone to hit the target. The human operator observes the ballistic trajectory of the drone in an AR and achieves a realistic and highly recognizable experience of the bowstring tension through the haptic display.
The experimental results revealed that the system improves trajectory prediction accuracy by 63.3% through applying AR technology and conveying haptic feedback of pulling force. DroneARchery users highlighted the naturalness (4.3 out of 5 point Likert scale) and increased confidence (4.7 out of 5) when controlling the drone. We have designed the tactile patterns to present four sliding distances (tension) and three applied force levels (stiffness) of the haptic display. Users demonstrated the ability to distinguish tactile patterns produced by the haptic display representing varying bowstring tension(average recognition rate is of 72.8%) and stiffness (average recognition rate is of 94.2%).
The novelty of the research is the development of an AR-based approach for drone control that does not require special skills and training from the operator.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
SwarMan: Anthropomorphic Swarm of Drones Avatar with Body Tracking and Deep Learning-Based Gesture Recognition
Authors:
Ahmed Baza,
Ayush Gupta,
Ekaterina Dorzhieva,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Anthropomorphic robot avatars present a conceptually novel approach to remote affective communication, allowing people across the world a wider specter of emotional and social exchanges over traditional 2D and 3D image data. However, there are several limitations of current telepresence robots, such as the high weight, complexity of the system that prevents its fast deployment, and the limited wor…
▽ More
Anthropomorphic robot avatars present a conceptually novel approach to remote affective communication, allowing people across the world a wider specter of emotional and social exchanges over traditional 2D and 3D image data. However, there are several limitations of current telepresence robots, such as the high weight, complexity of the system that prevents its fast deployment, and the limited workspace of the avatars mounted on either static or wheeled mobile platforms.
In this paper, we present a novel concept of telecommunication through a robot avatar based on an anthropomorphic swarm of drones; SwarMan. The developed system consists of nine nanocopters controlled remotely by the operator through a gesture recognition interface. SwarMan allows operators to communicate by directly following their motions and by recognizing one of the prerecorded emotional patterns, thus rendering the captured emotion as illumination on the drones. The LSTM MediaPipe network was trained on a collected dataset of 600 short videos with five emotional gestures. The accuracy of achieved emotion recognition was 97% on the test dataset.
As communication through the swarm avatar significantly changes the visual appearance of the operator, we investigated the ability of the users to recognize and respond to emotions performed by the swarm of drones. The experimental results revealed a high consistency between the users in rating emotions. Additionally, users indicated low physical demand (2.25 on the Likert scale) and were satisfied with their performance (1.38 on the Likert scale) when communicating by the SwarMan interface.
△ Less
Submitted 4 October, 2022;
originally announced October 2022.
-
DandelionTouch: High Fidelity Haptic Rendering of Soft Objects in VR by a Swarm of Drones
Authors:
Aleksey Fedoseev,
Ahmed Baza,
Ayush Gupta,
Ekaterina Dorzhieva,
Riya Neelesh Gujarathi,
Dzmitry Tsetserukou
Abstract:
To achieve high fidelity haptic rendering of soft objects in a high mobility virtual environment, we propose a novel haptic display DandelionTouch. The tactile actuators are delivered to the fingertips of the user by a swarm of drones. Users of DandelionTouch are capable of experiencing tactile feedback in a large space that is not limited by the device's working area. Importantly, they will not e…
▽ More
To achieve high fidelity haptic rendering of soft objects in a high mobility virtual environment, we propose a novel haptic display DandelionTouch. The tactile actuators are delivered to the fingertips of the user by a swarm of drones. Users of DandelionTouch are capable of experiencing tactile feedback in a large space that is not limited by the device's working area. Importantly, they will not experience muscle fatigue during long interactions with virtual objects. Hand tracking and swarm control algorithm allow guiding the swarm with hand motions and avoid collisions inside the formation.
Several topologies of the impedance connection between swarm units were investigated in this research. The experiment, in which drones performed a point following task on a square trajectory in real-time, revealed that drones connected in a Star topology performed the trajectory with low mean positional error (RMSE decreased by 20.6% in comparison with other impedance topologies and by 40.9% in comparison with potential field-based swarm control). The achieved velocities of the drones in all formations with impedance behavior were 28% higher than for the swarm controlled with the potential field algorithm.
Additionally, the perception of several vibrotactile patterns was evaluated in a user study with 7 participants. The study has shown that the proposed combination of temporal delay and frequency modulation allows users to successfully recognize the surface property and motion direction in VR simultaneously (mean recognition rate of 70%, maximum of 93%). DandelionTouch suggests a new type of haptic feedback in VR systems where no hand-held or wearable interface is required.
△ Less
Submitted 22 September, 2022; v1 submitted 21 September, 2022;
originally announced September 2022.
-
HyperGuider: Virtual Reality Framework for Interactive Path Planning of Quadruped Robot in Cluttered and Multi-Terrain Environments
Authors:
Ildar Babataev,
Aleksey Fedoseev,
Nipun Weerakkodi,
Elena Nazarova,
Dzmitry Tsetserukou
Abstract:
Quadruped platforms have become an active topic of research due to their high mobility and traversability in rough terrain. However, it is highly challenging to determine whether the clattered environment could be passed by the robot and how exactly its path should be calculated. Moreover, the calculated path may pass through areas with dynamic objects or environments that are dangerous for the ro…
▽ More
Quadruped platforms have become an active topic of research due to their high mobility and traversability in rough terrain. However, it is highly challenging to determine whether the clattered environment could be passed by the robot and how exactly its path should be calculated. Moreover, the calculated path may pass through areas with dynamic objects or environments that are dangerous for the robot or people around. Therefore, we propose a novel conceptual approach of teaching quadruped robots navigation through user-guided path planning in virtual reality (VR). Our system contains both global and local path planners, allowing robot to generate path through iterations of learning. The VR interface allows user to interact with environment and to assist quadruped robot in challenging scenarios. The results of comparison experiments show that cooperation between human and path planning algorithms can increase the computational speed of the algorithm by 35.58% in average, and non-critically increasing of the path length (average of 6.66%) in test scenario. Additionally, users described VR interface as not requiring physical demand (2.3 out of 10) and highly evaluated their performance (7.1 out of 10). The ability to find a less optimal but safer path remains in demand for the task of navigating in a cluttered and unstructured environment.
△ Less
Submitted 20 September, 2022;
originally announced September 2022.
-
HyperPalm: DNN-based hand gesture recognition interface for intelligent communication with quadruped robot in 3D space
Authors:
Elena Nazarova,
Ildar Babataev,
Nipun Weerakkodi,
Aleksey Fedoseev,
Dzmitry Tsetserukou
Abstract:
Nowadays, autonomous mobile robots support people in many areas where human presence either redundant or too dangerous. They have successfully proven themselves in expeditions, gas industry, mines, warehouses, etc. However, even legged robots may stuck in rough terrain conditions requiring human cognitive abilities to navigate the system. While gamepads and keyboards are convenient for wheeled rob…
▽ More
Nowadays, autonomous mobile robots support people in many areas where human presence either redundant or too dangerous. They have successfully proven themselves in expeditions, gas industry, mines, warehouses, etc. However, even legged robots may stuck in rough terrain conditions requiring human cognitive abilities to navigate the system. While gamepads and keyboards are convenient for wheeled robot control, the quadruped robot in 3D space can move along all linear coordinates and Euler angles, requiring at least 12 buttons for independent control of their DoF. Therefore, more convenient interfaces of control are required.
In this paper we present HyperPalm: a novel gesture interface for intuitive human-robot interaction with quadruped robots. Without additional devices, the operator has full position and orientation control of the quadruped robot in 3D space through hand gesture recognition with only 5 gestures and 6 DoF hand motion.
The experimental results revealed to classify 5 static gestures with high accuracy (96.5%), accurately predict the position of the 6D position of the hand in three-dimensional space. The absolute linear deviation Root mean square deviation (RMSD) of the proposed approach is 11.7 mm, which is almost 50% lower than for the second tested approach, the absolute angular deviation RMSD of the proposed approach is 2.6 degrees, which is almost 27% lower than for the second tested approach. Moreover, the user study was conducted to explore user's subjective experience from human-robot interaction through the proposed gesture interface. The participants evaluated their interaction with HyperPalm as intuitive (2.0), not causing frustration (2.63), and requiring low physical demand (2.0).
△ Less
Submitted 20 September, 2022;
originally announced September 2022.