Search | arXiv e-print repository

PlaceFormer: Transformer-based Visual Place Recognition using Multi-Scale Patch Selection and Fusion

Authors: Shyam Sundar Kannan, Byung-Cheol Min

Abstract: Visual place recognition is a challenging task in the field of computer vision, and autonomous robotics and vehicles, which aims to identify a location or a place from visual inputs. Contemporary methods in visual place recognition employ convolutional neural networks and utilize every region within the image for the place recognition task. However, the presence of dynamic and distracting elements… ▽ More Visual place recognition is a challenging task in the field of computer vision, and autonomous robotics and vehicles, which aims to identify a location or a place from visual inputs. Contemporary methods in visual place recognition employ convolutional neural networks and utilize every region within the image for the place recognition task. However, the presence of dynamic and distracting elements in the image may impact the effectiveness of the place recognition process. Therefore, it is meaningful to focus on task-relevant regions of the image for improved recognition. In this paper, we present PlaceFormer, a novel transformer-based approach for visual place recognition. PlaceFormer employs patch tokens from the transformer to create global image descriptors, which are then used for image retrieval. To re-rank the retrieved images, PlaceFormer merges the patch tokens from the transformer to form multi-scale patches. Utilizing the transformer's self-attention mechanism, it selects patches that correspond to task-relevant areas in an image. These selected patches undergo geometric verification, generating similarity scores across different patch sizes. Subsequently, spatial scores from each patch size are fused to produce a final similarity score. This score is then used to re-rank the images initially retrieved using global image descriptors. Extensive experiments on benchmark datasets demonstrate that PlaceFormer outperforms several state-of-the-art methods in terms of accuracy and computational efficiency, requiring less time and memory. △ Less

Submitted 27 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted for publication in Robotics and Automation Letters

arXiv:2309.16031 [pdf, other]

DynaCon: Dynamic Robot Planner with Contextual Awareness via LLMs

Authors: Gyeongmin Kim, Taehyeon Kim, Shyam Sundar Kannan, Vishnunandan L. N. Venkatesh, Donghan Kim, Byung-Cheol Min

Abstract: Mobile robots often rely on pre-existing maps for effective path planning and navigation. However, when these maps are unavailable, particularly in unfamiliar environments, a different approach become essential. This paper introduces DynaCon, a novel system designed to provide mobile robots with contextual awareness and dynamic adaptability during navigation, eliminating the reliance of traditiona… ▽ More Mobile robots often rely on pre-existing maps for effective path planning and navigation. However, when these maps are unavailable, particularly in unfamiliar environments, a different approach become essential. This paper introduces DynaCon, a novel system designed to provide mobile robots with contextual awareness and dynamic adaptability during navigation, eliminating the reliance of traditional maps. DynaCon integrates real-time feedback with an object server, prompt engineering, and navigation modules. By harnessing the capabilities of Large Language Models (LLMs), DynaCon not only understands patterns within given numeric series but also excels at categorizing objects into matched spaces. This facilitates dynamic path planner imbued with contextual awareness. We validated the effectiveness of DynaCon through an experiment where a robot successfully navigated to its goal using reasoning. Source code and experiment videos for this work can be found at: https://sites.google.com/view/dynacon. △ Less

Submitted 27 September, 2023; originally announced September 2023.

Comments: Submitted to ICRA 2024

arXiv:2309.10062 [pdf, other]

SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models

Authors: Shyam Sundar Kannan, Vishnunandan L. N. Venkatesh, Byung-Cheol Min

Abstract: In this work, we introduce SMART-LLM, an innovative framework designed for embodied multi-robot task planning. SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models (LLMs), harnesses the power of LLMs to convert high-level task instructions provided as input into a multi-robot task plan. It accomplishes this by executing a series of stages, including task decomposition, coal… ▽ More In this work, we introduce SMART-LLM, an innovative framework designed for embodied multi-robot task planning. SMART-LLM: Smart Multi-Agent Robot Task Planning using Large Language Models (LLMs), harnesses the power of LLMs to convert high-level task instructions provided as input into a multi-robot task plan. It accomplishes this by executing a series of stages, including task decomposition, coalition formation, and task allocation, all guided by programmatic LLM prompts within the few-shot prompting paradigm. We create a benchmark dataset designed for validating the multi-robot task planning problem, encompassing four distinct categories of high-level instructions that vary in task complexity. Our evaluation experiments span both simulation and real-world scenarios, demonstrating that the proposed model can achieve promising results for generating multi-robot task plans. The experimental videos, code, and datasets from the work can be found at https://sites.google.com/view/smart-llm/. △ Less

Submitted 22 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

Comments: Submitted to IROS 2024

arXiv:2303.04284 [pdf, other]

UPPLIED: UAV Path Planning for Inspection through Demonstration

Authors: Shyam Sundar Kannan, Vishnunandan L. N. Venkatesh, Revanth Krishna Senthilkumaran, Byung-Cheol Min

Abstract: In this paper, a new demonstration-based path-planning framework for the visual inspection of large structures using UAVs is proposed. We introduce UPPLIED: UAV Path PLanning for InspEction through Demonstration, which utilizes a demonstrated trajectory to generate a new trajectory to inspect other structures of the same kind. The demonstrated trajectory can inspect specific regions of the structu… ▽ More In this paper, a new demonstration-based path-planning framework for the visual inspection of large structures using UAVs is proposed. We introduce UPPLIED: UAV Path PLanning for InspEction through Demonstration, which utilizes a demonstrated trajectory to generate a new trajectory to inspect other structures of the same kind. The demonstrated trajectory can inspect specific regions of the structure and the new trajectory generated by UPPLIED inspects similar regions in the other structure. The proposed method generates inspection points from the demonstrated trajectory and uses standardization to translate those inspection points to inspect the new structure. Finally, the position of these inspection points is optimized to refine their view. Numerous experiments were conducted with various structures and the proposed framework was able to generate inspection trajectories of various kinds for different structures based on the demonstration. The trajectories generated match with the demonstrated trajectory in geometry and at the same time inspect the regions inspected by the demonstration trajectory with minimum deviation. The experimental video of the work can be found at https://youtu.be/YqPx-cLkv04. △ Less

Submitted 24 July, 2023; v1 submitted 7 March, 2023; originally announced March 2023.

Comments: Accepted for publication in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2023), Detroit, Michigan, USA

arXiv:2303.00920 [pdf, other]

Beacon-based Distributed Structure Formation in Multi-agent Systems

Authors: Tamzidul Mina, Wonse Jo, Shyam S. Kannan, Byung-Cheol Min

Abstract: Autonomous shape and structure formation is an important problem in the domain of large-scale multi-agent systems. In this paper, we propose a 3D structure representation method and a distributed structure formation strategy where settled agents guide free moving agents to a prescribed location to settle in the structure. Agents at the structure formation frontier looking for neighbors to settle a… ▽ More Autonomous shape and structure formation is an important problem in the domain of large-scale multi-agent systems. In this paper, we propose a 3D structure representation method and a distributed structure formation strategy where settled agents guide free moving agents to a prescribed location to settle in the structure. Agents at the structure formation frontier looking for neighbors to settle act as beacons, generating a surface gradient throughout the formed structure propagated by settled agents. Free-moving agents follow the surface gradient along the formed structure surface to the formation frontier, where they eventually reach the closest beacon and settle to continue the structure formation following a local bidding process. Agent behavior is governed by a finite state machine implementation, along with potential field-based motion control laws. We also discuss appropriate rules for recovering from stagnation points. Simulation experiments are presented to show planar and 3D structure formations with continuous and discontinuous boundary/surfaces, which validate the proposed strategy, followed by a scalability analysis. △ Less

Submitted 28 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: 8 pages, 6 figures, accepted for publication in IROS 2023. A link to the simulation videos is provided under the Validation section

arXiv:2108.12616 [pdf, other]

A Predictive Application Offloading Algorithm Using Small Datasets for Cloud Robotics

Authors: Manoj Penmetcha, Shyam Sundar Kannan, Byung-Cheol Min

Abstract: Many robotic applications that are critical for robot performance require immediate feedback, hence execution time is a critical concern. Furthermore, it is common that robots come with a fixed quantity of hardware resources; if an application requires more computational resources than the robot can accommodate, its onboard execution might be extended to a degree that degrades the robot performanc… ▽ More Many robotic applications that are critical for robot performance require immediate feedback, hence execution time is a critical concern. Furthermore, it is common that robots come with a fixed quantity of hardware resources; if an application requires more computational resources than the robot can accommodate, its onboard execution might be extended to a degree that degrades the robot performance. Cloud computing, on the other hand, features on-demand computational resources; by enabling robots to leverage those resources, application execution time can be reduced. The key to enabling robot use of cloud computing is designing an efficient offloading algorithm that makes optimum use of the robot onboard capabilities and also forms a quick consensus on when to offload without any prior knowledge or information about the application. In this paper, we propose a predictive algorithm to anticipate the time needed to execute an application for a given application data input size with the help of a small number of previous observations. To validate the algorithm, we train it on the previous N observations, which include independent (input data size) and dependent (execution time) variables. To understand how algorithm performance varies in terms of prediction accuracy and error, we tested various N values using linear regression and a mobile robot path planning application. From our experiments and analysis, we determined the algorithm to have acceptable error and prediction accuracy when N>40. △ Less

Submitted 28 August, 2021; originally announced August 2021.

arXiv:2108.03045 [pdf, other]

External Human-Machine Interface on Delivery Robots: Expression of Navigation Intent of the Robot

Authors: Shyam Sundar Kannan, Ahreum Lee, Byung-Cheol Min

Abstract: External Human-Machine Interfaces (eHMI) are widely used on robots and autonomous vehicles to convey the machine's intent to humans. Delivery robots are getting common, and they share the sidewalk along with the pedestrians. Current research has explored the design of eHMI and its effectiveness for social robots and autonomous vehicles, but the use of eHMIs on delivery robots still remains unexplo… ▽ More External Human-Machine Interfaces (eHMI) are widely used on robots and autonomous vehicles to convey the machine's intent to humans. Delivery robots are getting common, and they share the sidewalk along with the pedestrians. Current research has explored the design of eHMI and its effectiveness for social robots and autonomous vehicles, but the use of eHMIs on delivery robots still remains unexplored. There is a knowledge gap on the effective use of eHMIs on delivery robots for indicating the robot's navigational intent to the pedestrians. An online survey with 152 participants was conducted to investigate the comprehensibility of the display and light-based eHMIs that convey the delivery robot's navigational intent under common navigation scenarios. Results show that display is preferred over lights in conveying the intent. The preferred type of content to be displayed varies according to the scenarios. Additionally, light is preferred as an auxiliary eHMI to present redundant information. The findings of this study can contribute to the development of future designs of eHMI on delivery robots. △ Less

Submitted 6 August, 2021; originally announced August 2021.

Comments: Accepted at 30th IEEE International Conference on Robot and Human Interactive Communication (RO-MAN 2021)

arXiv:2105.01799 [pdf, other]

Towards End-to-End Deep Learning for Autonomous Racing: On Data Collection and a Unified Architecture for Steering and Throttle Prediction

Authors: Shakti N. Wadekar, Benjamin J. Schwartz, Shyam S. Kannan, Manuel Mar, Rohan Kumar Manna, Vishnu Chellapandi, Daniel J. Gonzalez, Aly El Gamal

Abstract: Deep Neural Networks (DNNs) which are trained end-to-end have been successfully applied to solve complex problems that we have not been able to solve in past decades. Autonomous driving is one of the most complex problems which is yet to be completely solved and autonomous racing adds more complexity and exciting challenges to this problem. Towards the challenge of applying end-to-end learning to… ▽ More Deep Neural Networks (DNNs) which are trained end-to-end have been successfully applied to solve complex problems that we have not been able to solve in past decades. Autonomous driving is one of the most complex problems which is yet to be completely solved and autonomous racing adds more complexity and exciting challenges to this problem. Towards the challenge of applying end-to-end learning to autonomous racing, this paper shows results on two aspects: (1) Analyzing the relationship between the driving data used for training and the maximum speed at which the DNN can be successfully applied for predicting steering angle, (2) Neural network architecture and training methodology for learning steering and throttle without any feedback or recurrent connections. △ Less

Submitted 4 May, 2021; originally announced May 2021.

Comments: 6 pages, 10 figures, 3 tables

arXiv:2104.05503 [pdf, other]

Autonomous Drone Delivery to Your Door and Yard

Authors: Shyam Sundar Kannan, Byung-Cheol Min

Abstract: In this work, we present a system that enables delivery drones to autonomously navigate and deliver packages at various locations around a house according to the desire of the recipient and without the need for any external markers as currently used. This development is motivated by recent advancements in deep learning that can potentially supplant the specialized markers presently used by deliver… ▽ More In this work, we present a system that enables delivery drones to autonomously navigate and deliver packages at various locations around a house according to the desire of the recipient and without the need for any external markers as currently used. This development is motivated by recent advancements in deep learning that can potentially supplant the specialized markers presently used by delivery drones for identifying sites at which to deliver packages. The proposed system is more natural in that it takes instruction on where to deliver the package as input, similar to the instructions provided to human couriers. First, we propose a semantic image segmentation-based descending location estimator that enables the drone to find a safe spot around the house at which it can descend from higher altitudes. Following this, we propose a strategy for visually routing the drone from the descent location to a specific site at which it is to deliver the package, such as the front door. We extensively evaluate this approach in a simulated environment and demonstrate that with our system, a delivery drone can deliver a package to the front door and also to other specified locations around a house. Relative to a frontier exploration-based strategy, drones using the proposed system found and reached the front doors of the 20 test houses 161% faster. △ Less

Submitted 10 May, 2022; v1 submitted 12 April, 2021; originally announced April 2021.

Comments: Accepted for publication in International Conference on Unmanned Aircraft Systems (ICUAS) 2022

arXiv:2007.13897 [pdf, other]

doi 10.1109/ACCESS.2020.3017659

Adaptive Workload Allocation for Multi-human Multi-robot Teams for Independent and Homogeneous Tasks

Authors: Tamzidul Mina, Shyam Sundar Kannan, Wonse Jo, Byung-Cheol Min

Abstract: Multi-human multi-robot (MH-MR) systems have the ability to combine the potential advantages of robotic systems with those of having humans in the loop. Robotic systems contribute precision performance and long operation on repetitive tasks without tiring, while humans in the loop improve situational awareness and enhance decision-making abilities. A system's ability to adapt allocated workload to… ▽ More Multi-human multi-robot (MH-MR) systems have the ability to combine the potential advantages of robotic systems with those of having humans in the loop. Robotic systems contribute precision performance and long operation on repetitive tasks without tiring, while humans in the loop improve situational awareness and enhance decision-making abilities. A system's ability to adapt allocated workload to changing conditions and the performance of each individual (human and robot) during the mission is vital to maintaining overall system performance. Previous works from literature including market-based and optimization approaches have attempted to address the task/workload allocation problem with focus on maximizing the system output without regarding individual agent conditions, lacking in real-time processing and have mostly focused exclusively on multi-robot systems. Given the variety of possible combination of teams (autonomous robots and human-operated robots: any number of human operators operating any number of robots at a time) and the operational scale of MH-MR systems, development of a generalized framework of workload allocation has been a particularly challenging task. In this paper, we present such a framework for independent homogeneous missions, capable of adaptively allocating the system workload in relation to health conditions and work performances of human-operated and autonomous robots in real-time. The framework consists of removable modular function blocks ensuring its applicability to different MH-MR scenarios. A new workload transition function block ensures smooth transition without the workload change having adverse effects on individual agents. The effectiveness and scalability of the system's workload adaptability is validated by experiments applying the proposed framework in a MH-MR patrolling scenario with changing human and robot condition, and failing robots. △ Less

Submitted 27 July, 2020; originally announced July 2020.

Comments: 14 pages, 13 figures, submitted to IEEE ACCESS. For associated file, see https://youtu.be/-WY49FPbNWg

arXiv:2006.05102 [pdf, other]

ROSbag-based Multimodal Affective Dataset for Emotional and Cognitive States

Authors: Wonse Jo, Shyam Sundar Kannan, Go-Eum Cha, Ahreum Lee, Byung-Cheol Min

Abstract: This paper introduces a new ROSbag-based multimodal affective dataset for emotional and cognitive states generated using Robot Operating System (ROS). We utilized images and sounds from the International Affective Pictures System (IAPS) and the International Affective Digitized Sounds (IADS) to stimulate targeted emotions (happiness, sadness, anger, fear, surprise, disgust, and neutral), and a dua… ▽ More This paper introduces a new ROSbag-based multimodal affective dataset for emotional and cognitive states generated using Robot Operating System (ROS). We utilized images and sounds from the International Affective Pictures System (IAPS) and the International Affective Digitized Sounds (IADS) to stimulate targeted emotions (happiness, sadness, anger, fear, surprise, disgust, and neutral), and a dual N-back game to stimulate different levels of cognitive workload. 30 human subjects participated in the user study; their physiological data was collected using the latest commercial wearable sensors, behavioral data was collected using hardware devices such as cameras, and subjective assessments were carried out through questionnaires. All data was stored in single ROSbag files rather than in conventional Comma-separated values (CSV) files. This not only ensures synchronization of signals and videos in a data set, but also allows researchers to easily analyze and verify their algorithms by connecting directly to this dataset through ROS. The generated affective dataset consists of 1,602 ROSbag files, and size of the dataset is about 787GB. The dataset is made publicly available. We expect that our dataset can be great resource for many researchers in the fields of affective computing, HCI, and HRI. △ Less

Submitted 20 October, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

Comments: Accepted for publication in SMC2020, TORONTO, CANADA

arXiv:2006.03805 [pdf, other]

Investigating the Effect of Deictic Movements of a Multi-robot

Authors: Ahreum Lee, Wonse Jo, Shyam Sundar Kannan, Byung-Cheol Min

Abstract: Multi-robot systems are made up of a team of multiple robots, which provides the advantage of performing complex tasks with high efficiency, flexibility, and robustness. Although research on human-robot interaction is ongoing as robots become more readily available and easier to use, the study of interactions between a human and multiple robots represents a relatively new field of research. In par… ▽ More Multi-robot systems are made up of a team of multiple robots, which provides the advantage of performing complex tasks with high efficiency, flexibility, and robustness. Although research on human-robot interaction is ongoing as robots become more readily available and easier to use, the study of interactions between a human and multiple robots represents a relatively new field of research. In particular, how multi-robots could be used for everyday users has not been extensively explored. Additionally, the impact of the characteristics of multiple robots on human perception and cognition in human multi-robot interaction should be further explored. In this paper, we specifically focus on the benefits of physical affordances generated by the movements of multi-robots, and investigate the effects of deictic movements of multi-robots on information retrieval by conducting a delayed free recall task. △ Less

Submitted 6 June, 2020; originally announced June 2020.

Comments: 13 pages, 9 figures;

arXiv:2006.03784 [pdf, other]

A ROS-based Framework for Monitoring Human and Robot Conditions in a Human-Multi-robot Team

Authors: Wonse Jo, Shyam Sundar Kannan, Go-Eum Cha, Ahreum Lee, Byung-Cheol Min

Abstract: This paper presents a framework for monitoring human and robot conditions in human multi-robot interactions. The proposed framework consists of four modules: 1) human and robot conditions monitoring interface, 2) synchronization time filter, 3) data feature extraction interface, and 4) condition monitoring interface. The framework is based on Robot Operating System (ROS), and it supports physiolog… ▽ More This paper presents a framework for monitoring human and robot conditions in human multi-robot interactions. The proposed framework consists of four modules: 1) human and robot conditions monitoring interface, 2) synchronization time filter, 3) data feature extraction interface, and 4) condition monitoring interface. The framework is based on Robot Operating System (ROS), and it supports physiological and behavioral sensors and devices and robot systems, as well as custom programs. Furthermore, it allows synchronizing the monitoring conditions and sharing them simultaneously. In order to validate the proposed framework, we present experiment results and analysis obtained from the user study where 30 human subjects participated and simulated robot experiments. △ Less

Submitted 6 June, 2020; originally announced June 2020.

Comments: 7 pages, 9 figures

arXiv:1912.02927 [pdf, other]

Smart Cloud: Scalable Cloud Robotic Architecture for Web-powered Multi-Robot Applications

Authors: Manoj Penmetcha, Shyam Sundar Kannan, Byung-Cheol Min

Abstract: Robots have inherently limited onboard processing, storage, and power capabilities. Cloud computing resources have the potential to provide significant advantages for robots in many applications. However, to make use of these resources, frameworks must be developed that facilitate robot interactions with cloud services. In this paper, we propose a cloud-based architecture called Smart Cloud that i… ▽ More Robots have inherently limited onboard processing, storage, and power capabilities. Cloud computing resources have the potential to provide significant advantages for robots in many applications. However, to make use of these resources, frameworks must be developed that facilitate robot interactions with cloud services. In this paper, we propose a cloud-based architecture called Smart Cloud that intends to overcome the physical limitations of single- or multi-robot systems through massively parallel computation, provided on demand by cloud services. Smart Cloud is implemented on Amazon Web Services (AWS) and available for robots running on the Robot Operating System (ROS) and on the non-ROS systems. Smart Cloud features a first-of-its-kind architecture that incorporates JavaScript-based libraries to run various robotic applications related to machine learning and other methods. This paper presents the architecture and its performance in terms of CPU usage and latency, and finally validates it for navigation and machine learning applications. △ Less

Submitted 15 September, 2020; v1 submitted 5 December, 2019; originally announced December 2019.

arXiv:1812.05489 [pdf, other]

Material Mapping in Unknown Environments using Tapping Sound

Authors: Shyam Sundar Kannan, Wonse Jo, Ramviyas Parasuraman, Byung-Cheol Min

Abstract: In this paper, we propose an autonomous exploration and a tapping mechanism-based material mapping system for a mobile robot in unknown environments. The goal of the proposed system is to integrate simultaneous localization and mapping (SLAM) modules and sound-based material classification to enable a mobile robot to explore an unknown environment autonomously and at the same time identify the var… ▽ More In this paper, we propose an autonomous exploration and a tapping mechanism-based material mapping system for a mobile robot in unknown environments. The goal of the proposed system is to integrate simultaneous localization and mapping (SLAM) modules and sound-based material classification to enable a mobile robot to explore an unknown environment autonomously and at the same time identify the various objects and materials in the environment. This creates a material map that localizes the various materials in the environment which has potential applications for search and rescue scenarios. A tapping mechanism and tapping audio signal processing based on machine learning techniques are exploited for a robot to identify the objects and materials. We demonstrate the proposed system through experiments using a mobile robot platform installed with Velodyne LiDAR, a linear solenoid, and microphones in an exploration-like scenario with various materials. Experiment results demonstrate that the proposed system can create useful material maps in unknown environments. △ Less

Submitted 3 August, 2020; v1 submitted 13 December, 2018; originally announced December 2018.

Comments: Accepted for publication in IROS 2020, Las Vegas, NV, USA

Showing 1–15 of 15 results for author: Kannan, S S