Search | arXiv e-print repository

Teledrive: An Embodied AI based Telepresence System

Authors: Snehasis Banerjee, Sayan Paul, Ruddradev Roychoudhury, Abhijan Bhattacharya, Chayan Sarkar, Ashis Sau, Pradip Pramanick, Brojeshwar Bhowmick

Abstract: This article presents Teledrive, a telepresence robotic system with embodied AI features that empowers an operator to navigate the telerobot in any unknown remote place with minimal human intervention. We conceive Teledrive in the context of democratizing remote care-giving for elderly citizens as well as for isolated patients, affected by contagious diseases. In particular, this paper focuses on… ▽ More This article presents Teledrive, a telepresence robotic system with embodied AI features that empowers an operator to navigate the telerobot in any unknown remote place with minimal human intervention. We conceive Teledrive in the context of democratizing remote care-giving for elderly citizens as well as for isolated patients, affected by contagious diseases. In particular, this paper focuses on the problem of navigating to a rough target area (like bedroom or kitchen) rather than pre-specified point destinations. This ushers in a unique AreaGoal based navigation feature, which has not been explored in depth in the contemporary solutions. Further, we describe an edge computing-based software system built on a WebRTC-based communication framework to realize the aforementioned scheme through an easy-to-use speech-based human-robot interaction. Moreover, to enhance the ease of operation for the remote caregiver, we incorporate a person following feature, whereby a robot follows a person on the move in its premises as directed by the operator. Moreover, the system presented is loosely coupled with specific robot hardware, unlike the existing solutions. We have evaluated the efficacy of the proposed system through baseline experiments, user study, and real-life deployment. △ Less

Submitted 1 June, 2024; originally announced June 2024.

Comments: Accepted in Journal of Intelligent Robotic System

Journal ref: Journal of Intelligent Robotic System 2024

arXiv:2210.11543 [pdf, other]

Object Goal Navigation Based on Semantics and RGB Ego View

Authors: Snehasis Banerjee, Brojeshwar Bhowmick, Ruddra Dev Roychoudhury

Abstract: This paper presents an architecture and methodology to empower a service robot to navigate an indoor environment with semantic decision making, given RGB ego view. This method leverages the knowledge of robot's actuation capability and that of scenes, objects and their relations -- represented in a semantic form. The robot navigates based on GeoSem map - a relational combination of geometric and s… ▽ More This paper presents an architecture and methodology to empower a service robot to navigate an indoor environment with semantic decision making, given RGB ego view. This method leverages the knowledge of robot's actuation capability and that of scenes, objects and their relations -- represented in a semantic form. The robot navigates based on GeoSem map - a relational combination of geometric and semantic map. The goal given to the robot is to find an object in a unknown environment with no navigational map and only egocentric RGB camera perception. The approach is tested both on a simulation environment and real life indoor settings. The presented approach was found to outperform human users in gamified evaluations with respect to average completion time. △ Less

Submitted 20 October, 2022; originally announced October 2022.

Comments: IROS 2022 AI&R Workshop

Journal ref: IROS 2022 AI&R Workshop

arXiv:2208.13031 [pdf, other]

Spatial Relation Graph and Graph Convolutional Network for Object Goal Navigation

Authors: D. A. Sasi Kiran, Kritika Anand, Chaitanya Kharyal, Gulshan Kumar, Nandiraju Gireesh, Snehasis Banerjee, Ruddra dev Roychoudhury, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna

Abstract: This paper describes a framework for the object-goal navigation task, which requires a robot to find and move to the closest instance of a target object class from a random starting position. The framework uses a history of robot trajectories to learn a Spatial Relational Graph (SRG) and Graph Convolutional Network (GCN)-based embeddings for the likelihood of proximity of different semantically-la… ▽ More This paper describes a framework for the object-goal navigation task, which requires a robot to find and move to the closest instance of a target object class from a random starting position. The framework uses a history of robot trajectories to learn a Spatial Relational Graph (SRG) and Graph Convolutional Network (GCN)-based embeddings for the likelihood of proximity of different semantically-labeled regions and the occurrence of different object classes in these regions. To locate a target object instance during evaluation, the robot uses Bayesian inference and the SRG to estimate the visible regions, and uses the learned GCN embeddings to rank visible regions and select the region to explore next. △ Less

Submitted 27 August, 2022; originally announced August 2022.

Comments: CASE 2022 paper

arXiv:2207.14205 [pdf, other]

DoRO: Disambiguation of referred object for embodied agents

Authors: Pradip Pramanick, Chayan Sarkar, Sayan Paul, Ruddra dev Roychoudhury, Brojeshwar Bhowmick

Abstract: Robotic task instructions often involve a referred object that the robot must locate (ground) within the environment. While task intent understanding is an essential part of natural language understanding, less effort is made to resolve ambiguity that may arise while grounding the task. Existing works use vision-based task grounding and ambiguity detection, suitable for a fixed view and a static r… ▽ More Robotic task instructions often involve a referred object that the robot must locate (ground) within the environment. While task intent understanding is an essential part of natural language understanding, less effort is made to resolve ambiguity that may arise while grounding the task. Existing works use vision-based task grounding and ambiguity detection, suitable for a fixed view and a static robot. However, the problem magnifies for a mobile robot, where the ideal view is not known beforehand. Moreover, a single view may not be sufficient to locate all the object instances in the given area, which leads to inaccurate ambiguity detection. Human intervention is helpful only if the robot can convey the kind of ambiguity it is facing. In this article, we present DoRO (Disambiguation of Referred Object), a system that can help an embodied agent to disambiguate the referred object by raising a suitable query whenever required. Given an area where the intended object is, DoRO finds all the instances of the object by aggregating observations from multiple views while exploring & scanning the area. It then raises a suitable query using the information from the grounded object instances. Experiments conducted with the AI2Thor simulator show that DoRO not only detects the ambiguity more accurately but also raises verbose queries with more accurate information from the visual-language grounding. △ Less

Submitted 28 July, 2022; originally announced July 2022.

Comments: Accepted in IEEE Robotics & Automation Letters (RA-L)

arXiv:2203.02959 [pdf]

A Perspective on Robotic Telepresence and Teleoperation using Cognition: Are we there yet?

Authors: Hrishav Bakul Barua, Ashis Sau, Ruddra dev Roychoudhury

Abstract: Telepresence and teleoperation robotics have attracted a great amount of attention in the last 10 years. With the Artificial Intelligence (AI) revolution already being started, we can see a wide range of robotic applications being realized. Intelligent robotic systems are being deployed both in industrial and domestic environments. Telepresence is the idea of being present in a remote location vir… ▽ More Telepresence and teleoperation robotics have attracted a great amount of attention in the last 10 years. With the Artificial Intelligence (AI) revolution already being started, we can see a wide range of robotic applications being realized. Intelligent robotic systems are being deployed both in industrial and domestic environments. Telepresence is the idea of being present in a remote location virtually or via robotic avatars. Similarly, the idea of operating a robot from a remote location for various tasks is called teleoperation. These technologies find significant application in health care, education, surveillance, disaster recovery, and corporate/government sectors. But question still remains about their maturity, security and safety levels. We also need to think about enhancing the user experience and trust in such technologies going into the next generation of computing. △ Less

Submitted 6 March, 2022; originally announced March 2022.

MSC Class: Artificial intelligence; Computer vision; Robotics; Machine learning; Deep learning ACM Class: I.2; I.2.9; I.2.10

arXiv:2108.06478 [pdf, other]

Sharing Cognition: Human Gesture and Natural Language Grounding Based Planning and Navigation for Indoor Robots

Authors: Gourav Kumar, Soumyadip Maity, Ruddra dev Roychoudhury, Brojeshwar Bhowmick

Abstract: Cooperation among humans makes it easy to execute tasks and navigate seamlessly even in unknown scenarios. With our individual knowledge and collective cognition skills, we can reason about and perform well in unforeseen situations and environments. To achieve a similar potential for a robot navigating among humans and interacting with them, it is crucial for it to acquire the ability for easy, ef… ▽ More Cooperation among humans makes it easy to execute tasks and navigate seamlessly even in unknown scenarios. With our individual knowledge and collective cognition skills, we can reason about and perform well in unforeseen situations and environments. To achieve a similar potential for a robot navigating among humans and interacting with them, it is crucial for it to acquire the ability for easy, efficient and natural ways of communication and cognition sharing with humans. In this work, we aim to exploit human gestures which is known to be the most prominent modality of communication after the speech. We demonstrate how the incorporation of gestures for communicating spatial understanding can be achieved in a very simple yet effective way using a robot having the vision and listening capability. This shows a big advantage over using only Vision and Language-based Navigation, Language Grounding or Human-Robot Interaction in a task requiring the development of cognition and indoor navigation. We adapt the state-of-the-art modules of Language Grounding and Human-Robot Interaction to demonstrate a novel system pipeline in real-world environments on a Telepresence robot for performing a set of challenging tasks. To the best of our knowledge, this is the first pipeline to couple the fields of HRI and language grounding in an indoor environment to demonstrate autonomous navigation. △ Less

Submitted 14 August, 2021; originally announced August 2021.

arXiv:2104.12032 [pdf]

The Design of the User Interfaces for Privacy Enhancements for Android

Authors: Jason I. Hong, Yuvraj Agarwal, Matt Fredrikson, Mike Czapik, Shawn Hanna, Swarup Sahoo, Judy Chun, Won-Woo Chung, Aniruddh Iyer, Ally Liu, Shen Lu, Rituparna Roychoudhury, Qian Wang, Shan Wang, Siqi Wang, Vida Zhang, Jessica Zhao, Yuan Jiang, Haojian Jin, Sam Kim, Evelyn Kuo, Tianshi Li, Jinping Liu, Yile Liu, Robert Zhang

Abstract: We present the design and design rationale for the user interfaces for Privacy Enhancements for Android (PE for Android). These UIs are built around two core ideas, namely that developers should explicitly declare the purpose of why sensitive data is being used, and these permission-purpose pairs should be split by first party and third party uses. We also present a taxonomy of purposes and ways o… ▽ More We present the design and design rationale for the user interfaces for Privacy Enhancements for Android (PE for Android). These UIs are built around two core ideas, namely that developers should explicitly declare the purpose of why sensitive data is being used, and these permission-purpose pairs should be split by first party and third party uses. We also present a taxonomy of purposes and ways of how these ideas can be deployed in the existing Android ecosystem. △ Less

Submitted 24 April, 2021; originally announced April 2021.

Comments: 58 pages, 21 figures, 3 tables

arXiv:2007.12990 [pdf, other]

Demo: Edge-centric Telepresence Avatar Robot for Geographically Distributed Environment

Authors: Ashis Sau, Ruddra Dev Roychoudhury, Hrishav Bakul Barua, Chayan Sarkar, Sayan Paul, Brojeshwar Bhowmick, Arpan Pal, Balamuralidhar P

Abstract: Using a robotic platform for telepresence applications has gained paramount importance in this decade. Scenarios such as remote meetings, group discussions, and presentations/talks in seminars and conferences get much attention in this regard. Though there exist some robotic platforms for such telepresence applications, they lack efficacy in communication and interaction between the remote person… ▽ More Using a robotic platform for telepresence applications has gained paramount importance in this decade. Scenarios such as remote meetings, group discussions, and presentations/talks in seminars and conferences get much attention in this regard. Though there exist some robotic platforms for such telepresence applications, they lack efficacy in communication and interaction between the remote person and the avatar robot deployed in another geographic location. Also, such existing systems are often cloud-centric which adds to its network overhead woes. In this demo, we develop and test a framework that brings the best of both cloud and edge-centric systems together along with a newly designed communication protocol. Our solution adds to the improvement of the existing systems in terms of robustness and efficacy in communication for a geographically distributed environment. △ Less

Submitted 25 July, 2020; originally announced July 2020.

Showing 1–8 of 8 results for author: Roychoudhury, R