-
"One Soy Latte for Daniel": Visual and Movement Communication of Intention from a Robot Waiter to a Group of Customers
Authors:
Seung Chan Hong,
Leimin Tian,
Akansel Cosgun,
Dana Kulić
Abstract:
Service robots are increasingly employed in the hospitality industry for delivering food orders in restaurants. However, in current practice the robot often arrives at a fixed location for each table when delivering orders to different patrons in the same dining group, thus requiring a human staff member or the customers themselves to identify and retrieve each order. This study investigates how t…
▽ More
Service robots are increasingly employed in the hospitality industry for delivering food orders in restaurants. However, in current practice the robot often arrives at a fixed location for each table when delivering orders to different patrons in the same dining group, thus requiring a human staff member or the customers themselves to identify and retrieve each order. This study investigates how to improve the robot's service behaviours to facilitate clear intention communication to a group of users, thus achieving accurate delivery and positive user experiences. Specifically, we conduct user studies (N=30) with a Temi service robot as a representative delivery robot currently adopted in restaurants. We investigated two factors in the robot's intent communication, namely visualisation and movement trajectories, and their influence on the objective and subjective interaction outcomes. A robot personalising its movement trajectory and stopping location in addition to displaying a visualisation of the order yields more accurate intent communication and successful order delivery, as well as more positive user perception towards the robot and its service. Our results also showed that individuals in a group have different interaction experiences.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Robots Have Been Seen and Not Heard: Effects of Consequential Sounds on Human-Perception of Robots
Authors:
Aimee Allen,
Tom Drummond,
Dana Kulic
Abstract:
Many people expect robots to move fairly quietly, or make pleasant "beep boop" sounds or jingles similar to what they have observed in videos of robots. Unfortunately, this expectation of quietness does not match reality, as robots make machine sounds, known as 'consequential sounds', as they move and operate. As robots become more prevalent within society, understanding the sounds produced by rob…
▽ More
Many people expect robots to move fairly quietly, or make pleasant "beep boop" sounds or jingles similar to what they have observed in videos of robots. Unfortunately, this expectation of quietness does not match reality, as robots make machine sounds, known as 'consequential sounds', as they move and operate. As robots become more prevalent within society, understanding the sounds produced by robots and how these sounds are perceived by people is becoming increasingly important for positive human robot interactions (HRI). This paper investigates how people respond to the consequential sounds of robots, specifically how robots make a participant feel, how much they like the robot, would be distracted by the robot, and a person's desire to colocate with robots. Participants were shown 5 videos of different robots and asked their opinions on the robots and the sounds they made. This was compared with a control condition of completely silent videos. The results in this paper demonstrate with data from 182 participants (858 trials) that consequential sounds produced by robots have a significant negative effect on human perceptions of robots. Firstly there were increased negative 'associated affects' of the participants, such as making them feel more uncomfortable or agitated around the robot. Secondly, the presence of consequential sounds correlated with participants feeling more distracted and less able to focus. Thirdly participants reported being less likely to want to colocate in a shared environment with robots.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Learning to Communicate Functional States with Nonverbal Expressions for Improved Human-Robot Collaboration
Authors:
Liam Roy,
Dana Kulic,
Elizabeth Croft
Abstract:
Collaborative robots must effectively communicate their internal state to humans to enable a smooth interaction. Nonverbal communication is widely used to communicate information during human-robot interaction, however, such methods may also be misunderstood, leading to communication errors. In this work, we explore modulating the acoustic parameter values (pitch bend, beats per minute, beats per…
▽ More
Collaborative robots must effectively communicate their internal state to humans to enable a smooth interaction. Nonverbal communication is widely used to communicate information during human-robot interaction, however, such methods may also be misunderstood, leading to communication errors. In this work, we explore modulating the acoustic parameter values (pitch bend, beats per minute, beats per loop) of nonverbal auditory expressions to convey functional robot states (accomplished, progressing, stuck). We propose a reinforcement learning (RL) algorithm based on noisy human feedback to produce accurately interpreted nonverbal auditory expressions. The proposed approach was evaluated through a user study with 24 participants. The results demonstrate that: 1. Our proposed RL-based approach is able to learn suitable acoustic parameter values which improve the users' ability to correctly identify the state of the robot. 2. Algorithm initialization informed by previous user data can be used to significantly speed up the learning process. 3. The method used for algorithm initialization strongly influences whether participants converge to similar sounds for each robot state. 4. Modulation of pitch bend has the largest influence on user association between sounds and robotic states.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Alternative Interfaces for Human-initiated Natural Language Communication and Robot-initiated Haptic Feedback: Towards Better Situational Awareness in Human-Robot Collaboration
Authors:
Callum Bennie,
Bridget Casey,
Cecile Paris,
Dana Kulic,
Brendan Tidd,
Nicholas Lawrance,
Alex Pitt,
Fletcher Talbot,
Jason Williams,
David Howard,
Pavan Sikka,
Hashini Senaratne
Abstract:
This article presents an implementation of a natural-language speech interface and a haptic feedback interface that enables a human supervisor to provide guidance to, request information, and receive status updates from a Spot robot. We provide insights gained during preliminary user testing of the interface in a realistic robot exploration scenario.
This article presents an implementation of a natural-language speech interface and a haptic feedback interface that enables a human supervisor to provide guidance to, request information, and receive status updates from a Spot robot. We provide insights gained during preliminary user testing of the interface in a realistic robot exploration scenario.
△ Less
Submitted 24 January, 2024;
originally announced January 2024.
-
How Can Everyday Users Efficiently Teach Robots by Demonstrations?
Authors:
Maram Sakr,
Zhikai Zhang,
Benjamin Li,
Haomiao Zhang,
H. F. Machiel Van der Loos,
Dana Kulic,
Elizabeth Croft
Abstract:
Learning from Demonstration (LfD) is a framework that allows lay users to easily program robots. However, the efficiency of robot learning and the robot's ability to generalize to task variations hinges upon the quality and quantity of the provided demonstrations. Our objective is to guide human teachers to furnish more effective demonstrations, thus facilitating efficient robot learning. To achie…
▽ More
Learning from Demonstration (LfD) is a framework that allows lay users to easily program robots. However, the efficiency of robot learning and the robot's ability to generalize to task variations hinges upon the quality and quantity of the provided demonstrations. Our objective is to guide human teachers to furnish more effective demonstrations, thus facilitating efficient robot learning. To achieve this, we propose to use a measure of uncertainty, namely task-related information entropy, as a criterion for suggesting informative demonstration examples to human teachers to improve their teaching skills. In a conducted experiment (N=24), an augmented reality (AR)-based guidance system was employed to train novice users to produce additional demonstrations from areas with the highest entropy within the workspace. These novice users were trained for a few trials to teach the robot a generalizable task using a limited number of demonstrations. Subsequently, the users' performance after training was assessed first on the same task (retention) and then on a novel task (transfer) without guidance. The results indicated a substantial improvement in robot learning efficiency from the teacher's demonstrations, with an improvement of up to 198% observed on the novel task. Furthermore, the proposed approach was compared to a state-of-the-art heuristic rule and found to improve robot learning efficiency by 210% compared to the heuristic rule.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
Auditory cueing strategy for stride length and cadence modification: a feasibility study with healthy adults
Authors:
Tina LY Wu,
Anna Murphy,
Chao Chen,
Dana Kulic
Abstract:
People with Parkinson's Disease experience gait impairments that significantly impact their quality of life. Visual, auditory, and tactile cues can alleviate gait impairments, but they can become less effective due to the progressive nature of the disease and changes in people's motor capability. In this study, we develop a human-in-the-loop (HIL) framework that monitors two key gait parameters, s…
▽ More
People with Parkinson's Disease experience gait impairments that significantly impact their quality of life. Visual, auditory, and tactile cues can alleviate gait impairments, but they can become less effective due to the progressive nature of the disease and changes in people's motor capability. In this study, we develop a human-in-the-loop (HIL) framework that monitors two key gait parameters, stride length and cadence, and continuously learns a person-specific model of how the parameters change in response to the feedback. The model is then used in an optimization algorithm to improve the gait parameters. This feasibility study examines whether auditory cues can be used to influence stride length in people without gait impairments. The results demonstrate the benefits of the HIL framework in maintaining people's stride length in the presence of a secondary task.
△ Less
Submitted 14 August, 2023;
originally announced August 2023.
-
Robotic Vision for Human-Robot Interaction and Collaboration: A Survey and Systematic Review
Authors:
Nicole Robinson,
Brendan Tidd,
Dylan Campbell,
Dana Kulić,
Peter Corke
Abstract:
Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 yea…
▽ More
Robotic vision for human-robot interaction and collaboration is a critical process for robots to collect and interpret detailed information related to human actions, goals, and preferences, enabling robots to provide more useful services to people. This survey and systematic review presents a comprehensive analysis on robotic vision in human-robot interaction and collaboration over the last 10 years. From a detailed search of 3850 articles, systematic extraction and evaluation was used to identify and explore 310 papers in depth. These papers described robots with some level of autonomy using robotic vision for locomotion, manipulation and/or visual communication to collaborate or interact with people. This paper provides an in-depth analysis of current trends, common domains, methods and procedures, technical processes, data sets and models, experimental testing, sample populations, performance metrics and future challenges. This manuscript found that robotic vision was often used in action and gesture recognition, robot movement in human spaces, object handover and collaborative actions, social communication and learning from demonstration. Few high-impact and novel techniques from the computer vision field had been translated into human-robot interaction and collaboration. Overall, notable advancements have been made on how to develop and deploy robots to assist people.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Learning to Assist and Communicate with Novice Drone Pilots for Expert Level Performance
Authors:
Kal Backman,
Dana Kulić,
Hoam Chung
Abstract:
Multi-task missions for unmanned aerial vehicles (UAVs) involving inspection and landing tasks are challenging for novice pilots due to the difficulties associated with depth perception and the control interface. We propose a shared autonomy system, alongside supplementary information displays, to assist pilots to successfully complete multi-task missions without any pilot training. Our approach c…
▽ More
Multi-task missions for unmanned aerial vehicles (UAVs) involving inspection and landing tasks are challenging for novice pilots due to the difficulties associated with depth perception and the control interface. We propose a shared autonomy system, alongside supplementary information displays, to assist pilots to successfully complete multi-task missions without any pilot training. Our approach comprises of three modules: (1) a perception module that encodes visual information onto a latent representation, (2) a policy module that augments pilot's actions, and (3) an information augmentation module that provides additional information to the pilot. The policy module is trained in simulation with simulated users and transferred to the real world without modification in a user study (n=29), alongside supplementary information schemes including learnt red/green light feedback cues and an augmented reality display. The pilot's intent is unknown to the policy module and is inferred from the pilot's input and UAV's states. The assistant increased task success rate for the landing and inspection tasks from [16.67% & 54.29%] respectively to [95.59% & 96.22%]. With the assistant, inexperienced pilots achieved similar performance to experienced pilots. Red/green light feedback cues reduced the required time by 19.53% and trajectory length by 17.86% for the inspection task, where participants rated it as their preferred condition due to the intuitive interface and providing reassurance. This work demonstrates that simple user models can train shared autonomy systems in simulation, and transfer to physical tasks to estimate user intent and provide effective assistance and information to the pilot.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Towards vision-based dual arm robotic fruit harvesting
Authors:
Ege Gursoy,
Benjamin Navarro,
Akansel Cosgun,
Dana Kulić,
Andrea Cherubini
Abstract:
Interest in agricultural robotics has increased considerably in recent years due to benefits such as improvement in productivity and labor reduction. However, current problems associated with unstructured environments make the development of robotic harvesters challenging. Most research in agricultural robotics focuses on single arm manipulation. Here, we propose a dual-arm approach. We present a…
▽ More
Interest in agricultural robotics has increased considerably in recent years due to benefits such as improvement in productivity and labor reduction. However, current problems associated with unstructured environments make the development of robotic harvesters challenging. Most research in agricultural robotics focuses on single arm manipulation. Here, we propose a dual-arm approach. We present a dual-arm fruit harvesting robot equipped with a RGB-D camera, cutting and collecting tools. We exploit the cooperative task description to maximize the capabilities of the dual-arm robot. We designed a Hierarchical Quadratic Programming based control strategy to fulfill the set of hard constrains related to the robot and environment: robot joint limits, robot self-collisions, robot-fruit and robot-tree collisions. We combine deep learning and standard image processing algorithms to detect and track fruits as well as the tree trunk in the scene. We validate our perception methods on real-world RGB-D images and our control method on simulated experiments.
△ Less
Submitted 16 June, 2023; v1 submitted 14 June, 2023;
originally announced June 2023.
-
A Benchmark for Cycling Close Pass Near Miss Event Detection from Video Streams
Authors:
Mingjie Li,
Tharindu Rathnayake,
Ben Beck,
Lingheng Meng,
Zijue Chen,
Akansel Cosgun,
Xiaojun Chang,
Dana Kulić
Abstract:
Cycling is a healthy and sustainable mode of transport. However, interactions with motor vehicles remain a key barrier to increased cycling participation. The ability to detect potentially dangerous interactions from on-bike sensing could provide important information to riders and policy makers. Thus, automated detection of conflict between cyclists and drivers has attracted researchers from both…
▽ More
Cycling is a healthy and sustainable mode of transport. However, interactions with motor vehicles remain a key barrier to increased cycling participation. The ability to detect potentially dangerous interactions from on-bike sensing could provide important information to riders and policy makers. Thus, automated detection of conflict between cyclists and drivers has attracted researchers from both computer vision and road safety communities. In this paper, we introduce a novel benchmark, called Cyc-CP, towards cycling close pass near miss event detection from video streams. We first divide this task into scene-level and instance-level problems. Scene-level detection asks an algorithm to predict whether there is a close pass near miss event in the input video clip. Instance-level detection aims to detect which vehicle in the scene gives rise to a close pass near miss. We propose two benchmark models based on deep learning techniques for these two problems. For training and testing those models, we construct a synthetic dataset and also collect a real-world dataset. Our models can achieve 88.13% and 84.60% accuracy on the real-world dataset, respectively. We envision this benchmark as a test-bed to accelerate cycling close pass near miss detection and facilitate interaction between the fields of road safety, intelligent transportation systems and artificial intelligence. Both the benchmark datasets and detection models will be available at https://github.com/SustainableMobility/cyc-cp to facilitate experimental reproducibility and encourage more in-depth research in the field.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
Rotating Objects via In-Hand Pivoting using Vision, Force and Touch
Authors:
Shiyu Xu,
Tianyuan Liu,
Michael Wong,
Dana Kulić,
Akansel Cosgun
Abstract:
We propose a robotic manipulation system that can pivot objects on a surface using vision, wrist force and tactile sensing. We aim to control the rotation of an object around the grip point of a parallel gripper by allowing rotational slip, while maintaining a desired wrist force profile. Our approach runs an end-effector position controller and a gripper width controller concurrently in a closed…
▽ More
We propose a robotic manipulation system that can pivot objects on a surface using vision, wrist force and tactile sensing. We aim to control the rotation of an object around the grip point of a parallel gripper by allowing rotational slip, while maintaining a desired wrist force profile. Our approach runs an end-effector position controller and a gripper width controller concurrently in a closed loop. The position controller maintains a desired force using vision and wrist force. The gripper controller uses tactile sensing to keep the grip firm enough to prevent translational slip, but loose enough to induce rotational slip. Our sensor-based control approach relies on matching a desired force profile derived from object dimensions and weight and vision-based monitoring of the object pose. The gripper controller uses tactile sensors to detect and prevent translational slip by tightening the grip when needed. Experimental results where the robot was tasked with rotating cuboid objects 90 degrees show that the multi-modal pivoting approach was able to rotate the objects without causing lift or slip, and was more energy-efficient compared to using a single sensor modality and to pick-and-place. While our work demonstrated the benefit of multi-modal sensing for the pivoting task, further work is needed to generalize our approach to any given object.
△ Less
Submitted 5 October, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Crafting with a Robot Assistant: Use Social Cues to Inform Adaptive Handovers in Human-Robot Collaboration
Authors:
Leimin Tian,
Kerry He,
Shiyu Xu,
Akansel Cosgun,
Dana Kulić
Abstract:
We study human-robot handovers in a naturalistic collaboration scenario, where a mobile manipulator robot assists a person during a crafting session by providing and retrieving objects used for wooden piece assembly (functional activities) and painting (creative activities). We collect quantitative and qualitative data from 20 participants in a Wizard-of-Oz study, generating the Functional And Cre…
▽ More
We study human-robot handovers in a naturalistic collaboration scenario, where a mobile manipulator robot assists a person during a crafting session by providing and retrieving objects used for wooden piece assembly (functional activities) and painting (creative activities). We collect quantitative and qualitative data from 20 participants in a Wizard-of-Oz study, generating the Functional And Creative Tasks Human-Robot Collaboration dataset (the FACT HRC dataset), available to the research community. This work illustrates how social cues and task context inform the temporal-spatial coordination in human-robot handovers, and how human-robot collaboration is shaped by and in turn influences people's functional and creative activities.
△ Less
Submitted 4 April, 2023; v1 submitted 7 January, 2023;
originally announced January 2023.
-
Human-Robot Team Performance Compared to Full Robot Autonomy in 16 Real-World Search and Rescue Missions: Adaptation of the DARPA Subterranean Challenge
Authors:
Nicole Robinson,
Jason Williams,
David Howard,
Brendan Tidd,
Fletcher Talbot,
Brett Wood,
Alex Pitt,
Navinda Kottege,
Dana Kulić
Abstract:
Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts suc…
▽ More
Human operators in human-robot teams are commonly perceived to be critical for mission success. To explore the direct and perceived impact of operator input on task success and team performance, 16 real-world missions (10 hrs) were conducted based on the DARPA Subterranean Challenge. These missions were to deploy a heterogeneous team of robots for a search task to locate and identify artifacts such as climbing rope, drills and mannequins representing human survivors. Two conditions were evaluated: human operators that could control the robot team with state-of-the-art autonomy (Human-Robot Team) compared to autonomous missions without human operator input (Robot-Autonomy). Human-Robot Teams were often in directed autonomy mode (70% of mission time), found more items, traversed more distance, covered more unique ground, and had a higher time between safety-related events. Human-Robot Teams were faster at finding the first artifact, but slower to respond to information from the robot team. In routine conditions, scores were comparable for artifacts, distance, and coverage. Reasons for intervention included creating waypoints to prioritise high-yield areas, and to navigate through error-prone spaces. After observing robot autonomy, operators reported increases in robot competency and trust, but that robot behaviour was not always transparent and understandable, even after high mission performance.
△ Less
Submitted 11 December, 2022;
originally announced December 2022.
-
Adapting Neural Models with Sequential Monte Carlo Dropout
Authors:
Pamela Carreno-Medrano,
Dana Kulić,
Michael Burke
Abstract:
The ability to adapt to changing environments and settings is essential for robots acting in dynamic and unstructured environments or working alongside humans with varied abilities or preferences. This work introduces an extremely simple and effective approach to adapting neural models in response to changing settings. We first train a standard network using dropout, which is analogous to learning…
▽ More
The ability to adapt to changing environments and settings is essential for robots acting in dynamic and unstructured environments or working alongside humans with varied abilities or preferences. This work introduces an extremely simple and effective approach to adapting neural models in response to changing settings. We first train a standard network using dropout, which is analogous to learning an ensemble of predictive models or distribution over predictions. At run-time, we use a particle filter to maintain a distribution over dropout masks to adapt the neural model to changing settings in an online manner. Experimental results show improved performance in control problems requiring both online and look-ahead prediction, and showcase the interpretability of the inferred masks in a human behaviour modelling task for drone teleoperation.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
In-Hand Gravitational Pivoting Using Tactile Sensing
Authors:
Jason Toskov,
Rhys Newbury,
Mustafa Mukadam,
Dana Kulić,
Akansel Cosgun
Abstract:
We study gravitational pivoting, a constrained version of in-hand manipulation, where we aim to control the rotation of an object around the grip point of a parallel gripper. To achieve this, instead of controlling the gripper to avoid slip, we embrace slip to allow the object to rotate in-hand. We collect two real-world datasets, a static tracking dataset and a controller-in-the loop dataset, bot…
▽ More
We study gravitational pivoting, a constrained version of in-hand manipulation, where we aim to control the rotation of an object around the grip point of a parallel gripper. To achieve this, instead of controlling the gripper to avoid slip, we embrace slip to allow the object to rotate in-hand. We collect two real-world datasets, a static tracking dataset and a controller-in-the loop dataset, both annotated with object angle and angular velocity labels. Both datasets contain force-based tactile information on ten different household objects. We train an LSTM model to predict the angular position and velocity of the held object from purely tactile data. We integrate this model with a controller that opens and closes the gripper allowing the object to rotate to desired relative angles. We conduct real-world experiments where the robot is tasked to achieve a relative target angle. We show that our approach outperforms a sliding-window based MLP in a zero-shot generalization setting with unseen objects. Furthermore, we show a 16.6% improvement in performance when the LSTM model is fine-tuned on a small set of data collected with both the LSTM model and the controller in-the-loop. Code and videos are available at https://rhys-newbury.github.io/projects/pivoting/
△ Less
Submitted 10 October, 2022;
originally announced October 2022.
-
Partial Observability during DRL for Robot Control
Authors:
Lingheng Meng,
Rob Gorbet,
Dana Kulić
Abstract:
Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world robot control tasks in recent years. Nevertheless, applying DRL to novel robot control tasks is still challenging, especially when researchers have to design the action and observation space and the reward function. In this paper, we investigate partial observability as a potential failure source of app…
▽ More
Deep Reinforcement Learning (DRL) has made tremendous advances in both simulated and real-world robot control tasks in recent years. Nevertheless, applying DRL to novel robot control tasks is still challenging, especially when researchers have to design the action and observation space and the reward function. In this paper, we investigate partial observability as a potential failure source of applying DRL to robot control tasks, which can occur when researchers are not confident whether the observation space fully represents the underlying state. We compare the performance of three common DRL algorithms, TD3, SAC and PPO under various partial observability conditions. We find that TD3 and SAC become easily stuck in local optima and underperform PPO. We propose multi-step versions of the vanilla TD3 and SAC to improve robustness to partial observability based on one-step bootstrapping.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Varying Joint Patterns and Compensatory Strategies Can Lead to the Same Functional Gait Outcomes: A Case Study
Authors:
T. Bacek,
M. Sun,
H. Liu,
Z. Chen,
D. Kulic,
D. Oetomo,
Y. Tan
Abstract:
This paper analyses joint-space walking mechanisms and redundancies in delivering functional gait outcomes. Multiple biomechanical measures are analysed for two healthy male adults who participated in a multi-factorial study and walked during three sessions. Both participants employed varying intra- and inter-personal compensatory strategies (e.g., vaulting, hip hiking) across walking conditions a…
▽ More
This paper analyses joint-space walking mechanisms and redundancies in delivering functional gait outcomes. Multiple biomechanical measures are analysed for two healthy male adults who participated in a multi-factorial study and walked during three sessions. Both participants employed varying intra- and inter-personal compensatory strategies (e.g., vaulting, hip hiking) across walking conditions and exhibited notable gait pattern alterations while keeping task-space (functional) gait parameters invariant. They also preferred various levels of asymmetric step length but kept their symmetric step time consistent and cadence-invariant during free walking. The results demonstrate the importance of an individualised approach and the need for a paradigm shift from functional (task-space) to joint-space gait analysis in attending to (a)typical gaits and delivering human-centred human-robot interaction.
△ Less
Submitted 26 June, 2022;
originally announced June 2022.
-
Quantifying Demonstration Quality for Robot Learning and Generalization
Authors:
Maram Sakr,
Zexi Jesse Li,
H. F. Machiel Van der Loos,
Dana Kulic,
Elizabeth A. Croft
Abstract:
Learning from Demonstration (LfD) seeks to democratize robotics by enabling diverse end-users to teach robots to perform a task by providing demonstrations. However, most LfD techniques assume users provide optimal demonstrations. This is not always the case in real applications where users are likely to provide demonstrations of varying quality, that may change with expertise and other factors. D…
▽ More
Learning from Demonstration (LfD) seeks to democratize robotics by enabling diverse end-users to teach robots to perform a task by providing demonstrations. However, most LfD techniques assume users provide optimal demonstrations. This is not always the case in real applications where users are likely to provide demonstrations of varying quality, that may change with expertise and other factors. Demonstration quality plays a crucial role in robot learning and generalization. Hence, it is important to quantify the quality of the provided demonstrations before using them for robot learning. In this paper, we propose quantifying the quality of the demonstrations based on how well they perform in the learned task. We hypothesize that task performance can give an indication of the generalization performance on similar tasks. The proposed approach is validated in a user study (N = 27). Users with different robotics expertise levels were recruited to teach a PR2 robot a generic task (pressing a button) under different task constraints. They taught the robot in two sessions on two different days to capture their teaching behaviour across sessions. The task performance was utilized to classify the provided demonstrations into high-quality and low-quality sets. The results show a significant Pearson correlation coefficient (R = 0.85, p < 0.0001) between the task performance and generalization performance across all participants. We also found that users clustered into two groups: Users who provided high-quality demonstrations from the first session, assigned to the fast-adapters group, and users who provided low-quality demonstrations in the first session and then improved with practice, assigned to the slow-adapters group. These results highlight the importance of quantifying demonstration quality, which can be indicative of the adaptation level of the user to the task.
△ Less
Submitted 25 March, 2022;
originally announced March 2022.
-
Natural Language Communication with a Teachable Agent
Authors:
Rachel Love,
Edith Law,
Philip R. Cohen,
Dana Kulić
Abstract:
Conversational teachable agents offer a promising platform to support learning, both in the classroom and in remote settings. In this context, the agent takes the role of the novice, while the student takes on the role of teacher. This framing is significant for its ability to elicit the Protégé effect in the student-teacher, a pedagogical phenomenon known to increase engagement in the teaching ta…
▽ More
Conversational teachable agents offer a promising platform to support learning, both in the classroom and in remote settings. In this context, the agent takes the role of the novice, while the student takes on the role of teacher. This framing is significant for its ability to elicit the Protégé effect in the student-teacher, a pedagogical phenomenon known to increase engagement in the teaching task, and also improve cognitive outcomes. In prior work, teachable agents often take a passive role in the learning interaction, and there are few studies in which the agent and student engage in natural language dialogue during the teaching task. This work investigates the effect of teaching modality when interacting with a virtual agent, via the web-based teaching platform, the Curiosity Notebook. A method of teaching the agent by selecting sentences from source material is compared to a method paraphrasing the source material and typing text input to teach. A user study has been conducted to measure the effect teaching modality on the learning outcomes and engagement of the participants. The results indicate that teaching via paraphrasing and text input has a positive effect on learning outcomes for the material covered, and also on aspects of affective engagement. Furthermore, increased paraphrasing effort, as measured by the similarity between the source material and the material the teacher conveyed to the robot, improves learning outcomes for participants.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
On-The-Go Robot-to-Human Handovers with a Mobile Manipulator
Authors:
Kerry He,
Pradeepsundar Simini,
Wesley Chan,
Dana Kulić,
Elizabeth Croft,
Akansel Cosgun
Abstract:
Existing approaches to direct robot-to-human handovers are typically implemented on fixed-base robot arms, or on mobile manipulators that come to a full stop before performing the handover. We propose "on-the-go" handovers which permit a moving mobile manipulator to hand over an object to a human without stopping. The on-the-go handover motion is generated with a reactive controller that allows si…
▽ More
Existing approaches to direct robot-to-human handovers are typically implemented on fixed-base robot arms, or on mobile manipulators that come to a full stop before performing the handover. We propose "on-the-go" handovers which permit a moving mobile manipulator to hand over an object to a human without stopping. The on-the-go handover motion is generated with a reactive controller that allows simultaneous control of the base and the arm. In a user study, human receivers subjectively assessed on-the-go handovers to be more efficient, predictable, natural, better timed and safer than handovers that implemented a "stop-and-deliver" behavior.
△ Less
Submitted 16 March, 2022;
originally announced March 2022.
-
Visibility Maximization Controller for Robotic Manipulation
Authors:
Kerry He,
Rhys Newbury,
Tin Tran,
Jesse Haviland,
Ben Burgess-Limerick,
Dana Kulić,
Peter Corke,
Akansel Cosgun
Abstract:
Occlusions caused by a robot's own body is a common problem for closed-loop control methods employed in eye-to-hand camera setups. We propose an optimization-based reactive controller that minimizes self-occlusions while achieving a desired goal pose. The approach allows coordinated control between the robot's base, arm and head by encoding the line-of-sight visibility to the target as a soft cons…
▽ More
Occlusions caused by a robot's own body is a common problem for closed-loop control methods employed in eye-to-hand camera setups. We propose an optimization-based reactive controller that minimizes self-occlusions while achieving a desired goal pose. The approach allows coordinated control between the robot's base, arm and head by encoding the line-of-sight visibility to the target as a soft constraint along with other task-related constraints, and solving for feasible joint and base velocities. The generalizability of the approach is demonstrated in simulated and real-world experiments, on robots with fixed or mobile bases, with moving or fixed objects, and multiple objects. The experiments revealed a trade-off between occlusion rates and other task metrics. While a planning-based baseline achieved lower occlusion rates than the proposed controller, it came at the expense of highly inefficient paths and a significant drop in the task success. On the other hand, the proposed controller is shown to improve visibility to the line target object(s) without sacrificing too much from the task success and efficiency. Videos and code can be found at: rhys-newbury.github.io/projects/vmc/.
△ Less
Submitted 25 February, 2022;
originally announced February 2022.
-
Reinforcement Learning for Shared Autonomy Drone Landings
Authors:
Kal Backman,
Dana Kulić,
Hoam Chung
Abstract:
Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe la…
▽ More
Novice pilots find it difficult to operate and land unmanned aerial vehicles (UAVs), due to the complex UAV dynamics, challenges in depth perception, lack of expertise with the control interface and additional disturbances from the ground effect. Therefore we propose a shared autonomy approach to assist pilots in safely landing a UAV under conditions where depth perception is difficult and safe landing zones are limited. Our approach comprises of two modules: a perception module that encodes information onto a compressed latent representation using two RGB-D cameras and a policy module that is trained with the reinforcement learning algorithm TD3 to discern the pilot's intent and to provide control inputs that augment the user's input to safely land the UAV. The policy module is trained in simulation using a population of simulated users. Simulated users are sampled from a parametric model with four parameters, which model a pilot's tendency to conform to the assistant, proficiency, aggressiveness and speed. We conduct a user study (n = 28) where human participants were tasked with landing a physical UAV on one of several platforms under challenging viewing conditions. The assistant, trained with only simulated user data, improved task success rate from 51.4% to 98.2% despite being unaware of the human participants' goal or the structure of the environment a priori. With the proposed assistant, regardless of prior piloting experience, participants performed with a proficiency greater than the most experienced unassisted participants.
△ Less
Submitted 21 October, 2023; v1 submitted 6 February, 2022;
originally announced February 2022.
-
Identifying Functions and Behaviours of Social Robots during Learning Activities: Teachers' Perspective
Authors:
Jessy Ceha,
Edith Law,
Dana Kulić,
Pierre-Yves Oudeyer,
Didier Roy
Abstract:
With advances in artificial intelligence, research is increasingly exploring the potential functions that social robots can play in education. As teachers are a critical stakeholder in the use and application of educational technologies, we conducted a study to understand teachers' perspectives on how a social robot could support a variety of learning activities in the classroom. Through interview…
▽ More
With advances in artificial intelligence, research is increasingly exploring the potential functions that social robots can play in education. As teachers are a critical stakeholder in the use and application of educational technologies, we conducted a study to understand teachers' perspectives on how a social robot could support a variety of learning activities in the classroom. Through interviews, robot puppeteering, and group brainstorming sessions with five elementary and middle school teachers from a local school in Canada, we take a socio-technical perspective to conceptualize possible robot functions and behaviours, and the effects they may have on the current way learning activities are designed, planned, and executed. Overall, the teachers responded positively to the idea of introducing a social robot as a technological tool for learning activities, envisioning differences in usage for teacher-robot and student-robot interactions. Further, Engeström's Activity System Model -- a framework for analyzing human needs, tasks, and outcomes -- illustrated a number of tensions associated with learning activities in the classroom. We discuss the fine-grained robot functions and behaviours conceived by teachers, and how they address the current tensions -- providing suggestions for improving the design of social robots for learning activities.
△ Less
Submitted 30 October, 2021;
originally announced November 2021.
-
A Proposed Set of Communicative Gestures for Human Robot Interaction and an RGB Image-based Gesture Recognizer Implemented in ROS
Authors:
Jia Chuan A. Tan,
Wesley P. Chan,
Nicole L. Robinson,
Elizabeth A. Croft,
Dana Kulic
Abstract:
We propose a set of communicative gestures and develop a gesture recognition system with the aim of facilitating more intuitive Human-Robot Interaction (HRI) through gestures. First, we propose a set of commands commonly used for human-robot interaction. Next, an online user study with 190 participants was performed to investigate if there was an agreed set of gestures that people intuitively use…
▽ More
We propose a set of communicative gestures and develop a gesture recognition system with the aim of facilitating more intuitive Human-Robot Interaction (HRI) through gestures. First, we propose a set of commands commonly used for human-robot interaction. Next, an online user study with 190 participants was performed to investigate if there was an agreed set of gestures that people intuitively use to communicate the given commands to robots when no guidance or training were given. As we found large variations among the gestures exist between participants, we then proposed a set of gestures for the proposed commands to be used as a common foundation for robot interaction. We collected ~7500 video demonstrations of the proposed gestures and trained a gesture recognition model, adapting 3D Convolutional Neural Networks (CNN) as the classifier, with a final accuracy of 84.1% (sigma=2.4). The resulting model was capable of training successfully with a relatively small amount of training data. We integrated the gesture recognition model into the ROS framework and report details for a demonstrated use case, where a person commands a robot to perform a pick and place task using the proposed set. This integrated ROS gesture recognition system is made available for use, and built with the intention to allow for new adaptations depending on robot model and use case scenarios, for novel user applications.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
Memory-based Deep Reinforcement Learning for POMDPs
Authors:
Lingheng Meng,
Rob Gorbet,
Dana Kulić
Abstract:
A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Processes (MDPs). In real-world robotics, this assumption is unpractical, because of issues such as sensor sensitivity limitatio…
▽ More
A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Processes (MDPs). In real-world robotics, this assumption is unpractical, because of issues such as sensor sensitivity limitations and sensor noise, and the lack of knowledge about whether the observation design is complete or not. These scenarios lead to Partially Observable MDPs (POMDPs). In this paper, we propose Long-Short-Term-Memory-based Twin Delayed Deep Deterministic Policy Gradient (LSTM-TD3) by introducing a memory component to TD3, and compare its performance with other DRL algorithms in both MDPs and POMDPs. Our results demonstrate the significant advantages of the memory component in addressing POMDPs, including the ability to handle missing and noisy observation data.
△ Less
Submitted 13 September, 2021; v1 submitted 24 February, 2021;
originally announced February 2021.
-
Aligning Robot's Behaviours and Users' Perceptions Through Participatory Prototyping
Authors:
Pamela Carreno-Medrano,
Leimin Tian,
Aimee Allen,
Shanti Sumartojo,
Michael Mintrom,
Enrique Coronado,
Gentiane Venture,
Elizabeth Croft,
Dana Kulic
Abstract:
Robots are increasingly being deployed in public spaces. However, the general population rarely has the opportunity to nominate what they would prefer or expect a robot to do in these contexts. Since most people have little or no experience interacting with a robot, it is not surprising that robots deployed in the real world may fail to gain acceptance or engage their intended users. To address th…
▽ More
Robots are increasingly being deployed in public spaces. However, the general population rarely has the opportunity to nominate what they would prefer or expect a robot to do in these contexts. Since most people have little or no experience interacting with a robot, it is not surprising that robots deployed in the real world may fail to gain acceptance or engage their intended users. To address this issue, we examine users' understanding of robots in public spaces and their expectations of appropriate uses of robots in these spaces. Furthermore, we investigate how these perceptions and expectations change as users engage and interact with a robot. To support this goal, we conducted a participatory design workshop in which participants were actively involved in the prototyping and testing of a robot's behaviours in simulation and on the physical robot. Our work highlights how social and interaction contexts influence users' perception of robots in public spaces and how users' design and understanding of what are appropriate robot behaviors shifts as they observe the enactment of their designs.
△ Less
Submitted 10 January, 2021;
originally announced January 2021.
-
Human-in-the-loop Auditory Cueing Strategy for Gait Modification
Authors:
Tina LY Wu,
Anna Murphy,
Chao Chen,
Dana Kulic
Abstract:
External feedback in the form of visual, auditory and tactile cues has been used to assist patients to overcome mobility challenges. However, these cues can become less effective over time. There is limited research on adapting cues to account for inter and intra-personal variations in cue responsiveness. We propose a cue-provision framework that consists of a gait performance monitoring algorithm…
▽ More
External feedback in the form of visual, auditory and tactile cues has been used to assist patients to overcome mobility challenges. However, these cues can become less effective over time. There is limited research on adapting cues to account for inter and intra-personal variations in cue responsiveness. We propose a cue-provision framework that consists of a gait performance monitoring algorithm and an adaptive cueing strategy to improve gait performance. The proposed approach learns a model of the person's response to cues using Gaussian Process regression. The model is then used within an on-line optimization algorithm to generate cues to improve gait performance. We conduct a study with healthy participants to evaluate the ability of the adaptive cueing strategy to influence human gait, and compare its effectiveness to two other cueing approaches: the standard fixed cue approach and a proportional cue approach. The results show that adaptive cueing is more effective in changing the person's gait state once the response model is learned compared to the other methods.
△ Less
Submitted 1 March, 2021; v1 submitted 26 November, 2020;
originally announced November 2020.
-
Learning to Assist Drone Landings
Authors:
Kal Backman,
Dana Kulić,
Hoam Chung
Abstract:
Unmanned aerial vehicles (UAVs) are often used for navigating dangerous terrains, however they are difficult to pilot. Due to complex input-output mapping schemes, limited perception, the complex system dynamics and the need to maintain a safe operation distance, novice pilots experience difficulties in performing safe landings in obstacle filled environments. In this work we propose a shared auto…
▽ More
Unmanned aerial vehicles (UAVs) are often used for navigating dangerous terrains, however they are difficult to pilot. Due to complex input-output mapping schemes, limited perception, the complex system dynamics and the need to maintain a safe operation distance, novice pilots experience difficulties in performing safe landings in obstacle filled environments. In this work we propose a shared autonomy approach that assists novice pilots to perform safe landings on one of several elevated platforms at a proficiency equal to or greater than experienced pilots. Our approach consists of two modules, a perceptual module and a policy module. The perceptual module compresses high dimensionality RGB-D images into a latent vector trained with a cross-modal variational auto-encoder. The policy module provides assistive control inputs trained with the reinforcement algorithm TD3. We conduct a user study (n=33) where participants land a simulated drone with and without the use of the assistant. Despite the goal platform not being known to the assistant, participants of all skill levels were able to outperform experienced participants while assisted in the task.
△ Less
Submitted 28 February, 2021; v1 submitted 26 November, 2020;
originally announced November 2020.
-
Joint Estimation of Expertise and Reward Preferences From Human Demonstrations
Authors:
Pamela Carreno-Medrano,
Stephen L. Smith,
Dana Kulic
Abstract:
When a robot learns from human examples, most approaches assume that the human partner provides examples of optimal behavior. However, there are applications in which the robot learns from non-expert humans. We argue that the robot should learn not only about the human's objectives, but also about their expertise level. The robot could then leverage this joint information to reduce or increase the…
▽ More
When a robot learns from human examples, most approaches assume that the human partner provides examples of optimal behavior. However, there are applications in which the robot learns from non-expert humans. We argue that the robot should learn not only about the human's objectives, but also about their expertise level. The robot could then leverage this joint information to reduce or increase the frequency at which it provides assistance to its human's partner or be more cautious when learning new skills from novice users. Similarly, by taking into account the human's expertise, the robot would also be able of inferring a human's true objectives even when the human's fails to properly demonstrate these objectives due to a lack of expertise. In this paper, we propose to jointly infer the expertise level and objective function of a human given observations of their (possibly) non-optimal demonstrations. Two inference approaches are proposed. In the first approach, inference is done over a finite, discrete set of possible objective functions and expertise levels. In the second approach, the robot optimizes over the space of all possible hypotheses and finds the objective function and expertise level that best explain the observed human behavior. We demonstrate our proposed approaches both in simulation and with real user data.
△ Less
Submitted 8 November, 2020;
originally announced November 2020.
-
Decentralized Multi-Agent Pursuit using Deep Reinforcement Learning
Authors:
Cristino de Souza Jr,
Rhys Newbury,
Akansel Cosgun,
Pedro Castillo,
Boris Vidolov,
Dana Kulic
Abstract:
Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits…
▽ More
Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits from curriculum learning, a sweeping-angle ordering to locally represent neighboring agents and encouraging good formations with reward structure that combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach, with non-holonomic agents, performs on par with classical algorithms with omni-directional agents, and outperforms their non-holonomic adaptations. The learned policy is successfully transferred to the real world in a proof-of-concept demonstration with three motion-constrained pursuer drones.
△ Less
Submitted 9 August, 2021; v1 submitted 16 October, 2020;
originally announced October 2020.
-
Strawberry Detection using Mixed Training on Simulated and Real Data
Authors:
Sunny Goondram,
Akansel Cosgun,
Dana Kulic
Abstract:
This paper demonstrates how simulated images can be useful for object detection tasks in the agricultural sector, where labeled data can be scarce and costly to collect. We consider training on mixed datasets with real and simulated data for strawberry detection in real images. Our results show that using the real dataset augmented by the simulated dataset resulted in slightly higher accuracy.
This paper demonstrates how simulated images can be useful for object detection tasks in the agricultural sector, where labeled data can be scarce and costly to collect. We consider training on mixed datasets with real and simulated data for strawberry detection in real images. Our results show that using the real dataset augmented by the simulated dataset resulted in slightly higher accuracy.
△ Less
Submitted 24 August, 2020;
originally announced August 2020.
-
Fast Approximate Multi-output Gaussian Processes
Authors:
Vladimir Joukov,
Dana Kulić
Abstract:
Gaussian processes regression models are an appealing machine learning method as they learn expressive non-linear models from exemplar data with minimal parameter tuning and estimate both the mean and covariance of unseen points. However, exponential computational complexity growth with the number of training samples has been a long standing challenge. During training, one has to compute and inver…
▽ More
Gaussian processes regression models are an appealing machine learning method as they learn expressive non-linear models from exemplar data with minimal parameter tuning and estimate both the mean and covariance of unseen points. However, exponential computational complexity growth with the number of training samples has been a long standing challenge. During training, one has to compute and invert an $N \times N$ kernel matrix at every iteration. Regression requires computation of an $m \times N$ kernel where $N$ and $m$ are the number of training and test points respectively. In this work we show how approximating the covariance kernel using eigenvalues and functions leads to an approximate Gaussian process with significant reduction in training and regression complexity. Training with the proposed approach requires computing only a $N \times n$ eigenfunction matrix and a $n \times n$ inverse where $n$ is a selected number of eigenvalues. Furthermore, regression now only requires an $m \times n$ matrix. Finally, in a special case the hyperparameter optimization is completely independent form the number of training samples. The proposed method can regress over multiple outputs, estimate the derivative of the regressor of any order, and learn the correlations between them. The computational complexity reduction, regression capabilities, and multioutput correlation learning are demonstrated in simulation examples.
△ Less
Submitted 22 August, 2020;
originally announced August 2020.
-
Learning from Sparse Demonstrations
Authors:
Wanxin Jin,
Todd D. Murphey,
Dana Kulić,
Neta Ezer,
Shaoshuai Mou
Abstract:
This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the…
▽ More
This paper develops the method of Continuous Pontryagin Differentiable Programming (Continuous PDP), which enables a robot to learn an objective function from a few sparsely demonstrated keyframes. The keyframes, labeled with some time stamps, are the desired task-space outputs, which a robot is expected to follow sequentially. The time stamps of the keyframes can be different from the time of the robot's actual execution. The method jointly finds an objective function and a time-warping function such that the robot's resulting trajectory sequentially follows the keyframes with minimal discrepancy loss. The Continuous PDP minimizes the discrepancy loss using projected gradient descent, by efficiently solving the gradient of the robot trajectory with respect to the unknown parameters. The method is first evaluated on a simulated robot arm and then applied to a 6-DoF quadrotor to learn an objective function for motion planning in unmodeled environments. The results show the efficiency of the method, its ability to handle time misalignment between keyframes and robot execution, and the generalization of objective learning into unseen motion conditions.
△ Less
Submitted 8 August, 2022; v1 submitted 5 August, 2020;
originally announced August 2020.
-
Object Handovers: a Review for Robotics
Authors:
Valerio Ortenzi,
Akansel Cosgun,
Tommaso Pardi,
Wesley Chan,
Elizabeth Croft,
Dana Kulic
Abstract:
This article surveys the literature on human-robot object handovers. A handover is a collaborative joint action where an agent, the giver, gives an object to another agent, the receiver. The physical exchange starts when the receiver first contacts the object held by the giver and ends when the giver fully releases the object to the receiver. However, important cognitive and physical processes beg…
▽ More
This article surveys the literature on human-robot object handovers. A handover is a collaborative joint action where an agent, the giver, gives an object to another agent, the receiver. The physical exchange starts when the receiver first contacts the object held by the giver and ends when the giver fully releases the object to the receiver. However, important cognitive and physical processes begin before the physical exchange, including initiating implicit agreement with respect to the location and timing of the exchange. From this perspective, we structure our review into the two main phases delimited by the aforementioned events: 1) a pre-handover phase, and 2) the physical exchange. We focus our analysis on the two actors (giver and receiver) and report the state of the art of robotic givers (robot-to-human handovers) and the robotic receivers (human-to-robot handovers). We report a comprehensive list of qualitative and quantitative metrics commonly used to assess the interaction. While focusing our review on the cognitive level (e.g., prediction, perception, motion planning, learning) and the physical level (e.g., motion, grasping, grip release) of the handover, we briefly discuss also the concepts of safety, social context, and ergonomics. We compare the behaviours displayed during human-to-human handovers to the state of the art of robotic assistants, and identify the major areas of improvement for robotic assistants to reach performance comparable to human interactions. Finally, we propose a minimal set of metrics that should be used in order to enable a fair comparison among the approaches.
△ Less
Submitted 7 January, 2022; v1 submitted 25 July, 2020;
originally announced July 2020.
-
The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning
Authors:
Lingheng Meng,
Rob Gorbet,
Dana Kulić
Abstract:
Multi-step (also called n-step) methods in reinforcement learning (RL) have been shown to be more efficient than the 1-step method due to faster propagation of the reward signal, both theoretically and empirically, in tasks exploiting tabular representation of the value-function. Recently, research in Deep Reinforcement Learning (DRL) also shows that multi-step methods improve learning speed and f…
▽ More
Multi-step (also called n-step) methods in reinforcement learning (RL) have been shown to be more efficient than the 1-step method due to faster propagation of the reward signal, both theoretically and empirically, in tasks exploiting tabular representation of the value-function. Recently, research in Deep Reinforcement Learning (DRL) also shows that multi-step methods improve learning speed and final performance in applications where the value-function and policy are represented with deep neural networks. However, there is a lack of understanding about what is actually contributing to the boost of performance. In this work, we analyze the effect of multi-step methods on alleviating the overestimation problem in DRL, where multi-step experiences are sampled from a replay buffer. Specifically building on top of Deep Deterministic Policy Gradient (DDPG), we propose Multi-step DDPG (MDDPG), where different step sizes are manually set, and its variant called Mixed Multi-step DDPG (MMDDPG) where an average over different multi-step backups is used as update target of Q-value function. Empirically, we show that both MDDPG and MMDDPG are significantly less affected by the overestimation problem than DDPG with 1-step backup, which consequently results in better final performance and learning speed. We also discuss the advantages and disadvantages of different ways to do multi-step expansion in order to reduce approximation error, and expose the tradeoff between overestimation and underestimation that underlies offline multi-step methods. Finally, we compare the computational resource needs of Twin Delayed Deep Deterministic Policy Gradient (TD3), a state-of-art algorithm proposed to address overestimation in actor-critic methods, and our proposed methods, since they show comparable final performance and learning speed.
△ Less
Submitted 22 June, 2020;
originally announced June 2020.
-
Affective Movement Generation using Laban Effort and Shape and Hidden Markov Models
Authors:
Ali Samadani,
Rob Gorbet,
Dana Kulic
Abstract:
Body movements are an important communication medium through which affective states can be discerned. Movements that convey affect can also give machines life-like attributes and help to create a more engaging human-machine interaction. This paper presents an approach for automatic affective movement generation that makes use of two movement abstractions: 1) Laban movement analysis (LMA), and 2) h…
▽ More
Body movements are an important communication medium through which affective states can be discerned. Movements that convey affect can also give machines life-like attributes and help to create a more engaging human-machine interaction. This paper presents an approach for automatic affective movement generation that makes use of two movement abstractions: 1) Laban movement analysis (LMA), and 2) hidden Markov modeling. The LMA provides a systematic tool for an abstract representation of the kinematic and expressive characteristics of movements. Given a desired motion path on which a target emotion is to be overlaid, the proposed approach searches a labeled dataset in the LMA Effort and Shape space for similar movements to the desired motion path that convey the target emotion. An HMM abstraction of the identified movements is obtained and used with the desired motion path to generate a novel movement that is a modulated version of the desired motion path that conveys the target emotion. The extent of modulation can be varied, trading-off between kinematic and affective constraints in the generated movement. The proposed approach is tested using a full-body movement dataset. The efficacy of the proposed approach in generating movements with recognizable target emotions is assessed using a validated automatic recognition model and a user study. The target emotions were correctly recognized from the generated movements at a rate of 72% using the recognition model. Furthermore, participants in the user study were able to correctly perceive the target emotions from a sample of generated movements, although some cases of confusion were also observed.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
Active Preference Learning using Maximum Regret
Authors:
Nils Wilde,
Dana Kulic,
Stephen L. Smith
Abstract:
We study active preference learning as a framework for intuitively specifying the behaviour of autonomous robots. In active preference learning, a user chooses the preferred behaviour from a set of alternatives, from which the robot learns the user's preferences, modeled as a parameterized cost function. Previous approaches present users with alternatives that minimize the uncertainty over the par…
▽ More
We study active preference learning as a framework for intuitively specifying the behaviour of autonomous robots. In active preference learning, a user chooses the preferred behaviour from a set of alternatives, from which the robot learns the user's preferences, modeled as a parameterized cost function. Previous approaches present users with alternatives that minimize the uncertainty over the parameters of the cost function. However, different parameters might lead to the same optimal behaviour; as a consequence the solution space is more structured than the parameter space. We exploit this by proposing a query selection that greedily reduces the maximum error ratio over the solution space. In simulations we demonstrate that the proposed approach outperforms other state of the art techniques in both learning efficiency and ease of queries for the user. Finally, we show that evaluating the learning based on the similarities of solutions instead of the similarities of weights allows for better predictions for different scenarios.
△ Less
Submitted 28 September, 2020; v1 submitted 8 May, 2020;
originally announced May 2020.
-
Supportive Actions for Manipulation in Human-Robot Coworker Teams
Authors:
Shray Bansal,
Rhys Newbury,
Wesley Chan,
Akansel Cosgun,
Aimee Allen,
Dana Kulić,
Tom Drummond,
Charles Isbell
Abstract:
The increasing presence of robots alongside humans, such as in human-robot teams in manufacturing, gives rise to research questions about the kind of behaviors people prefer in their robot counterparts. We term actions that support interaction by reducing future interference with others as supportive robot actions and investigate their utility in a co-located manipulation scenario. We compare two…
▽ More
The increasing presence of robots alongside humans, such as in human-robot teams in manufacturing, gives rise to research questions about the kind of behaviors people prefer in their robot counterparts. We term actions that support interaction by reducing future interference with others as supportive robot actions and investigate their utility in a co-located manipulation scenario. We compare two robot modes in a shared table pick-and-place task: (1) Task-oriented: the robot only takes actions to further its own task objective and (2) Supportive: the robot sometimes prefers supportive actions to task-oriented ones when they reduce future goal-conflicts. Our experiments in simulation, using a simplified human model, reveal that supportive actions reduce the interference between agents, especially in more difficult tasks, but also cause the robot to take longer to complete the task. We implemented these modes on a physical robot in a user study where a human and a robot perform object placement on a shared table. Our results show that a supportive robot was perceived as a more favorable coworker by the human and also reduced interference with the human in the more difficult of two scenarios. However, it also took longer to complete the task highlighting an interesting trade-off between task-efficiency and human-preference that needs to be considered before designing robot behavior for close-proximity manipulation scenarios.
△ Less
Submitted 2 May, 2020;
originally announced May 2020.
-
Improving User Specifications for Robot Behavior through Active Preference Learning: Framework and Evaluation
Authors:
Nils Wilde,
Alexandru Blidaru,
Stephen L. Smith,
Dana Kulić
Abstract:
An important challenge in human-robot interaction (HRI) is enabling non-expert users to specify complex tasks for autonomous robots. Recently, active preference learning has been applied in HRI to interactively shape a robot's behavior. We study a framework where users specify constraints on allowable robot movements on a graphical interface, yielding a robot task specification. However, users may…
▽ More
An important challenge in human-robot interaction (HRI) is enabling non-expert users to specify complex tasks for autonomous robots. Recently, active preference learning has been applied in HRI to interactively shape a robot's behavior. We study a framework where users specify constraints on allowable robot movements on a graphical interface, yielding a robot task specification. However, users may not be able to accurately assess the impact of such constraints on the performance of a robot. Thus, we revise the specification by iteratively presenting users with alternative solutions where some constraints might be violated, and learn about the importance of the constraints from the users' choices between these alternatives. We demonstrate our framework in a user study with a material transport task in an industrial facility. We show that nearly all users accept alternative solutions and thus obtain a revised specification through the learning process, and that the revision leads to a substantial improvement in robot performance. Further, the learning process reduces the variances between the specifications from different users and, thus, makes the specifications more similar. As a result, the users whose initial specifications had the largest impact on performance benefit the most from the interactive learning.
△ Less
Submitted 18 March, 2020; v1 submitted 24 July, 2019;
originally announced July 2019.
-
Learning to Engage with Interactive Systems: A Field Study on Deep Reinforcement Learning in a Public Museum
Authors:
Lingheng Meng,
Daiwei Lin,
Adam Francey,
Rob Gorbet,
Philip Beesley,
Dana Kulić
Abstract:
Physical agents that can autonomously generate engaging, life-like behaviour will lead to more responsive and interesting robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well controlled settings, future physical agents should be capable of interacting with humans in natural settings, including group interaction. In order to generate engagin…
▽ More
Physical agents that can autonomously generate engaging, life-like behaviour will lead to more responsive and interesting robots and other autonomous systems. Although many advances have been made for one-to-one interactions in well controlled settings, future physical agents should be capable of interacting with humans in natural settings, including group interaction. In order to generate engaging behaviours, the autonomous system must first be able to estimate its human partners' engagement level. In this paper, we propose an approach for estimating engagement during group interaction by simultaneously taking into account active and passive interaction, i.e. occupancy, and use the measure as the reward signal within a reinforcement learning framework to learn engaging interactive behaviours. The proposed approach is implemented in an interactive sculptural system in a museum setting. We compare the learning system to a baseline using pre-scripted interactive behaviours. Analysis based on sensory data and survey data shows that adaptable behaviours within an expert-designed action space can achieve higher engagement and likeability.
△ Less
Submitted 24 June, 2020; v1 submitted 14 April, 2019;
originally announced April 2019.
-
Bayesian Active Learning for Collaborative Task Specification Using Equivalence Regions
Authors:
Nils Wilde,
Dana Kulic,
Stephen L. Smith
Abstract:
Specifying complex task behaviours while ensuring good robot performance may be difficult for untrained users. We study a framework for users to specify rules for acceptable behaviour in a shared environment such as industrial facilities. As non-expert users might have little intuition about how their specification impacts the robot's performance, we design a learning system that interacts with th…
▽ More
Specifying complex task behaviours while ensuring good robot performance may be difficult for untrained users. We study a framework for users to specify rules for acceptable behaviour in a shared environment such as industrial facilities. As non-expert users might have little intuition about how their specification impacts the robot's performance, we design a learning system that interacts with the user to find an optimal solution. Using active preference learning, we iteratively show alternative paths that the robot could take on an interface. From the user feedback ranking the alternatives, we learn about the weights that users place on each part of their specification. We extend the user model from our previous work to a discrete Bayesian learning model and introduce a greedy algorithm for proposing alternative that operates on the notion of equivalence regions of user weights. We prove that with this algorithm the revision active learning process converges on the user-optimal path. In simulations on realistic industrial environments, we demonstrate the convergence and robustness of our approach.
△ Less
Submitted 27 January, 2019;
originally announced January 2019.
-
A Multi-layer Gaussian Process for Motor Symptom Estimation in People with Parkinson's Disease
Authors:
Muriel Lang,
Franz M. J. Pfister,
Jakob Fröhner,
Kian Abedinpour,
Daniel Pichler,
Urban Fietzek,
Terry T. Um,
Dana Kulić,
Satoshi Endo,
Sandra Hirche
Abstract:
The assessment of Parkinson's disease (PD) poses a significant challenge as it is influenced by various factors which lead to a complex and fluctuating symptom manifestation. Thus, a frequent and objective PD assessment is highly valuable for effective health management of people with Parkinson's disease (PwP). Here, we propose a method for monitoring PwP by stochastically modeling the relationshi…
▽ More
The assessment of Parkinson's disease (PD) poses a significant challenge as it is influenced by various factors which lead to a complex and fluctuating symptom manifestation. Thus, a frequent and objective PD assessment is highly valuable for effective health management of people with Parkinson's disease (PwP). Here, we propose a method for monitoring PwP by stochastically modeling the relationships between their wrist movements during unscripted daily activities and corresponding annotations about clinical displays of movement abnormalities. We approach the estimation of PD motor signs by independently modeling and hierarchically stacking Gaussian process models for three classes of commonly observed movement abnormalities in PwP including tremor, (non-tremulous) bradykinesia, and (non-tremulous) dyskinesia. We use clinically adopted severity measures as annotations for training the models, thus allowing our multi-layer Gaussian process prediction models to estimate not only their presence but also their severities. The experimental validation of our approach demonstrates strong agreement of the model predictions with these PD annotations. Our results show the proposed method produces promising results in objective monitoring of movement abnormalities of PD in the presence of arbitrary and unknown voluntary motions, and makes an important step towards continuous monitoring of PD in the home environment.
△ Less
Submitted 27 September, 2018; v1 submitted 31 August, 2018;
originally announced August 2018.
-
Parkinson's Disease Assessment from a Wrist-Worn Wearable Sensor in Free-Living Conditions: Deep Ensemble Learning and Visualization
Authors:
Terry Taewoong Um,
Franz Michael Josef Pfister,
Daniel Christian Pichler,
Satoshi Endo,
Muriel Lang,
Sandra Hirche,
Urban Fietzek,
Dana Kulić
Abstract:
Parkinson's Disease (PD) is characterized by disorders in motor function such as freezing of gait, rest tremor, rigidity, and slowed and hyposcaled movements. Medication with dopaminergic medication may alleviate those motor symptoms, however, side-effects may include uncontrolled movements, known as dyskinesia. In this paper, an automatic PD motor-state assessment in free-living conditions is pro…
▽ More
Parkinson's Disease (PD) is characterized by disorders in motor function such as freezing of gait, rest tremor, rigidity, and slowed and hyposcaled movements. Medication with dopaminergic medication may alleviate those motor symptoms, however, side-effects may include uncontrolled movements, known as dyskinesia. In this paper, an automatic PD motor-state assessment in free-living conditions is proposed using an accelerometer in a wrist-worn wearable sensor. In particular, an ensemble of convolutional neural networks (CNNs) is applied to capture the large variability of daily-living activities and overcome the dissimilarity between training and test patients due to the inter-patient variability. In addition, class activation map (CAM), a visualization technique for CNNs, is applied for providing an interpretation of the results.
△ Less
Submitted 8 August, 2018;
originally announced August 2018.
-
Stable Gaussian Process based Tracking Control of Euler-Lagrange Systems
Authors:
Thomas Beckers,
Dana Kulić,
Sandra Hirche
Abstract:
Perfect tracking control for real-world Euler-Lagrange systems is challenging due to uncertainties in the system model and external disturbances. The magnitude of the tracking error can be reduced either by increasing the feedback gains or improving the model of the system. The latter is clearly preferable as it allows to maintain good tracking performance at low feedback gains. However, accurate…
▽ More
Perfect tracking control for real-world Euler-Lagrange systems is challenging due to uncertainties in the system model and external disturbances. The magnitude of the tracking error can be reduced either by increasing the feedback gains or improving the model of the system. The latter is clearly preferable as it allows to maintain good tracking performance at low feedback gains. However, accurate models are often difficult to obtain. In this article, we address the problem of stable high-performance tracking control for unknown Euler-Lagrange systems. In particular, we employ Gaussian Process regression to obtain a data-driven model that is used for the feed-forward compensation of unknown dynamics of the system. The model fidelity is used to adapt the feedback gains allowing low feedback gains in state space regions of high model confidence. The proposed control law guarantees a globally bounded tracking error with a specific probability. Simulation studies demonstrate the superiority over state of the art tracking control approaches.
△ Less
Submitted 12 February, 2019; v1 submitted 19 June, 2018;
originally announced June 2018.
-
Inverse Optimal Control from Incomplete Trajectory Observations
Authors:
Wanxin Jin,
Dana Kulić,
Shaoshuai Mou,
Sandra Hirche
Abstract:
This article develops a methodology that enables learning an objective function of an optimal control system from incomplete trajectory observations. The objective function is assumed to be a weighted sum of features (or basis functions) with unknown weights, and the observed data is a segment of a trajectory of system states and inputs. The proposed technique introduces the concept of the recover…
▽ More
This article develops a methodology that enables learning an objective function of an optimal control system from incomplete trajectory observations. The objective function is assumed to be a weighted sum of features (or basis functions) with unknown weights, and the observed data is a segment of a trajectory of system states and inputs. The proposed technique introduces the concept of the recovery matrix to establish the relationship between any available segment of the trajectory and the weights of given candidate features. The rank of the recovery matrix indicates whether a subset of relevant features can be found among the candidate features and the corresponding weights can be learned from the segment data. The recovery matrix can be obtained iteratively and its rank non-decreasing property shows that additional observations may contribute to the objective learning. Based on the recovery matrix, a method for using incomplete trajectory observations to learn the weights of selected features is established, and an incremental inverse optimal control algorithm is developed by automatically finding the minimal required observation. The effectiveness of the proposed method is demonstrated on a linear quadratic regulator system and a simulated robot manipulator.
△ Less
Submitted 21 January, 2021; v1 submitted 20 March, 2018;
originally announced March 2018.
-
Data Augmentation of Wearable Sensor Data for Parkinson's Disease Monitoring using Convolutional Neural Networks
Authors:
Terry Taewoong Um,
Franz Michael Josef Pfister,
Daniel Pichler,
Satoshi Endo,
Muriel Lang,
Sandra Hirche,
Urban Fietzek,
Dana Kulić
Abstract:
While convolutional neural networks (CNNs) have been successfully applied to many challenging classification applications, they typically require large datasets for training. When the availability of labeled data is limited, data augmentation is a critical preprocessing step for CNNs. However, data augmentation for wearable sensor data has not been deeply investigated yet.
In this paper, various…
▽ More
While convolutional neural networks (CNNs) have been successfully applied to many challenging classification applications, they typically require large datasets for training. When the availability of labeled data is limited, data augmentation is a critical preprocessing step for CNNs. However, data augmentation for wearable sensor data has not been deeply investigated yet.
In this paper, various data augmentation methods for wearable sensor data are proposed. The proposed methods and CNNs are applied to the classification of the motor state of Parkinson's Disease patients, which is challenging due to small dataset size, noisy labels, and large intra-class variability. Appropriate augmentation improves the classification performance from 77.54\% to 86.88\%.
△ Less
Submitted 8 November, 2017; v1 submitted 1 June, 2017;
originally announced June 2017.
-
Exercise Motion Classification from Large-Scale Wearable Sensor Data Using Convolutional Neural Networks
Authors:
Terry Taewoong Um,
Vahid Babakeshizadeh,
Dana Kulić
Abstract:
The ability to accurately identify human activities is essential for developing automatic rehabilitation and sports training systems. In this paper, large-scale exercise motion data obtained from a forearm-worn wearable sensor are classified with a convolutional neural network (CNN). Time-series data consisting of accelerometer and orientation measurements are formatted as images, allowing the CNN…
▽ More
The ability to accurately identify human activities is essential for developing automatic rehabilitation and sports training systems. In this paper, large-scale exercise motion data obtained from a forearm-worn wearable sensor are classified with a convolutional neural network (CNN). Time-series data consisting of accelerometer and orientation measurements are formatted as images, allowing the CNN to automatically extract discriminative features. A comparative study on the effects of image formatting and different CNN architectures is also presented. The best performing configuration classifies 50 gym exercises with 92.1% accuracy.
△ Less
Submitted 22 July, 2017; v1 submitted 22 October, 2016;
originally announced October 2016.
-
Spline Path Following for Redundant Mechanical Systems
Authors:
Rajan Gill,
Dana Kulić,
Christopher Nielsen
Abstract:
Path following controllers make the output of a control system approach and traverse a pre-specified path with no apriori time parametrization. In this paper we present a method for path following control design applicable to framed curves generated by splines in the workspace of kinematically redundant mechanical systems. The class of admissible paths includes self-intersecting curves. Kinematic…
▽ More
Path following controllers make the output of a control system approach and traverse a pre-specified path with no apriori time parametrization. In this paper we present a method for path following control design applicable to framed curves generated by splines in the workspace of kinematically redundant mechanical systems. The class of admissible paths includes self-intersecting curves. Kinematic redundancies are resolved by designing controllers that solve a suitably defined constrained quadratic optimization problem. By employing partial feedback linearization, the proposed path following controllers have a clear physical meaning. The approach is experimentally verified on a 4-degree-of-freedom (4-DOF) manipulator with a combination of revolute and linear actuated links and significant model uncertainty.
△ Less
Submitted 26 April, 2015;
originally announced April 2015.
-
Detecting Falls with X-Factor Hidden Markov Models
Authors:
Shehroz S. Khan,
Michelle E. Karg,
Dana Kulic,
Jesse Hoey
Abstract:
Identification of falls while performing normal activities of daily living (ADL) is important to ensure personal safety and well-being. However, falling is a short term activity that occurs infrequently. This poses a challenge to traditional classification algorithms, because there may be very little training data for falls (or none at all). This paper proposes an approach for the identification o…
▽ More
Identification of falls while performing normal activities of daily living (ADL) is important to ensure personal safety and well-being. However, falling is a short term activity that occurs infrequently. This poses a challenge to traditional classification algorithms, because there may be very little training data for falls (or none at all). This paper proposes an approach for the identification of falls using a wearable device in the absence of training data for falls but with plentiful data for normal ADL. We propose three `X-Factor' Hidden Markov Model (XHMMs) approaches. The XHMMs model unseen falls using "inflated" output covariances (observation models). To estimate the inflated covariances, we propose a novel cross validation method to remove "outliers" from the normal ADL that serve as proxies for the unseen falls and allow learning the XHMMs using only normal activities. We tested the proposed XHMM approaches on two activity recognition datasets and show high detection rates for falls in the absence of fall-specific training data. We show that the traditional method of choosing a threshold based on maximum of negative of log-likelihood to identify unseen falls is ill-posed for this problem. We also show that supervised classification methods perform poorly when very limited fall data are available during the training phase.
△ Less
Submitted 20 January, 2017; v1 submitted 8 April, 2015;
originally announced April 2015.