-
MARLIN: A Cloud Integrated Robotic Solution to Support Intralogistics in Retail
Authors:
Dennis Mronga,
Andreas Bresser,
Fabian Maas,
Adrian Danzglock,
Simon Stelter,
Alina Hawkin,
Hoang Giang Nguyen,
Michael Beetz,
Frank Kirchner
Abstract:
In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous na…
▽ More
In this paper, we present the service robot MARLIN and its integration with the K4R platform, a cloud system for complex AI applications in retail. At its core, this platform contains so-called semantic digital twins, a semantically annotated representation of the retail store. MARLIN continuously exchanges data with the K4R platform, improving the robot's capabilities in perception, autonomous navigation, and task planning. We exploit these capabilities in a retail intralogistics scenario, specifically by assisting store employees in stocking shelves. We demonstrate that MARLIN is able to update the digital representation of the retail store by detecting and classifying obstacles, autonomously planning and executing replenishment missions, adapting to unforeseen changes in the environment, and interacting with store employees. Experiments are conducted in simulation, in a laboratory environment, and in a real store. We also describe and evaluate a novel algorithm for autonomous navigation of articulated tractor-trailer systems. The algorithm outperforms the manufacturer's proprietary navigation approach and improves MARLIN's navigation capabilities in confined spaces.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Human-AI Interaction in Industrial Robotics: Design and Empirical Evaluation of a User Interface for Explainable AI-Based Robot Program Optimization
Authors:
Benjamin Alt,
Johannes Zahn,
Claudius Kienle,
Julia Dvorak,
Marvin May,
Darko Katic,
Rainer Jäkel,
Tobias Kopp,
Michael Beetz,
Gisela Lanza
Abstract:
While recent advances in deep learning have demonstrated its transformative potential, its adoption for real-world manufacturing applications remains limited. We present an Explanation User Interface (XUI) for a state-of-the-art deep learning-based robot program optimizer which provides both naive and expert users with different user experiences depending on their skill level, as well as Explainab…
▽ More
While recent advances in deep learning have demonstrated its transformative potential, its adoption for real-world manufacturing applications remains limited. We present an Explanation User Interface (XUI) for a state-of-the-art deep learning-based robot program optimizer which provides both naive and expert users with different user experiences depending on their skill level, as well as Explainable AI (XAI) features to facilitate the application of deep learning methods in real-world applications. To evaluate the impact of the XUI on task performance, user satisfaction and cognitive load, we present the results of a preliminary user survey and propose a study design for a large-scale follow-up study.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
BANSAI: Towards Bridging the AI Adoption Gap in Industrial Robotics with Neurosymbolic Programming
Authors:
Benjamin Alt,
Julia Dvorak,
Darko Katic,
Rainer Jäkel,
Michael Beetz,
Gisela Lanza
Abstract:
Over the past decade, deep learning helped solve manipulation problems across all domains of robotics. At the same time, industrial robots continue to be programmed overwhelmingly using traditional program representations and interfaces. This paper undertakes an analysis of this "AI adoption gap" from an industry practitioner's perspective. In response, we propose the BANSAI approach (Bridging the…
▽ More
Over the past decade, deep learning helped solve manipulation problems across all domains of robotics. At the same time, industrial robots continue to be programmed overwhelmingly using traditional program representations and interfaces. This paper undertakes an analysis of this "AI adoption gap" from an industry practitioner's perspective. In response, we propose the BANSAI approach (Bridging the AI Adoption Gap via Neurosymbolic AI). It systematically leverages principles of neurosymbolic AI to establish data-driven, subsymbolic program synthesis and optimization in modern industrial robot programming workflow. BANSAI conceptually unites several lines of prior research and proposes a path toward practical, real-world validation.
△ Less
Submitted 21 April, 2024;
originally announced April 2024.
-
Cloud-based Digital Twin for Cognitive Robotics
Authors:
Arthur Niedźwiecki,
Sascha Jongebloed,
Yanxiang Zhan,
Michaela Kümpel,
Jörn Syrbe,
Michael Beetz
Abstract:
The paper presents a novel cloud-based digital twin learning platform for teaching and training concepts of cognitive robotics. Instead of forcing interested learners or students to install a new operating system and bulky, fragile software onto their personal laptops just to solve tutorials or coding assignments of a single lecture on robotics, it would be beneficial to avoid technical setups and…
▽ More
The paper presents a novel cloud-based digital twin learning platform for teaching and training concepts of cognitive robotics. Instead of forcing interested learners or students to install a new operating system and bulky, fragile software onto their personal laptops just to solve tutorials or coding assignments of a single lecture on robotics, it would be beneficial to avoid technical setups and directly dive into the content of cognitive robotics. To achieve this, the authors utilize containerization technologies and Kubernetes to deploy and operate containerized applications, including robotics simulation environments and software collections based on the Robot operating System (ROS). The web-based Integrated Development Environment JupyterLab is integrated with RvizWeb and XPRA to provide real-time visualization of sensor data and robot behavior in a user-friendly environment for interacting with robotics software. The paper also discusses the application of the platform in teaching Knowledge Representation, Reasoning, Acquisition and Retrieval, and Task-Executives. The authors conclude that the proposed platform is a valuable tool for education and research in cognitive robotics, and that it has the potential to democratize access to these fields. The platform has already been successfully employed in various academic courses, demonstrating its effectiveness in fostering knowledge and skill development.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Large Language Model-informed ECG Dual Attention Network for Heart Failure Risk Prediction
Authors:
Chen Chen,
Lei Li,
Marcel Beetz,
Abhirup Banerjee,
Ramneek Gupta,
Vicente Grau
Abstract:
Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF ris…
▽ More
Heart failure (HF) poses a significant public health challenge, with a rising global mortality rate. Early detection and prevention of HF could significantly reduce its impact. We introduce a novel methodology for predicting HF risk using 12-lead electrocardiograms (ECGs). We present a novel, lightweight dual-attention ECG network designed to capture complex ECG features essential for early HF risk prediction, despite the notable imbalance between low and high-risk groups. This network incorporates a cross-lead attention module and twelve lead-specific temporal attention modules, focusing on cross-lead interactions and each lead's local dynamics. To further alleviate model overfitting, we leverage a large language model (LLM) with a public ECG-Report dataset for pretraining on an ECG-report alignment task. The network is then fine-tuned for HF risk prediction using two specific cohorts from the UK Biobank study, focusing on patients with hypertension (UKB-HYP) and those who have had a myocardial infarction (UKB-MI).The results reveal that LLM-informed pre-training substantially enhances HF risk prediction in these cohorts. The dual-attention design not only improves interpretability but also predictive accuracy, outperforming existing competitive methods with C-index scores of 0.6349 for UKB-HYP and 0.5805 for UKB-MI. This demonstrates our method's potential in advancing HF risk assessment with clinical complex ECG data.
△ Less
Submitted 22 March, 2024; v1 submitted 15 March, 2024;
originally announced March 2024.
-
RoboGrind: Intuitive and Interactive Surface Treatment with Industrial Robots
Authors:
Benjamin Alt,
Florian Stöckl,
Silvan Müller,
Christopher Braun,
Julian Raible,
Saad Alhasan,
Oliver Rettig,
Lukas Ringle,
Darko Katic,
Rainer Jäkel,
Michael Beetz,
Marcus Strand,
Marco F. Huber
Abstract:
Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identif…
▽ More
Surface treatment tasks such as grinding, sanding or polishing are a vital step of the value chain in many industries, but are notoriously challenging to automate. We present RoboGrind, an integrated system for the intuitive, interactive automation of surface treatment tasks with industrial robots. It combines a sophisticated 3D perception pipeline for surface scanning and automatic defect identification, an interactive voice-controlled wizard system for the AI-assisted bootstrapping and parameterization of robot programs, and an automatic planning and execution pipeline for force-controlled robotic surface treatment. RoboGrind is evaluated both under laboratory and real-world conditions in the context of refabricating fiberglass wind turbine blades.
△ Less
Submitted 27 February, 2024; v1 submitted 26 February, 2024;
originally announced February 2024.
-
Anatomical basis of human sex differences in ECG identified by automated torso-cardiac three-dimensional reconstruction
Authors:
Hannah J. Smith,
Blanca Rodriguez,
Yuling Sang,
Marcel Beetz,
Robin Choudhury,
Vicente Grau,
Abhirup Banerjee
Abstract:
Background and Aims: The electrocardiogram (ECG) is routinely used for diagnosis and risk stratification following myocardial infarction (MI), though its interpretation is confounded by anatomical variability and sex differences. Women have a higher incidence of missed MI diagnosis and poorer outcomes following infarction. Sex differences in ECG biomarkers and torso-ventricular anatomy have not be…
▽ More
Background and Aims: The electrocardiogram (ECG) is routinely used for diagnosis and risk stratification following myocardial infarction (MI), though its interpretation is confounded by anatomical variability and sex differences. Women have a higher incidence of missed MI diagnosis and poorer outcomes following infarction. Sex differences in ECG biomarkers and torso-ventricular anatomy have not been well characterised, largely due to the absence of high-throughput torso reconstruction methods.
Methods: This work presents quantification of sex differences in ECG versus anatomical biomarkers in healthy and post-MI subjects, enabled by a novel, end-to-end automated pipeline for torso-ventricular anatomical reconstruction from clinically standard cardiac magnetic resonance imaging. Personalised 3D torso-ventricular reconstructions were generated for 425 post-MI subjects and 1051 healthy controls from the UK Biobank. Regression models were created relating the extracted torso-ventricular and ECG parameters.
Results: Half the sex difference in QRS durations is explained by smaller ventricles in women both in healthy ($3.4 \pm 1.3$ms of $6.0 \pm 1.5$ms) and post-MI ($4.5 \pm 1.4$ms of $8.3 \pm 2.5$ms) subjects. Lower baseline STj amplitude in women is also associated with smaller ventricles, and more superior and posterior cardiac position. Post-MI T wave amplitude and R axis deviations are more strongly associated with a more posterior and horizontal cardiac position in women rather than electrophysiology as in men.
Conclusion: A novel computational pipeline enables the three-dimensional reconstruction of 1476 torso-cardiac geometries of healthy and post-myocardial infarction subjects, quantification of sex and BMI-related differences and association with ECG biomarkers. Any ECG-based tool should be reviewed considering anatomical sex differences to avoid sex-biased outcomes.
△ Less
Submitted 17 July, 2024; v1 submitted 21 December, 2023;
originally announced December 2023.
-
Translating Universal Scene Descriptions into Knowledge Graphs for Robotic Environment
Authors:
Giang Hoang Nguyen,
Daniel Bessler,
Simon Stelter,
Mihai Pomarlan,
Michael Beetz
Abstract:
Robots performing human-scale manipulation tasks require an extensive amount of knowledge about their surroundings in order to perform their actions competently and human-like. In this work, we investigate the use of virtual reality technology as an implementation for robot environment modeling, and present a technique for translating scene graphs into knowledge bases. To this end, we take advanta…
▽ More
Robots performing human-scale manipulation tasks require an extensive amount of knowledge about their surroundings in order to perform their actions competently and human-like. In this work, we investigate the use of virtual reality technology as an implementation for robot environment modeling, and present a technique for translating scene graphs into knowledge bases. To this end, we take advantage of the Universal Scene Description (USD) format which is an emerging standard for the authoring, visualization and simulation of complex environments. We investigate the conversion of USD-based environment models into Knowledge Graph (KG) representations that facilitate semantic querying and integration with additional knowledge sources.
△ Less
Submitted 27 October, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Integrating Transformations in Probabilistic Circuits
Authors:
Tom Schierenbeck,
Vladimir Vutov,
Thorsten Dickhaus,
Michael Beetz
Abstract:
This study addresses the predictive limitation of probabilistic circuits and introduces transformations as a remedy to overcome it. We demonstrate this limitation in robotic scenarios. We motivate that independent component analysis is a sound tool to preserve the independence properties of probabilistic circuits. Our approach is an extension of joint probability trees, which are model-free determ…
▽ More
This study addresses the predictive limitation of probabilistic circuits and introduces transformations as a remedy to overcome it. We demonstrate this limitation in robotic scenarios. We motivate that independent component analysis is a sound tool to preserve the independence properties of probabilistic circuits. Our approach is an extension of joint probability trees, which are model-free deterministic circuits. By doing so, it is demonstrated that the proposed approach is able to achieve higher likelihoods while using fewer parameters compared to the joint probability trees on seven benchmark data sets as well as on real robot data. Furthermore, we discuss how to integrate transformations into tree-based learning routines. Finally, we argue that exact inference with transformed quantile parameterized distributions is not tractable. However, our approach allows for efficient sampling and approximate inference.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Towards a Neuronally Consistent Ontology for Robotic Agents
Authors:
Florian Ahrens,
Mihai Pomarlan,
Daniel Beßler,
Thorsten Fehr,
Michael Beetz,
Manfred Herrmann
Abstract:
The Collaborative Research Center for Everyday Activity Science & Engineering (CRC EASE) aims to enable robots to perform environmental interaction tasks with close to human capacity. It therefore employs a shared ontology to model the activity of both kinds of agents, empowering robots to learn from human experiences. To properly describe these human experiences, the ontology will strongly benefi…
▽ More
The Collaborative Research Center for Everyday Activity Science & Engineering (CRC EASE) aims to enable robots to perform environmental interaction tasks with close to human capacity. It therefore employs a shared ontology to model the activity of both kinds of agents, empowering robots to learn from human experiences. To properly describe these human experiences, the ontology will strongly benefit from incorporating characteristics of neuronal information processing which are not accessible from a behavioral perspective alone. We, therefore, propose the analysis of human neuroimaging data for evaluation and validation of concepts and events defined in the ontology model underlying most of the CRC projects. In an exploratory analysis, we employed an Independent Component Analysis (ICA) on functional Magnetic Resonance Imaging (fMRI) data from participants who were presented with the same complex video stimuli of activities as robotic and human agents in different environments and contexts. We then correlated the activity patterns of brain networks represented by derived components with timings of annotated event categories as defined by the ontology model. The present results demonstrate a subset of common networks with stable correlations and specificity towards particular event classes and groups, associated with environmental and contextual factors. These neuronal characteristics will open up avenues for adapting the ontology model to be more consistent with human information processing.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Multi-objective point cloud autoencoders for explainable myocardial infarction prediction
Authors:
Marcel Beetz,
Abhirup Banerjee,
Vicente Grau
Abstract:
Myocardial infarction (MI) is one of the most common causes of death in the world. Image-based biomarkers commonly used in the clinic, such as ejection fraction, fail to capture more complex patterns in the heart's 3D anatomy and thus limit diagnostic accuracy. In this work, we present the multi-objective point cloud autoencoder as a novel geometric deep learning approach for explainable infarctio…
▽ More
Myocardial infarction (MI) is one of the most common causes of death in the world. Image-based biomarkers commonly used in the clinic, such as ejection fraction, fail to capture more complex patterns in the heart's 3D anatomy and thus limit diagnostic accuracy. In this work, we present the multi-objective point cloud autoencoder as a novel geometric deep learning approach for explainable infarction prediction, based on multi-class 3D point cloud representations of cardiac anatomy and function. Its architecture consists of multiple task-specific branches connected by a low-dimensional latent space to allow for effective multi-objective learning of both reconstruction and MI prediction, while capturing pathology-specific 3D shape information in an interpretable latent space. Furthermore, its hierarchical branch design with point cloud-based deep learning operations enables efficient multi-scale feature learning directly on high-resolution anatomy point clouds. In our experiments on a large UK Biobank dataset, the multi-objective point cloud autoencoder is able to accurately reconstruct multi-temporal 3D shapes with Chamfer distances between predicted and input anatomies below the underlying images' pixel resolution. Our method outperforms multiple machine learning and deep learning benchmarks for the task of incident MI prediction by 19% in terms of Area Under the Receiver Operating Characteristic curve. In addition, its task-specific compact latent space exhibits easily separable control and MI clusters with clinically plausible associations between subject encodings and corresponding 3D shapes, thus demonstrating the explainability of the prediction.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Modeling 3D cardiac contraction and relaxation with point cloud deformation networks
Authors:
Marcel Beetz,
Abhirup Banerjee,
Vicente Grau
Abstract:
Global single-valued biomarkers of cardiac function typically used in clinical practice, such as ejection fraction, provide limited insight on the true 3D cardiac deformation process and hence, limit the understanding of both healthy and pathological cardiac mechanics. In this work, we propose the Point Cloud Deformation Network (PCD-Net) as a novel geometric deep learning approach to model 3D car…
▽ More
Global single-valued biomarkers of cardiac function typically used in clinical practice, such as ejection fraction, provide limited insight on the true 3D cardiac deformation process and hence, limit the understanding of both healthy and pathological cardiac mechanics. In this work, we propose the Point Cloud Deformation Network (PCD-Net) as a novel geometric deep learning approach to model 3D cardiac contraction and relaxation between the extreme ends of the cardiac cycle. It employs the recent advances in point cloud-based deep learning into an encoder-decoder structure, in order to enable efficient multi-scale feature learning directly on multi-class 3D point cloud representations of the cardiac anatomy. We evaluate our approach on a large dataset of over 10,000 cases from the UK Biobank study and find average Chamfer distances between the predicted and ground truth anatomies below the pixel resolution of the underlying image acquisition. Furthermore, we observe similar clinical metrics between predicted and ground truth populations and show that the PCD-Net can successfully capture subpopulation-specific differences between normal subjects and myocardial infarction (MI) patients. We then demonstrate that the learned 3D deformation patterns outperform multiple clinical benchmarks by 13% and 7% in terms of area under the receiver operating characteristic curve for the tasks of prevalent MI detection and incident MI prediction and by 7% in terms of Harrell's concordance index for MI survival analysis.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
Multi-class point cloud completion networks for 3D cardiac anatomy reconstruction from cine magnetic resonance images
Authors:
Marcel Beetz,
Abhirup Banerjee,
Julius Ossenberg-Engels,
Vicente Grau
Abstract:
Cine magnetic resonance imaging (MRI) is the current gold standard for the assessment of cardiac anatomy and function. However, it typically only acquires a set of two-dimensional (2D) slices of the underlying three-dimensional (3D) anatomy of the heart, thus limiting the understanding and analysis of both healthy and pathological cardiac morphology and physiology. In this paper, we propose a nove…
▽ More
Cine magnetic resonance imaging (MRI) is the current gold standard for the assessment of cardiac anatomy and function. However, it typically only acquires a set of two-dimensional (2D) slices of the underlying three-dimensional (3D) anatomy of the heart, thus limiting the understanding and analysis of both healthy and pathological cardiac morphology and physiology. In this paper, we propose a novel fully automatic surface reconstruction pipeline capable of reconstructing multi-class 3D cardiac anatomy meshes from raw cine MRI acquisitions. Its key component is a multi-class point cloud completion network (PCCN) capable of correcting both the sparsity and misalignment issues of the 3D reconstruction task in a unified model. We first evaluate the PCCN on a large synthetic dataset of biventricular anatomies and observe Chamfer distances between reconstructed and gold standard anatomies below or similar to the underlying image resolution for multiple levels of slice misalignment. Furthermore, we find a reduction in reconstruction error compared to a benchmark 3D U-Net by 32% and 24% in terms of Hausdorff distance and mean surface distance, respectively. We then apply the PCCN as part of our automated reconstruction pipeline to 1000 subjects from the UK Biobank study in a cross-domain transfer setting and demonstrate its ability to reconstruct accurate and topologically plausible biventricular heart meshes with clinical metrics comparable to the previous literature. Finally, we investigate the robustness of our proposed approach and observe its capacity to successfully handle multiple common outlier conditions.
△ Less
Submitted 18 July, 2023; v1 submitted 17 July, 2023;
originally announced July 2023.
-
3D Shape-Based Myocardial Infarction Prediction Using Point Cloud Classification Networks
Authors:
Marcel Beetz,
Yilong Yang,
Abhirup Banerjee,
Lei Li,
Vicente Grau
Abstract:
Myocardial infarction (MI) is one of the most prevalent cardiovascular diseases with associated clinical decision-making typically based on single-valued imaging biomarkers. However, such metrics only approximate the complex 3D structure and physiology of the heart and hence hinder a better understanding and prediction of MI outcomes. In this work, we investigate the utility of complete 3D cardiac…
▽ More
Myocardial infarction (MI) is one of the most prevalent cardiovascular diseases with associated clinical decision-making typically based on single-valued imaging biomarkers. However, such metrics only approximate the complex 3D structure and physiology of the heart and hence hinder a better understanding and prediction of MI outcomes. In this work, we investigate the utility of complete 3D cardiac shapes in the form of point clouds for an improved detection of MI events. To this end, we propose a fully automatic multi-step pipeline consisting of a 3D cardiac surface reconstruction step followed by a point cloud classification network. Our method utilizes recent advances in geometric deep learning on point clouds to enable direct and efficient multi-scale learning on high-resolution surface models of the cardiac anatomy. We evaluate our approach on 1068 UK Biobank subjects for the tasks of prevalent MI detection and incident MI prediction and find improvements of ~13% and ~5% respectively over clinical benchmarks. Furthermore, we analyze the role of each ventricle and cardiac phase for 3D shape-based MI detection and conduct a visual analysis of the morphological and physiological patterns typically associated with MI outcomes.
△ Less
Submitted 14 July, 2023;
originally announced July 2023.
-
Towards Enabling Cardiac Digital Twins of Myocardial Infarction Using Deep Computational Models for Inverse Inference
Authors:
Lei Li,
Julia Camps,
Zhinuo,
Wang,
Abhirup Banerjee,
Marcel Beetz,
Blanca Rodriguez,
Vicente Grau
Abstract:
Cardiac digital twins (CDTs) have the potential to offer individualized evaluation of cardiac function in a non-invasive manner, making them a promising approach for personalized diagnosis and treatment planning of my-ocardial infarction (MI). The inference of accurate myocardial tissue properties is crucial in creating a reliable CDT of MI. In this work, we investigate the feasibility of inferrin…
▽ More
Cardiac digital twins (CDTs) have the potential to offer individualized evaluation of cardiac function in a non-invasive manner, making them a promising approach for personalized diagnosis and treatment planning of my-ocardial infarction (MI). The inference of accurate myocardial tissue properties is crucial in creating a reliable CDT of MI. In this work, we investigate the feasibility of inferring myocardial tissue properties from the electrocardiogram (ECG) within a CDT platform. The platform integrates multi-modal data, such as cardiac MRI and ECG, to enhance the accuracy and reliability of the inferred tissue properties. We perform a sensitivity analysis based on computer simulations, systematically exploring the effects of infarct location, size, degree of transmurality, and electrical ac-tivity alteration on the simulated QRS complex of ECG, to establish the limits of the approach. We subsequently present a novel deep computational model, comprising a dual-branch variational autoencoder and an inference model, to infer infarct location and distribution from the simulated QRS. The proposed model achieves mean Dice scores of 0.457 \pm 0.317 and 0.302 \pm 0.273 for the inference of left ventricle scars and border zone, respectively. The sensitivity analysis enhances our understanding of the complex relationship between infarct characteristics and electrophysiological features. The in silico experimental results show that the model can effectively capture the relationship for the inverse inference, with promising potential for clinical application in the future. The code will be released publicly once the manuscript is accepted for publication.
△ Less
Submitted 14 February, 2024; v1 submitted 10 July, 2023;
originally announced July 2023.
-
Knowledge-Driven Robot Program Synthesis from Human VR Demonstrations
Authors:
Benjamin Alt,
Franklin Kenghagho Kenfack,
Andrei Haidu,
Darko Katic,
Rainer Jäkel,
Michael Beetz
Abstract:
Aging societies, labor shortages and increasing wage costs call for assistance robots capable of autonomously performing a wide array of real-world tasks. Such open-ended robotic manipulation requires not only powerful knowledge representations and reasoning (KR&R) algorithms, but also methods for humans to instruct robots what tasks to perform and how to perform them. In this paper, we present a…
▽ More
Aging societies, labor shortages and increasing wage costs call for assistance robots capable of autonomously performing a wide array of real-world tasks. Such open-ended robotic manipulation requires not only powerful knowledge representations and reasoning (KR&R) algorithms, but also methods for humans to instruct robots what tasks to perform and how to perform them. In this paper, we present a system for automatically generating executable robot control programs from human task demonstrations in virtual reality (VR). We leverage common-sense knowledge and game engine-based physics to semantically interpret human VR demonstrations, as well as an expressive and general task representation and automatic path planning and code generation, embedded into a state-of-the-art cognitive architecture. We demonstrate our approach in the context of force-sensitive fetch-and-place for a robotic shopping assistant. The source code is available at https://github.com/ease-crc/vr-program-synthesis.
△ Less
Submitted 3 July, 2023; v1 submitted 5 June, 2023;
originally announced June 2023.
-
From Interactive to Co-Constructive Task Learning
Authors:
Anna-Lisa Vollmer,
Daniel Leidner,
Michael Beetz,
Britta Wrede
Abstract:
Humans have developed the capability to teach relevant aspects of new or adapted tasks to a social peer with very few task demonstrations by making use of scaffolding strategies that leverage prior knowledge and importantly prior joint experience to yield a joint understanding and a joint execution of the required steps to solve the task. This process has been discovered and analyzed in parent-inf…
▽ More
Humans have developed the capability to teach relevant aspects of new or adapted tasks to a social peer with very few task demonstrations by making use of scaffolding strategies that leverage prior knowledge and importantly prior joint experience to yield a joint understanding and a joint execution of the required steps to solve the task. This process has been discovered and analyzed in parent-infant interaction and constitutes a ``co-construction'' as it allows both, the teacher and the learner, to jointly contribute to the task. We propose to focus research in robot interactive learning on this co-construction process to enable robots to learn from non-expert users in everyday situations. In the following, we will review current proposals for interactive task learning and discuss their main contributions with respect to the entailing interaction. We then discuss our notion of co-construction and summarize research insights from adult-child and human-robot interactions to elucidate its nature in more detail. From this overview we finally derive research desiderata that entail the dimensions architecture, representation, interaction and explainability.
△ Less
Submitted 24 May, 2023;
originally announced May 2023.
-
The CRAM Cognitive Architecture for Robot Manipulation in Everyday Activities
Authors:
Michael Beetz,
Gayane Kazhoyan,
David Vernon
Abstract:
This paper presents a hybrid robot cognitive architecture, CRAM, that enables robot agents to accomplish everyday manipulation tasks. It addresses five key challenges that arise when carrying out everyday activities. These include (i) the underdetermined nature of task specification, (ii) the generation of context-specific behavior, (iii) the ability to make decisions based on knowledge, experienc…
▽ More
This paper presents a hybrid robot cognitive architecture, CRAM, that enables robot agents to accomplish everyday manipulation tasks. It addresses five key challenges that arise when carrying out everyday activities. These include (i) the underdetermined nature of task specification, (ii) the generation of context-specific behavior, (iii) the ability to make decisions based on knowledge, experience, and prediction, (iv) the ability to reason at the levels of motions and sensor data, and (v) the ability to explain actions and the consequences of these actions. We explore the computational foundations of the CRAM cognitive model: the self-programmability entailed by physical symbol systems, the CRAM plan language, generalized action plans and implicit-to-explicit manipulation, generative models, digital twin knowledge representation & reasoning, and narrative-enabled episodic memories. We describe the structure of the cognitive architecture and explain the process by which CRAM transforms generalized action plans into parameterized motion plans. It does this using knowledge and reasoning to identify the parameter values that maximize the likelihood of successfully accomplishing the action. We demonstrate the ability of a CRAM-controlled robot to carry out everyday activities in a kitchen environment. Finally, we consider future extensions that focus on achieving greater flexibility through transformational learning and metacognition.
△ Less
Submitted 27 April, 2023;
originally announced April 2023.
-
Joint Probability Trees
Authors:
Daniel Nyga,
Mareike Picklum,
Tom Schierenbeck,
Michael Beetz
Abstract:
We introduce Joint Probability Trees (JPT), a novel approach that makes learning of and reasoning about joint probability distributions tractable for practical applications. JPTs support both symbolic and subsymbolic variables in a single hybrid model, and they do not rely on prior knowledge about variable dependencies or families of distributions. JPT representations build on tree structures that…
▽ More
We introduce Joint Probability Trees (JPT), a novel approach that makes learning of and reasoning about joint probability distributions tractable for practical applications. JPTs support both symbolic and subsymbolic variables in a single hybrid model, and they do not rely on prior knowledge about variable dependencies or families of distributions. JPT representations build on tree structures that partition the problem space into relevant subregions that are elicited from the training data instead of postulating a rigid dependency model prior to learning. Learning and reasoning scale linearly in JPTs, and the tree structure allows white-box reasoning about any posterior probability $P(Q|E)$, such that interpretable explanations can be provided for any inference result. Our experiments showcase the practical applicability of JPTs in high-dimensional heterogeneous probability spaces with millions of training samples, making it a promising alternative to classic probabilistic graphical models.
△ Less
Submitted 14 February, 2023;
originally announced February 2023.
-
Biomedical image analysis competitions: The state of current participation practice
Authors:
Matthias Eisenmann,
Annika Reinke,
Vivienn Weru,
Minu Dietlinde Tizabi,
Fabian Isensee,
Tim J. Adler,
Patrick Godau,
Veronika Cheplygina,
Michal Kozubek,
Sharib Ali,
Anubha Gupta,
Jan Kybic,
Alison Noble,
Carlos Ortiz de Solórzano,
Samiksha Pachade,
Caroline Petitjean,
Daniel Sage,
Donglai Wei,
Elizabeth Wilden,
Deepak Alapatt,
Vincent Andrearczyk,
Ujjwal Baid,
Spyridon Bakas,
Niranjan Balu,
Sophia Bano
, et al. (331 additional authors not shown)
Abstract:
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis,…
▽ More
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
△ Less
Submitted 12 September, 2023; v1 submitted 16 December, 2022;
originally announced December 2022.
-
Deep Computational Model for the Inference of Ventricular Activation Properties
Authors:
Lei Li,
Julia Camps,
Abhirup Banerjee,
Marcel Beetz,
Blanca Rodriguez,
Vicente Grau
Abstract:
Patient-specific cardiac computational models are essential for the efficient realization of precision medicine and in-silico clinical trials using digital twins. Cardiac digital twins can provide non-invasive characterizations of cardiac functions for individual patients, and therefore are promising for the patient-specific diagnosis and therapy stratification. However, current workflows for both…
▽ More
Patient-specific cardiac computational models are essential for the efficient realization of precision medicine and in-silico clinical trials using digital twins. Cardiac digital twins can provide non-invasive characterizations of cardiac functions for individual patients, and therefore are promising for the patient-specific diagnosis and therapy stratification. However, current workflows for both the anatomical and functional twinning phases, referring to the inference of model anatomy and parameter from clinical data, are not sufficiently efficient, robust, and accurate. In this work, we propose a deep learning based patient-specific computational model, which can fuse both anatomical and electrophysiological information for the inference of ventricular activation properties, i.e., conduction velocities and root nodes. The activation properties can provide a quantitative assessment of cardiac electrophysiological function for the guidance of interventional procedures. We employ the Eikonal model to generate simulated electrocardiogram (ECG) with ground truth properties to train the inference model, where specific patient information has also been considered. For evaluation, we test the model on the simulated data and obtain generally promising results with fast computational time.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Heuristic-free Optimization of Force-Controlled Robot Search Strategies in Stochastic Environments
Authors:
Benjamin Alt,
Darko Katic,
Rainer Jäkel,
Michael Beetz
Abstract:
In both industrial and service domains, a central benefit of the use of robots is their ability to quickly and reliably execute repetitive tasks. However, even relatively simple peg-in-hole tasks are typically subject to stochastic variations, requiring search motions to find relevant features such as holes. While search improves robustness, it comes at the cost of increased runtime: More exhausti…
▽ More
In both industrial and service domains, a central benefit of the use of robots is their ability to quickly and reliably execute repetitive tasks. However, even relatively simple peg-in-hole tasks are typically subject to stochastic variations, requiring search motions to find relevant features such as holes. While search improves robustness, it comes at the cost of increased runtime: More exhaustive search will maximize the probability of successfully executing a given task, but will significantly delay any downstream tasks. This trade-off is typically resolved by human experts according to simple heuristics, which are rarely optimal. This paper introduces an automatic, data-driven and heuristic-free approach to optimize robot search strategies. By training a neural model of the search strategy on a large set of simulated stochastic environments, conditioning it on few real-world examples and inverting the model, we can infer search strategies which adapt to the time-variant characteristics of the underlying probability distributions, while requiring very few real-world measurements. We evaluate our approach on two different industrial robots in the context of spiral and probe search for THT electronics assembly.
△ Less
Submitted 15 July, 2022;
originally announced July 2022.
-
Empirical Estimates on Hand Manipulation are Recoverable: A Step Towards Individualized and Explainable Robotic Support in Everyday Activities
Authors:
Alexander Wich,
Holger Schultheis,
Michael Beetz
Abstract:
A key challenge for robotic systems is to figure out the behavior of another agent. The capability to draw correct inferences is crucial to derive human behavior from examples.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally (observational evidence). For this reason, robots that rely on inferences that are correlational risk a…
▽ More
A key challenge for robotic systems is to figure out the behavior of another agent. The capability to draw correct inferences is crucial to derive human behavior from examples.
Processing correct inferences is especially challenging when (confounding) factors are not controlled experimentally (observational evidence). For this reason, robots that rely on inferences that are correlational risk a biased interpretation of the evidence.
We propose equipping robots with the necessary tools to conduct observational studies on people. Specifically, we propose and explore the feasibility of structural causal models with non-parametric estimators to derive empirical estimates on hand behavior in the context of object manipulation in a virtual kitchen scenario. In particular, we focus on inferences under (the weaker) conditions of partial confounding (the model covering only some factors) and confront estimators with hundreds of samples instead of the typical order of thousands. Studying these conditions explores the boundaries of the approach and its viability.
Despite the challenging conditions, the estimates inferred from the validation data are correct. Moreover, these estimates are stable against three refutation strategies where four estimators are in agreement. Furthermore, the causal quantity for two individuals reveals the sensibility of the approach to detect positive and negative effects.
The validity, stability and explainability of the approach are encouraging and serve as the foundation for further research.
△ Less
Submitted 27 January, 2022;
originally announced January 2022.
-
Prolog as a Querying Language for MongoDB
Authors:
Daniel Beßler,
Sascha Jongebloed,
Michael Beetz
Abstract:
Today's database systems have shown to be capable of supporting AI applications that demand a lot of data processing. To this end, these systems incorporate powerful querying languages that go far beyond the mere retrieval of data, and provide sophisticated facilities for data processing as well. In the case of SQL, the language has been even demonstrated to be Turing-complete in some implementati…
▽ More
Today's database systems have shown to be capable of supporting AI applications that demand a lot of data processing. To this end, these systems incorporate powerful querying languages that go far beyond the mere retrieval of data, and provide sophisticated facilities for data processing as well. In the case of SQL, the language has been even demonstrated to be Turing-complete in some implementations of the language. In the area of NoSQL databases, a widely adopted one nowadays is the MongoDB database. Queries in MongoDB databases are represented as sequential stages within an aggregation pipeline where each stage defines a transformation of the input data, and passes the transformed data to the next stage. But aggregation queries tend to become rather large for more complex computational problems, lack organization into re-usable pieces, and are thus hard to debug and maintain. We propose a new database querying language called Mongolog which is syntactically a subset of the Prolog language, and we define its operational semantics through translations into aggregation pipelines. To this end, we make use of and extend the formal framework of the MQuery language which characterizes the aggregation framework set-theoretically.
△ Less
Submitted 4 October, 2021;
originally announced October 2021.
-
Robot Program Parameter Inference via Differentiable Shadow Program Inversion
Authors:
Benjamin Alt,
Darko Katic,
Rainer Jäkel,
Asil Kaan Bozcuoglu,
Michael Beetz
Abstract:
Challenging manipulation tasks can be solved effectively by combining individual robot skills, which must be parameterized for the concrete physical environment and task at hand. This is time-consuming and difficult for human programmers, particularly for force-controlled skills. To this end, we present Shadow Program Inversion (SPI), a novel approach to infer optimal skill parameters directly fro…
▽ More
Challenging manipulation tasks can be solved effectively by combining individual robot skills, which must be parameterized for the concrete physical environment and task at hand. This is time-consuming and difficult for human programmers, particularly for force-controlled skills. To this end, we present Shadow Program Inversion (SPI), a novel approach to infer optimal skill parameters directly from data. SPI leverages unsupervised learning to train an auxiliary differentiable program representation ("shadow program") and realizes parameter inference via gradient-based model inversion. Our method enables the use of efficient first-order optimizers to infer optimal parameters for originally non-differentiable skills, including many skill variants currently used in production. SPI zero-shot generalizes across task objectives, meaning that shadow programs do not need to be retrained to infer parameters for different task variants. We evaluate our methods on three different robots and skill frameworks in industrial and household scenarios. Code and examples are available at https://innolab.artiminds.com/icra2021.
△ Less
Submitted 14 July, 2022; v1 submitted 26 March, 2021;
originally announced March 2021.
-
Kineverse: A Symbolic Articulation Model Framework for Model-Agnostic Mobile Manipulation
Authors:
Adrian Röfer,
Georg Bartels,
Wolfram Burgard,
Abhinav Valada,
Michael Beetz
Abstract:
Service robots in the future need to execute abstract instructions such as "fetch the milk from the fridge". To translate such instructions into actionable plans, robots require in-depth background knowledge. With regards to interactions with doors and drawers, robots require articulation models that they can use for state estimation and motion planning. Existing frameworks model articulated conne…
▽ More
Service robots in the future need to execute abstract instructions such as "fetch the milk from the fridge". To translate such instructions into actionable plans, robots require in-depth background knowledge. With regards to interactions with doors and drawers, robots require articulation models that they can use for state estimation and motion planning. Existing frameworks model articulated connections as abstract concepts such as prismatic, or revolute, but do not provide a parameterized model of these connections for computation. In this paper, we introduce a novel framework that uses symbolic mathematical expressions to model articulated structures -- robots and objects alike -- in a unified and extensible manner. We provide a theoretical description of this framework, and the operations that are supported by its models, and introduce an architecture to exchange our models in robotic applications, making them as flexible as any other environmental observation. To demonstrate the utility of our approach, we employ our practical implementation Kineverse for solving common robotics tasks from state estimation and mobile manipulation, and use it further in real-world mobile robot manipulation.
△ Less
Submitted 16 February, 2022; v1 submitted 9 December, 2020;
originally announced December 2020.
-
URoboSim -- An Episodic Simulation Framework for Prospective Reasoning in Robotic Agents
Authors:
Michael Neumann,
Sebastian Koralewski,
Michael Beetz
Abstract:
Anticipating what might happen as a result of an action is an essential ability humans have in order to perform tasks effectively. On the other hand, robots capabilities in this regard are quite lacking. While machine learning is used to increase the ability of prospection it is still limiting for novel situations. A possibility to improve the prospection ability of robots is through simulation of…
▽ More
Anticipating what might happen as a result of an action is an essential ability humans have in order to perform tasks effectively. On the other hand, robots capabilities in this regard are quite lacking. While machine learning is used to increase the ability of prospection it is still limiting for novel situations. A possibility to improve the prospection ability of robots is through simulation of imagined motions and the physical results of these actions. Therefore, we present URoboSim, a robot simulator that allows robots to perform tasks as mental simulation before performing this task in reality. We show the capabilities of URoboSim in form of mental simulations, generating data for machine learning and the usage as belief state for a real robot.
△ Less
Submitted 8 December, 2020;
originally announced December 2020.
-
Automated acquisition of structured, semantic models of manipulation activities from human VR demonstration
Authors:
Andrei Haidu,
Michael Beetz
Abstract:
In this paper we present a system capable of collecting and annotating, human performed, robot understandable, everyday activities from virtual environments. The human movements are mapped in the simulated world using off-the-shelf virtual reality devices with full body, and eye tracking capabilities. All the interactions in the virtual world are physically simulated, thus movements and their effe…
▽ More
In this paper we present a system capable of collecting and annotating, human performed, robot understandable, everyday activities from virtual environments. The human movements are mapped in the simulated world using off-the-shelf virtual reality devices with full body, and eye tracking capabilities. All the interactions in the virtual world are physically simulated, thus movements and their effects are closely relatable to the real world. During the activity execution, a subsymbolic data logger is recording the environment and the human gaze on a per-frame basis, enabling offline scene reproduction and replays. Coupled with the physics engine, online monitors (symbolic data loggers) are parsing (using various grammars) and recording events, actions, and their effects in the simulated world.
△ Less
Submitted 27 November, 2020;
originally announced November 2020.
-
Foundations of the Socio-physical Model of Activities (SOMA) for Autonomous Robotic Agents
Authors:
Daniel Beßler,
Robert Porzel,
Mihai Pomarlan,
Abhijit Vyas,
Sebastian Höffner,
Michael Beetz,
Rainer Malaka,
John Bateman
Abstract:
In this paper, we present foundations of the Socio-physical Model of Activities (SOMA). SOMA represents both the physical as well as the social context of everyday activities. Such tasks seem to be trivial for humans, however, they pose severe problems for artificial agents. For starters, a natural language command requesting something will leave many pieces of information necessary for performing…
▽ More
In this paper, we present foundations of the Socio-physical Model of Activities (SOMA). SOMA represents both the physical as well as the social context of everyday activities. Such tasks seem to be trivial for humans, however, they pose severe problems for artificial agents. For starters, a natural language command requesting something will leave many pieces of information necessary for performing the task unspecified. Humans can solve such problems fast as we reduce the search space by recourse to prior knowledge such as a connected collection of plans that describe how certain goals can be achieved at various levels of abstraction. Rather than enumerating fine-grained physical contexts SOMA sets out to include socially constructed knowledge about the functions of actions to achieve a variety of goals or the roles objects can play in a given situation. As the human cognition system is capable of generalizing experiences into abstract knowledge pieces applicable to novel situations, we argue that both physical and social context need be modeled to tackle these challenges in a general manner. This is represented by the link between the physical and social context in SOMA where relationships are established between occurrences and generalizations of them, which has been demonstrated in several use cases that validate SOMA.
△ Less
Submitted 24 November, 2020;
originally announced November 2020.
-
Imagination-enabled Robot Perception
Authors:
Patrick Mania,
Franklin Kenghagho Kenfack,
Michael Neumann,
Michael Beetz
Abstract:
Many of today's robot perception systems aim at accomplishing perception tasks that are too simplistic and too hard. They are too simplistic because they do not require the perception systems to provide all the information needed to accomplish manipulation tasks. Typically the perception results do not include information about the part structure of objects, articulation mechanisms and other attri…
▽ More
Many of today's robot perception systems aim at accomplishing perception tasks that are too simplistic and too hard. They are too simplistic because they do not require the perception systems to provide all the information needed to accomplish manipulation tasks. Typically the perception results do not include information about the part structure of objects, articulation mechanisms and other attributes needed for adapting manipulation behavior. On the other hand, the perception problems stated are also too hard because -- unlike humans -- the perception systems cannot leverage the expectations about what they will see to their full potential. Therefore, we investigate a variation of robot perception tasks suitable for robots accomplishing everyday manipulation tasks, such as household robots or a robot in a retail store. In such settings it is reasonable to assume that robots know most objects and have detailed models of them.
We propose a perception system that maintains its beliefs about its environment as a scene graph with physics simulation and visual rendering. When detecting objects, the perception system retrieves the model of the object and places it at the corresponding place in a VR-based environment model. The physics simulation ensures that object detections that are physically not possible are rejected and scenes can be rendered to generate expectations at the image level. The result is a perception system that can provide useful information for manipulation tasks.
△ Less
Submitted 6 July, 2021; v1 submitted 23 November, 2020;
originally announced November 2020.
-
The Robot Household Marathon Experiment
Authors:
Gayane Kazhoyan,
Simon Stelter,
Franklin Kenghagho Kenfack,
Sebastian Koralewski,
Michael Beetz
Abstract:
In this paper, we present an experiment, designed to investigate and evaluate the scalability and the robustness aspects of mobile manipulation. The experiment involves performing variations of mobile pick and place actions and opening/closing environment containers in a human household. The robot is expected to act completely autonomously for extended periods of time. We discuss the scientific ch…
▽ More
In this paper, we present an experiment, designed to investigate and evaluate the scalability and the robustness aspects of mobile manipulation. The experiment involves performing variations of mobile pick and place actions and opening/closing environment containers in a human household. The robot is expected to act completely autonomously for extended periods of time. We discuss the scientific challenges raised by the experiment as well as present our robotic system that can address these challenges and successfully perform all the tasks of the experiment. We present empirical results and the lessons learned as well as discuss where we hit limitations.
△ Less
Submitted 19 November, 2020;
originally announced November 2020.
-
Manipulation Planning and Control for Shelf Replenishment
Authors:
Marco Costanzo,
Simon Stelter,
Ciro Natale,
Salvatore Pirozzi,
Georg Bartels,
Alexis Maldonado,
Michael Beetz
Abstract:
Manipulation planning and control are relevant building blocks of a robotic system and their tight integration is a key factor to improve robot autonomy and allows robots to perform manipulation tasks of increasing complexity, such as those needed in the in-store logistics domain. Supermarkets contain a large variety of objects to be placed on the shelf layers with specific constraints, doing this…
▽ More
Manipulation planning and control are relevant building blocks of a robotic system and their tight integration is a key factor to improve robot autonomy and allows robots to perform manipulation tasks of increasing complexity, such as those needed in the in-store logistics domain. Supermarkets contain a large variety of objects to be placed on the shelf layers with specific constraints, doing this with a robot is a challenge and requires a high dexterity. However, an integration of reactive grasping control and motion planning can allow robots to perform such tasks even with grippers with limited dexterity. The main contribution of the paper is a novel method for planning manipulation tasks to be executed using a reactive control layer that provides more control modalities, i.e., slipping avoidance and controlled sliding. Experiments with a new force/tactile sensor equipping the gripper of a mobile manipulator show that the approach allows the robot to successfully perform manipulation tasks unfeasible with a standard fixed grasp.
△ Less
Submitted 12 March, 2020; v1 submitted 23 December, 2019;
originally announced December 2019.
-
RoboSherlock: Cognition-enabled Robot Perception for Everyday Manipulation Tasks
Authors:
Ferenc Bálint-Benczédi,
Jan-Hendrik Worch,
Daniel Nyga,
Nico Blodow,
Patrick Mania,
Zoltán-Csaba Márton,
Michael Beetz
Abstract:
A pressing question when designing intelligent autonomous systems is how to integrate the various subsystems concerned with complementary tasks. More specifically, robotic vision must provide task-relevant information about the environment and the objects in it to various planning related modules. In most implementations of the traditional Perception-Cognition-Action paradigm these tasks are treat…
▽ More
A pressing question when designing intelligent autonomous systems is how to integrate the various subsystems concerned with complementary tasks. More specifically, robotic vision must provide task-relevant information about the environment and the objects in it to various planning related modules. In most implementations of the traditional Perception-Cognition-Action paradigm these tasks are treated as quasi-independent modules that function as black boxes for each other. It is our view that perception can benefit tremendously from a tight collaboration with cognition. We present RoboSherlock, a knowledge-enabled cognitive perception systems for mobile robots performing human-scale everyday manipulation tasks. In RoboSherlock, perception and interpretation of realistic scenes is formulated as an unstructured information management(UIM) problem. The application of the UIM principle supports the implementation of perception systems that can answer task-relevant queries about objects in a scene, boost object recognition performance by combining the strengths of multiple perception algorithms, support knowledge-enabled reasoning about objects and enable automatic and knowledge-driven generation of processing pipelines. We demonstrate the potential of the proposed framework through feasibility studies of systems for real-world scene perception that have been built on top of the framework.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Amortized Object and Scene Perception for Long-term Robot Manipulation
Authors:
Ferenc Balint-Benczedi,
Michael Beetz
Abstract:
Mobile robots, performing long-term manipulation activities in human environments, have to perceive a wide variety of objects possessing very different visual characteristics and need to reliably keep track of these throughout the execution of a task. In order to be efficient, robot perception capabilities need to go beyond what is currently perceivable and should be able to answer queries about b…
▽ More
Mobile robots, performing long-term manipulation activities in human environments, have to perceive a wide variety of objects possessing very different visual characteristics and need to reliably keep track of these throughout the execution of a task. In order to be efficient, robot perception capabilities need to go beyond what is currently perceivable and should be able to answer queries about both current and past scenes. In this paper we investigate a perception system for long-term robot manipulation that keeps track of the changing environment and builds a representation of the perceived world. Specifically we introduce an amortized component that spreads perception tasks throughout the execution cycle. The resulting query driven perception system asynchronously integrates results from logged images into a symbolic and numeric (what we call sub-symbolic) representation that forms the perceptual belief state of the robot.
△ Less
Submitted 28 March, 2019;
originally announced March 2019.
-
The Liver Tumor Segmentation Benchmark (LiTS)
Authors:
Patrick Bilic,
Patrick Christ,
Hongwei Bran Li,
Eugene Vorontsov,
Avi Ben-Cohen,
Georgios Kaissis,
Adi Szeskin,
Colin Jacobs,
Gabriel Efrain Humpire Mamani,
Gabriel Chartrand,
Fabian Lohöfer,
Julian Walter Holch,
Wieland Sommer,
Felix Hofmann,
Alexandre Hostettler,
Naama Lev-Cohain,
Michal Drozdzal,
Michal Marianne Amitai,
Refael Vivantik,
Jacob Sosna,
Ivan Ezhov,
Anjany Sekuboyina,
Fernando Navarro,
Florian Kofler,
Johannes C. Paetzold
, et al. (84 additional authors not shown)
Abstract:
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with…
▽ More
In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LiTS), which was organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2017 and the International Conferences on Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2017 and 2018. The image dataset is diverse and contains primary and secondary tumors with varied sizes and appearances with various lesion-to-background levels (hyper-/hypo-dense), created in collaboration with seven hospitals and research institutions. Seventy-five submitted liver and liver tumor segmentation algorithms were trained on a set of 131 computed tomography (CT) volumes and were tested on 70 unseen test images acquired from different patients. We found that not a single algorithm performed best for both liver and liver tumors in the three events. The best liver segmentation algorithm achieved a Dice score of 0.963, whereas, for tumor segmentation, the best algorithms achieved Dices scores of 0.674 (ISBI 2017), 0.702 (MICCAI 2017), and 0.739 (MICCAI 2018). Retrospectively, we performed additional analysis on liver tumor detection and revealed that not all top-performing segmentation algorithms worked well for tumor detection. The best liver tumor detection method achieved a lesion-wise recall of 0.458 (ISBI 2017), 0.515 (MICCAI 2017), and 0.554 (MICCAI 2018), indicating the need for further research. LiTS remains an active benchmark and resource for research, e.g., contributing the liver-related segmentation tasks in \url{http://medicaldecathlon.com/}. In addition, both data and online evaluation are accessible via \url{www.lits-challenge.com}.
△ Less
Submitted 25 November, 2022; v1 submitted 13 January, 2019;
originally announced January 2019.
-
Towards Plan Transformations for Real-World Pick and Place Tasks
Authors:
Gayane Kazhoyan,
Arthur Niedzwiecki,
Michael Beetz
Abstract:
In this paper, we investigate the possibility of applying plan transformations to general manipulation plans in order to specialize them to the specific situation at hand. We present a framework for optimizing execution and achieving higher performance by autonomously transforming robot's behavior at runtime. We show that plans employed by robotic agents in real-world environments can be transform…
▽ More
In this paper, we investigate the possibility of applying plan transformations to general manipulation plans in order to specialize them to the specific situation at hand. We present a framework for optimizing execution and achieving higher performance by autonomously transforming robot's behavior at runtime. We show that plans employed by robotic agents in real-world environments can be transformed, despite their control structures being very complex due to the specifics of acting in the real world. The evaluation is carried out on a plan of a PR2 robot performing pick and place tasks, to which we apply three example transformations, as well as on a large amount of experiments in a fast plan projection environment.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
Specializing Underdetermined Action Descriptions Through Plan Projection
Authors:
Gayane Kazhoyan,
Michael Beetz
Abstract:
Plan execution on real robots in realistic environments is underdetermined and often leads to failures. The choice of action parameterization is crucial for task success. By thinking ahead of time with the fast plan projection mechanism proposed in this paper, a general plan can be specialized towards the environment and task at hand by choosing action parameterizations that are predicted to lead…
▽ More
Plan execution on real robots in realistic environments is underdetermined and often leads to failures. The choice of action parameterization is crucial for task success. By thinking ahead of time with the fast plan projection mechanism proposed in this paper, a general plan can be specialized towards the environment and task at hand by choosing action parameterizations that are predicted to lead to successful execution. For finding causal relationships between action parameterizations and task success, we provide the robot with means for plan introspection and propose a systematic and hierarchical plan structure to support that. We evaluate our approach by showing how a PR2 robot, when equipped with the proposed system, is able to choose action parameterizations that increase task execution success rates and overall performance of fetch and deliver actions in a real world setting.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
Adapting Everyday Manipulation Skills to Varied Scenarios
Authors:
Pawel Gajewski,
Paulo Ferreira,
Georg Bartels,
Chaozheng Wang,
Frank Guerin,
Bipin Indurkhya,
Michael Beetz,
Bartlomiej Sniezynski
Abstract:
We address the problem of executing tool-using manipulation skills in scenarios where the objects to be used may vary. We assume that point clouds of the tool and target object can be obtained, but no interpretation or further knowledge about these objects is provided. The system must interpret the point clouds and decide how to use the tool to complete a manipulation task with a target object; th…
▽ More
We address the problem of executing tool-using manipulation skills in scenarios where the objects to be used may vary. We assume that point clouds of the tool and target object can be obtained, but no interpretation or further knowledge about these objects is provided. The system must interpret the point clouds and decide how to use the tool to complete a manipulation task with a target object; this means it must adjust motion trajectories appropriately to complete the task. We tackle three everyday manipulations: scraping material from a tool into a container, cutting, and scooping from a container. Our solution encodes these manipulation skills in a generic way, with parameters that can be filled in at run-time via queries to a robot perception module; the perception module abstracts the functional parts for the tool and extracts key parameters that are needed for the task. The approach is evaluated in simulation and with selected examples on a PR2 robot.
△ Less
Submitted 4 March, 2019; v1 submitted 7 March, 2018;
originally announced March 2018.
-
Knowledge-Enabled Robotic Agents for Shelf Replenishment in Cluttered Retail Environments
Authors:
Jan Winkler,
Ferenc Balint-Benczedi,
Thiemo Wiedemeyer,
Michael Beetz,
Narunas Vaskevicius,
Christian A. Mueller,
Tobias Fromm,
Andreas Birk
Abstract:
Autonomous robots in unstructured and dynamically changing retail environments have to master complex perception, knowledgeprocessing, and manipulation tasks. To enable them to act competently, we propose a framework based on three core components: (o) a knowledge-enabled perception system, capable of combining diverse information sources to cope with occlusions and stacked objects with a variety…
▽ More
Autonomous robots in unstructured and dynamically changing retail environments have to master complex perception, knowledgeprocessing, and manipulation tasks. To enable them to act competently, we propose a framework based on three core components: (o) a knowledge-enabled perception system, capable of combining diverse information sources to cope with occlusions and stacked objects with a variety of textures and shapes, (o) knowledge processing methods produce strategies for tidying up supermarket racks, and (o) the necessary manipulation skills in confined spaces to arrange objects in semi-accessible rack shelves. We demonstrate our framework in an simulated environment as well as on a real shopping rack using a PR2 robot. Typical supermarket products are detected and rearranged in the retail rack, tidying up what was found to be misplaced items.
△ Less
Submitted 13 May, 2016;
originally announced May 2016.
-
Reasoning about Unmodelled Concepts - Incorporating Class Taxonomies in Probabilistic Relational Models
Authors:
Daniel Nyga,
Michael Beetz
Abstract:
A key problem in the application of first-order probabilistic methods is the enormous size of graphical models they imply. The size results from the possible worlds that can be generated by a domain of objects and relations. One of the reasons for this explosion is that so far the approaches do not sufficiently exploit the structure and similarity of possible worlds in order to encode the models m…
▽ More
A key problem in the application of first-order probabilistic methods is the enormous size of graphical models they imply. The size results from the possible worlds that can be generated by a domain of objects and relations. One of the reasons for this explosion is that so far the approaches do not sufficiently exploit the structure and similarity of possible worlds in order to encode the models more compactly. We propose fuzzy inference in Markov logic networks, which enables the use of taxonomic knowledge as a source of imposing structure onto possible worlds. We show that by exploiting this structure, probability distributions can be represented more compactly and that the reasoning systems become capable of reasoning about concepts not contained in the probabilistic knowledge base.
△ Less
Submitted 21 April, 2015;
originally announced April 2015.
-
Learning and Reasoning with Action-Related Places for Robust Mobile Manipulation
Authors:
Freek Stulp,
Andreas Fedrizzi,
Lorenz Mösenlechner,
Michael Beetz
Abstract:
We propose the concept of Action-Related Place (ARPlace) as a powerful and flexible representation of task-related place in the context of mobile manipulation. ARPlace represents robot base locations not as a single position, but rather as a collection of positions, each with an associated probability that the manipulation action will succeed when located there. ARPlaces are generated using a pred…
▽ More
We propose the concept of Action-Related Place (ARPlace) as a powerful and flexible representation of task-related place in the context of mobile manipulation. ARPlace represents robot base locations not as a single position, but rather as a collection of positions, each with an associated probability that the manipulation action will succeed when located there. ARPlaces are generated using a predictive model that is acquired through experience-based learning, and take into account the uncertainty the robot has about its own location and the location of the object to be manipulated.
When executing the task, rather than choosing one specific goal position based only on the initial knowledge about the task context, the robot instantiates an ARPlace, and bases its decisions on this ARPlace, which is updated as new information about the task becomes available. To show the advantages of this least-commitment approach, we present a transformational planner that reasons about ARPlaces in order to optimize symbolic plans. Our empirical evaluation demonstrates that using ARPlaces leads to more robust and efficient mobile manipulation in the face of state estimation uncertainty on our simulated robot.
△ Less
Submitted 18 January, 2014;
originally announced January 2014.
-
Probabilistic Hybrid Action Models for Predicting Concurrent Percept-driven Robot Behavior
Authors:
M. Beetz,
H. Grosskreutz
Abstract:
This article develops Probabilistic Hybrid Action Models (PHAMs), a realistic causal model for predicting the behavior generated by modern percept-driven robot plans. PHAMs represent aspects of robot behavior that cannot be represented by most action models used in AI planning: the temporal structure of continuous control processes, their non-deterministic effects, several modes of their interfere…
▽ More
This article develops Probabilistic Hybrid Action Models (PHAMs), a realistic causal model for predicting the behavior generated by modern percept-driven robot plans. PHAMs represent aspects of robot behavior that cannot be represented by most action models used in AI planning: the temporal structure of continuous control processes, their non-deterministic effects, several modes of their interferences, and the achievement of triggering conditions in closed-loop robot plans.
The main contributions of this article are: (1) PHAMs, a model of concurrent percept-driven behavior, its formalization, and proofs that the model generates probably, qualitatively accurate predictions; and (2) a resource-efficient inference method for PHAMs based on sampling projections from probabilistic action models and state descriptions. We show how PHAMs can be applied to planning the course of action of an autonomous robot office courier based on analytical and experimental results.
△ Less
Submitted 27 September, 2011;
originally announced September 2011.