Search | arXiv e-print repository

Jacta: A Versatile Planner for Learning Dexterous and Whole-body Manipulation

Authors: Jan Brüdigam, Ali-Adeeb Abbas, Maks Sorokin, Kuan Fang, Brandon Hung, Maya Guru, Stefan Sosnowski, Jiuguang Wang, Sandra Hirche, Simon Le Cleac'h

Abstract: Robotic manipulation is challenging due to discontinuous dynamics, as well as high-dimensional state and action spaces. Data-driven approaches that succeed in manipulation tasks require large amounts of data and expert demonstrations, typically from humans. Existing manipulation planners are restricted to specific systems and often depend on specialized algorithms for using demonstration. Therefor… ▽ More Robotic manipulation is challenging due to discontinuous dynamics, as well as high-dimensional state and action spaces. Data-driven approaches that succeed in manipulation tasks require large amounts of data and expert demonstrations, typically from humans. Existing manipulation planners are restricted to specific systems and often depend on specialized algorithms for using demonstration. Therefore, we introduce a flexible motion planner tailored to dexterous and whole-body manipulation tasks. Our planner creates readily usable demonstrations for reinforcement learning algorithms, eliminating the need for additional training pipeline complexities. With this approach, we can efficiently learn policies for complex manipulation tasks, where traditional reinforcement learning alone only makes little progress. Furthermore, we demonstrate that learned policies are transferable to real robotic systems for solving complex dexterous manipulation tasks. △ Less

Submitted 2 August, 2024; originally announced August 2024.

arXiv:2407.20156 [pdf, other]

Autonomous and Teleoperation Control of a Drawing Robot Avatar

Authors: Lingyun Chen, Abdeldjallil Naceri, Abdalla Swikir, Sandra Hirche, Sami Haddadin

Abstract: A drawing robot avatar is a robotic system that allows for telepresence-based drawing, enabling users to remotely control a robotic arm and create drawings in real-time from a remote location. The proposed control framework aims to improve bimanual robot telepresence quality by reducing the user workload and required prior knowledge through the automation of secondary or auxiliary tasks. The intro… ▽ More A drawing robot avatar is a robotic system that allows for telepresence-based drawing, enabling users to remotely control a robotic arm and create drawings in real-time from a remote location. The proposed control framework aims to improve bimanual robot telepresence quality by reducing the user workload and required prior knowledge through the automation of secondary or auxiliary tasks. The introduced novel method calculates the near-optimal Cartesian end-effector pose in terms of visual feedback quality for the attached eye-to-hand camera with motion constraints in consideration. The effectiveness is demonstrated by conducting user studies of drawing reference shapes using the implemented robot avatar compared to stationary and teleoperated camera pose conditions. Our results demonstrate that the proposed control framework offers improved visual feedback quality and drawing performance. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: Accepted to ICRA 2024

arXiv:2407.16407 [pdf, other]

Data-Driven Optimal Feedback Laws via Kernel Mean Embeddings

Authors: Petar Bevanda, Nicolas Hoischen, Stefan Sosnowski, Sandra Hirche, Boris Houska

Abstract: This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mea… ▽ More This paper proposes a fully data-driven approach for optimal control of nonlinear control-affine systems represented by a stochastic diffusion. The focus is on the scenario where both the nonlinear dynamics and stage cost functions are unknown, while only control penalty function and constraints are provided. Leveraging the theory of reproducing kernel Hilbert spaces, we introduce novel kernel mean embeddings (KMEs) to identify the Markov transition operators associated with controlled diffusion processes. The KME learning approach seamlessly integrates with modern convex operator-theoretic Hamilton-Jacobi-Bellman recursions. Thus, unlike traditional dynamic programming methods, our approach exploits the ``kernel trick'' to break the curse of dimensionality. We demonstrate the effectiveness of our method through numerical examples, highlighting its ability to solve a large class of nonlinear optimal control problems. △ Less

Submitted 23 July, 2024; originally announced July 2024.

Comments: author-submitted electronic preprint version: 16 pages, 3 figures, 4 tables

arXiv:2405.08756 [pdf, other]

Stable Inverse Reinforcement Learning: Policies from Control Lyapunov Landscapes

Authors: Samuel Tesfazgi, Leonhard Sprandl, Armin Lederer, Sandra Hirche

Abstract: Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic… ▽ More Learning from expert demonstrations to flexibly program an autonomous system with complex behaviors or to predict an agent's behavior is a powerful tool, especially in collaborative control settings. A common method to solve this problem is inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to the optimization of an intrinsic cost function that reflects its intent and informs its control actions. While the framework is expressive, it is also computationally demanding and generally lacks convergence guarantees. We therefore propose a novel, stability-certified IRL approach by reformulating the cost function inference problem to learning control Lyapunov functions (CLF) from demonstrations data. By additionally exploiting closed-form expressions for associated control policies, we are able to efficiently search the space of CLFs by observing the attractor landscape of the induced dynamics. For the construction of the inverse optimal CLFs, we use a Sum of Squares and formulate a convex optimization problem. We present a theoretical analysis of the optimality properties provided by the CLF and evaluate our approach using both simulated and real-world data. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.08711 [pdf, other]

Data-driven Force Observer for Human-Robot Interaction with Series Elastic Actuators using Gaussian Processes

Authors: Samuel Tesfazgi, Markus Keßler, Emilio Trigili, Armin Lederer, Sandra Hirche

Abstract: Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods h… ▽ More Ensuring safety and adapting to the user's behavior are of paramount importance in physical human-robot interaction. Thus, incorporating elastic actuators in the robot's mechanical design has become popular, since it offers intrinsic compliance and additionally provide a coarse estimate for the interaction force by measuring the deformation of the elastic components. While observer-based methods have been shown to improve these estimates, they rely on accurate models of the system, which are challenging to obtain in complex operating environments. In this work, we overcome this issue by learning the unknown dynamics components using Gaussian process (GP) regression. By employing the learned model in a Bayesian filtering framework, we improve the estimation accuracy and additionally obtain an observer that explicitly considers local model uncertainty in the confidence measure of the state estimate. Furthermore, we derive guaranteed estimation error bounds, thus, facilitating the use in safety-critical applications. We demonstrate the effectiveness of the proposed approach experimentally in a human-exoskeleton interaction scenario. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.07312 [pdf, other]

Nonparametric Control-Koopman Operator Learning: Flexible and Scalable Models for Prediction and Control

Authors: Petar Bevanda, Bas Driessen, Lucian Cristian Iacob, Roland Toth, Stefan Sosnowski, Sandra Hirche

Abstract: Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressin… ▽ More Linearity of Koopman operators and simplicity of their estimators coupled with model-reduction capabilities has lead to their great popularity in applications for learning dynamical systems. While nonparametric Koopman operator learning in infinite-dimensional reproducing kernel Hilbert spaces is well understood for autonomous systems, its control system analogues are largely unexplored. Addressing systems with control inputs in a principled manner is crucial for fully data-driven learning of controllers, especially since existing approaches commonly resort to representational heuristics or parametric models of limited expressiveness and scalability. We address the aforementioned challenge by proposing a universal framework via control-affine reproducing kernels that enables direct estimation of a single operator even for control systems. The proposed approach, called control-Koopman operator regression (cKOR), is thus completely analogous to Koopman operator regression of the autonomous case. First in the literature, we present a nonparametric framework for learning Koopman operator representations of nonlinear control-affine systems that does not suffer from the curse of control input dimensionality. This allows for reformulating the infinite-dimensional learning problem in a finite-dimensional space based solely on data without apriori loss of precision due to a restriction to a finite span of functions or inputs as in other approaches. For enabling applications to large-scale control systems, we also enhance the scalability of control-Koopman operator estimators by leveraging random projections (sketching). The efficacy of our novel cKOR approach is demonstrated on both forecasting and control tasks. △ Less

Submitted 12 May, 2024; originally announced May 2024.

arXiv:2404.11760 [pdf, other]

Predictive Model Development to Identify Failed Healing in Patients after Non-Union Fracture Surgery

Authors: Cedric Donié, Marie K. Reumann, Tony Hartung, Benedikt J. Braun, Tina Histing, Satoshi Endo, Sandra Hirche

Abstract: Bone non-union is among the most severe complications associated with trauma surgery, occurring in 10-30% of cases after long bone fractures. Treating non-unions requires a high level of surgical expertise and often involves multiple revision surgeries, sometimes even leading to amputation. Thus, more accurate prognosis is crucial for patient well-being. Recent advances in machine learning (ML) ho… ▽ More Bone non-union is among the most severe complications associated with trauma surgery, occurring in 10-30% of cases after long bone fractures. Treating non-unions requires a high level of surgical expertise and often involves multiple revision surgeries, sometimes even leading to amputation. Thus, more accurate prognosis is crucial for patient well-being. Recent advances in machine learning (ML) hold promise for developing models to predict non-union healing, even when working with smaller datasets, a commonly encountered challenge in clinical domains. To demonstrate the effectiveness of ML in identifying candidates at risk of failed non-union healing, we applied three ML models (logistic regression, support vector machine, and XGBoost) to the clinical dataset TRUFFLE, which includes 797 patients with long bone non-union. The models provided prediction results with 70% sensitivity, and the specificities of 66% (XGBoost), 49% (support vector machine), and 43% (logistic regression). These findings offer valuable clinical insights because they enable early identification of patients at risk of failed non-union healing after the initial surgical revision treatment protocol. △ Less

Submitted 17 April, 2024; originally announced April 2024.

Comments: To be presented at the 46th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC 2024)

ACM Class: J.3; I.5.4

arXiv:2404.02988 [pdf, other]

Risk-averse Learning with Non-Stationary Distributions

Authors: Siyi Wang, Zifan Wang, Xinlei Yi, Michael M. Zavlanos, Karl H. Johansson, Sandra Hirche

Abstract: Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost chan… ▽ More Considering non-stationary environments in online optimization enables decision-maker to effectively adapt to changes and improve its performance over time. In such cases, it is favorable to adopt a strategy that minimizes the negative impact of change to avoid potentially risky situations. In this paper, we investigate risk-averse online optimization where the distribution of the random cost changes over time. We minimize risk-averse objective function using the Conditional Value at Risk (CVaR) as risk measure. Due to the difficulty in obtaining the exact CVaR gradient, we employ a zeroth-order optimization approach that queries the cost function values multiple times at each iteration and estimates the CVaR gradient using the sampled values. To facilitate the regret analysis, we use a variation metric based on Wasserstein distance to capture time-varying distributions. Given that the distribution variation is sub-linear in the total number of episodes, we show that our designed learning algorithm achieves sub-linear dynamic regret with high probability for both convex and strongly convex functions. Moreover, theoretical results suggest that increasing the number of samples leads to a reduction in the dynamic regret bounds until the sampling number reaches a specific limit. Finally, we provide numerical experiments of dynamic pricing in a parking lot to illustrate the efficacy of the designed algorithm. △ Less

Submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.11932 [pdf, ps, other]

Consistency of Value of Information: Effects of Packet Loss and Time Delay in Networked Control Systems Tasks

Authors: Touraj Soleymani, John S. Baras, Siyi Wang, Sandra Hirche, Karl H. Johansson

Abstract: In this chapter, we study the consistency of the value of information$\unicode{x2014}$a semantic metric that claims to determine the right piece of information in networked control systems tasks$\unicode{x2014}$in a lossy and delayed communication regime. Our analysis begins with a focus on state estimation, and subsequently extends to feedback control. To that end, we make a causal tradeoff betwe… ▽ More In this chapter, we study the consistency of the value of information$\unicode{x2014}$a semantic metric that claims to determine the right piece of information in networked control systems tasks$\unicode{x2014}$in a lossy and delayed communication regime. Our analysis begins with a focus on state estimation, and subsequently extends to feedback control. To that end, we make a causal tradeoff between the packet rate and the mean square error. Associated with this tradeoff, we demonstrate the existence of an optimal policy profile, comprising a symmetric threshold scheduling policy based on the value of information for the encoder and a non-Gaussian linear estimation policy for the decoder. Our structural results assert that the scheduling policy is expressible in terms of $3d-1$ variables related to the source and the channel, where $d$ is the time delay, and that the estimation policy incorporates no residual related to signaling. We then construct an optimal control policy by exploiting the separation principle. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.11927 [pdf, ps, other]

Foundations of Value of Information: A Semantic Metric for Networked Control Systems Tasks

Authors: Touraj Soleymani, John S. Baras, Sandra Hirche, Karl H. Johansson

Abstract: In this chapter, we present our recent invention, i.e., the notion of the value of information$\unicode{x2014}$a semantic metric that is fundamental for networked control systems tasks. We begin our analysis by formulating a causal tradeoff between the packet rate and the regulation cost, with an encoder and a decoder as two distributed decision makers, and show that the valuation of information i… ▽ More In this chapter, we present our recent invention, i.e., the notion of the value of information$\unicode{x2014}$a semantic metric that is fundamental for networked control systems tasks. We begin our analysis by formulating a causal tradeoff between the packet rate and the regulation cost, with an encoder and a decoder as two distributed decision makers, and show that the valuation of information is conceivable and quantifiable grounded on this tradeoff. More precisely, we characterize an equilibrium, and quantify the value of information there as the variation in a value function with respect to a piece of sensory measurement that can be communicated from the encoder to the decoder at each time. We prove that, in feedback control of a dynamical process over a noiseless channel, the value of information is a function of the discrepancy between the state estimates at the encoder and the decoder, and that a data packet containing a sensory measurement at each time should be exchanged only if the value of information at that time is nonnegative. Finally, we prove that the characterized equilibrium is in fact globally optimal. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2402.03174 [pdf, ps, other]

Decentralized Event-Triggered Online Learning for Safe Consensus of Multi-Agent Systems with Gaussian Process Regression

Authors: Xiaobing Dai, Zewen Yang, Mengtian Xu, Fangzhou Liu, Georges Hattab, Sandra Hirche

Abstract: Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamic… ▽ More Consensus control in multi-agent systems has received significant attention and practical implementation across various domains. However, managing consensus control under unknown dynamics remains a significant challenge for control design due to system uncertainties and environmental disturbances. This paper presents a novel learning-based distributed control law, augmented by an auxiliary dynamics. Gaussian processes are harnessed to compensate for the unknown components of the multi-agent system. For continuous enhancement in predictive performance of Gaussian process model, a data-efficient online learning strategy with a decentralized event-triggered mechanism is proposed. Furthermore, the control performance of the proposed approach is ensured via the Lyapunov theory, based on a probabilistic guarantee for prediction error bounds. To demonstrate the efficacy of the proposed learning-based controller, a comparative analysis is conducted, contrasting it with both conventional distributed control laws and offline learning methodologies. △ Less

Submitted 5 February, 2024; originally announced February 2024.

arXiv:2402.03048 [pdf, other]

Cooperative Learning with Gaussian Processes for Euler-Lagrange Systems Tracking Control under Switching Topologies

Authors: Zewen Yang, Songbo Dong, Armin Lederer, Xiaobing Dai, Siyu Chen, Stefan Sosnowski, Georges Hattab, Sandra Hirche

Abstract: This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty pre… ▽ More This work presents an innovative learning-based approach to tackle the tracking control problem of Euler-Lagrange multi-agent systems with partially unknown dynamics operating under switching communication topologies. The approach leverages a correlation-aware cooperative algorithm framework built upon Gaussian process regression, which adeptly captures inter-agent correlations for uncertainty predictions. A standout feature is its exceptional efficiency in deriving the aggregation weights achieved by circumventing the computationally intensive posterior variance calculations. Through Lyapunov stability analysis, the distributed control law ensures bounded tracking errors with high probability. Simulation experiments validate the protocol's efficacy in effectively managing complex scenarios, establishing it as a promising solution for robust tracking control in multi-agent systems characterized by uncertain dynamics and dynamic communication structures. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 8 pages

arXiv:2402.03014 [pdf, other]

Whom to Trust? Elective Learning for Distributed Gaussian Process Regression

Authors: Zewen Yang, Xiaobing Dai, Akshat Dubey, Sandra Hirche, Georges Hattab

Abstract: This paper introduces an innovative approach to enhance distributed cooperative learning using Gaussian process (GP) regression in multi-agent systems (MASs). The key contribution of this work is the development of an elective learning algorithm, namely prior-aware elective distributed GP (Pri-GP), which empowers agents with the capability to selectively request predictions from neighboring agents… ▽ More This paper introduces an innovative approach to enhance distributed cooperative learning using Gaussian process (GP) regression in multi-agent systems (MASs). The key contribution of this work is the development of an elective learning algorithm, namely prior-aware elective distributed GP (Pri-GP), which empowers agents with the capability to selectively request predictions from neighboring agents based on their trustworthiness. The proposed Pri-GP effectively improves individual prediction accuracy, especially in cases where the prior knowledge of an agent is incorrect. Moreover, it eliminates the need for computationally intensive variance calculations for determining aggregation weights in distributed GP. Furthermore, we establish a prediction error bound within the Pri-GP framework, ensuring the reliability of predictions, which is regarded as a crucial property in safety-critical MAS applications. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: 9 pages, conference preprint

arXiv:2311.02133 [pdf, other]

Safe Online Dynamics Learning with Initially Unknown Models and Infeasible Safety Certificates

Authors: Alexandre Capone, Ryan Cosner, Aaron Ames, Sandra Hirche

Abstract: Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfi… ▽ More Safety-critical control tasks with high levels of uncertainty are becoming increasingly common. Typically, techniques that guarantee safety during learning and control utilize constraint-based safety certificates, which can be leveraged to compute safe control inputs. However, excessive model uncertainty can render robust safety certification methods or infeasible, meaning no control input satisfies the constraints imposed by the safety certificate. This paper considers a learning-based setting with a robust safety certificate based on a control barrier function (CBF) second-order cone program. If the control barrier function certificate is feasible, our approach leverages it to guarantee safety. Otherwise, our method explores the system dynamics to collect data and recover the feasibility of the control barrier function constraint. To this end, we employ a method inspired by well-established tools from Bayesian optimization. We show that if the sampling frequency is high enough, we recover the feasibility of the robust CBF certificate, guaranteeing safety. Our approach requires no prior model and corresponds, to the best of our knowledge, to the first algorithm that guarantees safety in settings with occasionally infeasible safety certificates without requiring a backup non-learning-based controller. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2310.02942 [pdf, other]

Online Constraint Tightening in Stochastic Model Predictive Control: A Regression Approach

Authors: Alexandre Capone, Tim Brüdigam, Sandra Hirche

Abstract: Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening… ▽ More Solving chance-constrained stochastic optimal control problems is a significant challenge in control. This is because no analytical solutions exist for up to a handful of special cases. A common and computationally efficient approach for tackling chance-constrained stochastic optimal control problems consists of reformulating the chance constraints as hard constraints with a constraint-tightening parameter. However, in such approaches, the choice of constraint-tightening parameter remains challenging, and guarantees can mostly be obtained assuming that the process noise distribution is known a priori. Moreover, the chance constraints are often not tightly satisfied, leading to unnecessarily high costs. This work proposes a data-driven approach for learning the constraint-tightening parameters online during control. To this end, we reformulate the choice of constraint-tightening parameter for the closed-loop as a binary regression problem. We then leverage a highly expressive \gls{gp} model for binary regression to approximate the smallest constraint-tightening parameters that satisfy the chance constraints. By tuning the algorithm parameters appropriately, we show that the resulting constraint-tightening parameters satisfy the chance constraints up to an arbitrarily small margin with high probability. Our approach yields constraint-tightening parameters that tightly satisfy the chance constraints in numerical experiments, resulting in a lower average cost than three other state-of-the-art approaches. △ Less

Submitted 4 October, 2023; originally announced October 2023.

Comments: Submitted to Transactions on Automatic Control

arXiv:2307.04415 [pdf, other]

Episodic Gaussian Process-Based Learning Control with Vanishing Tracking Errors

Authors: Armin Lederer, Jonas Umlauft, Sandra Hirche

Abstract: Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction… ▽ More Due to the increasing complexity of technical systems, accurate first principle models can often not be obtained. Supervised machine learning can mitigate this issue by inferring models from measurement data. Gaussian process regression is particularly well suited for this purpose due to its high data-efficiency and its explicit uncertainty representation, which allows the derivation of prediction error bounds. These error bounds have been exploited to show tracking accuracy guarantees for a variety of control approaches, but their direct dependency on the training data is generally unclear. We address this issue by deriving a Bayesian prediction error bound for GP regression, which we show to decay with the growth of a novel, kernel-based measure of data density. Based on the prediction error bound, we prove time-varying tracking accuracy guarantees for learned GP models used as feedback compensation of unknown nonlinearities, and show to achieve vanishing tracking error with increasing data density. This enables us to develop an episodic approach for learning Gaussian process models, such that an arbitrary tracking accuracy can be guaranteed. The effectiveness of the derived theory is demonstrated in several simulations. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2305.16215 [pdf, other]

Koopman Kernel Regression

Authors: Petar Bevanda, Max Beier, Armin Lederer, Stefan Sosnowski, Eyke Hüllermeier, Sandra Hirche

Abstract: Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challeng… ▽ More Many machine learning approaches for decision making, such as reinforcement learning, rely on simulators or predictive models to forecast the time-evolution of quantities of interest, e.g., the state of an agent or the reward of a policy. Forecasts of such complex phenomena are commonly described by highly nonlinear dynamical systems, making their use in optimization-based decision-making challenging. Koopman operator theory offers a beneficial paradigm for addressing this problem by characterizing forecasts via linear time-invariant (LTI) ODEs, turning multi-step forecasts into sparse matrix multiplication. Though there exists a variety of learning approaches, they usually lack crucial learning-theoretic guarantees, making the behavior of the obtained models with increasing data and dimensionality unclear. We address the aforementioned by deriving a universal Koopman-invariant reproducing kernel Hilbert space (RKHS) that solely spans transformations into LTI dynamical systems. The resulting Koopman Kernel Regression (KKR) framework enables the use of statistical learning tools from function approximation for novel convergence results and generalization error bounds under weaker assumptions than existing work. Our experiments demonstrate superior forecasting performance compared to Koopman operator and sequential data predictors in RKHS. △ Less

Submitted 16 January, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

Comments: Accepted to the thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023)

arXiv:2305.08169 [pdf, ps, other]

Can Learning Deteriorate Control? Analyzing Computational Delays in Gaussian Process-Based Event-Triggered Online Learning

Authors: Xiaobing Dai, Armin Lederer, Zewen Yang, Sandra Hirche

Abstract: When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ens… ▽ More When the dynamics of systems are unknown, supervised machine learning techniques are commonly employed to infer models from data. Gaussian process (GP) regression is a particularly popular learning method for this purpose due to the existence of prediction error bounds. Moreover, GP models can be efficiently updated online, such that event-triggered online learning strategies can be pursued to ensure specified tracking accuracies. However, existing trigger conditions must be able to be evaluated at arbitrary times, which cannot be achieved in practice due to non-negligible computation times. Therefore, we first derive a delay-aware tracking error bound, which reveals an accuracy-delay trade-off. Based on this result, we propose a novel event trigger for GP-based online learning with computational delays, which we show to offer advantages over offline trained GP models for sufficiently small computation times. Finally, we demonstrate the effectiveness of the proposed event trigger for online learning in simulations. △ Less

Submitted 14 May, 2023; originally announced May 2023.

arXiv:2304.11265 [pdf, other]

Time Series Classification for Detecting Parkinson's Disease from Wrist Motions

Authors: Cedric Donié, Neha Das, Satoshi Endo, Sandra Hirche

Abstract: Parkinson's disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size… ▽ More Parkinson's disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size of available datasets. We investigate InceptionTime and RandOm Convolutional KErnel Transform (ROCKET) as they are promising for PD symptom monitoring, with InceptionTime's high learning capacity being well-suited to modeling complex movement patterns while ROCKET is suited to small datasets. With random search methodology, we identify the highest-scoring InceptionTime architecture and compare its performance to ROCKET with a ridge classifier and a multi-layer perceptron (MLP) on wrist motion data from PD patients. Our findings indicate that all approaches are suitable for estimating tremor severity and bradykinesia presence but encounter challenges in detecting dyskinesia. ROCKET demonstrates superior performance in identifying dyskinesia, whereas InceptionTime exhibits slightly better performance in tremor and bradykinesia detection. Notably, both methods outperform the multi-layer perceptron. In conclusion, InceptionTime exhibits the capability to classify complex wrist motion time series and holds the greatest potential for continuous symptom monitoring in PD. △ Less

Submitted 20 May, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

Comments: The source code is available under https://github.com/cedricdonie/tsc-for-wrist-motion-pd-detection

ACM Class: I.5; J.2; J.3

arXiv:2304.05723 [pdf, ps, other]

Distributed Coverage Control of Constrained Constant-Speed Unicycle Multi-Agent Systems

Authors: Qingchen Liu, Zengjie Zhang, Nhan Khanh Le, Jiahu Qin, Fangzhou Liu, Sandra Hirche

Abstract: This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel cov… ▽ More This paper proposes a novel distributed coverage controller for a multi-agent system with constant-speed unicycle robots (CSUR). The work is motivated by the limitation of the conventional method that does not ensure the satisfaction of hard state- and input-dependent constraints and leads to feasibility issues for multi-CSUR systems. In this paper, we solve these problems by designing a novel coverage cost function and a saturated gradient-search-based control law. Invariant set theory and Lyapunov-based techniques are used to prove the state-dependent confinement and the convergence of the system state to the optimal coverage configuration, respectively. The controller is implemented in a distributed manner based on a novel communication standard among the agents. A series of simulation case studies are conducted to validate the effectiveness of the proposed coverage controller in different initial conditions and with control parameters. A comparison study in simulation reveals the advantage of the proposed method in terms of avoiding infeasibility. The experiment study verifies the applicability of the method to real robots with uncertainties. The development procedure of the method from theoretical analysis to experimental validation provides a novel framework for multi-agent system coordinate control with complex agent dynamics. △ Less

Submitted 14 March, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

arXiv:2303.17963 [pdf, other]

doi 10.23919/ECC64448.2024.10590972

Learning-Based Optimal Control with Performance Guarantees for Unknown Systems with Latent States

Authors: Robert Lefringhausen, Supitsana Srithasan, Armin Lederer, Sandra Hirche

Abstract: As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be nece… ▽ More As control engineering methods are applied to increasingly complex systems, data-driven approaches for system identification appear as a promising alternative to physics-based modeling. While the Bayesian approaches prevalent for safety-critical applications usually rely on the availability of state measurements, the states of a complex system are often not directly measurable. It may then be necessary to jointly estimate the dynamics and the latent state, making the quantification of uncertainties and the design of controllers with formal performance guarantees considerably more challenging. This paper proposes a novel method for the computation of an optimal input trajectory for unknown nonlinear systems with latent states based on a combination of particle Markov chain Monte Carlo methods and scenario theory. Probabilistic performance guarantees are derived for the resulting input trajectory, and an approach to validate the performance of arbitrary control laws is presented. The effectiveness of the proposed method is demonstrated in a numerical simulation. △ Less

Submitted 6 August, 2024; v1 submitted 31 March, 2023; originally announced March 2023.

Comments: Accepted version submitted to the 2024 European Control Conference (ECC)

Journal ref: 2024 European Control Conference (ECC), pp. 90-97

arXiv:2302.11961 [pdf, other]

Sharp Calibrated Gaussian Processes

Authors: Alexandre Capone, Geoff Pleiss, Sandra Hirche

Abstract: While Gaussian processes are a mainstay for various engineering and scientific applications, the uncertainty estimates don't satisfy frequentist guarantees and can be miscalibrated in practice. State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance, which yields confidence intervals that are potentially too coarse. To remedy this, we p… ▽ More While Gaussian processes are a mainstay for various engineering and scientific applications, the uncertainty estimates don't satisfy frequentist guarantees and can be miscalibrated in practice. State-of-the-art approaches for designing calibrated models rely on inflating the Gaussian process posterior variance, which yields confidence intervals that are potentially too coarse. To remedy this, we present a calibration approach that generates predictive quantiles using a computation inspired by the vanilla Gaussian process posterior variance but using a different set of hyperparameters chosen to satisfy an empirical calibration constraint. This results in a calibration approach that is considerably more flexible than existing approaches, which we optimize to yield tight predictive quantiles. Our approach is shown to yield a calibrated model under reasonable assumptions. Furthermore, it outperforms existing approaches in sharpness when employed for calibrated regression. △ Less

Submitted 16 November, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

arXiv:2302.05979 [pdf, other]

doi 10.1007/s11044-023-09949-x

Variational Integrators and Graph-Based Solvers for Multibody Dynamics in Maximal Coordinates

Authors: Jan Brüdigam, Stefan Sosnowski, Zachary Manchester, Sandra Hirche

Abstract: Multibody dynamics simulators are an important tool in many fields, including learning and control for robotics. However, many existing dynamics simulators suffer from inaccuracies when dealing with constrained mechanical systems due to unsuitable integrators with bad energy behavior and problematic constraint violations, for example for contact interactions. Variational integrators are numerical… ▽ More Multibody dynamics simulators are an important tool in many fields, including learning and control for robotics. However, many existing dynamics simulators suffer from inaccuracies when dealing with constrained mechanical systems due to unsuitable integrators with bad energy behavior and problematic constraint violations, for example for contact interactions. Variational integrators are numerical discretization methods that can reduce physical inaccuracies when simulating mechanical systems, and formulating the dynamics in maximal coordinates allows for easy and numerically robust incorporation of constraints such as kinematic loops or contacts. Therefore, this article derives a variational integrator for mechanical systems with equality and inequality constraints in maximal coordinates. Additionally, efficient graph-based sparsity-exploiting algorithms for solving the integrator are provided and implemented as an open-source simulator. The evaluation of the simulator shows improved physical accuracy due to the variational integrator and the advantages of the sparse solvers. Comparisons to minimal-coordinate algorithms show improved numerical robustness and application examples of a walking robot and an exoskeleton with explicit constraints demonstrate the necessity and capabilities of maximal coordinates. △ Less

Submitted 5 November, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

arXiv:2212.00478 [pdf, ps, other]

Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions

Authors: Armin Lederer, Azra Begzadić, Neha Das, Sandra Hirche

Abstract: Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the… ▽ More Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the robot's mechanical design can address the latter requirement. However, this elasticity can increase the complexity of the resulting system, leading to unmodeled dynamics, such that control barrier functions cannot directly ensure safety. In this paper, we mitigate this issue by learning the unknown dynamics using Gaussian process regression. By employing the model in a feedback linearizing control law, the safety conditions resulting from control barrier functions can be robustified to take into account model errors, while remaining feasible. In order to enforce them on-line, we formulate the derived safety conditions in the form of a second-order cone program. We demonstrate our proposed approach with simulations on a two-degree-of-freedom planar robot with elastic joints. △ Less

Submitted 14 April, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2209.06936 [pdf, other]

doi 10.1109/LRA.2023.3322899

Vision-Based Uncertainty-Aware Motion Planning based on Probabilistic Semantic Segmentation

Authors: Ralf Römer, Armin Lederer, Samuel Tesfazgi, Sandra Hirche

Abstract: For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent misca… ▽ More For safe operation, a robot must be able to avoid collisions in uncertain environments. Existing approaches for motion planning under uncertainties often assume parametric obstacle representations and Gaussian uncertainty, which can be inaccurate. While visual perception can deliver a more accurate representation of the environment, its use for safe motion planning is limited by the inherent miscalibration of neural networks and the challenge of obtaining adequate datasets. To address these limitations, we propose to employ ensembles of deep semantic segmentation networks trained with massively augmented datasets to ensure reliable probabilistic occupancy information. To avoid conservatism during motion planning, we directly employ the probabilistic perception in a scenario-based path planning approach. A velocity scheduling scheme is applied to the path to ensure a safe motion despite tracking inaccuracies. We demonstrate the effectiveness of the massive data augmentation in combination with deep ensembles and the proposed scenario-based planning approach in comparisons to state-of-the-art methods and validate our framework in an experiment with a human hand as an obstacle. △ Less

Submitted 1 December, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

Journal ref: IEEE Robotics and Automation Letters, vol. 8, no. 11, pp. 7825-7832, 2023

arXiv:2207.01337 [pdf, other]

Safe Reinforcement Learning via Confidence-Based Filters

Authors: Sebastian Curi, Armin Lederer, Sandra Hirche, Andreas Krause

Abstract: Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functi… ▽ More Ensuring safety is a crucial challenge when deploying reinforcement learning (RL) to real-world systems. We develop confidence-based safety filters, a control-theoretic approach for certifying state safety constraints for nominal policies learned via standard RL techniques, based on probabilistic dynamics models. Our approach is based on a reformulation of state constraints in terms of cost functions, reducing safety verification to a standard RL task. By exploiting the concept of hallucinating inputs, we extend this formulation to determine a "backup" policy that is safe for the unknown system with high probability. Finally, the nominal policy is minimally adjusted at every time step during a roll-out towards the backup policy, such that safe recovery can be guaranteed afterwards. We provide formal safety guarantees, and empirically demonstrate the effectiveness of our approach. △ Less

Submitted 4 July, 2022; originally announced July 2022.

arXiv:2206.13966 [pdf, other]

Dext-Gen: Dexterous Grasping in Sparse Reward Environments with Full Orientation Control

Authors: Martin Schuck, Jan Brüdigam, Alexandre Capone, Stefan Sosnowski, Sandra Hirche

Abstract: Reinforcement learning is a promising method for robotic grasping as it can learn effective reaching and grasping policies in difficult scenarios. However, achieving human-like manipulation capabilities with sophisticated robotic hands is challenging because of the problem's high dimensionality. Although remedies such as reward shaping or expert demonstrations can be employed to overcome this issu… ▽ More Reinforcement learning is a promising method for robotic grasping as it can learn effective reaching and grasping policies in difficult scenarios. However, achieving human-like manipulation capabilities with sophisticated robotic hands is challenging because of the problem's high dimensionality. Although remedies such as reward shaping or expert demonstrations can be employed to overcome this issue, they often lead to oversimplified and biased policies. We present Dext-Gen, a reinforcement learning framework for Dexterous Grasping in sparse reward ENvironments that is applicable to a variety of grippers and learns unbiased and intricate policies. Full orientation control of the gripper and object is achieved through smooth orientation representation. Our approach has reasonable training durations and provides the option to include desired prior knowledge. The effectiveness and adaptability of the framework to different scenarios is demonstrated in simulated experiments. △ Less

Submitted 28 June, 2022; originally announced June 2022.

arXiv:2206.12272 [pdf, ps, other]

doi 10.1109/CDC51059.2022.9993123

Physically Consistent Learning of Conservative Lagrangian Systems with Gaussian Processes

Authors: Giulio Evangelisti, Sandra Hirche

Abstract: This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky dec… ▽ More This paper proposes a physically consistent Gaussian Process (GP) enabling the identification of uncertain Lagrangian systems. The function space is tailored according to the energy components of the Lagrangian and the differential equation structure, analytically guaranteeing physical and mathematical properties such as energy conservation and quadratic form. The novel formulation of Cholesky decomposed matrix kernels allow the probabilistic preservation of positive definiteness. Only differential input-to-output measurements of the function map are required while Gaussian noise is permitted in torques, velocities, and accelerations. We demonstrate the effectiveness of the approach in numerical simulation. △ Less

Submitted 3 February, 2023; v1 submitted 24 June, 2022; originally announced June 2022.

Comments: Accepted version of paper published by IEEE in 2022 IEEE 61st Conference on Decision and Control (CDC). Final published paper can be found at https://doi.org/10.1109/CDC51059.2022.9993123

arXiv:2202.11491 [pdf, other]

Networked Online Learning for Control of Safety-Critical Resource-Constrained Systems based on Gaussian Processes

Authors: Armin Lederer, Mingmin Zhang, Samuel Tesfazgi, Sandra Hirche

Abstract: Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally e… ▽ More Safety-critical technical systems operating in unknown environments require the ability to quickly adapt their behavior, which can be achieved in control by inferring a model online from the data stream generated during operation. Gaussian process-based learning is particularly well suited for safety-critical applications as it ensures bounded prediction errors. While there exist computationally efficient approximations for online inference, these approaches lack guarantees for the prediction error and have high memory requirements, and are therefore not applicable to safety-critical systems with tight memory constraints. In this work, we propose a novel networked online learning approach based on Gaussian process regression, which addresses the issue of limited local resources by employing remote data management in the cloud. Our approach formally guarantees a bounded tracking error with high probability, which is exploited to identify the most relevant data to achieve a certain control performance. We further propose an effective data transmission scheme between the local system and the cloud taking bandwidth limitations and time delay of the transmission channel into account. The effectiveness of the proposed method is successfully demonstrated in a simulation. △ Less

Submitted 23 February, 2022; originally announced February 2022.

arXiv:2201.11640 [pdf, ps, other]

Towards Data-driven LQR with Koopmanizing Flows

Authors: Petar Bevanda, Max Beier, Shahab Heshmati-Alamdari, Stefan Sosnowski, Sandra Hirche

Abstract: We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while… ▽ More We propose a novel framework for learning linear time-invariant (LTI) models for a class of continuous-time non-autonomous nonlinear dynamics based on a representation of Koopman operators. In general, the operator is infinite-dimensional but, crucially, linear. To utilize it for efficient LTI control design, we learn a finite representation of the Koopman operator that is linear in controls while concurrently learning meaningful lifting coordinates. For the latter, we rely on Koopmanizing Flows - a diffeomorphism-based representation of Koopman operators and extend it to systems with linear control entry. With such a learned model, we can replace the nonlinear optimal control problem with quadratic cost to that of a linear quadratic regulator (LQR), facilitating efficacious optimal control for nonlinear systems. The superior control performance of the proposed method is demonstrated on simulation examples. △ Less

Submitted 23 May, 2022; v1 submitted 27 January, 2022; originally announced January 2022.

Comments: Final version, accepted for presentation at the 6th IFAC Conference on Intelligent Control and Automation Sciences (ICONS), 2022. arXiv admin note: text overlap with arXiv:2112.04085

arXiv:2112.05451 [pdf, other]

Structure-Preserving Learning Using Gaussian Processes and Variational Integrators

Authors: Jan Brüdigam, Martin Schuck, Alexandre Capone, Stefan Sosnowski, Sandra Hirche

Abstract: Gaussian process regression is increasingly applied for learning unknown dynamical systems. In particular, the implicit quantification of the uncertainty of the learned model makes it a promising approach for safety-critical applications. When using Gaussian process regression to learn unknown systems, a commonly considered approach consists of learning the residual dynamics after applying some ge… ▽ More Gaussian process regression is increasingly applied for learning unknown dynamical systems. In particular, the implicit quantification of the uncertainty of the learned model makes it a promising approach for safety-critical applications. When using Gaussian process regression to learn unknown systems, a commonly considered approach consists of learning the residual dynamics after applying some generic discretization technique, which might however disregard properties of the underlying physical system. Variational integrators are a less common yet promising approach to discretization, as they retain physical properties of the underlying system, such as energy conservation and satisfaction of explicit kinematic constraints. In this work, we present a novel structure-preserving learning-based modelling approach that combines a variational integrator for the nominal dynamics of a mechanical system and learning residual dynamics with Gaussian process regression. We extend our approach to systems with known kinematic constraints and provide formal bounds on the prediction uncertainty. The simulative evaluation of the proposed method shows desirable energy conservation properties in accordance with general theoretical results and demonstrates exact constraint satisfaction for constrained dynamical systems. △ Less

Submitted 17 April, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

Journal ref: Learning for Dynamics and Control Conference 2022

arXiv:2112.04085 [pdf, other]

Diffeomorphically Learning Stable Koopman Operators

Authors: Petar Bevanda, Max Beier, Sebastian Kerz, Armin Lederer, Stefan Sosnowski, Sandra Hirche

Abstract: System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it… ▽ More System representations inspired by the infinite-dimensional Koopman operator (generator) are increasingly considered for predictive modeling. Due to the operator's linearity, a range of nonlinear systems admit linear predictor representations - allowing for simplified prediction, analysis and control. However, finding meaningful finite-dimensional representations for prediction is difficult as it involves determining features that are both Koopman-invariant (evolve linearly under the dynamics) as well as relevant (spanning the original state) - a generally unsupervised problem. In this work, we present Koopmanizing Flows - a novel continuous-time framework for supervised learning of linear predictors for a class of nonlinear dynamics. In our model construction a latent diffeomorphically related linear system unfolds into a linear predictor through the composition with a monomial basis. The lifting, its linear dynamics and state reconstruction are learned simultaneously, while an unconstrained parameterization of Hurwitz matrices ensures asymptotic stability regardless of the operator approximation accuracy. The superior efficacy of Koopmanizing Flows is demonstrated in comparison to a state-of-the-art method on the well-known LASA handwriting benchmark. △ Less

Submitted 30 May, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

Comments: Revised version submitted to IEEE Control Systems Letters (L-CSS) with substantially revised exposition, evaluation and proof of Lemma 2 (previously Lemma 8)

arXiv:2111.03617 [pdf, ps, other]

Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes

Authors: Alejandro J. Ordóñez-Conejo, Armin Lederer, Sandra Hirche

Abstract: When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires pr… ▽ More When signals are measured through physical sensors, they are perturbed by noise. To reduce noise, low-pass filters are commonly employed in order to attenuate high frequency components in the incoming signal, regardless if they come from noise or the actual signal. Therefore, low-pass filters must be carefully tuned in order to avoid significant deterioration of the signal. This tuning requires prior knowledge about the signal, which is often not available in applications such as reinforcement learning or learning-based control. In order to overcome this limitation, we propose an adaptive low-pass filter based on Gaussian process regression. By considering a constant window of previous observations, updates and predictions fast enough for real-world filtering applications can be realized. Moreover, the online optimization of hyperparameters leads to an adaptation of the low-pass behavior, such that no prior tuning is necessary. We show that the estimation error of the proposed method is uniformly bounded, and demonstrate the flexibility and efficiency of the approach in several simulations. △ Less

Submitted 5 November, 2021; originally announced November 2021.

arXiv:2110.07786 [pdf, other]

Learning the Koopman Eigendecomposition: A Diffeomorphic Approach

Authors: Petar Bevanda, Johannes Kirmayr, Stefan Sosnowski, Sandra Hirche

Abstract: We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system v… ▽ More We present a novel data-driven approach for learning linear representations of a class of stable nonlinear systems using Koopman eigenfunctions. By learning the conjugacy map between a nonlinear system and its Jacobian linearization through a Normalizing Flow one can guarantee the learned function is a diffeomorphism. Using this diffeomorphism, we construct eigenfunctions of the nonlinear system via the spectral equivalence of conjugate systems - allowing the construction of linear predictors for nonlinear systems. The universality of the diffeomorphism learner leads to the universal approximation of the nonlinear system's Koopman eigenfunctions. The developed method is also safe as it guarantees the model is asymptotically stable regardless of the representation accuracy. To our best knowledge, this is the first work to close the gap between the operator, system and learning theories. The efficacy of our approach is shown through simulation examples. △ Less

Submitted 30 May, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: Accepted for presentation at the 2022 American Control Conference (ACC)

arXiv:2110.00481 [pdf, other]

Personalized Rehabilitation Robotics based on Online Learning Control

Authors: Samuel Tesfazgi, Armin Lederer, Johannes F. Kunz, Alejandro J. Ordóñez-Conejo, Sandra Hirche

Abstract: The use of rehabilitation robotics in clinical applications gains increasing importance, due to therapeutic benefits and the ability to alleviate labor-intensive works. However, their practical utility is dependent on the deployment of appropriate control algorithms, which adapt the level of task-assistance according to each individual patient's need. Generally, the required personalization is ach… ▽ More The use of rehabilitation robotics in clinical applications gains increasing importance, due to therapeutic benefits and the ability to alleviate labor-intensive works. However, their practical utility is dependent on the deployment of appropriate control algorithms, which adapt the level of task-assistance according to each individual patient's need. Generally, the required personalization is achieved through manual tuning by clinicians, which is cumbersome and error-prone. In this work we propose a novel online learning control architecture, which is able to personalize the control force at run time to each individual user. To this end, we deploy Gaussian process-based online learning with previously unseen prediction and update rates. Finally, we evaluate our method in an experimental user study, where the learning controller is shown to provide personalized control, while also obtaining safe interaction forces. △ Less

Submitted 15 September, 2022; v1 submitted 1 October, 2021; originally announced October 2021.

arXiv:2109.07262 [pdf, other]

Linear-Time Contact and Friction Dynamics in Maximal Coordinates using Variational Integrators

Authors: Jan Brüdigam, Jana Janeva, Stefan Sosnowski, Sandra Hirche

Abstract: Simulation of contact and friction dynamics is an important basis for control- and learning-based algorithms. However, the numerical difficulties of contact interactions pose a challenge for robust and efficient simulators. A maximal-coordinate representation of the dynamics enables efficient solving algorithms, but current methods in maximal coordinates require constraint stabilization schemes. T… ▽ More Simulation of contact and friction dynamics is an important basis for control- and learning-based algorithms. However, the numerical difficulties of contact interactions pose a challenge for robust and efficient simulators. A maximal-coordinate representation of the dynamics enables efficient solving algorithms, but current methods in maximal coordinates require constraint stabilization schemes. Therefore, we propose an interior-point algorithm for the numerically robust treatment of rigid-body dynamics with contact interactions in maximal coordinates. Additionally, we discretize the dynamics with a variational integrator to prevent constraint drift. Our algorithm achieves linear-time complexity both in the number of contact points and the number of bodies, which is shown theoretically and demonstrated with an implementation. Furthermore, we simulate two robotic systems to highlight the applicability of the proposed algorithm. △ Less

Submitted 15 September, 2021; originally announced September 2021.

arXiv:2109.02606 [pdf, other]

Gaussian Process Uniform Error Bounds with Unknown Hyperparameters for Safety-Critical Applications

Authors: Alexandre Capone, Armin Lederer, Sandra Hirche

Abstract: Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian proc… ▽ More Gaussian processes have become a promising tool for various safety-critical settings, since the posterior variance can be used to directly estimate the model error and quantify risk. However, state-of-the-art techniques for safety-critical settings hinge on the assumption that the kernel hyperparameters are known, which does not apply in general. To mitigate this, we introduce robust Gaussian process uniform error bounds in settings with unknown hyperparameters. Our approach computes a confidence region in the space of hyperparameters, which enables us to obtain a probabilistic upper bound for the model error of a Gaussian process with arbitrary hyperparameters. We do not require to know any bounds for the hyperparameters a priori, which is an assumption commonly found in related work. Instead, we are able to derive bounds from data in an intuitive fashion. We additionally employ the proposed technique to derive performance guarantees for a class of learning-based control problems. Experiments show that the bound performs significantly better than vanilla and fully Bayesian Gaussian processes. △ Less

Submitted 20 July, 2022; v1 submitted 6 September, 2021; originally announced September 2021.

arXiv:2107.14580 [pdf, other]

Distributed Event- and Self-Triggered Coverage Control with Speed Constrained Unicycle Robots

Authors: Yuni Zhou, Lingxuan Kong, Stefan Sosnowski, Qingchen Liu, Sandra Hirche

Abstract: Voronoi coverage control is a particular problem of importance in the area of multi-robot systems, which considers a network of multiple autonomous robots, tasked with optimally covering a large area. This is a common task for fleets of fixed-wing Unmanned Aerial Vehicles (UAVs), which are described in this work by a unicycle model with constant forward-speed constraints. We develop event-based co… ▽ More Voronoi coverage control is a particular problem of importance in the area of multi-robot systems, which considers a network of multiple autonomous robots, tasked with optimally covering a large area. This is a common task for fleets of fixed-wing Unmanned Aerial Vehicles (UAVs), which are described in this work by a unicycle model with constant forward-speed constraints. We develop event-based control/communication algorithms to relax the resource requirements on wireless communication and control actuators, an important feature for battery-driven or otherwise energy-constrained systems. To overcome the drawback that the event-triggered algorithm requires continuous measurement of system states, we propose a self-triggered algorithm to estimate the next triggering time. Hardware experiments illustrate the theoretical results. △ Less

Submitted 30 July, 2021; originally announced July 2021.

arXiv:2106.10662 [pdf, other]

FedXGBoost: Privacy-Preserving XGBoost for Federated Learning

Authors: Nhan Khanh Le, Yang Liu, Quang Minh Nguyen, Qingchen Liu, Fangzhou Liu, Quanwei Cai, Sandra Hirche

Abstract: Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated learning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federate… ▽ More Federated learning is the distributed machine learning framework that enables collaborative training across multiple parties while ensuring data privacy. Practical adaptation of XGBoost, the state-of-the-art tree boosting framework, to federated learning remains limited due to high cost incurred by conventional privacy-preserving methods. To address the problem, we propose two variants of federated XGBoost with privacy guarantee: FedXGBoost-SMM and FedXGBoost-LDP. Our first protocol FedXGBoost-SMM deploys enhanced secure matrix multiplication method to preserve privacy with lossless accuracy and lower overhead than encryption-based techniques. Developed independently, the second protocol FedXGBoost-LDP is heuristically designed with noise perturbation for local differential privacy, and empirically evaluated on real-world and synthetic datasets. △ Less

Submitted 12 August, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

arXiv:2105.12236 [pdf, other]

Gaussian Process-based Stochastic Model Predictive Control for Overtaking in Autonomous Racing

Authors: Tim Brüdigam, Alexandre Capone, Sandra Hirche, Dirk Wollherr, Marion Leibold

Abstract: A fundamental aspect of racing is overtaking other race cars. Whereas previous research on autonomous racing has majorly focused on lap-time optimization, here, we propose a method to plan overtaking maneuvers in autonomous racing. A Gaussian process is used to learn the behavior of the leading vehicle. Based on the outputs of the Gaussian process, a stochastic Model Predictive Control algorithm p… ▽ More A fundamental aspect of racing is overtaking other race cars. Whereas previous research on autonomous racing has majorly focused on lap-time optimization, here, we propose a method to plan overtaking maneuvers in autonomous racing. A Gaussian process is used to learn the behavior of the leading vehicle. Based on the outputs of the Gaussian process, a stochastic Model Predictive Control algorithm plans optimistic trajectories, such that the controlled autonomous race car is able to overtake the leading vehicle. The proposed method is tested in a simple simulation scenario. △ Less

Submitted 25 May, 2021; originally announced May 2021.

Comments: This work has been accepted to the ICRA 2021 workshop 'Opportunities and Challenges with Autonomous Racing'

arXiv:2104.04483 [pdf, other]

Inverse Reinforcement Learning: A Control Lyapunov Approach

Authors: Samuel Tesfazgi, Armin Lederer, Sandra Hirche

Abstract: Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and… ▽ More Inferring the intent of an intelligent agent from demonstrations and subsequently predicting its behavior, is a critical task in many collaborative settings. A common approach to solve this problem is the framework of inverse reinforcement learning (IRL), where the observed agent, e.g., a human demonstrator, is assumed to behave according to an intrinsic cost function that reflects its intent and informs its control actions. In this work, we reformulate the IRL inference problem to learning control Lyapunov functions (CLF) from demonstrations by exploiting the inverse optimality property, which states that every CLF is also a meaningful value function. Moreover, the derived CLF formulation directly guarantees stability of inferred control policies. We show the flexibility of our proposed method by learning from goal-directed movement demonstrations in a continuous environment. △ Less

Submitted 4 October, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

Comments: This work has been accepted for presentation at, and publication in the proceedings of, the 2021 IEEE Conference on Decision and Control (CDC)

arXiv:2104.04342 [pdf, ps, other]

doi 10.1109/cdc45484.2021.9683772

Distributed Bayesian Online Learning for Cooperative Manipulation

Authors: Pablo Budde gen. Dohmann, Armin Lederer, Marcel Dißemond, Sandra Hirche

Abstract: For tasks where the dynamics of multiple agents are physically coupled, e.g., in cooperative manipulation, the coordination between the individual agents becomes crucial, which requires exact knowledge of the interaction dynamics. This problem is typically addressed using centralized estimators, which can negatively impact the flexibility and robustness of the overall system. To overcome this shor… ▽ More For tasks where the dynamics of multiple agents are physically coupled, e.g., in cooperative manipulation, the coordination between the individual agents becomes crucial, which requires exact knowledge of the interaction dynamics. This problem is typically addressed using centralized estimators, which can negatively impact the flexibility and robustness of the overall system. To overcome this shortcoming, we propose a novel distributed learning framework for the exemplary task of cooperative manipulation using Bayesian principles. Using only local state information each agent obtains an estimate of the object dynamics and grasp kinematics. These local estimates are combined using dynamic average consensus. Due to the strong probabilistic foundation of the method, each estimate of the object dynamics and grasp kinematics is accompanied by a measure of uncertainty, which allows to guarantee a bounded prediction error with high probability. Moreover, the Bayesian principles directly allow iterative learning with constant complexity, such that the proposed learning method can be used online in real-time applications. The effectiveness of the approach is demonstrated in a simulated cooperative manipulation task. △ Less

Submitted 28 June, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

arXiv:2101.05328 [pdf, other]

Uniform Error and Posterior Variance Bounds for Gaussian Process Regression with Application to Safe Control

Authors: Armin Lederer, Jonas Umlauft, Sandra Hirche

Abstract: In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bo… ▽ More In application areas where data generation is expensive, Gaussian processes are a preferred supervised learning model due to their high data-efficiency. Particularly in model-based control, Gaussian processes allow the derivation of performance guarantees using probabilistic model error bounds. To make these approaches applicable in practice, two open challenges must be solved i) Existing error bounds rely on prior knowledge, which might not be available for many real-world tasks. (ii) The relationship between training data and the posterior variance, which mainly drives the error bound, is not well understood and prevents the asymptotic analysis. This article addresses these issues by presenting a novel uniform error bound using Lipschitz continuity and an analysis of the posterior variance function for a large class of kernels. Additionally, we show how these results can be used to guarantee safe control of an unknown dynamical system and provide numerical illustration examples. △ Less

Submitted 13 January, 2021; originally announced January 2021.

arXiv:2011.10596 [pdf, ps, other]

The Impact of Data on the Stability of Learning-Based Control- Extended Version

Authors: Armin Lederer, Alexandre Capone, Thomas Beckers, Jonas Umlauft, Sandra Hirche

Abstract: Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between mod… ▽ More Despite the existence of formal guarantees for learning-based control approaches, the relationship between data and control performance is still poorly understood. In this paper, we propose a Lyapunov-based measure for quantifying the impact of data on the certifiable control performance. By modeling unknown system dynamics through Gaussian processes, we can determine the interrelation between model uncertainty and satisfaction of stability conditions. This allows us to directly asses the impact of data on the provable stationary control performance, and thereby the value of the data for the closed-loop system performance. Our approach is applicable to a wide variety of unknown nonlinear systems that are to be controlled by a generic learning-based control law, and the results obtained in numerical simulations indicate the efficacy of the proposed measure. △ Less

Submitted 30 July, 2021; v1 submitted 20 November, 2020; originally announced November 2020.

arXiv:2011.08683 [pdf, ps, other]

Fisher Information of a Family of Generalized Normal Distributions

Authors: Precious Ugo Abara, Sandra Hirche

Abstract: In this brief note we compute the Fisher information of a family of generalized normal distributions. Fisher information is usually defined for regular distributions, i.e. continuously differentiable (log) density functions whose support does not depend on the family parameter $θ$. Although the uniform distribution in $[-θ, + θ]$ does not satisfy the regularity requirements, as a special case of o… ▽ More In this brief note we compute the Fisher information of a family of generalized normal distributions. Fisher information is usually defined for regular distributions, i.e. continuously differentiable (log) density functions whose support does not depend on the family parameter $θ$. Although the uniform distribution in $[-θ, + θ]$ does not satisfy the regularity requirements, as a special case of our result, we will obtain the Fisher information for this family. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Comments: 3 pages, 1 figure

arXiv:2010.02613 [pdf, other]

Deep Learning based Uncertainty Decomposition for Real-time Control

Authors: Neha Das, Jonas Umlauft, Armin Lederer, Thomas Beckers, Sandra Hirche

Abstract: Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can… ▽ More Data-driven control in unknown environments requires a clear understanding of the involved uncertainties for ensuring safety and efficient exploration. While aleatoric uncertainty that arises from measurement noise can often be explicitly modeled given a parametric description, it can be harder to model epistemic uncertainty, which describes the presence or absence of training data. The latter can be particularly useful for implementing exploratory control strategies when system dynamics are unknown. We propose a novel method for detecting the absence of training data using deep learning, which gives a continuous valued scalar output between $0$ (indicating low uncertainty) and $1$ (indicating high uncertainty). We utilize this detector as a proxy for epistemic uncertainty and show its advantages over existing approaches on synthetic and real-world datasets. Our approach can be directly combined with aleatoric uncertainty estimates and allows for uncertainty estimation in real-time as the inference is sample-free unlike existing approaches for uncertainty modeling. We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter acted upon by an unknown disturbance model. △ Less

Submitted 12 July, 2023; v1 submitted 6 October, 2020; originally announced October 2020.

Comments: Accepted at IFAC World Congress 2023

arXiv:2009.06689 [pdf, other]

Online learning-based trajectory tracking for underactuated vehicles with uncertain dynamics

Authors: Thomas Beckers, Leonardo Colombo, Sandra Hirche, George J. Pappas

Abstract: Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To ov… ▽ More Underactuated vehicles have gained much attention in the recent years due to the increasing amount of aerial and underwater vehicles as well as nanosatellites. Trajectory tracking control of these vehicles is a substantial aspect for an increasing range of application domains. However, external disturbances and parts of the internal dynamics are often unknown or very time-consuming to model. To overcome this issue, we present a tracking control law for underactuated rigid-body dynamics using an online learning-based oracle for the prediction of the unknown dynamics. We show that Gaussian process models are of particular interest for the role of the oracle. The presented approach guarantees a bounded tracking error with high probability where the bound is explicitly given. A numerical example highlights the effectiveness of the proposed control law. △ Less

Submitted 14 September, 2021; v1 submitted 14 September, 2020; originally announced September 2020.

arXiv:2007.12377 [pdf, ps, other]

Anticipating the Long-Term Effect of Online Learning in Control

Authors: Alexandre Capone, Sandra Hirche

Abstract: Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively… ▽ More Control schemes that learn using measurement data collected online are increasingly promising for the control of complex and uncertain systems. However, in most approaches of this kind, learning is viewed as a side effect that passively improves control performance, e.g., by updating a model of the system dynamics. Determining how improvements in control performance due to learning can be actively exploited in the control synthesis is still an open research question. In this paper, we present AntLer, a design algorithm for learning-based control laws that anticipates learning, i.e., that takes the impact of future learning in uncertain dynamic settings explicitly into account. AntLer expresses system uncertainty using a non-parametric probabilistic model. Given a cost function that measures control performance, AntLer chooses the control parameters such that the expected cost of the closed-loop system is minimized approximately. We show that AntLer approximates an optimal solution arbitrarily accurately with probability one. Furthermore, we apply AntLer to a nonlinear system, which yields better results compared to the case where learning is not anticipated. △ Less

Submitted 24 July, 2020; originally announced July 2020.

arXiv:2006.14551 [pdf, other]

Prediction with Approximated Gaussian Process Dynamical Models

Authors: Thomas Beckers, Sandra Hirche

Abstract: The modeling and simulation of dynamical systems is a necessary step for many control approaches. Using classical, parameter-based techniques for modeling of modern systems, e.g., soft robotics or human-robot interaction, is often challenging or even infeasible due to the complexity of the system dynamics. In contrast, data-driven approaches need only a minimum of prior knowledge and scale with th… ▽ More The modeling and simulation of dynamical systems is a necessary step for many control approaches. Using classical, parameter-based techniques for modeling of modern systems, e.g., soft robotics or human-robot interaction, is often challenging or even infeasible due to the complexity of the system dynamics. In contrast, data-driven approaches need only a minimum of prior knowledge and scale with the complexity of the system. In particular, Gaussian process dynamical models (GPDMs) provide very promising results for the modeling of complex dynamics. However, the control properties of these GP models are just sparsely researched, which leads to a "blackbox" treatment in modeling and control scenarios. In addition, the sampling of GPDMs for prediction purpose respecting their non-parametric nature results in non-Markovian dynamics making the theoretical analysis challenging. In this article, we present approximated GPDMs which are Markov and analyze their control theoretical properties. Among others, the approximated error is analyzed and conditions for boundedness of the trajectories are provided. The outcomes are illustrated with numerical examples that show the power of the approximated models while the the computational time is significantly reduced. △ Less

Submitted 30 November, 2021; v1 submitted 25 June, 2020; originally announced June 2020.

Comments: This article has been accepted for publication by IEEE

arXiv:2006.09446 [pdf, ps, other]

Real-Time Regression with Dividing Local Gaussian Processes

Authors: Armin Lederer, Alejandro Jose Ordonez Conejo, Korbinian Maier, Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Abstract: The increased demand for online prediction and the growing availability of large data sets drives the need for computationally efficient models. While exact Gaussian process regression shows various favorable theoretical properties (uncertainty estimate, unlimited expressive power), the poor scaling with respect to the training set size prohibits its application in big data regimes in real-time. T… ▽ More The increased demand for online prediction and the growing availability of large data sets drives the need for computationally efficient models. While exact Gaussian process regression shows various favorable theoretical properties (uncertainty estimate, unlimited expressive power), the poor scaling with respect to the training set size prohibits its application in big data regimes in real-time. Therefore, this paper proposes dividing local Gaussian processes, which are a novel, computationally efficient modeling approach based on Gaussian process regression. Due to an iterative, data-driven division of the input space, they achieve a sublinear computational complexity in the total number of training points in practice, while providing excellent predictive distributions. A numerical evaluation on real-world data sets shows their advantages over other state-of-the-art methods in terms of accuracy as well as prediction and update speed. △ Less

Submitted 30 July, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

Showing 1–50 of 69 results for author: Hirche, S