Zum Hauptinhalt springen

Showing 1–47 of 47 results for author: Burdick, J W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15100  [pdf, other

    cs.RO

    Mobile Robot Sensory Coverage in 2-D Environments: An Optimization Approach with Efficiency Bounds

    Authors: E. Fourney, J. W. Burdick, E. D. Rimon

    Abstract: This paper considers three related mobile robot multi-target sensory coverage and inspection planning problems in 2-D environments. In the first problem, a mobile robot must find the shortest path to observe multiple targets with a limited range sensor in an obstacle free environment. In the second problem, the mobile robot must efficiently observe multiple targets while taking advantage of multi-… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2403.18972  [pdf, other

    cs.RO eess.SY

    Risk-Aware Robotics: Tail Risk Measures in Planning, Control, and Verification

    Authors: Prithvi Akella, Anushri Dixit, Mohamadreza Ahmadi, Lars Lindemann, Margaret P. Chapman, George J. Pappas, Aaron D. Ames, Joel W. Burdick

    Abstract: The need for a systematic approach to risk assessment has increased in recent years due to the ubiquity of autonomous systems that alter our day-to-day experiences and their need for safety, e.g., for self-driving vehicles, mobile service robots, and bipedal robots. These systems are expected to function safely in unpredictable environments and interact seamlessly with humans, whose behavior is no… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  3. Rollover Prevention for Mobile Robots with Control Barrier Functions: Differentiator-Based Adaptation and Projection-to-State Safety

    Authors: Ersin Das, Aaron D. Ames, Joel W. Burdick

    Abstract: This paper develops rollover prevention guarantees for mobile robots using control barrier function (CBF) theory, and demonstrates the method experimentally. We consider a safety measure based on a zero moment point condition through the lens of CBFs. However, these conditions depend on time-varying and noisy parameters. To address this issue, we present a differentiator-based safety-critical cont… ▽ More

    Submitted 15 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2403.03215  [pdf, other

    cs.RO

    A Safety-Critical Framework for UGVs in Complex Environments: A Data-Driven Discrepancy-Aware Approach

    Authors: Skylar X. Wei, Lu Gan, Joel W. Burdick

    Abstract: This work presents a novel data-driven multi-layered planning and control framework for the safe navigation of a class of unmanned ground vehicles (UGVs) in the presence of unknown stationary obstacles and additive modeling uncertainties. The foundation of this framework is a novel robust model predictive planner, designed to generate optimal collision-free trajectories given an occupancy grid map… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2401.01881  [pdf, other

    eess.SY cs.RO

    Robust Control Barrier Functions using Uncertainty Estimation with Application to Mobile Robots

    Authors: Ersin Das, Joel W. Burdick

    Abstract: Model uncertainty poses a significant challenge to the implementation of safety-critical control systems. With this as motivation, this paper proposes a safe control design approach that guarantees the robustness of nonlinear feedback systems in the presence of matched or unmatched unmodelled system dynamics and external disturbances. Our approach couples control barrier functions (CBFs) with a ne… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  6. arXiv:2310.05865  [pdf, other

    cs.RO eess.SY

    A Learning-Based Framework for Safe Human-Robot Collaboration with Multiple Backup Control Barrier Functions

    Authors: Neil C. Janwani, Ersin Daş, Thomas Touma, Skylar X. Wei, Tamas G. Molnar, Joel W. Burdick

    Abstract: Ensuring robot safety in complex environments is a difficult task due to actuation limits, such as torque bounds. This paper presents a safety-critical control framework that leverages learning-based switching between multiple backup controllers to formally guarantee safety under bounded control inputs while satisfying driver intention. By leveraging backup controllers designed to uphold safety an… ▽ More

    Submitted 7 March, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted to the International Conference on Robotics and Automation 2024

  7. arXiv:2309.08766  [pdf, other

    cs.RO

    The Fractal Hand-II: Reviving a Classic Mechanism for Contemporary Grasping Challenges

    Authors: Malcolm G. A. Tisdale, Joel W. Burdick

    Abstract: This paper, and its companion, propose a new fractal robotic gripper, drawing inspiration from the century-old Fractal Vise. The unusual synergistic properties allow it to passively conform to diverse objects using only one actuator. Designed to be easily integrated with prevailing parallel jaw grippers, it alleviates the complexities tied to perception and grasp planning, especially when dealing… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: This paper is prepared for ICRA 2024

  8. arXiv:2303.03658  [pdf, ps, other

    cs.RO

    An Active Learning Based Robot Kinematic Calibration Framework Using Gaussian Processes

    Authors: Ersin Daş, Joel W. Burdick

    Abstract: Future NASA lander missions to icy moons will require completely automated, accurate, and data efficient calibration methods for the robot manipulator arms that sample icy terrains in the lander's vicinity. To support this need, this paper presents a Gaussian Process (GP) approach to the classical manipulator kinematic calibration process. Instead of identifying a corrected set of Denavit-Hartenbe… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  9. STEP: Stochastic Traversability Evaluation and Planning for Risk-Aware Off-road Navigation; Results from the DARPA Subterranean Challenge

    Authors: Anushri Dixit, David D. Fan, Kyohei Otsu, Sharmita Dey, Ali-Akbar Agha-Mohammadi, Joel W. Burdick

    Abstract: Although autonomy has gained widespread usage in structured and controlled environments, robotic autonomy in unknown and off-road terrain remains a difficult problem. Extreme, off-road, and unstructured environments such as undeveloped wilderness, caves, rubble, and other post-disaster sites pose unique and challenging problems for autonomous navigation. Based on our participation in the DARPA Sub… ▽ More

    Submitted 2 March, 2023; originally announced March 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2103.02828

    Journal ref: Field Robotics, 4, 2024, 182-210

  10. arXiv:2302.13687  [pdf, other

    cs.RO math.OC

    FRoGGeR: Fast Robust Grasp Generation via the Min-Weight Metric

    Authors: Albert H. Li, Preston Culbertson, Joel W. Burdick, Aaron D. Ames

    Abstract: Many approaches to grasp synthesis optimize analytic quality metrics that measure grasp robustness based on finger placements and local surface geometry. However, generating feasible dexterous grasps by optimizing these metrics is slow, often taking minutes. To address this issue, this paper presents FRoGGeR: a method that quickly generates robust precision grasps using the min-weight metric, a no… ▽ More

    Submitted 24 July, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted at IROS 2023. The arXiv version contains the appendix, which does not appear in the conference version

  11. arXiv:2212.06253  [pdf, other

    eess.SY cs.IT cs.LG cs.RO

    Learning Disturbances Online for Risk-Aware Control: Risk-Aware Flight with Less Than One Minute of Data

    Authors: Prithvi Akella, Skylar X. Wei, Joel W. Burdick, Aaron D. Ames

    Abstract: Recent advances in safety-critical risk-aware control are predicated on apriori knowledge of the disturbances a system might face. This paper proposes a method to efficiently learn these disturbances online, in a risk-aware context. First, we introduce the concept of a Surface-at-Risk, a risk measure for stochastic processes that extends Value-at-Risk -- a commonly utilized risk measure in the ris… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  12. arXiv:2212.00278  [pdf, other

    cs.RO eess.SY

    Adaptive Conformal Prediction for Motion Planning among Dynamic Agents

    Authors: Anushri Dixit, Lars Lindemann, Skylar Wei, Matthew Cleaveland, George J. Pappas, Joel W. Burdick

    Abstract: This paper proposes an algorithm for motion planning among dynamic agents using adaptive conformal prediction. We consider a deterministic control system and use trajectory predictors to predict the dynamic agents' future motion, which is assumed to follow an unknown distribution. We then leverage ideas from adaptive conformal prediction to dynamically quantify prediction uncertainty from an onlin… ▽ More

    Submitted 30 November, 2022; originally announced December 2022.

  13. arXiv:2204.09833  [pdf, other

    cs.AI eess.SY

    Sample-Based Bounds for Coherent Risk Measures: Applications to Policy Synthesis and Verification

    Authors: Prithvi Akella, Anushri Dixit, Mohamadreza Ahmadi, Joel W. Burdick, Aaron D. Ames

    Abstract: The dramatic increase of autonomous systems subject to variable environments has given rise to the pressing need to consider risk in both the synthesis and verification of policies for these systems. This paper aims to address a few problems regarding risk-aware verification and policy synthesis, by first developing a sample-based method to bound the risk measure evaluation of a random variable wh… ▽ More

    Submitted 20 April, 2022; originally announced April 2022.

  14. arXiv:2204.09596  [pdf, other

    eess.SY cs.RO math.OC

    Risk-Averse Receding Horizon Motion Planning for Obstacle Avoidance using Coherent Risk Measures

    Authors: Anushri Dixit, Mohamadreza Ahmadi, Joel W. Burdick

    Abstract: This paper studies the problem of risk-averse receding horizon motion planning for agents with uncertain dynamics, in the presence of stochastic, dynamic obstacles. We propose a model predictive control (MPC) scheme that formulates the obstacle avoidance constraint using coherent risk measures. To handle disturbances, or process noise, in the state dynamics, the state constraints are tightened in… ▽ More

    Submitted 28 September, 2023; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: Accepted to Artificial Intelligence Journal, Special Issue on Risk-aware Autonomous Systems: Theory and Practice. arXiv admin note: text overlap with arXiv:2011.11211

    Journal ref: Artificial Intelligence, 325, 2023, 104018

  15. arXiv:2203.14913  [pdf, other

    cs.RO

    Moving Obstacle Avoidance: a Data-Driven Risk-Aware Approach

    Authors: Skylar X. Wei, Anushri Dixit, Shashank Tomar, Joel W. Burdick

    Abstract: This paper proposes a new structured method for a moving agent to predict the paths of dynamically moving obstacles and avoid them using a risk-aware model predictive control (MPC) scheme. Given noisy measurements of the a priori unknown obstacle trajectory, a bootstrapping technique predicts a set of obstacle trajectories. The bootstrapped predictions are incorporated in the MPC optimization usin… ▽ More

    Submitted 25 March, 2022; originally announced March 2022.

    Comments: This is prepared for IEEE Control Systems Letters (L-CSS) 2022

  16. arXiv:2203.12062  [pdf, other

    eess.SY cs.RO math.OC

    Distributionally Robust Model Predictive Control with Total Variation Distance

    Authors: Anushri Dixit, Mohamadreza Ahmadi, Joel W. Burdick

    Abstract: This paper studies the problem of distributionally robust model predictive control (MPC) using total variation distance ambiguity sets. For a discrete-time linear system with additive disturbances, we provide a conditional value-at-risk reformulation of the MPC optimization problem that is distributionally robust in the expected cost and chance constraints. The distributionally robust chance const… ▽ More

    Submitted 24 June, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

    Comments: Accepted to LCSS

  17. arXiv:2110.10341  [pdf, other

    cs.RO

    Quadrotor Trajectory Tracking with Learned Dynamics: Joint Koopman-based Learning of System Models and Function Dictionaries

    Authors: Carl Folkestad, Skylar X. Wei, Joel W. Burdick

    Abstract: Nonlinear dynamical effects are crucial to the operation of many agile robotic systems. Koopman-based model learning methods can capture these nonlinear dynamical system effects in higher dimensional lifted bilinear models that are amenable to optimal control. However, standard methods that lift the system state using a fixed function dictionary before model learning result in high dimensional mod… ▽ More

    Submitted 19 October, 2021; originally announced October 2021.

    Comments: arXiv admin note: text overlap with arXiv:2105.08036

  18. arXiv:2105.08036  [pdf, other

    cs.RO eess.SY

    Koopman NMPC: Koopman-based Learning and Nonlinear Model Predictive Control of Control-affine Systems

    Authors: Carl Folkestad, Joel W. Burdick

    Abstract: Koopman-based learning methods can potentially be practical and powerful tools for dynamical robotic systems. However, common methods to construct Koopman representations seek to learn lifted linear models that cannot capture nonlinear actuation effects inherent in many robotic systems. This paper presents a learning and control methodology that is a first step towards overcoming this limitation.… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

  19. arXiv:2103.14727  [pdf, other

    eess.SY cs.AI math.OC math.PR

    Risk-Averse Stochastic Shortest Path Planning

    Authors: Mohamadreza Ahmadi, Anushri Dixit, Joel W. Burdick, Aaron D. Ames

    Abstract: We consider the stochastic shortest path planning problem in MDPs, i.e., the problem of designing policies that ensure reaching a goal state from a given initial state with minimum accrued cost. In order to account for rare but important realizations of the system, we consider a nested dynamic coherent risk total cost functional rather than the conventional risk-neutral total expected cost. Under… ▽ More

    Submitted 26 March, 2021; originally announced March 2021.

  20. arXiv:2103.03388  [pdf, other

    cs.RO cs.MA eess.SY

    Limits of Probabilistic Safety Guarantees when Considering Human Uncertainty

    Authors: Richard Cheng, Richard M. Murray, Joel W. Burdick

    Abstract: When autonomous robots interact with humans, such as during autonomous driving, explicit safety guarantees are crucial in order to avoid potentially life-threatening accidents. Many data-driven methods have explored learning probabilistic bounds over human agents' trajectories (i.e. confidence tubes that contain trajectories with probability $δ$), which can then be used to guarantee safety with pr… ▽ More

    Submitted 24 March, 2021; v1 submitted 4 March, 2021; originally announced March 2021.

    Comments: ICRA 2021

  21. arXiv:2102.09119  [pdf, ps, other

    cs.RO cs.AI cs.CV

    Learning Invariant Representation of Tasks for Robust Surgical State Estimation

    Authors: Yidan Qin, Max Allan, Yisong Yue, Joel W. Burdick, Mahdi Azizian

    Abstract: Surgical state estimators in robot-assisted surgery (RAS) - especially those trained via learning techniques - rely heavily on datasets that capture surgeon actions in laboratory or real-world surgical tasks. Real-world RAS datasets are costly to acquire, are obtained from multiple surgeons who may use different surgical strategies, and are recorded under uncontrolled conditions in highly complex… ▽ More

    Submitted 17 February, 2021; originally announced February 2021.

    Comments: Accepted to IEEE Robotics & Automation Letters

  22. arXiv:2011.11211  [pdf, other

    eess.SY cs.RO math.OC

    Risk-Sensitive Motion Planning using Entropic Value-at-Risk

    Authors: Anushri Dixit, Mohamadreza Ahmadi, Joel W. Burdick

    Abstract: We consider the problem of risk-sensitive motion planning in the presence of randomly moving obstacles. To this end, we adopt a model predictive control (MPC) scheme and pose the obstacle avoidance constraint in the MPC problem as a distributionally robust constraint with a KL divergence ambiguity set. This constraint is the dual representation of the Entropic Value-at-Risk (EVaR). Building upon t… ▽ More

    Submitted 10 April, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

    Comments: Accepted to 2021 European Control Conference (ECC)

    Journal ref: European Control Conference (ECC) 2021

  23. ROIAL: Region of Interest Active Learning for Characterizing Exoskeleton Gait Preference Landscapes

    Authors: Kejun Li, Maegan Tucker, Erdem Bıyık, Ellen Novoseller, Joel W. Burdick, Yanan Sui, Dorsa Sadigh, Yisong Yue, Aaron D. Ames

    Abstract: Characterizing what types of exoskeleton gaits are comfortable for users, and understanding the science of walking more generally, require recovering a user's utility landscape. Learning these landscapes is challenging, as walking trajectories are defined by numerous gait parameters, data collection from human trials is expensive, and user safety and comfort must be ensured. This work proposes the… ▽ More

    Submitted 30 March, 2021; v1 submitted 9 November, 2020; originally announced November 2020.

    Comments: 6 pages + 1 page of references; 7 figures; To Appear at ICRA 2021

  24. arXiv:2009.11937  [pdf, ps, other

    cs.CV cs.LG cs.RO

    daVinciNet: Joint Prediction of Motion and Surgical State in Robot-Assisted Surgery

    Authors: Yidan Qin, Seyedshams Feyzabadi, Max Allan, Joel W. Burdick, Mahdi Azizian

    Abstract: This paper presents a technique to concurrently and jointly predict the future trajectories of surgical instruments and the future state(s) of surgical subtasks in robot-assisted surgeries (RAS) using multiple input sources. Such predictions are a necessary first step towards shared control and supervised autonomy of surgical subtasks. Minute-long surgical subtasks, such as suturing or ultrasound… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Accepted to IROS 2020

  25. arXiv:2004.05273  [pdf, other

    cs.RO cs.LG cs.MA eess.SY math.OC

    Safe Multi-Agent Interaction through Robust Control Barrier Functions with Learned Uncertainties

    Authors: Richard Cheng, Mohammad Javad Khojasteh, Aaron D. Ames, Joel W. Burdick

    Abstract: Robots operating in real world settings must navigate and maintain safety while interacting with many heterogeneous agents and obstacles. Multi-Agent Control Barrier Functions (CBF) have emerged as a computationally efficient tool to guarantee safety in multi-agent environments, but they assume perfect knowledge of both the robot dynamics and other agents' dynamics. While knowledge of the robot's… ▽ More

    Submitted 22 September, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

    Journal ref: 59th IEEE Conference on Decision and Control (CDC 2020)

  26. arXiv:2003.09267  [pdf, other

    eess.SY cs.MA cs.RO math.OC

    Barrier Functions for Multiagent-POMDPs with DTL Specifications

    Authors: Mohamadreza Ahmadi, Andrew Singletary, Joel W. Burdick, Aaron D. Ames

    Abstract: Multi-agent partially observable Markov decision processes (MPOMDPs) provide a framework to represent heterogeneous autonomous agents subject to uncertainty and partial observation. In this paper, given a nominal policy provided by a human operator or a conventional planning method, we propose a technique based on barrier functions to design a minimally interfering safety-shield ensuring satisfact… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: text overlap with arXiv:1903.07823

  27. arXiv:2003.06495  [pdf, other

    cs.RO cs.HC cs.LG

    Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits

    Authors: Maegan Tucker, Myra Cheng, Ellen Novoseller, Richard Cheng, Yisong Yue, Joel W. Burdick, Aaron D. Ames

    Abstract: Optimizing lower-body exoskeleton walking gaits for user comfort requires understanding users' preferences over a high-dimensional gait parameter space. However, existing preference-based learning methods have only explored low-dimensional domains due to computational limitations. To learn user preferences in high dimensions, this work presents LineCoSpar, a human-in-the-loop preference-based fram… ▽ More

    Submitted 8 August, 2020; v1 submitted 13 March, 2020; originally announced March 2020.

    Comments: 8 pages, 9 figures, 2 tables, to appear at IROS 2020

  28. arXiv:2002.02921  [pdf, other

    cs.CV cs.LG cs.RO eess.IV

    Temporal Segmentation of Surgical Sub-tasks through Deep Learning with Multiple Data Sources

    Authors: Yidan Qin, Sahba Aghajani Pedram, Seyedshams Feyzabadi, Max Allan, A. Jonathan McLeod, Joel W. Burdick, Mahdi Azizian

    Abstract: Many tasks in robot-assisted surgeries (RAS) can be represented by finite-state machines (FSMs), where each state represents either an action (such as picking up a needle) or an observation (such as bleeding). A crucial step towards the automation of such surgical tasks is the temporal perception of the current surgical scene, which requires a real-time estimation of the states in the FSMs. The ob… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: Accepted to ICRA 2020

  29. arXiv:2001.07679  [pdf, other

    cs.AI cs.FL cs.RO eess.SY math.OC

    Stochastic Finite State Control of POMDPs with LTL Specifications

    Authors: Mohamadreza Ahmadi, Rangoli Sharan, Joel W. Burdick

    Abstract: Partially observable Markov decision processes (POMDPs) provide a modeling framework for autonomous decision making under uncertainty and imperfect sensing, e.g. robot manipulation and self-driving cars. However, optimal control of POMDPs is notoriously intractable. This paper considers the quantitative problem of synthesizing sub-optimal stochastic finite state controllers (sFSCs) for POMDPs such… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

  30. arXiv:1909.10209  [pdf, other

    cs.RO

    Energy-Efficient Motion Planning for Multi-Modal Hybrid Locomotion

    Authors: H. J. Terry Suh, Xiaobin Xiong, Andrew Singletary, Aaron D. Ames, Joel W. Burdick

    Abstract: Hybrid locomotion, which combines multiple modalities of locomotion within a single robot, enables robots to carry out complex tasks in diverse environments. This paper presents a novel method for planning multi-modal locomotion trajectories using approximate dynamic programming. We formulate this problem as a shortest-path search through a state-space graph, where the edge cost is assigned as opt… ▽ More

    Submitted 4 August, 2020; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: Accepted to International Conference on Intelligent Robots and Systems (IROS) 2020

  31. arXiv:1908.01289  [pdf, other

    cs.LG cs.AI stat.ML

    Dueling Posterior Sampling for Preference-Based Reinforcement Learning

    Authors: Ellen R. Novoseller, Yibing Wei, Yanan Sui, Yisong Yue, Joel W. Burdick

    Abstract: In preference-based reinforcement learning (RL), an agent interacts with the environment while receiving preferences instead of absolute feedback. While there is increasing research activity in preference-based RL, the design of formal frameworks that admit tractable theoretical analysis remains an open challenge. Building upon ideas from preference-based bandit learning and posterior sampling in… ▽ More

    Submitted 29 June, 2020; v1 submitted 4 August, 2019; originally announced August 2019.

    Comments: To appear in Conference on Uncertainty in Artificial Intelligence (UAI), 2020. 9 pages before references and appendix; 51 pages total; 7 figures; 4 tables. This replacement incorporates reviewer comments, and in comparison to version 1, extends the theoretical and empirical analyses and adds mathematical detail. Code: https://github.com/ernovoseller/DuelingPosteriorSampling

  32. arXiv:1905.05380  [pdf, other

    cs.LG eess.SY stat.ML

    Control Regularization for Reduced Variance Reinforcement Learning

    Authors: Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick

    Abstract: Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of t… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: Appearing in ICML 2019

  33. arXiv:1903.08792  [pdf, other

    cs.LG eess.SY stat.ML

    End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

    Authors: Richard Cheng, Gabor Orosz, Richard M. Murray, Joel W. Burdick

    Abstract: Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2)… ▽ More

    Submitted 20 March, 2019; originally announced March 2019.

    Comments: Published in AAAI 2019

  34. arXiv:1903.07823  [pdf, other

    cs.RO

    Safe Policy Synthesis in Multi-Agent POMDPs via Discrete-Time Barrier Functions

    Authors: Mohamadreza Ahmadi, Andrew Singletary, Joel W. Burdick, Aaron D. Ames

    Abstract: A multi-agent partially observable Markov decision process (MPOMDP) is a modeling paradigm used for high-level planning of heterogeneous autonomous agents subject to uncertainty and partial observation. Despite their modeling efficiency, MPOMDPs have not received significant attention in safety-critical settings. In this paper, we use barrier functions to design policies for MPOMDPs that ensure sa… ▽ More

    Submitted 12 September, 2019; v1 submitted 19 March, 2019; originally announced March 2019.

    Comments: 8 pages and 4 figures

  35. arXiv:1806.07555  [pdf, other

    cs.LG stat.ML

    Stagewise Safe Bayesian Optimization with Gaussian Processes

    Authors: Yanan Sui, Vincent Zhuang, Joel W. Burdick, Yisong Yue

    Abstract: Enforcing safety is a key aspect of many problems pertaining to sequential decision making under uncertainty, which require the decisions made at every step to be both informative of the optimal decision and also safe. For example, we value both efficacy and comfort in medical therapy, and efficiency and safety in robotic control. We consider this problem of optimizing an unknown utility function… ▽ More

    Submitted 26 January, 2020; v1 submitted 20 June, 2018; originally announced June 2018.

    Comments: International Conference on Machine Learning (ICML) 2018

  36. arXiv:1711.07894  [pdf, other

    stat.ML cs.AI q-bio.NC

    Quantifying Performance of Bipedal Standing with Multi-channel EMG

    Authors: Yanan Sui, Kun ho Kim, Joel W. Burdick

    Abstract: Spinal cord stimulation has enabled humans with motor complete spinal cord injury (SCI) to independently stand and recover some lost autonomic function. Quantifying the quality of bipedal standing under spinal stimulation is important for spinal rehabilitation therapies and for new strategies that seek to combine spinal stimulation and rehabilitative robots (such as exoskeletons) in real time feed… ▽ More

    Submitted 21 November, 2017; originally announced November 2017.

    Journal ref: IROS 2017

  37. arXiv:1710.03592  [pdf, ps, other

    cs.AI

    Meta Inverse Reinforcement Learning via Maximum Reward Sharing for Human Motion Analysis

    Authors: Kun Li, Joel W. Burdick

    Abstract: This work handles the inverse reinforcement learning (IRL) problem where only a small number of demonstrations are available from a demonstrator for each high-dimensional task, insufficient to estimate an accurate reward function. Observing that each demonstrator has an inherent reward for each state and the task-specific behaviors mainly depend on a small number of key states, we propose a meta I… ▽ More

    Submitted 12 October, 2017; v1 submitted 7 October, 2017; originally announced October 2017.

    Comments: arXiv admin note: text overlap with arXiv:1707.09394

  38. arXiv:1708.07738  [pdf, ps, other

    cs.LG cs.RO

    A Function Approximation Method for Model-based High-Dimensional Inverse Reinforcement Learning

    Authors: Kun Li, Joel W. Burdick

    Abstract: This works handles the inverse reinforcement learning problem in high-dimensional state spaces, which relies on an efficient solution of model-based high-dimensional reinforcement learning problems. To solve the computationally expensive reinforcement learning problems, we propose a function approximation method to ensure that the Bellman Optimality Equation always holds, and then estimate a funct… ▽ More

    Submitted 23 August, 2017; originally announced August 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1707.09394

  39. arXiv:1707.09394  [pdf, ps, other

    cs.LG cs.RO

    Inverse Reinforcement Learning in Large State Spaces via Function Approximation

    Authors: Kun Li, Joel W. Burdick

    Abstract: This paper introduces a new method for inverse reinforcement learning in large-scale and high-dimensional state spaces. To avoid solving the computationally expensive reinforcement learning problems in reward learning, we propose a function approximation method to ensure that the Bellman Optimality Equation always holds, and then estimate a function to maximize the likelihood of the observed motio… ▽ More

    Submitted 13 August, 2017; v1 submitted 28 July, 2017; originally announced July 2017.

    Comments: Experiment updated

  40. arXiv:1707.09393  [pdf, ps, other

    cs.RO

    Online Inverse Reinforcement Learning via Bellman Gradient Iteration

    Authors: Kun Li, Joel W. Burdick

    Abstract: This paper develops an online inverse reinforcement learning algorithm aimed at efficiently recovering a reward function from ongoing observations of an agent's actions. To reduce the computation time and storage space in reward estimation, this work assumes that each observed action implies a change of the Q-value distribution, and relates the change to the reward function via the gradient of Q-v… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: The code and video are available at https://github.com/mestoking/BellmanGradientIteration/ . arXiv admin note: substantial text overlap with arXiv:1707.07767

  41. arXiv:1707.07767  [pdf, ps, other

    cs.LG cs.RO

    Bellman Gradient Iteration for Inverse Reinforcement Learning

    Authors: Kun Li, Yanan Sui, Joel W. Burdick

    Abstract: This paper develops an inverse reinforcement learning algorithm aimed at recovering a reward function from the observed actions of an agent. We introduce a strategy to flexibly handle different types of actions with two approximations of the Bellman Optimality Equation, and a Bellman Gradient Iteration method to compute the gradient of the Q-value with respect to the reward function. These methods… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

  42. arXiv:1707.07139  [pdf, ps, other

    cs.RO cs.CV

    Clinical Patient Tracking in the Presence of Transient and Permanent Occlusions via Geodesic Feature

    Authors: Kun Li, Joel W. Burdick

    Abstract: This paper develops a method to use RGB-D cameras to track the motions of a human spinal cord injury patient undergoing spinal stimulation and physical rehabilitation. Because clinicians must remain close to the patient during training sessions, the patient is usually under permanent and transient occlusions due to the training equipment and the movements of the attending clinicians. These occlusi… ▽ More

    Submitted 24 July, 2017; v1 submitted 22 July, 2017; originally announced July 2017.

  43. arXiv:1707.02375  [pdf, ps, other

    cs.LG

    Correlational Dueling Bandits with Application to Clinical Treatment in Large Decision Spaces

    Authors: Yanan Sui, Yisong Yue, Joel W. Burdick

    Abstract: We consider sequential decision making under uncertainty, where the goal is to optimize over a large decision space using noisy comparative feedback. This problem can be formulated as a $K$-armed Dueling Bandits problem where $K$ is the total number of decisions. When $K$ is very large, existing dueling bandits algorithms suffer huge cumulative regret before converging on the optimal arm. This pap… ▽ More

    Submitted 7 July, 2017; originally announced July 2017.

  44. arXiv:1705.00253  [pdf, ps, other

    cs.LG

    Multi-dueling Bandits with Dependent Arms

    Authors: Yanan Sui, Vincent Zhuang, Joel W. Burdick, Yisong Yue

    Abstract: The dueling bandits problem is an online learning framework for learning from pairwise preference feedback, and is particularly well-suited for modeling settings that elicit subjective or implicit human feedback. In this paper, we study the problem of multi-dueling bandits with dependent arms, which extends the original dueling bandits setting by simultaneously dueling multiple arms as well as mod… ▽ More

    Submitted 29 April, 2017; originally announced May 2017.

  45. arXiv:1410.2792  [pdf, other

    cs.RO eess.SY

    Convex Model Predictive Control for Vehicular Systems

    Authors: Tiffany A. Huang, Matanya B. Horowitz, Joel W. Burdick

    Abstract: In this work, we present a method to perform Model Predictive Control (MPC) over systems whose state is an element of $SO(n)$ for $n=2,3$. This is done without charts or any local linearization, and instead is performed by operating over the orbitope of rotation matrices. This results in a novel MPC scheme without the drawbacks associated with conventional linearization techniques. Instead, second… ▽ More

    Submitted 10 October, 2014; originally announced October 2014.

  46. arXiv:1409.5993  [pdf, other

    cs.RO

    Optimal Navigation Functions for Nonlinear Stochastic Systems

    Authors: Matanya B. Horowitz, Joel W. Burdick

    Abstract: This paper presents a new methodology to craft navigation functions for nonlinear systems with stochastic uncertainty. The method relies on the transformation of the Hamilton-Jacobi-Bellman (HJB) equation into a linear partial differential equation. This approach allows for optimality criteria to be incorporated into the navigation function, and generalizes several existing results in navigation f… ▽ More

    Submitted 21 September, 2014; originally announced September 2014.

    Comments: Accepted to IROS 2014. 8 Pages

  47. arXiv:1401.3700  [pdf, other

    cs.CV

    Convex Relaxations of SE(2) and SE(3) for Visual Pose Estimation

    Authors: Matanya B. Horowitz, Nikolai Matni, Joel W. Burdick

    Abstract: This paper proposes a new method for rigid body pose estimation based on spectrahedral representations of the tautological orbitopes of $SE(2)$ and $SE(3)$. The approach can use dense point cloud data from stereo vision or an RGB-D sensor (such as the Microsoft Kinect), as well as visual appearance data. The method is a convex relaxation of the classical pose estimation problem, and is based on ex… ▽ More

    Submitted 6 April, 2014; v1 submitted 15 January, 2014; originally announced January 2014.

    Comments: ICRA 2014 Preprint