Zum Hauptinhalt springen

Showing 1–10 of 10 results for author: Hasanbeig, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2102.12855  [pdf, other

    cs.LG cs.AI cs.FL cs.LO

    Modular Deep Reinforcement Learning for Continuous Motion Planning with Temporal Logic

    Authors: Mingyu Cai, Mohammadhosein Hasanbeig, Shaoping Xiao, Alessandro Abate, Zhen Kan

    Abstract: This paper investigates the motion planning of autonomous dynamical systems modeled by Markov decision processes (MDP) with unknown transition probabilities over continuous state and action spaces. Linear temporal logic (LTL) is used to specify high-level tasks over infinite horizon, which can be converted into a limit deterministic generalized Büchi automaton (LDGBA) with several accepting sets.… ▽ More

    Submitted 23 January, 2022; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: arXiv admin note: text overlap with arXiv:2010.06797

    Journal ref: IEEE Robotics and Automation Letters, 2021

  2. arXiv:2101.08153  [pdf, other

    cs.AI

    Shielding Atari Games with Bounded Prescience

    Authors: Mirco Giacobbe, Mohammadhosein Hasanbeig, Daniel Kroening, Hjalmar Wijk

    Abstract: Deep reinforcement learning (DRL) is applied in safety-critical domains such as robotics and autonomous driving. It achieves superhuman abilities in many tasks, however whether DRL agents can be shown to act safely is an open problem. Atari games are a simple yet challenging exemplar for evaluating the safety of DRL agents and feature a diverse portfolio of game mechanics. The safety of neural age… ▽ More

    Submitted 22 January, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

    Comments: To appear at AAMAS 2021

  3. arXiv:2007.02527  [pdf, other

    cs.AI

    Jump Operator Planning: Goal-Conditioned Policy Ensembles and Zero-Shot Transfer

    Authors: Thomas J. Ringstrom, Mohammadhosein Hasanbeig, Alessandro Abate

    Abstract: In Hierarchical Control, compositionality, abstraction, and task-transfer are crucial for designing versatile algorithms which can solve a variety of problems with maximal representational reuse. We propose a novel hierarchical and compositional framework called Jump-Operator Dynamic Programming for quickly computing solutions within a super-exponential space of sequential sub-goal tasks with orde… ▽ More

    Submitted 6 July, 2020; originally announced July 2020.

  4. arXiv:2002.12156  [pdf, other

    cs.LG cs.AI cs.LO eess.SY stat.ML

    Cautious Reinforcement Learning with Logical Constraints

    Authors: Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

    Abstract: This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal, expressed as a temporal logic formula, with maximal probability. Enforcing the RL agent to stay safe during learning might limit the exploration, however we show that… ▽ More

    Submitted 21 March, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Accepted to AAMAS 2020. arXiv admin note: text overlap with arXiv:1902.00778

  5. arXiv:1911.10244  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

    Authors: Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate, Tom Melham, Daniel Kroening

    Abstract: This paper proposes DeepSynth, a method for effective training of deep Reinforcement Learning (RL) agents when the reward is sparse and non-Markovian, but at the same time progress towards the reward requires achieving an unknown sequence of high-level objectives. Our method employs a novel algorithm for synthesis of compact automata to uncover this sequential structure automatically. We synthesis… ▽ More

    Submitted 6 March, 2021; v1 submitted 22 November, 2019; originally announced November 2019.

    Comments: Extended version of AAAI 2021 paper

  6. arXiv:1909.11591  [pdf, other

    cs.LG cs.AI cs.LO eess.SY stat.ML

    Modular Deep Reinforcement Learning with Temporal Logic Specifications

    Authors: Lim Zun Yuan, Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

    Abstract: We propose an actor-critic, model-free, and online Reinforcement Learning (RL) framework for continuous-state continuous-action Markov Decision Processes (MDPs) when the reward is highly sparse but encompasses a high-level temporal structure. We represent this temporal structure by a finite-state machine and construct an on-the-fly synchronised product with the MDP and the finite machine. The temp… ▽ More

    Submitted 22 November, 2019; v1 submitted 23 September, 2019; originally announced September 2019.

    Comments: arXiv admin note: text overlap with arXiv:1902.00778

  7. arXiv:1909.05304  [pdf, other

    cs.LO cs.LG eess.SY stat.ML

    Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees

    Authors: Mohammadhosein Hasanbeig, Yiannis Kantaros, Alessandro Abate, Daniel Kroening, George J. Pappas, Insup Lee

    Abstract: Reinforcement Learning (RL) has emerged as an efficient method of choice for solving complex sequential decision making problems in automatic control, computer science, economics, and biology. In this paper we present a model-free RL algorithm to synthesize control policies that maximize the probability of satisfying high-level control objectives given as Linear Temporal Logic (LTL) formulas. Unce… ▽ More

    Submitted 11 September, 2019; originally announced September 2019.

  8. arXiv:1809.07823  [pdf, other

    cs.LG cs.FL cs.LO stat.ML

    Logically-Constrained Neural Fitted Q-Iteration

    Authors: Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

    Abstract: We propose a method for efficient training of Q-functions for continuous-state Markov Decision Processes (MDPs) such that the traces of the resulting policies satisfy a given Linear Temporal Logic (LTL) property. LTL, a modal logic, can express a wide range of time-dependent logical properties (including "safety") that are quite similar to patterns in natural language. We convert the LTL property… ▽ More

    Submitted 14 March, 2019; v1 submitted 20 September, 2018; originally announced September 2018.

    Comments: AAMAS 2019

  9. arXiv:1802.02277  [pdf, other

    cs.LG cs.MA

    From Game-theoretic Multi-agent Log Linear Learning to Reinforcement Learning

    Authors: Mohammadhosein Hasanbeig, Lacra Pavel

    Abstract: The main focus of this paper is on enhancement of two types of game-theoretic learning algorithms: log-linear learning and reinforcement learning. The standard analysis of log-linear learning needs a highly structured environment, i.e. strong assumptions about the game from an implementation perspective. In this paper, we introduce a variant of log-linear learning that provides asymptotic guarante… ▽ More

    Submitted 18 September, 2018; v1 submitted 6 February, 2018; originally announced February 2018.

  10. arXiv:1801.08099  [pdf, other

    cs.LG cs.LO

    Logically-Constrained Reinforcement Learning

    Authors: Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

    Abstract: We present the first model-free Reinforcement Learning (RL) algorithm to synthesise policies for an unknown Markov Decision Process (MDP), such that a linear time property is satisfied. The given temporal property is converted into a Limit Deterministic Buchi Automaton (LDBA) and a robust reward function is defined over the state-action pairs of the MDP according to the resulting LDBA. With this r… ▽ More

    Submitted 16 February, 2019; v1 submitted 24 January, 2018; originally announced January 2018.