Search | arXiv e-print repository

Open Problems in Technical AI Governance

Authors: Anka Reuel, Ben Bucknall, Stephen Casper, Tim Fist, Lisa Soder, Onni Aarne, Lewis Hammond, Lujain Ibrahim, Alan Chan, Peter Wills, Markus Anderljung, Ben Garfinkel, Lennart Heim, Andrew Trask, Gabriel Mukobi, Rylan Schaeffer, Mauricio Baker, Sara Hooker, Irene Solaiman, Alexandra Sasha Luccioni, Nitarshan Rajkumar, Nicolas Moës, Jeffrey Ladish, Neel Guha, Jessica Newman , et al. (6 additional authors not shown)

Abstract: AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and uncertainties faced are at least partly technical. Technical AI governance, referring to technical analysis and tools for supporting the effective governance of AI, seeks to address such challenges. It can help to (a) identify areas where interve… ▽ More AI progress is creating a growing range of risks and opportunities, but it is often unclear how they should be navigated. In many cases, the barriers and uncertainties faced are at least partly technical. Technical AI governance, referring to technical analysis and tools for supporting the effective governance of AI, seeks to address such challenges. It can help to (a) identify areas where intervention is needed, (b) identify and assess the efficacy of potential governance actions, and (c) enhance governance options by designing mechanisms for enforcement, incentivization, or compliance. In this paper, we explain what technical AI governance is, why it is important, and present a taxonomy and incomplete catalog of its open problems. This paper is intended as a resource for technical researchers or research funders looking to contribute to AI governance. △ Less

Submitted 20 July, 2024; originally announced July 2024.

Comments: Ben Bucknall and Anka Reuel contributed equally and share the first author position

arXiv:2406.12137 [pdf, other]

IDs for AI Systems

Authors: Alan Chan, Noam Kolt, Peter Wills, Usman Anwar, Christian Schroeder de Witt, Nitarshan Rajkumar, Lewis Hammond, David Krueger, Lennart Heim, Markus Anderljung

Abstract: AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. A user may not be able to verify whether a system has certain safety certifications. An investigator may not know whom to investigate when a system causes an incident. It may not be clear whom to contact to shut down a malfunctioning system. Across a number of… ▽ More AI systems are increasingly pervasive, yet information needed to decide whether and how to engage with them may not exist or be accessible. A user may not be able to verify whether a system has certain safety certifications. An investigator may not know whom to investigate when a system causes an incident. It may not be clear whom to contact to shut down a malfunctioning system. Across a number of domains, IDs address analogous problems by identifying particular entities (e.g., a particular Boeing 747) and providing information about other entities of the same class (e.g., some or all Boeing 747s). We propose a framework in which IDs are ascribed to instances of AI systems (e.g., a particular chat session with Claude 3), and associated information is accessible to parties seeking to interact with that system. We characterize IDs for AI systems, provide concrete examples where IDs could be useful, argue that there could be significant demand for IDs from key actors, analyze how those actors could incentivize ID adoption, explore a potential implementation of our framework for deployers of AI systems, and highlight limitations and risks. IDs seem most warranted in settings where AI systems could have a large impact upon the world, such as in making financial transactions or contacting real humans. With further study, IDs could help to manage a world where AI systems pervade society. △ Less

Submitted 18 July, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

Comments: Work-in-progress

arXiv:2404.09932 [pdf, other]

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Authors: Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi , et al. (13 additional authors not shown)

Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions. This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions. △ Less

Submitted 15 April, 2024; originally announced April 2024.

arXiv:2402.15821 [pdf, other]

doi 10.24963/ijcai.2024/26

Cooperation and Control in Delegation Games

Authors: Oliver Sourbut, Lewis Hammond, Harriet Wood

Abstract: Many settings of interest involving humans and machines -- from virtual personal assistants to autonomous vehicles -- can naturally be modelled as principals (humans) delegating to agents (machines), which then interact with each other on their principals' behalf. We refer to these multi-principal, multi-agent scenarios as delegation games. In such games, there are two important failure modes: pro… ▽ More Many settings of interest involving humans and machines -- from virtual personal assistants to autonomous vehicles -- can naturally be modelled as principals (humans) delegating to agents (machines), which then interact with each other on their principals' behalf. We refer to these multi-principal, multi-agent scenarios as delegation games. In such games, there are two important failure modes: problems of control (where an agent fails to act in line their principal's preferences) and problems of cooperation (where the agents fail to work well together). In this paper we formalise and analyse these problems, further breaking them down into issues of alignment (do the players have similar preferences?) and capabilities (how competent are the players at satisfying those preferences?). We show -- theoretically and empirically -- how these measures determine the principals' welfare, how they can be estimated using limited observations, and thus how they might be used to help us design more aligned and cooperative AI systems. △ Less

Submitted 5 August, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

Comments: Published at IJCAI 2024

arXiv:2402.07510 [pdf, other]

Secret Collusion among Generative AI Agents

Authors: Sumeet Ramesh Motwani, Mikhail Baranchuk, Martin Strohmeier, Vijay Bolina, Philip H. S. Torr, Lewis Hammond, Christian Schroeder de Witt

Abstract: Recent capability increases in large language models (LLMs) open up applications in which groups of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concerning the unauthorised sharing of information, or other unwanted forms of agent coordination. Modern steganographic techniques could render such dynamics hard to detect. In this paper, we comprehens… ▽ More Recent capability increases in large language models (LLMs) open up applications in which groups of communicating generative AI agents solve joint tasks. This poses privacy and security challenges concerning the unauthorised sharing of information, or other unwanted forms of agent coordination. Modern steganographic techniques could render such dynamics hard to detect. In this paper, we comprehensively formalise the problem of secret collusion in systems of generative AI agents by drawing on relevant concepts from both AI and security literature. We study incentives for the use of steganography, and propose a variety of mitigation measures. Our investigations result in a model evaluation framework that systematically tests capabilities required for various forms of secret collusion. We provide extensive empirical results across a range of contemporary LLMs. While the steganographic capabilities of current models remain limited, GPT-4 displays a capability jump suggesting the need for continuous monitoring of steganographic frontier model capabilities. We conclude by laying out a comprehensive research program to mitigate future risks of collusion between generative AI models. △ Less

Submitted 28 August, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

arXiv:2401.13138 [pdf, other]

Visibility into AI Agents

Authors: Alan Chan, Carson Ezell, Max Kaufmann, Kevin Wei, Lewis Hammond, Herbie Bradley, Emma Bluemke, Nitarshan Rajkumar, David Krueger, Noam Kolt, Lennart Heim, Markus Anderljung

Abstract: Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ens… ▽ More Increased delegation of commercial, scientific, governmental, and personal activities to AI agents -- systems capable of pursuing complex goals with limited supervision -- may exacerbate existing societal risks and introduce new risks. Understanding and mitigating these risks involves critically evaluating existing governance structures, revising and adapting these structures where needed, and ensuring accountability of key stakeholders. Information about where, why, how, and by whom certain AI agents are used, which we refer to as visibility, is critical to these objectives. In this paper, we assess three categories of measures to increase visibility into AI agents: agent identifiers, real-time monitoring, and activity logging. For each, we outline potential implementations that vary in intrusiveness and informativeness. We analyze how the measures apply across a spectrum of centralized through decentralized deployment contexts, accounting for various actors in the supply chain including hardware and software service providers. Finally, we discuss the implications of our measures for privacy and concentration of power. Further work into understanding the measures and mitigating their negative impacts can help to build a foundation for the governance of AI agents. △ Less

Submitted 17 May, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

Comments: Accepted to ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT 2024)

arXiv:2310.08901 [pdf, other]

Welfare Diplomacy: Benchmarking Language Model Cooperation

Authors: Gabriel Mukobi, Hannah Erlebach, Niklas Lauffer, Lewis Hammond, Alan Chan, Jesse Clifton

Abstract: The growing capabilities and increasingly widespread deployment of AI systems necessitate robust benchmarks for measuring their cooperative capabilities. Unfortunately, most multi-agent benchmarks are either zero-sum or purely cooperative, providing limited opportunities for such measurements. We introduce a general-sum variant of the zero-sum board game Diplomacy -- called Welfare Diplomacy -- in… ▽ More The growing capabilities and increasingly widespread deployment of AI systems necessitate robust benchmarks for measuring their cooperative capabilities. Unfortunately, most multi-agent benchmarks are either zero-sum or purely cooperative, providing limited opportunities for such measurements. We introduce a general-sum variant of the zero-sum board game Diplomacy -- called Welfare Diplomacy -- in which players must balance investing in military conquest and domestic welfare. We argue that Welfare Diplomacy facilitates both a clearer assessment of and stronger training incentives for cooperative capabilities. Our contributions are: (1) proposing the Welfare Diplomacy rules and implementing them via an open-source Diplomacy engine; (2) constructing baseline agents using zero-shot prompted language models; and (3) conducting experiments where we find that baselines using state-of-the-art models attain high social welfare but are exploitable. Our work aims to promote societal safety by aiding researchers in developing and assessing multi-agent AI systems. Code to evaluate Welfare Diplomacy and reproduce our experiments is available at https://github.com/mukobi/welfare-diplomacy. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2309.13174 [pdf, other]

Robust self-propulsion in sand using simply controlled vibrating cubes

Authors: Bangyuan Liu, Tianyu Wang, Velin Kojouharov, Frank L. Hammond III, Daniel I. Goldman

Abstract: Much of the Earth and many surfaces of extraterrestrial bodies are composed of in-cohesive particle matter. Locomoting on granular terrain is challenging for common robotic devices, either wheeled or legged. In this work, we discover a robust alternative locomotion mechanism on granular media -- generating movement via self-vibration. To demonstrate the effectiveness of this locomotion mechanism,… ▽ More Much of the Earth and many surfaces of extraterrestrial bodies are composed of in-cohesive particle matter. Locomoting on granular terrain is challenging for common robotic devices, either wheeled or legged. In this work, we discover a robust alternative locomotion mechanism on granular media -- generating movement via self-vibration. To demonstrate the effectiveness of this locomotion mechanism, we develop a cube-shaped robot with an embedded vibratory motor and conduct systematic experiments on diverse granular terrains of various particle properties. We investigate how locomotion changes as a function of vibration frequency/intensity on granular terrains. Compared to hard surfaces, we find such a vibratory locomotion mechanism enables the robot to move faster, and more stable on granular surfaces, facilitated by the interaction between the body and surrounding granules. The simplicity in structural design and controls of this robotic system indicates that vibratory locomotion can be a valuable alternative way to produce robust locomotion on granular terrains. We further demonstrate that such cube-shape robots can be used as modular units for morphologically structured vibratory robots with capabilities of maneuverable forward and turning motions, showing potential practical scenarios for robotic systems. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.04655 [pdf]

Intelligent upper-limb exoskeleton integrated with soft wearable bioelectronics and deep-learning for human intention-driven strength augmentation based on sensory feedback

Authors: Jinwoo Lee, Kangkyu Kwon, Ira Soltis, Jared Matthews, Yoonjae Lee, Hojoong Kim, Lissette Romero, Nathan Zavanelli, Youngjin Kwon, Shinjae Kwon, Jimin Lee, Yewon Na, Sung Hoon Lee, Ki Jun Yu, Minoru Shinohara, Frank L. Hammond, Woon-Hong Yeo

Abstract: The age and stroke-associated decline in musculoskeletal strength degrades the ability to perform daily human tasks using the upper extremities. Although there are a few examples of exoskeletons, they need manual operations due to the absence of sensor feedback and no intention prediction of movements. Here, we introduce an intelligent upper-limb exoskeleton system that uses cloud-based deep learn… ▽ More The age and stroke-associated decline in musculoskeletal strength degrades the ability to perform daily human tasks using the upper extremities. Although there are a few examples of exoskeletons, they need manual operations due to the absence of sensor feedback and no intention prediction of movements. Here, we introduce an intelligent upper-limb exoskeleton system that uses cloud-based deep learning to predict human intention for strength augmentation. The embedded soft wearable sensors provide sensory feedback by collecting real-time muscle signals, which are simultaneously computed to determine the user's intended movement. The cloud-based deep-learning predicts four upper-limb joint motions with an average accuracy of 96.2% at a 200-250 millisecond response rate, suggesting that the exoskeleton operates just by human intention. In addition, an array of soft pneumatics assists the intended movements by providing 897 newton of force and 78.7 millimeter of displacement at maximum. Collectively, the intent-driven exoskeleton can augment human strength by 5.15 times on average compared to the unassisted exoskeleton. This report demonstrates an exoskeleton robot that augments the upper-limb joint movements by human intention based on a machine-learning cloud computing and sensory feedback. △ Less

Submitted 26 January, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 15 pages, 6 figures, 1 table, published in npj flexible electronics journals

MSC Class: 68T40 (Primary) 92C55; 68T99 (Secondary)

arXiv:2307.05059 [pdf, ps, other]

doi 10.4204/EPTCS.379.17

On Imperfect Recall in Multi-Agent Influence Diagrams

Authors: James Fox, Matt MacDermott, Lewis Hammond, Paul Harrenstein, Alessandro Abate, Michael Wooldridge

Abstract: Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfec… ▽ More Multi-agent influence diagrams (MAIDs) are a popular game-theoretic model based on Bayesian networks. In some settings, MAIDs offer significant advantages over extensive-form game representations. Previous work on MAIDs has assumed that agents employ behavioural policies, which set independent conditional probability distributions over actions for each of their decisions. In settings with imperfect recall, however, a Nash equilibrium in behavioural policies may not exist. We overcome this by showing how to solve MAIDs with forgetful and absent-minded agents using mixed policies and two types of correlated equilibrium. We also analyse the computational complexity of key decision problems in MAIDs, and explore tractable cases. Finally, we describe applications of MAIDs to Markov games and team situations, where imperfect recall is often unavoidable. △ Less

Submitted 11 July, 2023; originally announced July 2023.

Comments: In Proceedings TARK 2023, arXiv:2307.04005

Journal ref: EPTCS 379, 2023, pp. 201-220

arXiv:2304.09370 [pdf, other]

Integrating Reconfigurable Foot Design, Multi-modal Contact Sensing, and Terrain Classification for Bipedal Locomotion

Authors: Ted Tyler, Vaibhav Malhotra, Adam Montague, Zhigen Zhao, Frank L. Hammond III, Ye Zhao

Abstract: The ability of bipedal robots to adapt to diverse and unstructured terrain conditions is crucial for their deployment in real-world environments. To this end, we present a novel, bio-inspired robot foot design with stabilizing tarsal segments and a multifarious sensor suite involving acoustic, capacitive, tactile, temperature, and acceleration sensors. A real-time signal processing and terrain cla… ▽ More The ability of bipedal robots to adapt to diverse and unstructured terrain conditions is crucial for their deployment in real-world environments. To this end, we present a novel, bio-inspired robot foot design with stabilizing tarsal segments and a multifarious sensor suite involving acoustic, capacitive, tactile, temperature, and acceleration sensors. A real-time signal processing and terrain classification system is developed and evaluated. The sensed terrain information is used to control actuated segments of the foot, leading to improved ground contact and stability. The proposed framework highlights the potential of the sensor-integrated adaptive foot for intelligent and adaptive locomotion. △ Less

Submitted 18 April, 2023; originally announced April 2023.

Comments: 7 pages, 6 figures

arXiv:2301.02324 [pdf, other]

doi 10.1016/j.artint.2023.103919

Reasoning about Causality in Games

Authors: Lewis Hammond, James Fox, Tom Everitt, Ryan Carey, Alessandro Abate, Michael Wooldridge

Abstract: Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's caus… ▽ More Causal reasoning and game-theoretic reasoning are fundamental topics in artificial intelligence, among many other disciplines: this paper is concerned with their intersection. Despite their importance, a formal framework that supports both these forms of reasoning has, until now, been lacking. We offer a solution in the form of (structural) causal games, which can be seen as extending Pearl's causal hierarchy to the game-theoretic domain, or as extending Koller and Milch's multi-agent influence diagrams to the causal domain. We then consider three key questions: i) How can the (causal) dependencies in games - either between variables, or between strategies - be modelled in a uniform, principled manner? ii) How may causal queries be computed in causal games, and what assumptions does this require? iii) How do causal games compare to existing formalisms? To address question i), we introduce mechanised games, which encode dependencies between agents' decision rules and the distributions governing the game. In response to question ii), we present definitions of predictions, interventions, and counterfactuals, and discuss the assumptions required for each. Regarding question iii), we describe correspondences between causal games and other formalisms, and explain how causal games can be used to answer queries that other causal or game-theoretic models do not support. Finally, we highlight possible applications of causal games, aided by an extensive open-source Python library. △ Less

Submitted 17 April, 2023; v1 submitted 5 January, 2023; originally announced January 2023.

Comments: Published in Artificial Intelligence (2023)

arXiv:2212.13769 [pdf, other]

doi 10.24963/ijcai.2022/476

Lexicographic Multi-Objective Reinforcement Learning

Authors: Joar Skalse, Lewis Hammond, Charlie Griffin, Alessandro Abate

Abstract: In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems. These are problems that involve multiple reward signals, and where the goal is to learn a policy that maximises the first reward signal, and subject to this constraint also maximises the second reward signal, and so on. We present a family of both action-value and policy gradient algorit… ▽ More In this work we introduce reinforcement learning techniques for solving lexicographic multi-objective problems. These are problems that involve multiple reward signals, and where the goal is to learn a policy that maximises the first reward signal, and subject to this constraint also maximises the second reward signal, and so on. We present a family of both action-value and policy gradient algorithms that can be used to solve such problems, and prove that they converge to policies that are lexicographically optimal. We evaluate the scalability and performance of these algorithms empirically, demonstrating their practical applicability. As a more specific application, we show how our algorithms can be used to impose safety constraints on the behaviour of an agent, and compare their performance in this context with that of other constrained reinforcement learning algorithms. △ Less

Submitted 28 December, 2022; originally announced December 2022.

Journal ref: IJCAI 2022; Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence. Main Track, Pages 3430-3436

arXiv:2212.09234 [pdf, other]

Real-Time Deformable-Contact-Aware Model Predictive Control for Force-Modulated Manipulation

Authors: Lasitha Wijayarathne, Ziyi Zhou, Ye Zhao, Frank L. Hammond III

Abstract: Force modulation of robotic manipulators has been extensively studied for several decades. However, it is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory opti… ▽ More Force modulation of robotic manipulators has been extensively studied for several decades. However, it is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory optimization and force control of the interaction between a manipulator and soft environments, which is prone to external disturbances. Sliding friction and normal contact force are taken into account. The dynamics of the soft contact model and the manipulator are simultaneously incorporated in a trajectory optimizer to generate desired motion and force profiles. A constrained optimization framework based on Alternative Direction Method of Multipliers (ADMM) has been employed to efficiently generate real-time optimal control inputs and high-dimensional state trajectories in a Model Predictive Control fashion. Experimental validation of the model performance is conducted on a soft substrate with known material properties using a Cartesian space force control mode. Results show a comparison of ground truth and real-time model-based contact force and motion tracking for multiple Cartesian motions in the valid range of the friction model. It is shown that a contact model-based motion planner can compensate for frictional forces and motion disturbances and improve the overall motion and force tracking accuracy. The proposed high-level planner has the potential to facilitate the automation of medical tasks involving the manipulation of compliant, delicate, and deformable tissues. △ Less

Submitted 9 June, 2023; v1 submitted 18 December, 2022; originally announced December 2022.

Comments: arXiv admin note: text overlap with arXiv:2004.09734

arXiv:2209.15320 [pdf, other]

Bounded Robustness in Reinforcement Learning via Lexicographic Objectives

Authors: Daniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate

Abstract: Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator in… ▽ More Policy robustness in Reinforcement Learning may not be desirable at any cost: the alterations caused by robustness requirements from otherwise optimal policies should be explainable, quantifiable and formally verifiable. In this work we study how policies can be maximally robust to arbitrary observational noise by analysing how they are altered by this noise through a stochastic linear operator interpretation of the disturbances, and establish connections between robustness and properties of the noise kernel and of the underlying MDPs. Then, we construct sufficient conditions for policy robustness, and propose a robustness-inducing scheme, applicable to any policy gradient algorithm, that formally trades off expected policy utility for robustness through lexicographic optimisation, while preserving convergence and sub-optimality in the policy synthesis. △ Less

Submitted 11 December, 2023; v1 submitted 30 September, 2022; originally announced September 2022.

arXiv:2107.09119 [pdf, ps, other]

Rational Verification for Probabilistic Systems

Authors: Julian Gutierrez, Lewis Hammond, Anthony W. Lin, Muhammad Najib, Michael Wooldridge

Abstract: Rational verification is the problem of determining which temporal logic properties will hold in a multi-agent system, under the assumption that agents in the system act rationally, by choosing strategies that collectively form a game-theoretic equilibrium. Previous work in this area has largely focussed on deterministic systems. In this paper, we develop the theory and algorithms for rational ver… ▽ More Rational verification is the problem of determining which temporal logic properties will hold in a multi-agent system, under the assumption that agents in the system act rationally, by choosing strategies that collectively form a game-theoretic equilibrium. Previous work in this area has largely focussed on deterministic systems. In this paper, we develop the theory and algorithms for rational verification in probabilistic systems. We focus on concurrent stochastic games (CSGs), which can be used to model uncertainty and randomness in complex multi-agent environments. We study the rational verification problem for both non-cooperative games and cooperative games in the qualitative probabilistic setting. In the former case, we consider LTL properties satisfied by the Nash equilibria of the game and in the latter case LTL properties satisfied by the core. In both cases, we show that the problem is 2EXPTIME-complete, thus not harder than the much simpler verification problem of model checking LTL properties of systems modelled as Markov decision processes (MDPs). △ Less

Submitted 26 July, 2021; v1 submitted 19 July, 2021; originally announced July 2021.

Comments: 18th International Conference on Principles of Knowledge Representation and Reasoning (KR 2021)

arXiv:2102.05008 [pdf, other]

Equilibrium Refinements for Multi-Agent Influence Diagrams: Theory and Practice

Authors: Lewis Hammond, James Fox, Tom Everitt, Alessandro Abate, Michael Wooldridge

Abstract: Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibri… ▽ More Multi-agent influence diagrams (MAIDs) are a popular form of graphical model that, for certain classes of games, have been shown to offer key complexity and explainability advantages over traditional extensive form game (EFG) representations. In this paper, we extend previous work on MAIDs by introducing the concept of a MAID subgame, as well as subgame perfect and trembling hand perfect equilibrium refinements. We then prove several equivalence results between MAIDs and EFGs. Finally, we describe an open source implementation for reasoning about MAIDs and computing their equilibria. △ Less

Submitted 9 February, 2021; originally announced February 2021.

Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

arXiv:2102.00582 [pdf, other]

Multi-Agent Reinforcement Learning with Temporal Logic Specifications

Authors: Lewis Hammond, Alessandro Abate, Julian Gutierrez, Michael Wooldridge

Abstract: In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabili… ▽ More In this paper, we study the problem of learning to satisfy temporal logic specifications with a group of agents in an unknown environment, which may exhibit probabilistic behaviour. From a learning perspective these specifications provide a rich formal language with which to capture tasks or objectives, while from a logic and automated verification perspective the introduction of learning capabilities allows for practical applications in large, stochastic, unknown environments. The existing work in this area is, however, limited. Of the frameworks that consider full linear temporal logic or have correctness guarantees, all methods thus far consider only the case of a single temporal logic specification and a single agent. In order to overcome this limitation, we develop the first multi-agent reinforcement learning technique for temporal logic specifications, which is also novel in its ability to handle multiple specifications. We provide correctness and convergence guarantees for our main algorithm - ALMANAC (Automaton/Logic Multi-Agent Natural Actor-Critic) - even when using function approximation. Alongside our theoretical results, we further demonstrate the applicability of our technique via a set of preliminary experiments. △ Less

Submitted 9 February, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

Comments: Accepted to the 20th International Conference on Autonomous Agents and Multiagent Systems (AAMAS-21)

arXiv:2005.00739 [pdf, other]

Design-Informed Kinematic Control for Improved Dexterous Teleoperation of a Bilateral Manipulator System

Authors: Lasitha Wijayarathne, Juan Vallejo, Anthony Barnum, Zachary Cloutier, Frank L. Hammond III

Abstract: This paper explores the possibility of improving bilateral robot manipulation task performance through optimizing the robot morphology and configuration of the system through motion. To optimize the design for different scenarios, we select a set of tasks that represent the variability in small scale manipulation (e.g. pick and place, tasks involving positioning and orientation) and track the moti… ▽ More This paper explores the possibility of improving bilateral robot manipulation task performance through optimizing the robot morphology and configuration of the system through motion. To optimize the design for different scenarios, we select a set of tasks that represent the variability in small scale manipulation (e.g. pick and place, tasks involving positioning and orientation) and track the motion to obtain a reproducible trajectory. Kinematic data is captured through an electromagnetic (EM) tracker system while a human subject performs the tasks. Then, the data is pre-processed and used to optimize the morphology of each symmetric robot arm of the bilateral system. Once optimized, a kinematic control scheme is used to generate a motion with dexterous configurations. The dexterity is evaluated along the trajectories with standard dexterity metrics. Results show a 10\% improvement in dexterous maneuverability with the optimized arm design and optimal base configuration. △ Less

Submitted 2 May, 2020; originally announced May 2020.

Comments: 7 pages. submitted to RO-MAN 2020

arXiv:2004.09734 [pdf, other]

Simultaneous Trajectory Optimization and Force Control with Soft Contact Mechanics

Authors: Lasitha Wijayarathne, Qie Sima, Ziyi Zhou, Ye Zhao, Frank L. Hammond III

Abstract: Force modulation of robotic manipulators has been extensively studied for several decades but is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory optimization… ▽ More Force modulation of robotic manipulators has been extensively studied for several decades but is not yet commonly used in safety-critical applications due to a lack of accurate interaction contact modeling and weak performance guarantees - a large proportion of them concerning the modulation of interaction forces. This study presents a high-level framework for simultaneous trajectory optimization and force control of the interaction between manipulator and soft environments. Sliding friction and normal contact force are taken into account. The dynamics of the soft contact model and the manipulator dynamics are simultaneously incorporated in the trajectory optimizer to generate desired motion and force profiles. A constraint optimization framework based on Differential Dynamic Programming and Alternative Direction Method of Multipliers has been employed to generate optimal control input and high-dimensional state trajectories. Experimental validation of the model performance is conducted on a soft substrate with known material properties using Cartesian space force control mode. Results show a comparison of ground truth and predicted model based contact force states for a few cartesian motions and the validity range of the friction model. Potential applications include high-level task planning of medical tasks involving manipulation of compliant, delicate, and deformable tissues. △ Less

Submitted 5 August, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: 7 pages, Submitted to IROS 2020 (Accepted for publication)

arXiv:2004.09729 [pdf, other]

Identification of Compliant Contact Parameters and Admittance Force Modulation on a Non-stationary Compliant Surface

Authors: Lasitha Wijayarathne, Frank L. Hammond III

Abstract: Although autonomous control of robotic manipulators has been studied for several decades, they are not commonly used in safety-critical applications due to lack of safety and performance guarantees - many of them concerning the modulation of interaction forces. This paper presents a mechanical probing strategy for estimating the environmental impedance parameters of compliant environments, indepen… ▽ More Although autonomous control of robotic manipulators has been studied for several decades, they are not commonly used in safety-critical applications due to lack of safety and performance guarantees - many of them concerning the modulation of interaction forces. This paper presents a mechanical probing strategy for estimating the environmental impedance parameters of compliant environments, independent a manipulator's controller design, and configuration. The parameter estimates are used in a position-based adaptive force controller to enable control of interaction forces in compliant, stationary, and non-stationary environments. This approach is targeted for applications where the workspace is constrained and non-stationary, and where force control is critical to task success. These applications include surgical tasks involving manipulation of compliant, delicate, moving tissues. Results show fast parameter estimation and successful force modulation that compensates for motion. △ Less

Submitted 20 April, 2020; originally announced April 2020.

Comments: Accepted for publication ay ICRA 2020

arXiv:1810.03736 [pdf, other]

doi 10.1007/s10618-020-00726-4

Learning Tractable Probabilistic Models for Moral Responsibility and Blame

Authors: Lewis Hammond, Vaishak Belle

Abstract: Moral responsibility is a major concern in autonomous systems, with applications ranging from self-driving cars to kidney exchanges. Although there have been recent attempts to formalise responsibility and blame, among similar notions, the problem of learning within these formalisms has been unaddressed. From the viewpoint of such systems, the urgent questions are: (a) How can models of moral scen… ▽ More Moral responsibility is a major concern in autonomous systems, with applications ranging from self-driving cars to kidney exchanges. Although there have been recent attempts to formalise responsibility and blame, among similar notions, the problem of learning within these formalisms has been unaddressed. From the viewpoint of such systems, the urgent questions are: (a) How can models of moral scenarios and blameworthiness be extracted and learnt automatically from data? (b) How can judgements be computed effectively and efficiently, given the split-second decision points faced by some systems? By building on constrained tractable probabilistic learning, we propose and implement a hybrid (between data-driven and rule-based methods) learning framework for inducing models of such scenarios automatically from data and reasoning tractably from them. We report on experiments that compare our system with human judgement in three illustrative domains: lung cancer staging, teamwork management, and trolley problems. △ Less

Submitted 30 January, 2021; v1 submitted 8 October, 2018; originally announced October 2018.

Comments: Published in Data Mining and Knowledge Discovery (2021)

Showing 1–22 of 22 results for author: Hammond, L