Scaling Opponent Shaping to High Dimensional Games
Authors:
Akbir Khan,
Timon Willi,
Newton Kwan,
Andrea Tacchetti,
Chris Lu,
Edward Grefenstette,
Tim Rocktäschel,
Jakob Foerster
Abstract:
In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes. To address this issue, opponent shaping (OS) methods explicitly learn to influence the learning dynamics of co-players and empirically lead to improved individual and collective outcomes. However, OS methods have only been evaluated in low-dimensional environments du…
▽ More
In multi-agent settings with mixed incentives, methods developed for zero-sum games have been shown to lead to detrimental outcomes. To address this issue, opponent shaping (OS) methods explicitly learn to influence the learning dynamics of co-players and empirically lead to improved individual and collective outcomes. However, OS methods have only been evaluated in low-dimensional environments due to the challenges associated with estimating higher-order derivatives or scaling model-free meta-learning. Alternative methods that scale to more complex settings either converge to undesirable solutions or rely on unrealistic assumptions about the environment or co-players. In this paper, we successfully scale an OS-based approach to general-sum games with temporally-extended actions and long-time horizons for the first time. After analysing the representations of the meta-state and history used by previous algorithms, we propose a simplified version called Shaper. We show empirically that Shaper leads to improved individual and collective outcomes in a range of challenging settings from literature. We further formalize a technique previously implicit in the literature, and analyse its contribution to opponent shaping. We show empirically that this technique is helpful for the functioning of prior methods in certain environments. Lastly, we show that previous environments, such as the CoinGame, are inadequate for analysing temporally-extended general-sum interactions.
△ Less
Submitted 10 February, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
Constructions for Nonadaptive Tropical Group Testing
Authors:
Nicholas Kwan,
Lele Wang
Abstract:
PCR testing is an invaluable diagnostic tool that has most recently seen widespread use during the COVID-19 pandemic. A recent work by Wang, Gabrys and Vardy proposed tropical codes as a model for group PCR testing. For a known but arbitrary number of infected persons, a sufficient condition on the underlying block design of a zero-error tropical code, called double disjunction, is proposed. Despi…
▽ More
PCR testing is an invaluable diagnostic tool that has most recently seen widespread use during the COVID-19 pandemic. A recent work by Wang, Gabrys and Vardy proposed tropical codes as a model for group PCR testing. For a known but arbitrary number of infected persons, a sufficient condition on the underlying block design of a zero-error tropical code, called double disjunction, is proposed. Despite this, the parameters for which the construction of doubly disjunct block designs is known to exist are very limited. In this paper, we define probabilistic tropical codes and consider random block designs that are doubly disjunct with high probability. We also provide a deterministic construction for a doubly disjunct block design given a disjunct block design. We show that for certain choices of parameters, our probabilistic construction has vanishing error. Our constructions, combined with existing methods, give us three different ways to construct tropical codes. We compare the number of tests required by each, and bounds on the error.
△ Less
Submitted 8 May, 2023;
originally announced May 2023.