Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Laterre, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.07963  [pdf, other

    cs.AI cs.LG

    SPO: Sequential Monte Carlo Policy Optimisation

    Authors: Matthew V Macfarlane, Edan Toledo, Donal Byrne, Paul Duckworth, Alexandre Laterre

    Abstract: Leveraging planning during learning and decision-making is central to the long-term development of intelligent agents. Recent works have successfully combined tree-based search methods and self-play learning mechanisms to this end. However, these methods typically face scaling challenges due to the sequential nature of their search. While practical engineering solutions can partly overcome this, t… ▽ More

    Submitted 7 July, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

    Comments: 31 pages, 5 main figures

  2. arXiv:2311.13569  [pdf, other

    cs.LG cs.AI

    Combinatorial Optimization with Policy Adaptation using Latent Space Search

    Authors: Felix Chalumeau, Shikha Surana, Clement Bonnet, Nathan Grinsztajn, Arnu Pretorius, Alexandre Laterre, Thomas D. Barrett

    Abstract: Combinatorial Optimization underpins many real-world applications and yet, designing performant algorithms to solve these complex, typically NP-hard, problems remains a significant research challenge. Reinforcement Learning (RL) provides a versatile framework for designing heuristics across a broad spectrum of problem domains. However, despite notable progress, RL has not yet supplanted industrial… ▽ More

    Submitted 28 May, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Fix typo in formula and add a reference

  3. arXiv:2306.09884  [pdf, other

    cs.LG cs.AI

    Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

    Authors: Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre

    Abstract: Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments speci… ▽ More

    Submitted 15 March, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: 9 pages + 21 pages of appendices and references. Published at ICLR 2024

  4. arXiv:2211.10550  [pdf, other

    cs.LG cs.AI

    Debiasing Meta-Gradient Reinforcement Learning by Learning the Outer Value Function

    Authors: Clément Bonnet, Laurence Midgley, Alexandre Laterre

    Abstract: Meta-gradient Reinforcement Learning (RL) allows agents to self-tune their hyper-parameters in an online fashion during training. In this paper, we identify a bias in the meta-gradient of current meta-gradient RL approaches. This bias comes from using the critic that is trained using the meta-learned discount factor for the advantage estimation in the outer objective which requires a different dis… ▽ More

    Submitted 18 November, 2022; originally announced November 2022.

    Comments: Published at the 6th Workshop on Meta-Learning at NeurIPS 2022, New Orleans

  5. arXiv:2205.14345  [pdf, other

    cs.LG

    Reinforcement Learning for Branch-and-Bound Optimisation using Retrospective Trajectories

    Authors: Christopher W. F. Parsonson, Alexandre Laterre, Thomas D. Barrett

    Abstract: Combinatorial optimisation problems framed as mixed integer linear programmes (MILPs) are ubiquitous across a range of real-world applications. The canonical branch-and-bound algorithm seeks to exactly solve MILPs by constructing a search tree of increasingly constrained sub-problems. In practice, its solving time performance is dependent on heuristics, such as the choice of the next variable to c… ▽ More

    Submitted 5 December, 2022; v1 submitted 28 May, 2022; originally announced May 2022.

    Comments: Accepted to AAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence

    Journal ref: AAAI'23: Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

  6. arXiv:2205.14105  [pdf, other

    cs.LG cs.AI

    Learning to Solve Combinatorial Graph Partitioning Problems via Efficient Exploration

    Authors: Thomas D. Barrett, Christopher W. F. Parsonson, Alexandre Laterre

    Abstract: From logistics to the natural sciences, combinatorial optimisation on graphs underpins numerous real-world applications. Reinforcement learning (RL) has shown particular promise in this setting as it can adapt to specific problem structures and does not require pre-solved instances for these, often NP-hard, problems. However, state-of-the-art (SOTA) approaches typically suffer from severe scalabil… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  7. arXiv:2111.00206  [pdf, other

    cs.LG cs.AI

    One Step at a Time: Pros and Cons of Multi-Step Meta-Gradient Reinforcement Learning

    Authors: Clément Bonnet, Paul Caron, Thomas Barrett, Ian Davies, Alexandre Laterre

    Abstract: Self-tuning algorithms that adapt the learning process online encourage more effective and robust learning. Among all the methods available, meta-gradients have emerged as a promising approach. They leverage the differentiability of the learning rule with respect to some hyper-parameters to adapt them in an online fashion. Although meta-gradients can be accumulated over multiple learning steps to… ▽ More

    Submitted 30 October, 2021; originally announced November 2021.

    Comments: 14 pages, 6 figures, 2 tables

  8. arXiv:2012.01736  [pdf, other

    q-bio.BM cs.LG

    Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning

    Authors: Marcin J. Skwark, Nicolás López Carranza, Thomas Pierrot, Joe Phillips, Slim Said, Alexandre Laterre, Amine Kerkeni, Uğur Şahin, Karim Beguir

    Abstract: The SARS-CoV-2 pandemic has created a global race for a cure. One approach focuses on designing a novel variant of the human angiotensin-converting enzyme 2 (ACE2) that binds more tightly to the SARS-CoV-2 spike protein and diverts it from human cells. Here we formulate a novel protein design framework as a reinforcement learning problem. We generate new designs efficiently through the combination… ▽ More

    Submitted 3 December, 2020; originally announced December 2020.

  9. arXiv:2011.14379  [pdf, other

    cs.LG

    Offline Reinforcement Learning Hands-On

    Authors: Louis Monier, Jakub Kmec, Alexandre Laterre, Thomas Pierrot, Valentin Courgeau, Olivier Sigaud, Karim Beguir

    Abstract: Offline Reinforcement Learning (RL) aims to turn large datasets into powerful decision-making engines without any online interactions with the environment. This great promise has motivated a large amount of research that hopes to replicate the success RL has experienced in simulation settings. This work ambitions to reflect upon these efforts from a practitioner viewpoint. We start by discussing t… ▽ More

    Submitted 29 November, 2020; originally announced November 2020.

    Comments: Accepted at NeurIPS 2020 Offline Reinforcement Learning Workshop. First two authors contributed equally. Authors three and four advised equally

  10. arXiv:2010.07777  [pdf, other

    cs.LG cs.GT cs.MA

    A game-theoretic analysis of networked system control for common-pool resource management using multi-agent reinforcement learning

    Authors: Arnu Pretorius, Scott Cameron, Elan van Biljon, Tom Makkink, Shahil Mawjee, Jeremy du Plessis, Jonathan Shock, Alexandre Laterre, Karim Beguir

    Abstract: Multi-agent reinforcement learning has recently shown great promise as an approach to networked system control. Arguably, one of the most difficult and important tasks for which large scale networked system control is applicable is common-pool resource management. Crucial common-pool resources include arable land, fresh water, wetlands, wildlife, fish stock, forests and the atmosphere, of which pr… ▽ More

    Submitted 15 October, 2020; originally announced October 2020.

    Comments: 17 pages, 16 Figures, to appear in Advances of Neural Information Processing Systems (NeurIPS) conference, 2020

  11. arXiv:2007.13363  [pdf, other

    cs.AI

    Learning Compositional Neural Programs for Continuous Control

    Authors: Thomas Pierrot, Nicolas Perrin, Feryal Behbahani, Alexandre Laterre, Olivier Sigaud, Karim Beguir, Nando de Freitas

    Abstract: We propose a novel solution to challenging sparse-reward, continuous control problems that require hierarchical planning at multiple levels of abstraction. Our solution, dubbed AlphaNPI-X, involves three separate stages of learning. First, we use off-policy reinforcement learning algorithms with experience replay to learn a set of atomic goal-conditioned policies, which can be easily repurposed fo… ▽ More

    Submitted 13 April, 2021; v1 submitted 27 July, 2020; originally announced July 2020.

  12. arXiv:1905.12941  [pdf, other

    cs.AI

    Learning Compositional Neural Programs with Recursive Tree Search and Planning

    Authors: Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, Alexandre Laterre, David Kas, Karim Beguir, Nando de Freitas

    Abstract: We propose a novel reinforcement learning algorithm, AlphaNPI, that incorporates the strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural biases in the form of modularity, hierarchy and recursion, which are helpful to reduce sample complexity, improve generalization and increase interpretability. AlphaZero contributes powerful neural network guided search alg… ▽ More

    Submitted 13 April, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

  13. arXiv:1807.01672  [pdf, other

    cs.LG cs.AI stat.ML

    Ranked Reward: Enabling Self-Play Reinforcement Learning for Combinatorial Optimization

    Authors: Alexandre Laterre, Yunguan Fu, Mohamed Khalil Jabri, Alain-Sam Cohen, David Kas, Karl Hajjar, Torbjorn S. Dahl, Amine Kerkeni, Karim Beguir

    Abstract: Adversarial self-play in two-player games has delivered impressive results when used with reinforcement learning algorithms that combine deep neural networks and tree search. Algorithms like AlphaZero and Expert Iteration learn tabula-rasa, producing highly informative training data on the fly. However, the self-play training strategy is not directly applicable to single-player games. Recently, se… ▽ More

    Submitted 6 December, 2018; v1 submitted 4 July, 2018; originally announced July 2018.

    Journal ref: Presented at the Thirty-second Conference on Neural Information Processing Systems (NeurIPS 2018), Deep Reinforcement Learning Workshop, Montreal, Canada, December 3-8, 2018