Zum Hauptinhalt springen

Showing 1–14 of 14 results for author: Perrin, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.07785  [pdf, other

    cs.CV cs.DL

    NeuroPapyri: A Deep Attention Embedding Network for Handwritten Papyri Retrieval

    Authors: Giuseppe De Gregorio, Simon Perrin, Rodrigo C. G. Pena, Isabelle Marthot-Santaniello, Harold Mouchère

    Abstract: The intersection of computer vision and machine learning has emerged as a promising avenue for advancing historical research, facilitating a more profound exploration of our past. However, the application of machine learning approaches in historical palaeography is often met with criticism due to their perceived ``black box'' nature. In response to this challenge, we introduce NeuroPapyri, an inno… ▽ More

    Submitted 14 August, 2024; originally announced August 2024.

  2. arXiv:2408.00118  [pdf, other

    cs.CL cs.AI

    Gemma 2: Improving Open Language Models at a Practical Size

    Authors: Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, Nikola Momchev, Matt Hoffman , et al. (172 additional authors not shown)

    Abstract: In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters. In this new version, we apply several known technical modifications to the Transformer architecture, such as interleaving local-global attentions (Beltagy et al., 2020a) and group-query attention (Ainslie et al., 2023). We al… ▽ More

    Submitted 2 August, 2024; v1 submitted 31 July, 2024; originally announced August 2024.

  3. arXiv:2407.14622  [pdf, other

    cs.LG cs.AI cs.CL

    BOND: Aligning LLMs with Best-of-N Distillation

    Authors: Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Nino Vieillard, Alexandre Ramé, Bobak Shariari, Sarah Perrin, Abe Friesen, Geoffrey Cideron, Sertan Girgin, Piotr Stanczyk, Andrea Michi, Danila Sinopalnikov, Sabela Ramos, Amélie Héliou, Aliaksei Severyn, Matt Hoffman, Nikola Momchev, Olivier Bachem

    Abstract: Reinforcement learning from human feedback (RLHF) is a key driver of quality and safety in state-of-the-art large language models. Yet, a surprisingly simple and strong inference-time strategy is Best-of-N sampling that selects the best generation among N candidates. In this paper, we propose Best-of-N Distillation (BOND), a novel RLHF algorithm that seeks to emulate Best-of-N but without its sign… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

  4. arXiv:2402.03928  [pdf, other

    cs.GT cs.MA

    Approximating the Core via Iterative Coalition Sampling

    Authors: Ian Gemp, Marc Lanctot, Luke Marris, Yiran Mao, Edgar Duéñez-Guzmán, Sarah Perrin, Andras Gyorgy, Romuald Elie, Georgios Piliouras, Michael Kaisers, Daniel Hennes, Kalesha Bullard, Kate Larson, Yoram Bachrach

    Abstract: The core is a central solution concept in cooperative game theory, defined as the set of feasible allocations or payments such that no subset of agents has incentive to break away and form their own subgroup or coalition. However, it has long been known that the core (and approximations, such as the least-core) are hard to compute. This limits our ability to analyze cooperative games in general, a… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Published in AAMAS 2024

  5. arXiv:2208.10138  [pdf, other

    cs.GT stat.ML

    Learning Correlated Equilibria in Mean-Field Games

    Authors: Paul Muller, Romuald Elie, Mark Rowland, Mathieu Lauriere, Julien Perolat, Sarah Perrin, Matthieu Geist, Georgios Piliouras, Olivier Pietquin, Karl Tuyls

    Abstract: The designs of many large-scale systems today, from traffic routing environments to smart grids, rely on game-theoretic equilibrium concepts. However, as the size of an $N$-player game typically grows exponentially with $N$, standard game theoretic analysis becomes effectively infeasible beyond a low number of players. Recent approaches have gone around this limitation by instead considering Mean-… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

  6. arXiv:2205.12944  [pdf, other

    cs.LG cs.AI cs.GT math.OC

    Learning in Mean Field Games: A Survey

    Authors: Mathieu Laurière, Sarah Perrin, Julien Pérolat, Sertan Girgin, Paul Muller, Romuald Élie, Matthieu Geist, Olivier Pietquin

    Abstract: Non-cooperative and cooperative games with a very large number of players have many applications but remain generally intractable when the number of players increases. Introduced by Lasry and Lions, and Huang, Caines and Malhamé, Mean Field Games (MFGs) rely on a mean-field approximation to allow the number of players to grow to infinity. Traditional methods for solving these games generally rely… ▽ More

    Submitted 26 July, 2024; v1 submitted 25 May, 2022; originally announced May 2022.

  7. arXiv:2203.11973  [pdf, other

    cs.LG math.OC stat.ML

    Scalable Deep Reinforcement Learning Algorithms for Mean Field Games

    Authors: Mathieu Laurière, Sarah Perrin, Sertan Girgin, Paul Muller, Ayush Jain, Theophile Cabannes, Georgios Piliouras, Julien Pérolat, Romuald Élie, Olivier Pietquin, Matthieu Geist

    Abstract: Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents. Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods. One limiting factor to further scale up using RL is that existing algorithms to solve MFGs require the mixing of approximated quant… ▽ More

    Submitted 17 June, 2022; v1 submitted 22 March, 2022; originally announced March 2022.

  8. arXiv:2110.11943  [pdf, other

    math.DS cs.MA cs.NI eess.SY math.OC

    Solving N-player dynamic routing games with congestion: a mean field approach

    Authors: Theophile Cabannes, Mathieu Lauriere, Julien Perolat, Raphael Marinier, Sertan Girgin, Sarah Perrin, Olivier Pietquin, Alexandre M. Bayen, Eric Goubault, Romuald Elie

    Abstract: The recent emergence of navigational tools has changed traffic patterns and has now enabled new types of congestion-aware routing control like dynamic road pricing. Using the fundamental diagram of traffic flows - applied in macroscopic and mesoscopic traffic modeling - the article introduces a new N-player dynamic routing game with explicit congestion dynamics. The model is well-posed and can rep… ▽ More

    Submitted 27 October, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

  9. arXiv:2109.09717  [pdf, other

    cs.LG cs.GT cs.MA math.OC

    Generalization in Mean Field Games by Learning Master Policies

    Authors: Sarah Perrin, Mathieu Laurière, Julien Pérolat, Romuald Élie, Matthieu Geist, Olivier Pietquin

    Abstract: Mean Field Games (MFGs) can potentially scale multi-agent systems to extremely large populations of agents. Yet, most of the literature assumes a single initial distribution for the agents, which limits the practical applications of MFGs. Machine Learning has the potential to solve a wider diversity of MFG problems thanks to generalizations capacities. We study how to leverage these generalization… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

  10. arXiv:2106.03787  [pdf, other

    cs.LG cs.MA

    Concave Utility Reinforcement Learning: the Mean-Field Game Viewpoint

    Authors: Matthieu Geist, Julien Pérolat, Mathieu Laurière, Romuald Elie, Sarah Perrin, Olivier Bachem, Rémi Munos, Olivier Pietquin

    Abstract: Concave Utility Reinforcement Learning (CURL) extends RL from linear to concave utilities in the occupancy measure induced by the agent's policy. This encompasses not only RL but also imitation learning and exploration, among others. Yet, this more general paradigm invalidates the classical Bellman equations, and calls for new algorithms. Mean-field Games (MFGs) are a continuous approximation of m… ▽ More

    Submitted 16 February, 2022; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: AAMAS 2022

  11. arXiv:2105.07933  [pdf, other

    cs.MA cs.AI

    Mean Field Games Flock! The Reinforcement Learning Way

    Authors: Sarah Perrin, Mathieu Laurière, Julien Pérolat, Matthieu Geist, Romuald Élie, Olivier Pietquin

    Abstract: We present a method enabling a large number of agents to learn how to flock, which is a natural behavior observed in large populations of animals. This problem has drawn a lot of interest but requires many structural assumptions and is tractable only in small dimensions. We phrase this problem as a Mean Field Game (MFG), where each individual chooses its acceleration depending on the population be… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

  12. arXiv:2103.00623  [pdf, other

    cs.AI

    Scaling up Mean Field Games with Online Mirror Descent

    Authors: Julien Perolat, Sarah Perrin, Romuald Elie, Mathieu Laurière, Georgios Piliouras, Matthieu Geist, Karl Tuyls, Olivier Pietquin

    Abstract: We address scaling up equilibrium computation in Mean Field Games (MFGs) using Online Mirror Descent (OMD). We show that continuous-time OMD provably converges to a Nash equilibrium under a natural and well-motivated set of monotonicity assumptions. This theoretical result nicely extends to multi-population games and to settings involving common noise. A thorough experimental investigation on vari… ▽ More

    Submitted 28 February, 2021; originally announced March 2021.

  13. arXiv:2007.03458  [pdf, other

    math.OC cs.AI

    Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications

    Authors: Sarah Perrin, Julien Perolat, Mathieu Laurière, Matthieu Geist, Romuald Elie, Olivier Pietquin

    Abstract: In this paper, we deepen the analysis of continuous time Fictitious Play learning algorithm to the consideration of various finite state Mean Field Game settings (finite horizon, $γ$-discounted), allowing in particular for the introduction of an additional common noise. We first present a theoretical convergence analysis of the continuous time Fictitious Play process and prove that the induced e… ▽ More

    Submitted 26 October, 2020; v1 submitted 5 July, 2020; originally announced July 2020.

  14. Decision method choice in a human posture recognition context

    Authors: Stéphane Perrin, Eric Benoit, Didier Coquin

    Abstract: Human posture recognition provides a dynamic field that has produced many methods. Using fuzzy subsets based data fusion methods to aggregate the results given by different types of recognition processes is a convenient way to improve recognition methods. Nevertheless, choosing a defuzzification method to imple-ment the decision is a crucial point of this approach. The goal of this paper is to pre… ▽ More

    Submitted 11 July, 2018; originally announced July 2018.

    Journal ref: Human-Computer Systems Interaction. Backgrounds and Applications 4, 4, 2018