Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Micheli, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19320  [pdf, other

    cs.LG cs.AI cs.CV

    Efficient World Models with Context-Aware Tokenization

    Authors: Vincent Micheli, Eloi Alonso, François Fleuret

    Abstract: Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environme… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  2. arXiv:2405.12399  [pdf, other

    cs.LG cs.AI cs.CV

    Diffusion for World Modeling: Visual Details Matter in Atari

    Authors: Eloi Alonso, Adam Jelley, Vincent Micheli, Anssi Kanervisto, Amos Storkey, Tim Pearce, François Fleuret

    Abstract: World models constitute a promising approach for training reinforcement learning agents in a safe and sample-efficient manner. Recent world models predominantly operate on sequences of discrete latent variables to model environment dynamics. However, this compression into a compact discrete representation may ignore visual details that are important for reinforcement learning. Concurrently, diffus… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: 25 pages, 11 figures, 10 tables

  3. arXiv:2209.00588  [pdf, other

    cs.LG cs.AI cs.CV

    Transformers are Sample-Efficient World Models

    Authors: Vincent Micheli, Eloi Alonso, François Fleuret

    Abstract: Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems. Recently, many model-based methods have been designed to address this issue, with learning in the imagination of a world model being one of the most prominent approaches. However, while virtually unlimited interaction with a simulated environment sounds appeali… ▽ More

    Submitted 1 March, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: ICLR 2023 (notable top 5%)

  4. arXiv:2202.10583  [pdf, other

    cs.LG cs.AI

    MineRL Diamond 2021 Competition: Overview, Results, and Lessons Learned

    Authors: Anssi Kanervisto, Stephanie Milani, Karolis Ramanauskas, Nicholay Topin, Zichuan Lin, Junyou Li, Jianing Shi, Deheng Ye, Qiang Fu, Wei Yang, Weijun Hong, Zhongyue Huang, Haicheng Chen, Guangjun Zeng, Yue Lin, Vincent Micheli, Eloi Alonso, François Fleuret, Alexander Nikulin, Yury Belousov, Oleg Svidchenko, Aleksei Shpilman

    Abstract: Reinforcement learning competitions advance the field by providing appropriate scope and support to develop solutions toward a specific problem. To promote the development of more broadly applicable methods, organizers need to enforce the use of general techniques, the use of sample-efficient methods, and the reproducibility of the results. While beneficial for the research community, these restri… ▽ More

    Submitted 17 February, 2022; originally announced February 2022.

    Comments: Under review for PMLR volume on NeurIPS 2021 competitions

  5. arXiv:2104.07972  [pdf, other

    cs.CL cs.LG

    Language Models are Few-Shot Butlers

    Authors: Vincent Micheli, François Fleuret

    Abstract: Pretrained language models demonstrate strong performance in most NLP tasks when fine-tuned on small task-specific datasets. Hence, these autoregressive models constitute ideal agents to operate in text-based environments where language understanding and generative capabilities are essential. Nonetheless, collecting expert demonstrations in such environments is a time-consuming endeavour. We intro… ▽ More

    Submitted 20 September, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: EMNLP 2021

  6. arXiv:2104.06045  [pdf, other

    cs.CL cs.LG

    Structural analysis of an all-purpose question answering model

    Authors: Vincent Micheli, Quentin Heinrich, François Fleuret, Wacim Belblidia

    Abstract: Attention is a key component of the now ubiquitous pre-trained language models. By learning to focus on relevant pieces of information, these Transformer-based architectures have proven capable of tackling several tasks at once and sometimes even surpass their single-task counterparts. To better understand this phenomenon, we conduct a structural analysis of a new all-purpose question answering mo… ▽ More

    Submitted 13 April, 2021; originally announced April 2021.

  7. arXiv:2010.03813  [pdf, other

    cs.CL cs.LG

    On the importance of pre-training data volume for compact language models

    Authors: Vincent Micheli, Martin d'Hoffschmidt, François Fleuret

    Abstract: Recent advances in language modeling have led to computationally intensive and resource-demanding state-of-the-art models. In an effort towards sustainable practices, we study the impact of pre-training data volume on compact language models. Multiple BERT-based models are trained on gradually increasing amounts of French text. Through fine-tuning on the French Question Answering Dataset (FQuAD),… ▽ More

    Submitted 9 October, 2020; v1 submitted 8 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020; typo corrected

  8. arXiv:2002.03240  [pdf, other

    cs.LG stat.ML

    Multi-task Reinforcement Learning with a Planning Quasi-Metric

    Authors: Vincent Micheli, Karthigan Sinnathamby, François Fleuret

    Abstract: We introduce a new reinforcement learning approach combining a planning quasi-metric (PQM) that estimates the number of steps required to go from any state to another, with task-specific "aimers" that compute a target state to reach a given goal. This decomposition allows the sharing across tasks of a task-agnostic model of the quasi-metric that captures the environment's dynamics and can be learn… ▽ More

    Submitted 5 December, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

    Comments: Deep RL Workshop, NeurIPS 2020