Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Dalibard, V

.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2306.11706  [pdf, other

    cs.RO cs.LG

    RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

    Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz , et al. (14 additional authors not shown)

    Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de… ▽ More

    Submitted 22 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Transactions on Machine Learning Research (12/2023)

  3. arXiv:2304.03995  [pdf, other

    cs.NE cs.LG

    Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

    Authors: Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag

    Abstract: Genetic algorithms constitute a family of black-box optimization algorithms, which take inspiration from the principles of biological evolution. While they provide a general-purpose tool for optimization, their particular instantiations can be heuristic and motivated by loose biological intuition. In this work we explore a fundamentally different approach: Given a sufficiently flexible parametriza… ▽ More

    Submitted 8 April, 2023; originally announced April 2023.

    Comments: 14 pages, 31 figures

  4. arXiv:2110.01765  [pdf, other

    cs.LG cs.AI cs.NE

    Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping

    Authors: James Martens, Andy Ballard, Guillaume Desjardins, Grzegorz Swirszcz, Valentin Dalibard, Jascha Sohl-Dickstein, Samuel S. Schoenholz

    Abstract: Using an extended and formalized version of the Q/C map analysis of Poole et al. (2016), along with Neural Tangent Kernel theory, we identify the main pathologies present in deep networks that prevent them from training fast and generalizing to unseen data, and show how these can be avoided by carefully controlling the "shape" of the network's initialization-time kernel function. We then develop a… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  5. arXiv:2109.13800  [pdf, other

    cs.NE stat.ML

    Faster Improvement Rate Population Based Training

    Authors: Valentin Dalibard, Max Jaderberg

    Abstract: The successful training of neural networks typically involves careful and time consuming hyperparameter tuning. Population Based Training (PBT) has recently been proposed to automate this process. PBT trains a population of neural networks concurrently, frequently mutating their hyperparameters throughout their training. However, the decision mechanisms of PBT are greedy and favour short-term impr… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: 9 pages, 5 figures

  6. arXiv:2107.12808  [pdf, other

    cs.LG cs.AI cs.MA

    Open-Ended Learning Leads to Generally Capable Agents

    Authors: Open Ended Learning Team, Adam Stooke, Anuj Mahajan, Catarina Barros, Charlie Deck, Jakob Bauer, Jakub Sygnowski, Maja Trebacz, Max Jaderberg, Michael Mathieu, Nat McAleese, Nathalie Bradley-Schmieg, Nathaniel Wong, Nicolas Porcel, Roberta Raileanu, Steph Hughes-Fitt, Valentin Dalibard, Wojciech Marian Czarnecki

    Abstract: In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the con… ▽ More

    Submitted 31 July, 2021; v1 submitted 27 July, 2021; originally announced July 2021.

  7. arXiv:2006.15223  [pdf, other

    cs.AI cs.LG

    Perception-Prediction-Reaction Agents for Deep Reinforcement Learning

    Authors: Adam Stooke, Valentin Dalibard, Siddhant M. Jayakumar, Wojciech M. Czarnecki, Max Jaderberg

    Abstract: We introduce a new recurrent agent architecture and associated auxiliary losses which improve reinforcement learning in partially observable tasks requiring long-term memory. We employ a temporal hierarchy, using a slow-ticking recurrent core to allow information to flow more easily over long time spans, and three fast-ticking recurrent cores with connections designed to create an information asym… ▽ More

    Submitted 26 June, 2020; originally announced June 2020.

  8. arXiv:1902.01894  [pdf, other

    cs.AI cs.DC cs.LG cs.NE

    A Generalized Framework for Population Based Training

    Authors: Ang Li, Aleksandra Spyra, Sagi Perel, Valentin Dalibard, Max Jaderberg, Chenjie Gu, David Budden, Tim Harley, Pramod Gupta

    Abstract: Population Based Training (PBT) is a recent approach that jointly optimizes neural network weights and hyperparameters which periodically copies weights of the best performers and mutates hyperparameters during training. Previous PBT implementations have been synchronized glass-box systems. We propose a general, black-box PBT framework that distributes many asynchronous "trials" (a small number of… ▽ More

    Submitted 5 February, 2019; originally announced February 2019.

    Comments: 9 pages

  9. arXiv:1711.09846  [pdf, other

    cs.LG cs.NE

    Population Based Training of Neural Networks

    Authors: Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu

    Abstract: Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In this work we present \emph{Population Based Training (PBT)}, a simple asynchronous optimisation algorithm which effectively utilises a fixed computational budget… ▽ More

    Submitted 28 November, 2017; v1 submitted 27 November, 2017; originally announced November 2017.

  10. arXiv:1612.00383  [pdf, other

    stat.ML cs.LG

    Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

    Authors: Valentin Dalibard, Michael Schaarschmidt, Eiko Yoneki

    Abstract: We present an optimizer which uses Bayesian optimization to tune the system parameters of distributed stochastic gradient descent (SGD). Given a specific context, our goal is to quickly find efficient configurations which appropriately balance the load between the available machines to minimize the average SGD iteration time. Our experiments consider setups with over thirty parameters. Traditional… ▽ More

    Submitted 1 December, 2016; originally announced December 2016.

  11. arXiv:1610.09903  [pdf, other

    cs.LG

    Learning Runtime Parameters in Computer Systems with Delayed Experience Injection

    Authors: Michael Schaarschmidt, Felix Gessert, Valentin Dalibard, Eiko Yoneki

    Abstract: Learning effective configurations in computer systems without hand-crafting models for every parameter is a long-standing problem. This paper investigates the use of deep reinforcement learning for runtime parameters of cloud databases under latency constraints. Cloud services serve up to thousands of concurrent requests per second and can adjust critical parameters by leveraging performance metri… ▽ More

    Submitted 31 October, 2016; originally announced October 2016.

    Comments: Deep Reinforcement Learning Workshop, NIPS 2016

    ACM Class: I.2.6; H.2.4