Skip to main content

Showing 1–20 of 20 results for author: Matthey, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.10179  [pdf, other

    cs.RO cs.AI cs.HC cs.LG

    Scaling Instructable Agents Across Many Simulated Worlds

    Authors: SIMA Team, Maria Abi Raad, Arun Ahuja, Catarina Barros, Frederic Besse, Andrew Bolt, Adrian Bolton, Bethanie Brownfield, Gavin Buttimore, Max Cant, Sarah Chakera, Stephanie C. Y. Chan, Jeff Clune, Adrian Collister, Vikki Copeman, Alex Cullum, Ishita Dasgupta, Dario de Cesare, Julia Di Trapani, Yani Donchev, Emma Dunleavy, Martin Engelcke, Ryan Faulkner, Frankie Garcia, Charles Gbadamosi , et al. (68 additional authors not shown)

    Abstract: Building embodied AI systems that can follow arbitrary language instructions in any 3D environment is a key challenge for creating general AI. Accomplishing this goal requires learning to ground language in perception and embodied actions, in order to accomplish complex tasks. The Scalable, Instructable, Multiworld Agent (SIMA) project tackles this by training agents to follow free-form instructio… ▽ More

    Submitted 17 April, 2024; v1 submitted 13 March, 2024; originally announced April 2024.

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2311.17901  [pdf, other

    cs.CV cs.AI cs.LG

    SODA: Bottleneck Diffusion Models for Representation Learning

    Authors: Drew A. Hudson, Daniel Zoran, Mateusz Malinowski, Andrew K. Lampinen, Andrew Jaegle, James L. McClelland, Loic Matthey, Felix Hill, Alexander Lerchner

    Abstract: We introduce SODA, a self-supervised diffusion model, designed for representation learning. The model incorporates an image encoder, which distills a source view into a compact representation, that, in turn, guides the generation of related novel views. We show that by imposing a tight bottleneck between the encoder and a denoising decoder, and leveraging novel view synthesis as a self-supervised… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

  4. arXiv:2311.17851  [pdf, other

    cs.CV

    Leveraging VLM-Based Pipelines to Annotate 3D Objects

    Authors: Rishabh Kabra, Loic Matthey, Alexander Lerchner, Niloy J. Mitra

    Abstract: Pretrained vision language models (VLMs) present an opportunity to caption unlabeled 3D objects at scale. The leading approach to summarize VLM descriptions from different views of an object (Luo et al., 2023) relies on a language model (GPT4) to produce the final output. This text-based aggregation is susceptible to hallucinations as it merges potentially contradictory descriptions. We propose an… ▽ More

    Submitted 17 June, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

  5. arXiv:2310.15940  [pdf, other

    cs.AI cs.LG

    Combining Behaviors with the Successor Features Keyboard

    Authors: Wilka Carvalho, Andre Saraiva, Angelos Filos, Andrew Kyle Lampinen, Loic Matthey, Richard L. Lewis, Honglak Lee, Satinder Singh, Danilo J. Rezende, Daniel Zoran

    Abstract: The Option Keyboard (OK) was recently proposed as a method for transferring behavioral knowledge across tasks. OK transfers knowledge by adaptively combining subsets of known behaviors using Successor Features (SFs) and Generalized Policy Improvement (GPI). However, it relies on hand-designed state-features and task encodings which are cumbersome to design for every new environment. In this work,… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  6. arXiv:2107.11153  [pdf, other

    cs.LG cs.AI stat.ML

    Constellation: Learning relational abstractions over objects for compositional imagination

    Authors: James C. R. Whittington, Rishabh Kabra, Loic Matthey, Christopher P. Burgess, Alexander Lerchner

    Abstract: Learning structured representations of visual scenes is currently a major bottleneck to bridging perception with reasoning. While there has been exciting progress with slot-based models, which learn to segment scenes into sets of objects, learning configurational properties of entire groups of objects is still under-explored. To address this problem, we introduce Constellation, a network that lear… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

  7. arXiv:2106.03849  [pdf, other

    cs.CV cs.LG

    SIMONe: View-Invariant, Temporally-Abstracted Object Representations via Unsupervised Video Decomposition

    Authors: Rishabh Kabra, Daniel Zoran, Goker Erdogan, Loic Matthey, Antonia Creswell, Matthew Botvinick, Alexander Lerchner, Christopher P. Burgess

    Abstract: To help agents reason about scenes in terms of their building blocks, we wish to extract the compositional structure of any given scene (in particular, the configuration and characteristics of objects comprising the scene). This problem is especially difficult when scene structure needs to be inferred while also estimating the agent's location/viewpoint, as the two variables jointly give rise to t… ▽ More

    Submitted 6 December, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: Animated figures are available at https://sites.google.com/view/simone-scene-understanding/

  8. arXiv:2102.02926  [pdf, other

    cs.LG cs.AI

    Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

    Authors: Jane X. Wang, Michael King, Nicolas Porcel, Zeb Kurth-Nelson, Tina Zhu, Charlie Deck, Peter Choy, Mary Cassin, Malcolm Reynolds, Francis Song, Gavin Buttimore, David P. Reichert, Neil Rabinowitz, Loic Matthey, Demis Hassabis, Alexander Lerchner, Matthew Botvinick

    Abstract: There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled anal… ▽ More

    Submitted 20 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Published in Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 2021

  9. arXiv:2007.08973  [pdf, other

    cs.CV cs.AI cs.LG

    AlignNet: Unsupervised Entity Alignment

    Authors: Antonia Creswell, Kyriacos Nikiforou, Oriol Vinyals, Andre Saraiva, Rishabh Kabra, Loic Matthey, Chris Burgess, Malcolm Reynolds, Richard Tanburn, Marta Garnelo, Murray Shanahan

    Abstract: Recently developed deep learning models are able to learn to segment scenes into component objects without supervision. This opens many new and exciting avenues of research, allowing agents to take objects (or entities) as inputs, rather that pixels. Unfortunately, while these models provide excellent segmentation of a single frame, they do not keep track of how objects segmented at one time-step… ▽ More

    Submitted 21 July, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

  10. arXiv:1905.12614  [pdf, other

    cs.LG stat.ML

    Unsupervised Model Selection for Variational Disentangled Representation Learning

    Authors: Sunny Duan, Loic Matthey, Andre Saraiva, Nicholas Watters, Christopher P. Burgess, Alexander Lerchner, Irina Higgins

    Abstract: Disentangled representations have recently been shown to improve fairness, data efficiency and generalisation in simple supervised and reinforcement learning tasks. To extend the benefits of disentangled representations to more complex domains and practical applications, it is important to enable hyperparameter tuning and model selection of existing unsupervised approaches without requiring access… ▽ More

    Submitted 14 February, 2020; v1 submitted 29 May, 2019; originally announced May 2019.

  11. arXiv:1905.09275  [pdf, other

    cs.LG cs.AI

    COBRA: Data-Efficient Model-Based RL through Unsupervised Object Discovery and Curiosity-Driven Exploration

    Authors: Nicholas Watters, Loic Matthey, Matko Bosnjak, Christopher P. Burgess, Alexander Lerchner

    Abstract: Data efficiency and robustness to task-irrelevant perturbations are long-standing challenges for deep reinforcement learning algorithms. Here we introduce a modular approach to addressing these challenges in a continuous control environment, without using hand-crafted or supervised information. Our Curious Object-Based seaRch Agent (COBRA) uses task-free intrinsically motivated exploration and uns… ▽ More

    Submitted 14 August, 2019; v1 submitted 22 May, 2019; originally announced May 2019.

  12. arXiv:1903.00450  [pdf, other

    cs.LG cs.CV stat.ML

    Multi-Object Representation Learning with Iterative Variational Inference

    Authors: Klaus Greff, Raphaël Lopez Kaufman, Rishabh Kabra, Nick Watters, Chris Burgess, Daniel Zoran, Loic Matthey, Matthew Botvinick, Alexander Lerchner

    Abstract: Human perception is structured around objects which form the basis for our higher-level cognition and impressive systematic generalization abilities. Yet most work on representation learning focuses on feature learning without even considering multiple objects, or treats segmentation as an (often supervised) preprocessing step. Instead, we argue for the importance of learning to segment and repres… ▽ More

    Submitted 27 July, 2020; v1 submitted 1 March, 2019; originally announced March 2019.

    Journal ref: ICML 2019 (PMLR 97:2424-2433)

  13. arXiv:1901.11390  [pdf, other

    cs.CV cs.LG stat.ML

    MONet: Unsupervised Scene Decomposition and Representation

    Authors: Christopher P. Burgess, Loic Matthey, Nicholas Watters, Rishabh Kabra, Irina Higgins, Matt Botvinick, Alexander Lerchner

    Abstract: The ability to decompose scenes in terms of abstract building blocks is crucial for general intelligence. Where those basic building blocks share meaningful properties, interactions and other regularities across scenes, such decompositions can simplify reasoning and facilitate imagination of novel scenarios. In particular, representing perceptual observations in terms of entities should improve da… ▽ More

    Submitted 22 January, 2019; originally announced January 2019.

  14. arXiv:1901.07017  [pdf, other

    cs.LG cs.CV stat.ML

    Spatial Broadcast Decoder: A Simple Architecture for Learning Disentangled Representations in VAEs

    Authors: Nicholas Watters, Loic Matthey, Christopher P. Burgess, Alexander Lerchner

    Abstract: We present a simple neural rendering architecture that helps variational autoencoders (VAEs) learn disentangled representations. Instead of the deconvolutional network typically used in the decoder of VAEs, we tile (broadcast) the latent vector across space, concatenate fixed X- and Y-"coordinate" channels, and apply a fully convolutional network with 1x1 stride. This provides an architectural pri… ▽ More

    Submitted 14 August, 2019; v1 submitted 21 January, 2019; originally announced January 2019.

  15. arXiv:1812.02230  [pdf, other

    cs.LG stat.ML

    Towards a Definition of Disentangled Representations

    Authors: Irina Higgins, David Amos, David Pfau, Sebastien Racaniere, Loic Matthey, Danilo Rezende, Alexander Lerchner

    Abstract: How can intelligent agents solve a diverse set of tasks in a data-efficient manner? The disentangled representation learning approach posits that such an agent would benefit from separating out (disentangling) the underlying structure of the world into disjoint parts of its representation. However, there is no generally agreed-upon definition of disentangling, not least because it is unclear how t… ▽ More

    Submitted 5 December, 2018; originally announced December 2018.

  16. arXiv:1808.06508  [pdf, other

    cs.LG stat.ML

    Life-Long Disentangled Representation Learning with Cross-Domain Latent Homologies

    Authors: Alessandro Achille, Tom Eccles, Loic Matthey, Christopher P. Burgess, Nick Watters, Alexander Lerchner, Irina Higgins

    Abstract: Intelligent behaviour in the real-world requires the ability to acquire new knowledge from an ongoing sequence of experiences while preserving and reusing past knowledge. We propose a novel algorithm for unsupervised representation learning from piece-wise stationary visual data: Variational Autoencoder with Shared Embeddings (VASE). Based on the Minimum Description Length principle, VASE automati… ▽ More

    Submitted 20 August, 2018; originally announced August 2018.

  17. arXiv:1804.03599  [pdf, other

    stat.ML cs.AI cs.LG

    Understanding disentangling in $β$-VAE

    Authors: Christopher P. Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick Watters, Guillaume Desjardins, Alexander Lerchner

    Abstract: We present new intuitions and theoretical assessments of the emergence of disentangled representation in variational autoencoders. Taking a rate-distortion theory perspective, we show the circumstances under which representations aligned with the underlying generative factors of variation of data emerge when optimising the modified ELBO bound in $β$-VAE, as training progresses. From these insights… ▽ More

    Submitted 10 April, 2018; originally announced April 2018.

    Comments: Presented at the 2017 NIPS Workshop on Learning Disentangled Representations

  18. arXiv:1707.08475  [pdf, other

    stat.ML cs.AI cs.LG

    DARLA: Improving Zero-Shot Transfer in Reinforcement Learning

    Authors: Irina Higgins, Arka Pal, Andrei A. Rusu, Loic Matthey, Christopher P Burgess, Alexander Pritzel, Matthew Botvinick, Charles Blundell, Alexander Lerchner

    Abstract: Domain adaptation is an important open problem in deep reinforcement learning (RL). In many scenarios of interest data is hard to obtain, so agents may learn a source policy in a setting where data is readily available, with the hope that it generalises well to the target domain. We propose a new multi-stage RL agent, DARLA (DisentAngled Representation Learning Agent), which learns to see before l… ▽ More

    Submitted 6 June, 2018; v1 submitted 26 July, 2017; originally announced July 2017.

    Comments: ICML 2017

  19. arXiv:1707.03389  [pdf, other

    stat.ML cs.LG

    SCAN: Learning Hierarchical Compositional Visual Concepts

    Authors: Irina Higgins, Nicolas Sonnerat, Loic Matthey, Arka Pal, Christopher P Burgess, Matko Bosnjak, Murray Shanahan, Matthew Botvinick, Demis Hassabis, Alexander Lerchner

    Abstract: The seemingly infinite diversity of the natural world arises from a relatively small set of coherent rules, such as the laws of physics or chemistry. We conjecture that these rules give rise to regularities that can be discovered through primarily unsupervised experiences and represented as abstract concepts. If such representations are compositional and hierarchical, they can be recombined into a… ▽ More

    Submitted 6 June, 2018; v1 submitted 11 July, 2017; originally announced July 2017.

  20. arXiv:1606.05579  [pdf, other

    stat.ML cs.LG q-bio.NC

    Early Visual Concept Learning with Unsupervised Deep Learning

    Authors: Irina Higgins, Loic Matthey, Xavier Glorot, Arka Pal, Benigno Uria, Charles Blundell, Shakir Mohamed, Alexander Lerchner

    Abstract: Automated discovery of early visual concepts from raw image data is a major open challenge in AI research. Addressing this problem, we propose an unsupervised approach for learning disentangled representations of the underlying factors of variation. We draw inspiration from neuroscience, and show how this can be achieved in an unsupervised generative model by applying the same learning pressures a… ▽ More

    Submitted 20 September, 2016; v1 submitted 17 June, 2016; originally announced June 2016.