Skip to main content

Showing 1–14 of 14 results for author: Kushman, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  2. arXiv:2211.14275  [pdf, other

    cs.LG cs.AI cs.CL

    Solving math word problems with process- and outcome-based feedback

    Authors: Jonathan Uesato, Nate Kushman, Ramana Kumar, Francis Song, Noah Siegel, Lisa Wang, Antonia Creswell, Geoffrey Irving, Irina Higgins

    Abstract: Recent work has shown that asking language models to generate reasoning steps improves performance on many reasoning tasks. When moving beyond prompting, this raises the question of how we should supervise such models: outcome-based approaches which supervise the final result, or process-based approaches which supervise the reasoning process itself? Differences between these approaches might natur… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

  3. arXiv:2203.07814  [pdf, other

    cs.PL cs.AI cs.LG

    Competition-Level Code Generation with AlphaCode

    Authors: Yujia Li, David Choi, Junyoung Chung, Nate Kushman, Julian Schrittwieser, Rémi Leblond, Tom Eccles, James Keeling, Felix Gimeno, Agustin Dal Lago, Thomas Hubert, Peter Choy, Cyprien de Masson d'Autume, Igor Babuschkin, Xinyun Chen, Po-Sen Huang, Johannes Welbl, Sven Gowal, Alexey Cherepanov, James Molloy, Daniel J. Mankowitz, Esme Sutherland Robson, Pushmeet Kohli, Nando de Freitas, Koray Kavukcuoglu , et al. (1 additional authors not shown)

    Abstract: Programming is a powerful and ubiquitous problem-solving tool. Developing systems that can assist programmers or even generate programs independently could make programming more productive and accessible, yet so far incorporating innovations in AI has proven challenging. Recent large-scale language models have demonstrated an impressive ability to generate code, and are now able to complete simple… ▽ More

    Submitted 8 February, 2022; originally announced March 2022.

    Comments: 74 pages

  4. arXiv:2007.12411  [pdf, other

    cs.LG cs.CV stat.ML

    Interpreting Spatially Infinite Generative Models

    Authors: Chaochao Lu, Richard E. Turner, Yingzhen Li, Nate Kushman

    Abstract: Traditional deep generative models of images and other spatial modalities can only generate fixed sized outputs. The generated images have exactly the same resolution as the training images, which is dictated by the number of layers in the underlying neural network. Recent work has shown, however, that feeding spatial noise vectors into a fully convolutional neural network enables both generation… ▽ More

    Submitted 24 July, 2020; originally announced July 2020.

    Comments: ICML 2020 workshop on Human Interpretability in Machine Learning (WHI 2020)

  5. arXiv:2002.12674  [pdf, other

    cs.CV cs.LG

    Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data

    Authors: Sebastian Lunz, Yingzhen Li, Andrew Fitzgibbon, Nate Kushman

    Abstract: Recent work has shown the ability to learn generative models for 3D shapes from only unstructured 2D images. However, training such models requires differentiating through the rasterization step of the rendering process, therefore past work has focused on developing bespoke rendering models which smooth over this non-differentiable process in various ways. Such models are thus unable to take advan… ▽ More

    Submitted 28 February, 2020; originally announced February 2020.

    Comments: 8 pages paper, 3 pages references, 18 pages appendix

  6. arXiv:2002.07017  [pdf, other

    cs.LG stat.ML

    Learning Robust Representations via Multi-View Information Bottleneck

    Authors: Marco Federici, Anjan Dutta, Patrick Forré, Nate Kushman, Zeynep Akata

    Abstract: The information bottleneck principle provides an information-theoretic method for representation learning, by training an encoder to retain all information which is relevant for predicting the label while minimizing the amount of other, excess information in the representation. The original formulation, however, requires labeled data to identify the superfluous information. In this work, we extend… ▽ More

    Submitted 18 February, 2020; v1 submitted 17 February, 2020; originally announced February 2020.

  7. arXiv:1806.00400  [pdf, other

    stat.ML cs.LG

    Inverting Supervised Representations with Autoregressive Neural Density Models

    Authors: Charlie Nash, Nate Kushman, Christopher K. I. Williams

    Abstract: We present a method for feature interpretation that makes use of recent advances in autoregressive density estimation models to invert model representations. We train generative inversion models to express a distribution over input features conditioned on intermediate model representations. Insights into the invariances learned by supervised models can be gained by viewing samples from these inver… ▽ More

    Submitted 2 January, 2019; v1 submitted 1 June, 2018; originally announced June 2018.

    Comments: Accepted for publication by AISTATS 2019

  8. arXiv:1805.07894  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    Constructing Unrestricted Adversarial Examples with Generative Models

    Authors: Yang Song, Rui Shu, Nate Kushman, Stefano Ermon

    Abstract: Adversarial examples are typically constructed by perturbing an existing data point within a small matrix norm, and current defense methods are focused on guarding against this type of attack. In this paper, we propose unrestricted adversarial examples, a new threat model where the attackers are not restricted to small norm-bounded perturbations. Different from perturbation-based attacks, we propo… ▽ More

    Submitted 2 December, 2018; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Neural Information Processing Systems (NeurIPS 2018)

  9. arXiv:1710.10766  [pdf, other

    cs.LG

    PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples

    Authors: Yang Song, Taesup Kim, Sebastian Nowozin, Stefano Ermon, Nate Kushman

    Abstract: Adversarial perturbations of normal images are usually imperceptible to humans, but they can seriously confuse state-of-the-art machine learning models. What makes them so special in the eyes of image classifiers? In this paper, we show empirically that adversarial examples mainly lie in the low probability regions of the training distribution, regardless of attack types and targeted models. Using… ▽ More

    Submitted 21 May, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: ICLR 2018

  10. arXiv:1612.00817  [pdf, other

    cs.LG cs.AI cs.NE

    Summary - TerpreT: A Probabilistic Programming Language for Program Induction

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

    Abstract: We study machine learning formulations of inductive program synthesis; that is, given input-output examples, synthesize source code that maps inputs to corresponding outputs. Our key contribution is TerpreT, a domain-specific language for expressing program synthesis problems. A TerpreT model is composed of a specification of a program representation and an interpreter that describes how programs… ▽ More

    Submitted 2 December, 2016; originally announced December 2016.

    Comments: 7 pages, 2 figures, 4 tables in 1st Workshop on Neural Abstract Machines & Program Induction (NAMPI), @NIPS 2016

  11. arXiv:1611.07078  [pdf, other

    cs.AI cs.LG stat.ML

    A Deep Learning Approach for Joint Video Frame and Reward Prediction in Atari Games

    Authors: Felix Leibfried, Nate Kushman, Katja Hofmann

    Abstract: Reinforcement learning is concerned with identifying reward-maximizing behaviour policies in environments that are initially unknown. State-of-the-art reinforcement learning approaches, such as deep Q-networks, are model-free and learn to act effectively across a wide range of environments such as Atari games, but require huge amounts of data. Model-based techniques are more data-efficient, but ne… ▽ More

    Submitted 17 August, 2017; v1 submitted 21 November, 2016; originally announced November 2016.

    Comments: Presented at the ICML 2017 Workshop on Principled Approaches to Deep Learning, Sydney, Australia, 2017

  12. arXiv:1611.02109  [pdf, other

    cs.LG

    Differentiable Programs with Neural Libraries

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Nate Kushman, Daniel Tarlow

    Abstract: We develop a framework for combining differentiable programming languages with neural networks. Using this framework we create end-to-end trainable systems that learn to write interpretable algorithms with perceptual components. We explore the benefits of inductive biases for strong generalization and modularity that come from the program-like structure of our models. In particular, modularity all… ▽ More

    Submitted 2 March, 2017; v1 submitted 7 November, 2016; originally announced November 2016.

  13. arXiv:1608.04428  [pdf, other

    cs.LG cs.AI cs.NE

    TerpreT: A Probabilistic Programming Language for Program Induction

    Authors: Alexander L. Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, Daniel Tarlow

    Abstract: We study machine learning formulations of inductive program synthesis; given input-output examples, we try to synthesize source code that maps inputs to corresponding outputs. Our aims are to develop new machine learning approaches based on neural networks and graphical models, and to understand the capabilities of machine learning techniques relative to traditional alternatives, such as those bas… ▽ More

    Submitted 15 August, 2016; originally announced August 2016.

    Comments: 50 pages, 20 figures, 4 tables

  14. arXiv:1608.03000  [pdf, other

    cs.CL cs.AI

    Neural Generation of Regular Expressions from Natural Language with Minimal Domain Knowledge

    Authors: Nicholas Locascio, Karthik Narasimhan, Eduardo DeLeon, Nate Kushman, Regina Barzilay

    Abstract: This paper explores the task of translating natural language queries into regular expressions which embody their meaning. In contrast to prior work, the proposed neural model does not utilize domain-specific crafting, learning to translate directly from a parallel corpus. To fully explore the potential of neural models, we propose a methodology for collecting a large corpus of regular expression,… ▽ More

    Submitted 9 August, 2016; originally announced August 2016.

    Comments: to be published in EMNLP 2016