Skip to main content

Showing 1–29 of 29 results for author: Reed, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.15391  [pdf, other

    cs.LG cs.AI cs.CV

    Genie: Generative Interactive Environments

    Authors: Jake Bruce, Michael Dennis, Ashley Edwards, Jack Parker-Holder, Yuge Shi, Edward Hughes, Matthew Lai, Aditi Mavalankar, Richie Steigerwald, Chris Apps, Yusuf Aytar, Sarah Bechtle, Feryal Behbahani, Stephanie Chan, Nicolas Heess, Lucy Gonzalez, Simon Osindero, Sherjil Ozair, Scott Reed, Jingwei Zhang, Konrad Zolna, Jeff Clune, Nando de Freitas, Satinder Singh, Tim Rocktäschel

    Abstract: We introduce Genie, the first generative interactive environment trained in an unsupervised manner from unlabelled Internet videos. The model can be prompted to generate an endless variety of action-controllable virtual worlds described through text, synthetic images, photographs, and even sketches. At 11B parameters, Genie can be considered a foundation world model. It is comprised of a spatiotem… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: https://sites.google.com/corp/view/genie-2024/

  2. arXiv:2306.11706  [pdf, other

    cs.RO cs.LG

    RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

    Authors: Konstantinos Bousmalis, Giulia Vezzani, Dushyant Rao, Coline Devin, Alex X. Lee, Maria Bauza, Todor Davchev, Yuxiang Zhou, Agrim Gupta, Akhil Raju, Antoine Laurens, Claudio Fantacci, Valentin Dalibard, Martina Zambelli, Murilo Martins, Rugile Pevceviciute, Michiel Blokzijl, Misha Denil, Nathan Batchelor, Thomas Lampe, Emilio Parisotto, Konrad Żołna, Scott Reed, Sergio Gómez Colmenarejo, Jon Scholz , et al. (14 additional authors not shown)

    Abstract: The ability to leverage heterogeneous robotic experience from different robots and tasks to quickly master novel skills and embodiments has the potential to transform robot learning. Inspired by recent advances in foundation models for vision and language, we propose a multi-embodiment, multi-task generalist agent for robotic manipulation. This agent, named RoboCat, is a visual goal-conditioned de… ▽ More

    Submitted 22 December, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Transactions on Machine Learning Research (12/2023)

  3. arXiv:2208.02670  [pdf

    stat.ML cs.LG

    Development and Validation of ML-DQA -- a Machine Learning Data Quality Assurance Framework for Healthcare

    Authors: Mark Sendak, Gaurav Sirdeshmukh, Timothy Ochoa, Hayley Premo, Linda Tang, Kira Niederhoffer, Sarah Reed, Kaivalya Deshpande, Emily Sterrett, Melissa Bauer, Laurie Snyder, Afreen Shariff, David Whellan, Jeffrey Riggio, David Gaieski, Kristin Corey, Megan Richards, Michael Gao, Marshall Nichols, Bradley Heintze, William Knechtle, William Ratliff, Suresh Balu

    Abstract: The approaches by which the machine learning and clinical research communities utilize real world data (RWD), including data captured in the electronic health record (EHR), vary dramatically. While clinical researchers cautiously use RWD for clinical investigations, ML for healthcare teams consume public datasets with minimal scrutiny to develop new algorithms. This study bridges this gap by devel… ▽ More

    Submitted 4 August, 2022; originally announced August 2022.

    Comments: Presented at 2022 Machine Learning in Health Care Conference

  4. arXiv:2205.06175  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    A Generalist Agent

    Authors: Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas

    Abstract: Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, dec… ▽ More

    Submitted 11 November, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Published at TMLR, 42 pages

    Journal ref: Transactions on Machine Learning Research, 11/2022, https://openreview.net/forum?id=1ikK0kHjvj

  5. arXiv:2110.10819  [pdf, other

    cs.LG cs.AI

    Shaking the foundations: delusions in sequence models for interaction and control

    Authors: Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Perolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott Reed, Marcus Hutter, Nando de Freitas, Shane Legg

    Abstract: The recent phenomenal success of language models has reinvigorated machine learning research, and large sequence models such as transformers are being applied to a variety of domains. One important problem class that has remained relatively elusive however is purposeful adaptive behavior. Currently there is a common perception that sequence models "lack the understanding of the cause and effect of… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: DeepMind Tech Report, 16 pages, 4 figures

  6. arXiv:2012.06899  [pdf, other

    cs.LG cs.AI cs.RO

    Semi-supervised reward learning for offline reinforcement learning

    Authors: Ksenia Konyushkova, Konrad Zolna, Yusuf Aytar, Alexander Novikov, Scott Reed, Serkan Cabi, Nando de Freitas

    Abstract: In offline reinforcement learning (RL) agents are trained using a logged dataset. It appears to be the most natural route to attack real-life applications because in domains such as healthcare and robotics interactions with the environment are either expensive or unethical. Training agents usually requires reward functions, but unfortunately, rewards are seldom available in practice and their engi… ▽ More

    Submitted 12 December, 2020; originally announced December 2020.

    Comments: Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)

  7. arXiv:2011.13885  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Offline Learning from Demonstrations and Unlabeled Experience

    Authors: Konrad Zolna, Alexander Novikov, Ksenia Konyushkova, Caglar Gulcehre, Ziyu Wang, Yusuf Aytar, Misha Denil, Nando de Freitas, Scott Reed

    Abstract: Behavior cloning (BC) is often practical for robot learning because it allows a policy to be trained offline without rewards, by supervised learning on expert demonstrations. However, BC does not effectively leverage what we will refer to as unlabeled experience: data of mixed and unknown quality without reward annotations. This unlabeled data can be generated by a variety of sources such as human… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

    Comments: Accepted to Offline Reinforcement Learning Workshop at Neural Information Processing Systems (2020)

  8. arXiv:2006.15134  [pdf, other

    cs.LG cs.AI stat.ML

    Critic Regularized Regression

    Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Josh Merel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

    Abstract: Offline reinforcement learning (RL), also known as batch RL, offers the prospect of policy optimization from large pre-recorded datasets without online environment interaction. It addresses challenges with regard to the cost of data collection and safety, both of which are particularly pertinent to real-world applications of RL. Unfortunately, most off-policy algorithms perform poorly when learnin… ▽ More

    Submitted 22 September, 2021; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 24 pages; presented at NeurIPS 2020

  9. arXiv:1910.01077  [pdf, other

    cs.LG cs.AI cs.RO stat.ML

    Task-Relevant Adversarial Imitation Learning

    Authors: Konrad Zolna, Scott Reed, Alexander Novikov, Sergio Gomez Colmenarejo, David Budden, Serkan Cabi, Misha Denil, Nando de Freitas, Ziyu Wang

    Abstract: We show that a critical vulnerability in adversarial imitation is the tendency of discriminator networks to learn spurious associations between visual features and expert labels. When the discriminator focuses on task-irrelevant features, it does not provide an informative reward signal, leading to poor task performance. We analyze this problem in detail and propose a solution that outperforms sta… ▽ More

    Submitted 12 November, 2020; v1 submitted 2 October, 2019; originally announced October 2019.

    Comments: Accepted to CoRL 2020 (see presentation here: https://youtu.be/ZgQvFGuEgFU )

  10. arXiv:1909.12200  [pdf, other

    cs.RO cs.LG

    Scaling data-driven robotics with reward sketching and batch reinforcement learning

    Authors: Serkan Cabi, Sergio Gómez Colmenarejo, Alexander Novikov, Ksenia Konyushkova, Scott Reed, Rae Jeong, Konrad Zolna, Yusuf Aytar, David Budden, Mel Vecerik, Oleg Sushkov, David Barker, Jonathan Scholz, Misha Denil, Nando de Freitas, Ziyu Wang

    Abstract: We present a framework for data-driven robotics that makes use of a large dataset of recorded robot experience and scales to several tasks using learned reward functions. We show how to apply this framework to accomplish three different object manipulation tasks on a real robot platform. Given demonstrations of a task together with task-agnostic recorded experience, we use a special form of human… ▽ More

    Submitted 4 June, 2020; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Project website: https://sites.google.com/view/data-driven-robotics/

    Journal ref: Robotics: Science and Systems Conference 2020

  11. arXiv:1905.12941  [pdf, other

    cs.AI

    Learning Compositional Neural Programs with Recursive Tree Search and Planning

    Authors: Thomas Pierrot, Guillaume Ligner, Scott Reed, Olivier Sigaud, Nicolas Perrin, Alexandre Laterre, David Kas, Karim Beguir, Nando de Freitas

    Abstract: We propose a novel reinforcement learning algorithm, AlphaNPI, that incorporates the strengths of Neural Programmer-Interpreters (NPI) and AlphaZero. NPI contributes structural biases in the form of modularity, hierarchy and recursion, which are helpful to reduce sample complexity, improve generalization and increase interpretability. AlphaZero contributes powerful neural network guided search alg… ▽ More

    Submitted 13 April, 2021; v1 submitted 30 May, 2019; originally announced May 2019.

  12. arXiv:1810.05017  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL

    Authors: Tom Le Paine, Sergio Gómez Colmenarejo, Ziyu Wang, Scott Reed, Yusuf Aytar, Tobias Pfaff, Matt W. Hoffman, Gabriel Barth-Maron, Serkan Cabi, David Budden, Nando de Freitas

    Abstract: Humans are experts at high-fidelity imitation -- closely mimicking a demonstration, often in one attempt. Humans use this ability to quickly solve a task instance, and to bootstrap learning of new tasks. Achieving these abilities in autonomous agents is an open problem. In this paper, we introduce an off-policy RL algorithm (MetaMimic) to narrow this gap. MetaMimic can learn both (i) policies for… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.

  13. arXiv:1809.10460  [pdf, other

    cs.LG cs.SD stat.ML

    Sample Efficient Adaptive Text-to-Speech

    Authors: Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

    Abstract: We present a meta-learning approach for adaptive text-to-speech (TTS) with few data. During training, we learn a multi-speaker model using a shared conditional WaveNet core and independent learned embeddings for each speaker. The aim of training is not to produce a neural network with fixed weights, which is then deployed as a TTS system. Instead, the aim is to produce a network that requires few… ▽ More

    Submitted 16 January, 2019; v1 submitted 27 September, 2018; originally announced September 2018.

    Comments: Accepted by ICLR 2019

  14. arXiv:1808.00508  [pdf, other

    cs.NE

    Neural Arithmetic Logic Units

    Authors: Andrew Trask, Felix Hill, Scott Reed, Jack Rae, Chris Dyer, Phil Blunsom

    Abstract: Neural networks can learn to represent and manipulate numerical information, but they seldom generalize well outside of the range of numerical values encountered during training. To encourage more systematic numerical extrapolation, we propose an architecture that represents numerical quantities as linear activations which are manipulated using primitive arithmetic operators, controlled by learned… ▽ More

    Submitted 1 August, 2018; originally announced August 2018.

  15. arXiv:1712.10215  [pdf, other

    cs.CV

    ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans

    Authors: Angela Dai, Daniel Ritchie, Martin Bokeloh, Scott Reed, Jürgen Sturm, Matthias Nießner

    Abstract: We introduce ScanComplete, a novel data-driven approach for taking an incomplete 3D scan of a scene as input and predicting a complete 3D model along with per-voxel semantic labels. The key contribution of our method is its ability to handle large scenes with varying spatial extent, managing the cubic growth in data size as scene size increases. To this end, we devise a fully-convolutional generat… ▽ More

    Submitted 27 March, 2018; v1 submitted 29 December, 2017; originally announced December 2017.

    Comments: Video: https://youtu.be/5s5s8iH0NF8

  16. arXiv:1710.10304  [pdf, other

    cs.NE cs.CV

    Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions

    Authors: Scott Reed, Yutian Chen, Thomas Paine, Aäron van den Oord, S. M. Ali Eslami, Danilo Rezende, Oriol Vinyals, Nando de Freitas

    Abstract: Deep autoregressive models have shown state-of-the-art performance in density estimation for natural images on large-scale datasets such as ImageNet. However, such models require many thousands of gradient-based weight updates and unique image examples for training. Ideally, the models would rapidly learn visual concepts from only a handful of examples, similar to the manner in which humans learns… ▽ More

    Submitted 28 February, 2018; v1 submitted 27 October, 2017; originally announced October 2017.

  17. arXiv:1707.02747  [pdf, other

    cs.LG

    Robust Imitation of Diverse Behaviors

    Authors: Ziyu Wang, Josh Merel, Scott Reed, Greg Wayne, Nando de Freitas, Nicolas Heess

    Abstract: Deep generative models have recently shown great promise in imitation learning for motor control. Given enough data, even supervised approaches can do one-shot imitation learning; however, they are vulnerable to cascading failures when the agent trajectory diverges from the demonstrations. Compared to purely supervised methods, Generative Adversarial Imitation Learning (GAIL) can learn more robust… ▽ More

    Submitted 14 July, 2017; v1 submitted 10 July, 2017; originally announced July 2017.

  18. arXiv:1703.03664  [pdf, other

    cs.CV cs.NE

    Parallel Multiscale Autoregressive Density Estimation

    Authors: Scott Reed, Aäron van den Oord, Nal Kalchbrenner, Sergio Gómez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas

    Abstract: PixelCNN achieves state-of-the-art results in density estimation for natural images. Although training is fast, inference is costly, requiring one network evaluation per pixel; O(N) for N pixels. This can be sped up by caching activations, but still involves generating each pixel sequentially. In this work, we propose a parallelized PixelCNN that allows more efficient inference by modeling certain… ▽ More

    Submitted 10 March, 2017; originally announced March 2017.

  19. arXiv:1610.02454  [pdf, other

    cs.CV cs.NE

    Learning What and Where to Draw

    Authors: Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

    Abstract: Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers. While existing models can synthesize images based on global constraints such as a class label or caption, they do not provide control over pose or object location. We propose a new model, the Generative… ▽ More

    Submitted 7 October, 2016; originally announced October 2016.

    Comments: In NIPS 2016

  20. arXiv:1605.05396  [pdf, other

    cs.NE cs.CV

    Generative Adversarial Text to Image Synthesis

    Authors: Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, Honglak Lee

    Abstract: Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly compel… ▽ More

    Submitted 5 June, 2016; v1 submitted 17 May, 2016; originally announced May 2016.

    Comments: ICML 2016

  21. arXiv:1605.05395  [pdf, other

    cs.CV

    Learning Deep Representations of Fine-grained Visual Descriptions

    Authors: Scott Reed, Zeynep Akata, Bernt Schiele, Honglak Lee

    Abstract: State-of-the-art methods for zero-shot visual recognition formulate learning as a joint embedding problem of images and side information. In these formulations the current best complement to visual features are attributes: manually encoded vectors describing shared characteristics among categories. Despite good performance, attributes have limitations: (1) finer-grained recognition requires commen… ▽ More

    Submitted 17 May, 2016; originally announced May 2016.

    Comments: CVPR 2016

  22. arXiv:1601.00706  [pdf, other

    cs.LG cs.AI cs.CV

    Weakly-supervised Disentangling with Recurrent Transformations for 3D View Synthesis

    Authors: Jimei Yang, Scott Reed, Ming-Hsuan Yang, Honglak Lee

    Abstract: An important problem for both graphics and vision is to synthesize novel views of a 3D object from a single image. This is particularly challenging due to the partial observability inherent in projecting a 3D object onto the image space, and the ill-posedness of inferring object shape and pose. However, we can train a neural network to address the problem if we restrict our attention to specific o… ▽ More

    Submitted 4 January, 2016; originally announced January 2016.

    Comments: This was published in NIPS 2015 conference

  23. SSD: Single Shot MultiBox Detector

    Authors: Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, Alexander C. Berg

    Abstract: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box… ▽ More

    Submitted 29 December, 2016; v1 submitted 7 December, 2015; originally announced December 2015.

    Comments: ECCV 2016

  24. arXiv:1511.06279  [pdf, other

    cs.LG cs.NE

    Neural Programmer-Interpreters

    Authors: Scott Reed, Nando de Freitas

    Abstract: We propose the neural programmer-interpreter (NPI): a recurrent and compositional neural network that learns to represent and execute programs. NPI has three learnable components: a task-agnostic recurrent core, a persistent key-value program memory, and domain-specific encoders that enable a single NPI to operate in multiple perceptually diverse environments with distinct affordances. By learning… ▽ More

    Submitted 29 February, 2016; v1 submitted 19 November, 2015; originally announced November 2015.

    Comments: ICLR 2016 conference submission

  25. arXiv:1412.6596  [pdf, other

    cs.CV cs.LG cs.NE

    Training Deep Neural Networks on Noisy Labels with Bootstrapping

    Authors: Scott Reed, Honglak Lee, Dragomir Anguelov, Christian Szegedy, Dumitru Erhan, Andrew Rabinovich

    Abstract: Current state-of-the-art deep learning systems for visual object recognition and detection use purely supervised training with regularization such as dropout to avoid overfitting. The performance depends critically on the amount of labeled examples, and in current practice the labels are assumed to be unambiguous and accurate. However, this assumption often does not hold; e.g. in recognition, clas… ▽ More

    Submitted 15 April, 2015; v1 submitted 19 December, 2014; originally announced December 2014.

  26. arXiv:1412.1441  [pdf, other

    cs.CV

    Scalable, High-Quality Object Detection

    Authors: Christian Szegedy, Scott Reed, Dumitru Erhan, Dragomir Anguelov, Sergey Ioffe

    Abstract: Current high-quality object detection approaches use the scheme of salience-based object proposal methods followed by post-classification using deep convolutional features. This spurred recent research in improving object proposal methods. However, domain agnostic proposal generation has the principal drawback that the proposals come unranked or with very weak ranking, making it hard to trade-off… ▽ More

    Submitted 8 December, 2015; v1 submitted 3 December, 2014; originally announced December 2014.

  27. Evaluation of Output Embeddings for Fine-Grained Image Classification

    Authors: Zeynep Akata, Scott Reed, Daniel Walter, Honglak Lee, Bernt Schiele

    Abstract: Image classification has advanced significantly in recent years with the availability of large-scale image sets. However, fine-grained classification remains a major challenge due to the annotation cost of large numbers of fine-grained categories. This project shows that compelling classification performance can be achieved on such categories even without labeled training data. Given image and cla… ▽ More

    Submitted 28 August, 2015; v1 submitted 30 September, 2014; originally announced September 2014.

    Comments: @inproceedings {ARWLS15, title = {Evaluation of Output Embeddings for Fine-Grained Image Classification}, booktitle = {IEEE Computer Vision and Pattern Recognition}, year = {2015}, author = {Zeynep Akata and Scott Reed and Daniel Walter and Honglak Lee and Bernt Schiele} }

  28. arXiv:1409.4842  [pdf, other

    cs.CV

    Going Deeper with Convolutions

    Authors: Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

    Abstract: We propose a deep convolutional neural network architecture codenamed "Inception", which was responsible for setting the new state of the art for classification and detection in the ImageNet Large-Scale Visual Recognition Challenge 2014 (ILSVRC 2014). The main hallmark of this architecture is the improved utilization of the computing resources inside the network. This was achieved by a carefully c… ▽ More

    Submitted 16 September, 2014; originally announced September 2014.

  29. arXiv:1405.5741  [pdf

    cs.CY

    Bitcoin Cooperative Proof-of-Stake

    Authors: Stephen L. Reed

    Abstract: A hard-fork reconfiguration of the peer to peer Bitcoin network is described that substitutes tamper-evident logs and proof-of-stake consensus for proof-of-work consensus. The block creation rewards and transaction fees are reallocated to establish and staff a secure financial data network capable of handling the world's transactions with subsecond response time. The new system pays dividends to s… ▽ More

    Submitted 22 May, 2014; originally announced May 2014.

    Comments: 16 pages, 2 figures