Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Hoffman, M W

.
  1. arXiv:2408.09840  [pdf, other

    cs.LG math.NA physics.comp-ph

    Machine Learning with Physics Knowledge for Prediction: A Survey

    Authors: Joe Watson, Chen Song, Oliver Weeger, Theo Gruner, An T. Le, Kay Hansel, Ahmed Hendawy, Oleg Arenz, Will Trojak, Miles Cranmer, Carlo D'Eramo, Fabian Bülow, Tanmay Goyal, Jan Peters, Martin W. Hoffman

    Abstract: This survey examines the broad suite of methods and models for combining machine learning with physics knowledge for prediction and forecast, with a focus on partial differential equations. These methods have attracted significant interest due to their potential impact on advancing scientific research and industrial practices by improving predictive models with small- or large-scale datasets and e… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 56 pages, 8 figures, 2 tables

  2. arXiv:2305.03870  [pdf, other

    cs.LG

    Knowledge Transfer from Teachers to Learners in Growing-Batch Reinforcement Learning

    Authors: Patrick Emedom-Nnamdi, Abram L. Friesen, Bobak Shahriari, Nando de Freitas, Matt W. Hoffman

    Abstract: Standard approaches to sequential decision-making exploit an agent's ability to continually interact with its environment and improve its control policy. However, due to safety, ethical, and practicality constraints, this type of trial-and-error experimentation is often infeasible in many real-world domains such as healthcare and robotics. Instead, control policies in these domains are typically t… ▽ More

    Submitted 9 May, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: Reincarnating Reinforcement Learning Workshop at ICLR 2023

  3. arXiv:2006.00979  [pdf, other

    cs.LG cs.AI

    Acme: A Research Framework for Distributed Reinforcement Learning

    Authors: Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang , et al. (14 additional authors not shown)

    Abstract: Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe… ▽ More

    Submitted 20 September, 2022; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme

  4. arXiv:1909.05557  [pdf, other

    cs.LG cs.AI stat.ML

    Modular Meta-Learning with Shrinkage

    Authors: Yutian Chen, Abram L. Friesen, Feryal Behbahani, Arnaud Doucet, David Budden, Matthew W. Hoffman, Nando de Freitas

    Abstract: Many real-world problems, including multi-speaker text-to-speech synthesis, can greatly benefit from the ability to meta-learn large models with only a few task-specific components. Updating only these task-specific modules then allows the model to be adapted to low-data tasks for as many steps as necessary without risking overfitting. Unfortunately, existing meta-learning methods either do not sc… ▽ More

    Submitted 22 October, 2020; v1 submitted 12 September, 2019; originally announced September 2019.

    Comments: Accepted by NeurIPS 2020

  5. arXiv:1810.05017  [pdf, other

    cs.LG cs.AI cs.CV cs.RO

    One-Shot High-Fidelity Imitation: Training Large-Scale Deep Nets with RL

    Authors: Tom Le Paine, Sergio Gómez Colmenarejo, Ziyu Wang, Scott Reed, Yusuf Aytar, Tobias Pfaff, Matt W. Hoffman, Gabriel Barth-Maron, Serkan Cabi, David Budden, Nando de Freitas

    Abstract: Humans are experts at high-fidelity imitation -- closely mimicking a demonstration, often in one attempt. Humans use this ability to quickly solve a task instance, and to bootstrap learning of new tasks. Achieving these abilities in autonomous agents is an open problem. In this paper, we introduce an off-policy RL algorithm (MetaMimic) to narrow this gap. MetaMimic can learn both (i) policies for… ▽ More

    Submitted 11 October, 2018; originally announced October 2018.

  6. arXiv:1807.05162  [pdf, other

    cs.CV cs.LG

    Large-Scale Visual Speech Recognition

    Authors: Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

    Abstract: This work presents a scalable solution to open-vocabulary visual speech recognition. To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3,886 hours of video). In tandem, we designed and trained an integrated lipreading system, consisting of a video processing pipeline that maps raw video to stable v… ▽ More

    Submitted 1 October, 2018; v1 submitted 13 July, 2018; originally announced July 2018.

  7. arXiv:1804.08617  [pdf, other

    cs.LG cs.AI stat.ML

    Distributed Distributional Deterministic Policy Gradients

    Authors: Gabriel Barth-Maron, Matthew W. Hoffman, David Budden, Will Dabney, Dan Horgan, Dhruva TB, Alistair Muldal, Nicolas Heess, Timothy Lillicrap

    Abstract: This work adopts the very successful distributional perspective on reinforcement learning and adapts it to the continuous control setting. We combine this within a distributed framework for off-policy learning in order to develop what we call the Distributed Distributional Deep Deterministic Policy Gradient algorithm, D4PG. We also combine this technique with a number of additional, simple improve… ▽ More

    Submitted 23 April, 2018; originally announced April 2018.

  8. arXiv:1707.03300  [pdf, other

    cs.AI

    The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously

    Authors: Serkan Cabi, Sergio Gómez Colmenarejo, Matthew W. Hoffman, Misha Denil, Ziyu Wang, Nando de Freitas

    Abstract: This paper introduces the Intentional Unintentional (IU) agent. This agent endows the deep deterministic policy gradients (DDPG) agent for continuous control with the ability to solve several tasks simultaneously. Learning to solve many tasks simultaneously has been a long-standing, core goal of artificial intelligence, inspired by infant development and motivated by the desire to build flexible r… ▽ More

    Submitted 11 July, 2017; originally announced July 2017.

  9. arXiv:1703.04813  [pdf, other

    cs.LG cs.NE stat.ML

    Learned Optimizers that Scale and Generalize

    Authors: Olga Wichrowska, Niru Maheswaranathan, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Nando de Freitas, Jascha Sohl-Dickstein

    Abstract: Learning to learn has emerged as an important direction for achieving artificial intelligence. Two of the primary barriers to its adoption are an inability to scale to larger problems and a limited ability to generalize to new tasks. We introduce a learned gradient descent optimizer that generalizes well to new tasks, and which has significantly reduced memory and computation overhead. We achieve… ▽ More

    Submitted 7 September, 2017; v1 submitted 14 March, 2017; originally announced March 2017.

    Comments: Final ICML paper after reviewer suggestions

  10. arXiv:1611.03824  [pdf, other

    stat.ML cs.LG

    Learning to Learn without Gradient Descent by Gradient Descent

    Authors: Yutian Chen, Matthew W. Hoffman, Sergio Gomez Colmenarejo, Misha Denil, Timothy P. Lillicrap, Matt Botvinick, Nando de Freitas

    Abstract: We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter t… ▽ More

    Submitted 12 June, 2017; v1 submitted 11 November, 2016; originally announced November 2016.

    Comments: Accepted by ICML 2017. Previous version "Learning to Learn for Global Optimization of Black Box Functions" was published in the Deep Reinforcement Learning Workshop, NIPS 2016

  11. arXiv:1606.04474  [pdf, other

    cs.NE cs.LG

    Learning to learn by gradient descent by gradient descent

    Authors: Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul, Brendan Shillingford, Nando de Freitas

    Abstract: The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorithms… ▽ More

    Submitted 30 November, 2016; v1 submitted 14 June, 2016; originally announced June 2016.

  12. arXiv:1511.09422  [pdf, other

    stat.ML

    A General Framework for Constrained Bayesian Optimization using Information-based Search

    Authors: José Miguel Hernández-Lobato, Michael A. Gelbart, Ryan P. Adams, Matthew W. Hoffman, Zoubin Ghahramani

    Abstract: We present an information-theoretic framework for solving global black-box optimization problems that also have black-box constraints. Of particular interest to us is to efficiently solve problems with decoupled constraints, in which subsets of the objective and constraint functions may be evaluated independently. For example, when the objective is evaluated on a CPU and the constraints are evalua… ▽ More

    Submitted 4 September, 2016; v1 submitted 30 November, 2015; originally announced November 2015.

  13. arXiv:1502.05312  [pdf, other

    stat.ML

    Predictive Entropy Search for Bayesian Optimization with Unknown Constraints

    Authors: José Miguel Hernández-Lobato, Michael A. Gelbart, Matthew W. Hoffman, Ryan P. Adams, Zoubin Ghahramani

    Abstract: Unknown constraints arise in many types of expensive black-box optimization problems. Several methods have been proposed recently for performing Bayesian optimization with constraints, based on the expected improvement (EI) heuristic. However, EI can lead to pathologies when used with constraints. For example, in the case of decoupled constraints---i.e., when one can independently evaluate the obj… ▽ More

    Submitted 15 July, 2015; v1 submitted 18 February, 2015; originally announced February 2015.

  14. arXiv:1406.4625  [pdf, other

    stat.ML cs.LG

    An Entropy Search Portfolio for Bayesian Optimization

    Authors: Bobak Shahriari, Ziyu Wang, Matthew W. Hoffman, Alexandre Bouchard-Côté, Nando de Freitas

    Abstract: Bayesian optimization is a sample-efficient method for black-box global optimization. How- ever, the performance of a Bayesian optimization method very much depends on its exploration strategy, i.e. the choice of acquisition function, and it is not clear a priori which choice will result in superior performance. While portfolio methods provide an effective, principled way of combining a collection… ▽ More

    Submitted 4 March, 2015; v1 submitted 18 June, 2014; originally announced June 2014.

    Comments: 10 pages, 5 figures

  15. arXiv:1406.2541  [pdf, other

    stat.ML cs.LG

    Predictive Entropy Search for Efficient Global Optimization of Black-box Functions

    Authors: José Miguel Hernández-Lobato, Matthew W. Hoffman, Zoubin Ghahramani

    Abstract: We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES). At each iteration, PES selects the next evaluation point that maximizes the expected information gained with respect to the global maximum. PES codifies this intractable acquisition function in terms of the expected reduction in the differential entropy of the predictive distribution… ▽ More

    Submitted 10 June, 2014; originally announced June 2014.

  16. arXiv:1303.6746  [pdf, other

    stat.ML cs.LG

    Exploiting correlation and budget constraints in Bayesian multi-armed bandit optimization

    Authors: Matthew W. Hoffman, Bobak Shahriari, Nando de Freitas

    Abstract: We address the problem of finding the maximizer of a nonlinear smooth function, that can only be evaluated point-wise, subject to constraints on the number of permitted function evaluations. This problem is also known as fixed-budget best arm identification in the multi-armed bandit literature. We introduce a Bayesian approach for this problem and show that it empirically outperforms both the exis… ▽ More

    Submitted 11 November, 2013; v1 submitted 27 March, 2013; originally announced March 2013.

  17. arXiv:1009.5419  [pdf, other

    cs.LG

    Portfolio Allocation for Bayesian Optimization

    Authors: Eric Brochu, Matthew W. Hoffman, Nando de Freitas

    Abstract: Bayesian optimization with Gaussian processes has become an increasingly popular tool in the machine learning community. It is efficient and can be used when very little is known about the objective function, making it popular in expensive black-box optimization scenarios. It uses Bayesian methods to sample the objective efficiently using an acquisition function which incorporates the model's esti… ▽ More

    Submitted 7 March, 2011; v1 submitted 27 September, 2010; originally announced September 2010.

    Comments: This revision contains an updated the performance bound and other minor text changes

    ACM Class: G.1.6; G.3; I.2.6