Skip to main content

Showing 1–9 of 9 results for author: Henderson, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2308.03526  [pdf, other

    cs.LG cs.AI

    AlphaStar Unplugged: Large-Scale Offline Reinforcement Learning

    Authors: Michaël Mathieu, Sherjil Ozair, Srivatsan Srinivasan, Caglar Gulcehre, Shangtong Zhang, Ray Jiang, Tom Le Paine, Richard Powell, Konrad Żołna, Julian Schrittwieser, David Choi, Petko Georgiev, Daniel Toyama, Aja Huang, Roman Ring, Igor Babuschkin, Timo Ewalds, Mahyar Bordbar, Sarah Henderson, Sergio Gómez Colmenarejo, Aäron van den Oord, Wojciech Marian Czarnecki, Nando de Freitas, Oriol Vinyals

    Abstract: StarCraft II is one of the most challenging simulated reinforcement learning environments; it is partially observable, stochastic, multi-agent, and mastering StarCraft II requires strategic planning over long time horizons with real-time low-level execution. It also has an active professional competitive scene. StarCraft II is uniquely suited for advancing offline RL algorithms, both because of it… ▽ More

    Submitted 7 August, 2023; originally announced August 2023.

    Comments: 32 pages, 13 figures, previous version published as a NeurIPS 2021 workshop: https://openreview.net/forum?id=Np8Pumfoty

  2. arXiv:2212.10433  [pdf, other

    cs.DS cs.AI cs.LG

    Scheduling with Predictions

    Authors: Woo-Hyung Cho, Shane Henderson, David Shmoys

    Abstract: There is significant interest in deploying machine learning algorithms for diagnostic radiology, as modern learning techniques have made it possible to detect abnormalities in medical images within minutes. While machine-assisted diagnoses cannot yet reliably replace human reviews of images by a radiologist, they could inform prioritization rules for determining the order by which to review patien… ▽ More

    Submitted 20 December, 2022; originally announced December 2022.

  3. arXiv:2112.11446  [pdf, other

    cs.CL cs.AI

    Scaling Language Models: Methods, Analysis & Insights from Training Gopher

    Authors: Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor , et al. (55 additional authors not shown)

    Abstract: Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world. In this paper, we present an analysis of Transformer-based language model performance across a wide range of model scales -- from models with tens of millions of parameters up to a 280 billion parameter model called Gop… ▽ More

    Submitted 21 January, 2022; v1 submitted 8 December, 2021; originally announced December 2021.

    Comments: 120 pages

  4. arXiv:2006.00979  [pdf, other

    cs.LG cs.AI

    Acme: A Research Framework for Distributed Reinforcement Learning

    Authors: Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang , et al. (14 additional authors not shown)

    Abstract: Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe… ▽ More

    Submitted 20 September, 2022; v1 submitted 1 June, 2020; originally announced June 2020.

    Comments: This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme

  5. arXiv:1702.07694  [pdf, other

    stat.ML cs.IT cs.LG

    Bayes-Optimal Entropy Pursuit for Active Choice-Based Preference Learning

    Authors: Stephen N. Pallone, Peter I. Frazier, Shane G. Henderson

    Abstract: We analyze the problem of learning a single user's preferences in an active learning setting, sequentially and adaptively querying the user over a finite time horizon. Learning is conducted via choice-based queries, where the user selects her preferred option among a small subset of offered alternatives. These queries have been shown to be a robust and efficient way to learn an individual's prefer… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

  6. arXiv:1611.09304  [pdf, other

    math.OC cs.DM

    Minimizing Multimodular Functions and Allocating Capacity in Bike-Sharing Systems

    Authors: Daniel Freund, Shane G. Henderson, David B. Shmoys

    Abstract: The growing popularity of bike-sharing systems around the world has motivated recent attention to models and algorithms for their effective operation. Most of this literature focuses on their daily operation for managing asymmetric demand. In this work, we consider the more strategic question of how to (re-)allocate dock-capacity in such systems. We develop mathematical formulations for variations… ▽ More

    Submitted 14 March, 2022; v1 submitted 28 November, 2016; originally announced November 2016.

  7. arXiv:1506.04986  [pdf, ps, other

    cs.DC math.OC

    Efficient Ranking and Selection in Parallel Computing Environments

    Authors: Eric C. Ni, Dragos F. Ciocan, Shane G. Henderson, Susan R. Hunter

    Abstract: The goal of ranking and selection (R&S) procedures is to identify the best stochastic system from among a finite set of competing alternatives. Such procedures require constructing estimates of each system's performance, which can be obtained simultaneously by running multiple independent replications on a parallel computing platform. However, nontrivial statistical and implementation issues arise… ▽ More

    Submitted 16 June, 2015; originally announced June 2015.

  8. arXiv:1411.0275  [pdf, ps, other

    cs.DL

    On the Shoulders of Giants: The Growing Impact of Older Articles

    Authors: Alex Verstak, Anurag Acharya, Helder Suzuki, Sean Henderson, Mikhail Iakhiaev, Cliff Chiung Yu Lin, Namit Shetty

    Abstract: In this paper, we examine the evolution of the impact of older scholarly articles. We attempt to answer four questions. First, how often are older articles cited and how has this changed over time. Second, how does the impact of older articles vary across different research fields. Third, is the change in the impact of older articles accelerating or slowing down. Fourth, are these trends different… ▽ More

    Submitted 2 November, 2014; originally announced November 2014.

  9. arXiv:1410.2217  [pdf, ps, other

    cs.DL

    Rise of the Rest: The Growing Impact of Non-Elite Journals

    Authors: Anurag Acharya, Alex Verstak, Helder Suzuki, Sean Henderson, Mikhail Iakhiaev, Cliff Chiung Yu Lin, Namit Shetty

    Abstract: In this paper, we examine the evolution of the impact of non-elite journals. We attempt to answer two questions. First, what fraction of the top-cited articles are published in non-elite journals and how has this changed over time. Second, what fraction of the total citations are to non-elite journals and how has this changed over time. We studied citations to articles published in 1995-2013. We… ▽ More

    Submitted 8 October, 2014; originally announced October 2014.