-
Acme: A Research Framework for Distributed Reinforcement Learning
Authors:
Matthew W. Hoffman,
Bobak Shahriari,
John Aslanides,
Gabriel Barth-Maron,
Nikola Momchev,
Danila Sinopalnikov,
Piotr Stańczyk,
Sabela Ramos,
Anton Raichuk,
Damien Vincent,
Léonard Hussenot,
Robert Dadashi,
Gabriel Dulac-Arnold,
Manu Orsini,
Alexis Jacq,
Johan Ferret,
Nino Vieillard,
Seyed Kamyar Seyed Ghasemipour,
Sertan Girgin,
Olivier Pietquin,
Feryal Behbahani,
Tamara Norman,
Abbas Abdolmaleki,
Albin Cassirer,
Fan Yang
, et al. (14 additional authors not shown)
Abstract:
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce publishe…
▽ More
Deep reinforcement learning (RL) has led to many recent and groundbreaking advances. However, these advances have often come at the cost of both increased scale in the underlying architectures being trained as well as increased complexity of the RL algorithms used to train them. These increases have in turn made it more difficult for researchers to rapidly prototype new ideas or reproduce published RL algorithms. To address these concerns this work describes Acme, a framework for constructing novel RL algorithms that is specifically designed to enable agents that are built using simple, modular components that can be used at various scales of execution. While the primary goal of Acme is to provide a framework for algorithm development, a secondary goal is to provide simple reference implementations of important or state-of-the-art algorithms. These implementations serve both as a validation of our design decisions as well as an important contribution to reproducibility in RL research. In this work we describe the major design decisions made within Acme and give further details as to how its components can be used to implement various algorithms. Our experiments provide baselines for a number of common and state-of-the-art algorithms as well as showing how these algorithms can be scaled up for much larger and more complex environments. This highlights one of the primary advantages of Acme, namely that it can be used to implement large, distributed RL algorithms that can run at massive scales while still maintaining the inherent readability of that implementation.
This work presents a second version of the paper which coincides with an increase in modularity, additional emphasis on offline, imitation and learning from demonstrations algorithms, as well as various new agents implemented as part of Acme.
△ Less
Submitted 20 September, 2022; v1 submitted 1 June, 2020;
originally announced June 2020.
-
A comprehensive study of GRB 070125, a most energetic gamma ray burst
Authors:
Poonam Chandra,
S. Bradley Cenko,
Dale Frail,
Roger Chevalier,
Jean-Pierre Macquart,
Shri Kulkarni,
Douglas C. -J. Bock,
Frank Bertoldi,
Mansi Kasliwal,
Derek B. Fox,
Paul A. Price,
Edo Berger,
Alicia Soderberg,
Fiona A. Harrison,
Avishay Gal-Yam,
Eran Ofek,
Arne Rau,
Brian P. Schmidt,
P. Brian Cameron,
Lennox L. Cowie,
Antoinette Cowie,
Michael Dopita,
Bruce Peterson,
Bryan E. Penprase
Abstract:
We present a comprehensive multiwavelength analysis of the bright, long duration gamma-ray burst GRB 070125, comprised of observations in $γ$-ray, X-ray, optical, millimeter and centimeter wavebands. Simultaneous fits to the optical and X-ray light curves favor a break on day 3.78, which we interpret as the jet break from a collimated outflow. Independent fits to optical and X-ray bands give sim…
▽ More
We present a comprehensive multiwavelength analysis of the bright, long duration gamma-ray burst GRB 070125, comprised of observations in $γ$-ray, X-ray, optical, millimeter and centimeter wavebands. Simultaneous fits to the optical and X-ray light curves favor a break on day 3.78, which we interpret as the jet break from a collimated outflow. Independent fits to optical and X-ray bands give similar results in the optical bands but shift the jet break to around day 10 in the X-ray light curve. We show that for the physical parameters derived for GRB 070125, inverse Compton scattering effects are important throughout the afterglow evolution. While inverse Compton scattering does not affect radio and optical bands, it may be a promising candidate to delay the jet break in the X-ray band. Radio light curves show rapid flux variations, which are interpreted as due to interstellar scintillation, and are used to derive an upper limit of $2.4 \times 10^{17}$ cm on the radius of the fireball in the lateral expansion phase of the jet. Radio light curves and spectra suggest a high synchrotron self absorption frequency indicative of the afterglow shock wave moving in a dense medium. Our broadband modeling favors a constant density profile for the circumburst medium over a wind-like profile ($R^{-2}$). However, keeping in mind the uncertainty of the parameters, it is difficult to unambiguously distinguish between the two density profiles. Our broadband fits suggest that \event is a burst with high radiative efficiency ($> 60 %$).
△ Less
Submitted 16 September, 2008; v1 submitted 19 February, 2008;
originally announced February 2008.
-
The microwave power handling of a FIB generated weak link in a YBCO film
Authors:
A. Cowie,
L. F. Cohen,
M. W. Denhoff
Abstract:
We have measured the power dependent microwave properties of a weak link in a YBa2Cu3O7 thin film formed by writing a line of damage using a focused ion beam. The measurement was made using a parallel plate resonator at 5.5 GHz with the weak link written across the width of one of the plates. The ion induced damage was characterized using a TRIM computer simulation and the dc properties of simil…
▽ More
We have measured the power dependent microwave properties of a weak link in a YBa2Cu3O7 thin film formed by writing a line of damage using a focused ion beam. The measurement was made using a parallel plate resonator at 5.5 GHz with the weak link written across the width of one of the plates. The ion induced damage was characterized using a TRIM computer simulation and the dc properties of similar weak links was measured. Using a 200 eV Si ion dose of 2e13 cm-2, the Tc of the damaged region was reduced by 5.5 K and the normal resistivity was doubled. Surprisingly, the microwave measurements did not show any Josephson junction characteristics. Rather, the ion damaged region exhibited a greatly increased microwave resistivity that was constant as a function of microwave power up to rf fields of 20 mT at 21 K.
△ Less
Submitted 15 June, 1999; v1 submitted 8 February, 1999;
originally announced February 1999.