-
A Deep Learning Approach for Characterizing Major Galaxy Mergers
Authors:
Skanda Koppula,
Victor Bapst,
Marc Huertas-Company,
Sam Blackwell,
Agnieszka Grabska-Barwinska,
Sander Dieleman,
Andrea Huber,
Natasha Antropova,
Mikolaj Binkowski,
Hannah Openshaw,
Adria Recasens,
Fernando Caro,
Avishai Deke,
Yohan Dubois,
Jesus Vega Ferrero,
David C. Koo,
Joel R. Primack,
Trevor Back
Abstract:
Fine-grained estimation of galaxy merger stages from observations is a key problem useful for validation of our current theoretical understanding of galaxy formation. To this end, we demonstrate a CNN-based regression model that is able to predict, for the first time, using a single image, the merger stage relative to the first perigee passage with a median error of 38.3 million years (Myrs) over…
▽ More
Fine-grained estimation of galaxy merger stages from observations is a key problem useful for validation of our current theoretical understanding of galaxy formation. To this end, we demonstrate a CNN-based regression model that is able to predict, for the first time, using a single image, the merger stage relative to the first perigee passage with a median error of 38.3 million years (Myrs) over a period of 400 Myrs. This model uses no specific dynamical modeling and learns only from simulated merger events. We show that our model provides reasonable estimates on real observations, approximately matching prior estimates provided by detailed dynamical modeling. We provide a preliminary interpretability analysis of our models, and demonstrate first steps toward calibrated uncertainty estimation.
△ Less
Submitted 9 February, 2021;
originally announced February 2021.
-
Combining Q-Learning and Search with Amortized Value Estimates
Authors:
Jessica B. Hamrick,
Victor Bapst,
Alvaro Sanchez-Gonzalez,
Tobias Pfaff,
Theophane Weber,
Lars Buesing,
Peter W. Battaglia
Abstract:
We introduce "Search with Amortized Value Estimates" (SAVE), an approach for combining model-free Q-learning with model-based Monte-Carlo Tree Search (MCTS). In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values. The new Q-estimates are then used in combination with real experience to update the prior. This effectively amort…
▽ More
We introduce "Search with Amortized Value Estimates" (SAVE), an approach for combining model-free Q-learning with model-based Monte-Carlo Tree Search (MCTS). In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values. The new Q-estimates are then used in combination with real experience to update the prior. This effectively amortizes the value computation performed by MCTS, resulting in a cooperative relationship between model-free learning and model-based search. SAVE can be implemented on top of any Q-learning agent with access to a model, which we demonstrate by incorporating it into agents that perform challenging physical reasoning tasks and Atari. SAVE consistently achieves higher rewards with fewer training steps, and---in contrast to typical model-based search approaches---yields strong performance with very small search budgets. By combining real experience with information computed during search, SAVE demonstrates that it is possible to improve on both the performance of model-free learning and the computational cost of planning.
△ Less
Submitted 10 January, 2020; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Object-oriented state editing for HRL
Authors:
Victor Bapst,
Alvaro Sanchez-Gonzalez,
Omar Shams,
Kimberly Stachenfeld,
Peter W. Battaglia,
Satinder Singh,
Jessica B. Hamrick
Abstract:
We introduce agents that use object-oriented reasoning to consider alternate states of the world in order to more quickly find solutions to problems. Specifically, a hierarchical controller directs a low-level agent to behave as if objects in the scene were added, deleted, or modified. The actions taken by the controller are defined over a graph-based representation of the scene, with actions corr…
▽ More
We introduce agents that use object-oriented reasoning to consider alternate states of the world in order to more quickly find solutions to problems. Specifically, a hierarchical controller directs a low-level agent to behave as if objects in the scene were added, deleted, or modified. The actions taken by the controller are defined over a graph-based representation of the scene, with actions corresponding to adding, deleting, or editing the nodes of a graph. We present preliminary results on three environments, demonstrating that our approach can achieve similar levels of reward as non-hierarchical agents, but with better data efficiency.
△ Less
Submitted 31 October, 2019;
originally announced October 2019.
-
Hamiltonian Graph Networks with ODE Integrators
Authors:
Alvaro Sanchez-Gonzalez,
Victor Bapst,
Kyle Cranmer,
Peter Battaglia
Abstract:
We introduce an approach for imposing physically informed inductive biases in learned simulation models. We combine graph networks with a differentiable ordinary differential equation integrator as a mechanism for predicting future states, and a Hamiltonian as an internal representation. We find that our approach outperforms baselines without these biases in terms of predictive accuracy, energy ac…
▽ More
We introduce an approach for imposing physically informed inductive biases in learned simulation models. We combine graph networks with a differentiable ordinary differential equation integrator as a mechanism for predicting future states, and a Hamiltonian as an internal representation. We find that our approach outperforms baselines without these biases in terms of predictive accuracy, energy accuracy, and zero-shot generalization to time-step sizes and integrator orders not experienced during training. This advances the state-of-the-art of learned simulation, and in principle is applicable beyond physical domains.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.
-
Structured agents for physical construction
Authors:
Victor Bapst,
Alvaro Sanchez-Gonzalez,
Carl Doersch,
Kimberly L. Stachenfeld,
Pushmeet Kohli,
Peter W. Battaglia,
Jessica B. Hamrick
Abstract:
Physical construction---the ability to compose objects, subject to physical dynamics, to serve some function---is fundamental to human intelligence. We introduce a suite of challenging physical construction tasks inspired by how children play with blocks, such as matching a target configuration, stacking blocks to connect objects together, and creating shelter-like structures over target objects.…
▽ More
Physical construction---the ability to compose objects, subject to physical dynamics, to serve some function---is fundamental to human intelligence. We introduce a suite of challenging physical construction tasks inspired by how children play with blocks, such as matching a target configuration, stacking blocks to connect objects together, and creating shelter-like structures over target objects. We examine how a range of deep reinforcement learning agents fare on these challenges, and introduce several new approaches which provide superior performance. Our results show that agents which use structured representations (e.g., objects and scene graphs) and structured policies (e.g., object-centric actions) outperform those which use less structured representations, and generalize better beyond their training when asked to reason about larger scenes. Model-based agents which use Monte-Carlo Tree Search also outperform strictly model-free agents in our most challenging construction problems. We conclude that approaches which combine structured representations and reasoning with powerful learning are a key path toward agents that possess rich intuitive physics, scene understanding, and planning.
△ Less
Submitted 13 May, 2019; v1 submitted 5 April, 2019;
originally announced April 2019.
-
Relational Deep Reinforcement Learning
Authors:
Vinicius Zambaldi,
David Raposo,
Adam Santoro,
Victor Bapst,
Yujia Li,
Igor Babuschkin,
Karl Tuyls,
David Reichert,
Timothy Lillicrap,
Edward Lockhart,
Murray Shanahan,
Victoria Langston,
Razvan Pascanu,
Matthew Botvinick,
Oriol Vinyals,
Peter Battaglia
Abstract:
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and…
▽ More
We introduce an approach for deep reinforcement learning (RL) that improves upon the efficiency, generalization capacity, and interpretability of conventional approaches through structured perception and relational reasoning. It uses self-attention to iteratively reason about the relations between entities in a scene and to guide a model-free policy. Our results show that in a novel navigation and planning task called Box-World, our agent finds interpretable solutions that improve upon baselines in terms of sample complexity, ability to generalize to more complex scenes than experienced during training, and overall performance. In the StarCraft II Learning Environment, our agent achieves state-of-the-art performance on six mini-games -- surpassing human grandmaster performance on four. By considering architectural inductive biases, our work opens new directions for overcoming important, but stubborn, challenges in deep RL.
△ Less
Submitted 28 June, 2018; v1 submitted 5 June, 2018;
originally announced June 2018.
-
Relational inductive biases, deep learning, and graph networks
Authors:
Peter W. Battaglia,
Jessica B. Hamrick,
Victor Bapst,
Alvaro Sanchez-Gonzalez,
Vinicius Zambaldi,
Mateusz Malinowski,
Andrea Tacchetti,
David Raposo,
Adam Santoro,
Ryan Faulkner,
Caglar Gulcehre,
Francis Song,
Andrew Ballard,
Justin Gilmer,
George Dahl,
Ashish Vaswani,
Kelsey Allen,
Charles Nash,
Victoria Langston,
Chris Dyer,
Nicolas Heess,
Daan Wierstra,
Pushmeet Kohli,
Matt Botvinick,
Oriol Vinyals
, et al. (2 additional authors not shown)
Abstract:
Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, rema…
▽ More
Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI.
The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.
△ Less
Submitted 17 October, 2018; v1 submitted 4 June, 2018;
originally announced June 2018.
-
Relational inductive bias for physical construction in humans and machines
Authors:
Jessica B. Hamrick,
Kelsey R. Allen,
Victor Bapst,
Tina Zhu,
Kevin R. McKee,
Joshua B. Tenenbaum,
Peter W. Battaglia
Abstract:
While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a "relational inductive bias": a capacity for reasoning about inter-object relations and making choices over a structured description of a scene. To test this hypot…
▽ More
While current deep learning systems excel at tasks such as object classification, language processing, and gameplay, few can construct or modify a complex system such as a tower of blocks. We hypothesize that what these systems lack is a "relational inductive bias": a capacity for reasoning about inter-object relations and making choices over a structured description of a scene. To test this hypothesis, we focus on a task that involves gluing pairs of blocks together to stabilize a tower, and quantify how well humans perform. We then introduce a deep reinforcement learning agent which uses object- and relation-centric scene and policy representations and apply it to the task. Our results show that these structured representations allow the agent to outperform both humans and more naive approaches, suggesting that relational inductive bias is an important component in solving structured reasoning problems and for building more intelligent, flexible machines.
△ Less
Submitted 4 June, 2018;
originally announced June 2018.
-
Hyperbolic Attention Networks
Authors:
Caglar Gulcehre,
Misha Denil,
Mateusz Malinowski,
Ali Razavi,
Razvan Pascanu,
Karl Moritz Hermann,
Peter Battaglia,
Victor Bapst,
David Raposo,
Adam Santoro,
Nando de Freitas
Abstract:
We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks…
▽ More
We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks. This allows us to exploit hyperbolic geometry to reason about embeddings produced by deep networks. We achieve this by re-expressing the ubiquitous mechanism of soft attention in terms of operations defined for hyperboloid and Klein models. Our method shows improvements in terms of generalization on neural machine translation, learning on graphs and visual question answering tasks while keeping the neural representations compact.
△ Less
Submitted 24 May, 2018;
originally announced May 2018.
-
Distral: Robust Multitask Reinforcement Learning
Authors:
Yee Whye Teh,
Victor Bapst,
Wojciech Marian Czarnecki,
John Quan,
James Kirkpatrick,
Raia Hadsell,
Nicolas Heess,
Razvan Pascanu
Abstract:
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from d…
▽ More
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. One direction for improving data efficiency is multitask learning with shared neural network parameters, where efficiency may be improved through transfer across related tasks. In practice, however, this is not usually observed, because gradients from different tasks can interfere negatively, making learning unstable and sometimes even less data efficient. Another issue is the different reward schemes between tasks, which can easily lead to one task dominating the learning of a shared model. We propose a new approach for joint training of multiple tasks, which we refer to as Distral (Distill & transfer learning). Instead of sharing parameters between the different workers, we propose to share a "distilled" policy that captures common behaviour across tasks. Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D environments, outperforming several related methods. Moreover, the proposed learning process is more robust and more stable---attributes that are critical in deep reinforcement learning.
△ Less
Submitted 13 July, 2017;
originally announced July 2017.
-
Sample Efficient Actor-Critic with Experience Replay
Authors:
Ziyu Wang,
Victor Bapst,
Nicolas Heess,
Volodymyr Mnih,
Remi Munos,
Koray Kavukcuoglu,
Nando de Freitas
Abstract:
This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochasti…
▽ More
This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
△ Less
Submitted 10 July, 2017; v1 submitted 3 November, 2016;
originally announced November 2016.
-
The condensation phase transition in the regular $k$-SAT model
Authors:
Victor Bapst,
Amin Coja-Oghlan
Abstract:
Much of the recent work on random constraint satisfaction problems has been inspired by ingenious but non-rigorous approaches from physics. The physics predictions typically come in the form of distributional fixed point problems that are intended to mimic Belief Propagation, a message passing algorithm, applied to the random CSP. In this paper we propose a novel method for harnessing Belief Propa…
▽ More
Much of the recent work on random constraint satisfaction problems has been inspired by ingenious but non-rigorous approaches from physics. The physics predictions typically come in the form of distributional fixed point problems that are intended to mimic Belief Propagation, a message passing algorithm, applied to the random CSP. In this paper we propose a novel method for harnessing Belief Propagation directly to obtain a rigorous proof of such a prediction, namely the existence and location of a condensation phase transition in the random regular $k$-SAT model.
△ Less
Submitted 7 October, 2015; v1 submitted 13 July, 2015;
originally announced July 2015.
-
Harnessing the Bethe free energy
Authors:
Victor Bapst,
Amin Coja-Oghlan
Abstract:
A wide class of problems in combinatorics, computer science and physics can be described along the following lines. There are a large number of variables ranging over a finite domain that interact through constraints that each bind a few variables and either encourage or discourage certain value combinations. Examples include the $k$-SAT problem or the Ising model. Such models naturally induce a G…
▽ More
A wide class of problems in combinatorics, computer science and physics can be described along the following lines. There are a large number of variables ranging over a finite domain that interact through constraints that each bind a few variables and either encourage or discourage certain value combinations. Examples include the $k$-SAT problem or the Ising model. Such models naturally induce a Gibbs measure on the set of assignments, which is characterised by its partition function. The present paper deals with the partition function of problems where the interactions between variables and constraints are induced by a sparse random (hyper)graph. According to physics predictions, a generic recipe called the "replica symmetric cavity method" yields the correct value of the partition function if the underlying model enjoys certain properties [Krzkala et al., PNAS 2007]. Guided by this conjecture, we prove general sufficient conditions for the success of the cavity method. The proofs are based on a "regularity lemma" for probability measures on sets of the form $Ω^n$ for a finite $Ω$ and a large $n$ that may be of independent interest.
△ Less
Submitted 4 October, 2015; v1 submitted 15 April, 2015;
originally announced April 2015.
-
Planting colourings silently
Authors:
Victor Bapst,
Amin Coja-Oghlan,
Charilaos Efthymiou
Abstract:
Let $k\geq3$ be a fixed integer and let $Z_k(G)$ be the number of $k$-colourings of the graph $G$. For certain values of the average degree, the random variable $Z_k(G(n,m))$ is known to be concentrated in the sense that $\frac1n(\ln Z_k(G(n,m))-\ln E[Z_k(G(n,m))])$ converges to $0$ in probability [Achlioptas and Coja-Oghlan: FOCS 2008]. In the present paper we prove a significantly stronger conce…
▽ More
Let $k\geq3$ be a fixed integer and let $Z_k(G)$ be the number of $k$-colourings of the graph $G$. For certain values of the average degree, the random variable $Z_k(G(n,m))$ is known to be concentrated in the sense that $\frac1n(\ln Z_k(G(n,m))-\ln E[Z_k(G(n,m))])$ converges to $0$ in probability [Achlioptas and Coja-Oghlan: FOCS 2008]. In the present paper we prove a significantly stronger concentration result. Namely, we show that for a wide range of average degrees, $\frac1ω(\ln Z_k(G(n,m))-\ln E[Z_k(G(n,m))])$ converges to $0$ in probability for any diverging function $ω=ω(n)\to\infty$. For $k$ exceeding a certain constant $k_0$ this result covers all average degrees up to the so-called condensation phase transition, and this is best possible. As an application, we show that the experiment of choosing a $k$-colouring of the random graph $G(n,m)$ uniformly at random is contiguous with respect to the so-called "planted model".
△ Less
Submitted 3 November, 2014;
originally announced November 2014.
-
A positive temperature phase transition in random hypergraph 2-coloring
Authors:
Victor Bapst,
Amin Coja-Oghlan,
Felicia Raßmann
Abstract:
Diluted mean-field models are graphical models in which the geometry of interactions is determined by a sparse random graph or hypergraph. Based on a nonrigorous but analytic approach called the "cavity method", physicists have predicted that in many diluted mean-field models a phase transition occurs as the inverse temperature grows from $0$ to $\infty$ [Proc. National Academy of Sciences 104 (20…
▽ More
Diluted mean-field models are graphical models in which the geometry of interactions is determined by a sparse random graph or hypergraph. Based on a nonrigorous but analytic approach called the "cavity method", physicists have predicted that in many diluted mean-field models a phase transition occurs as the inverse temperature grows from $0$ to $\infty$ [Proc. National Academy of Sciences 104 (2007) 10318-10323]. In this paper, we establish the existence and asymptotic location of this so-called condensation phase transition in the random hypergraph $2$-coloring problem.
△ Less
Submitted 22 June, 2016; v1 submitted 8 October, 2014;
originally announced October 2014.
-
The condensation phase transition in random graph coloring
Authors:
Victor Bapst,
Amin Coja-Oghlan,
Samuel Hetterich,
Felicia Rassmann,
Dan Vilenchik
Abstract:
Based on a non-rigorous formalism called the "cavity method", physicists have put forward intriguing predictions on phase transitions in discrete structures. One of the most remarkable ones is that in problems such as random $k$-SAT or random graph $k$-coloring, very shortly before the threshold for the existence of solutions there occurs another phase transition called "condensation" [Krzakala et…
▽ More
Based on a non-rigorous formalism called the "cavity method", physicists have put forward intriguing predictions on phase transitions in discrete structures. One of the most remarkable ones is that in problems such as random $k$-SAT or random graph $k$-coloring, very shortly before the threshold for the existence of solutions there occurs another phase transition called "condensation" [Krzakala et al., PNAS 2007]. The existence of this phase transition appears to be intimately related to the difficulty of proving precise results on, e.g., the $k$-colorability threshold as well as to the performance of message passing algorithms. In random graph $k$-coloring, there is a precise conjecture as to the location of the condensation phase transition in terms of a distributional fixed point problem. In this paper we prove this conjecture for $k$ exceeding a certain constant $k_0$.
△ Less
Submitted 19 April, 2014;
originally announced April 2014.
-
The effect of quantum fluctuations on the coloring of random graphs
Authors:
Victor Bapst,
Guilhem Semerjian,
Francesco Zamponi
Abstract:
We present a study of the coloring problem (antiferromagnetic Potts model) of random regular graphs, submitted to quantum fluctuations induced by a transverse field, using the quantum cavity method and quantum Monte-Carlo simulations. We determine the order of the quantum phase transition encountered at low temperature as a function of the transverse field and discuss the structure of the quantum…
▽ More
We present a study of the coloring problem (antiferromagnetic Potts model) of random regular graphs, submitted to quantum fluctuations induced by a transverse field, using the quantum cavity method and quantum Monte-Carlo simulations. We determine the order of the quantum phase transition encountered at low temperature as a function of the transverse field and discuss the structure of the quantum spin glass phase. In particular, we conclude that the quantum adiabatic algorithm would fail to solve efficiently typical instances of these problems because of avoided level crossings within the quantum spin glass phase, caused by a competition between energetic and entropic effects.
△ Less
Submitted 27 February, 2013;
originally announced February 2013.
-
The Quantum Adiabatic Algorithm applied to random optimization problems: the quantum spin glass perspective
Authors:
Victor Bapst,
Laura Foini,
Florent Krzakala,
Guilhem Semerjian,
Francesco Zamponi
Abstract:
Among various algorithms designed to exploit the specific properties of quantum computers with respect to classical ones, the quantum adiabatic algorithm is a versatile proposition to find the minimal value of an arbitrary cost function (ground state energy). Random optimization problems provide a natural testbed to compare its efficiency with that of classical algorithms. These problems correspon…
▽ More
Among various algorithms designed to exploit the specific properties of quantum computers with respect to classical ones, the quantum adiabatic algorithm is a versatile proposition to find the minimal value of an arbitrary cost function (ground state energy). Random optimization problems provide a natural testbed to compare its efficiency with that of classical algorithms. These problems correspond to mean field spin glasses that have been extensively studied in the classical case. This paper reviews recent analytical works that extended these studies to incorporate the effect of quantum fluctuations, and presents also some original results in this direction.
△ Less
Submitted 13 November, 2012; v1 submitted 2 October, 2012;
originally announced October 2012.