Search | arXiv e-print repository

SE(3) Equivariant Augmented Coupling Flows

Authors: Laurence I. Midgley, Vincent Stimper, Javier Antorán, Emile Mathieu, Bernhard Schölkopf, José Miguel Hernández-Lobato

Abstract: Coupling normalizing flows allow for fast sampling and density evaluation, making them the tool of choice for probabilistic modeling of physical systems. However, the standard coupling architecture precludes endowing flows that operate on the Cartesian coordinates of atoms with the SE(3) and permutation invariances of physical systems. This work proposes a coupling flow that preserves SE(3) and pe… ▽ More Coupling normalizing flows allow for fast sampling and density evaluation, making them the tool of choice for probabilistic modeling of physical systems. However, the standard coupling architecture precludes endowing flows that operate on the Cartesian coordinates of atoms with the SE(3) and permutation invariances of physical systems. This work proposes a coupling flow that preserves SE(3) and permutation equivariance by performing coordinate splits along additional augmented dimensions. At each layer, the flow maps atoms' positions into learned SE(3) invariant bases, where we apply standard flow transformations, such as monotonic rational-quadratic splines, before returning to the original basis. Crucially, our flow preserves fast sampling and density evaluation, and may be used to produce unbiased estimates of expectations with respect to the target distribution via importance sampling. When trained on the DW4, LJ13, and QM9-positional datasets, our flow is competitive with equivariant continuous normalizing flows and diffusion models, while allowing sampling more than an order of magnitude faster. Moreover, to the best of our knowledge, we are the first to learn the full Boltzmann distribution of alanine dipeptide by only modeling the Cartesian positions of its atoms. Lastly, we demonstrate that our flow can be trained to approximately sample from the Boltzmann distribution of the DW4 and LJ13 particle systems using only their energy functions. △ Less

Submitted 5 March, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

arXiv:2306.09884 [pdf, other]

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

Authors: Clément Bonnet, Daniel Luo, Donal Byrne, Shikha Surana, Sasha Abramowitz, Paul Duckworth, Vincent Coyette, Laurence I. Midgley, Elshadai Tegegn, Tristan Kalloniatis, Omayma Mahjoub, Matthew Macfarlane, Andries P. Smit, Nathan Grinsztajn, Raphael Boige, Cemlyn N. Waters, Mohamed A. Mimouni, Ulrich A. Mbou Sob, Ruan de Kock, Siddarth Singh, Daniel Furelos-Blanco, Victor Le, Arnu Pretorius, Alexandre Laterre

Abstract: Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments speci… ▽ More Open-source reinforcement learning (RL) environments have played a crucial role in driving progress in the development of AI algorithms. In modern RL research, there is a need for simulated environments that are performant, scalable, and modular to enable their utilization in a wider range of potential real-world applications. Therefore, we present Jumanji, a suite of diverse RL environments specifically designed to be fast, flexible, and scalable. Jumanji provides a suite of environments focusing on combinatorial problems frequently encountered in industry, as well as challenging general decision-making tasks. By leveraging the efficiency of JAX and hardware accelerators like GPUs and TPUs, Jumanji enables rapid iteration of research ideas and large-scale experimentation, ultimately empowering more capable agents. Unlike existing RL environment suites, Jumanji is highly customizable, allowing users to tailor the initial state distribution and problem complexity to their needs. Furthermore, we provide actor-critic baselines for each environment, accompanied by preliminary findings on scaling and generalization scenarios. Jumanji aims to set a new standard for speed, adaptability, and scalability of RL environments. △ Less

Submitted 15 March, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

Comments: 9 pages + 21 pages of appendices and references. Published at ICLR 2024

arXiv:2211.04327 [pdf, other]

Synthesis of separation processes with reinforcement learning

Authors: Stephan C. P. A. van Kalmthout, Laurence I. Midgley, Meik B. Franke

Abstract: This paper shows the implementation of reinforcement learning (RL) in commercial flowsheet simulator software (Aspen Plus V12) for designing and optimising a distillation sequence. The aim of the SAC agent was to separate a hydrocarbon mixture in its individual components by utilising distillation. While doing so it tries to maximise the profit produced by the distillation sequence. All actions of… ▽ More This paper shows the implementation of reinforcement learning (RL) in commercial flowsheet simulator software (Aspen Plus V12) for designing and optimising a distillation sequence. The aim of the SAC agent was to separate a hydrocarbon mixture in its individual components by utilising distillation. While doing so it tries to maximise the profit produced by the distillation sequence. All actions of the agent were set by the SAC agent in Python and communicated in Aspen Plus via an API. Here the distillation column was simulated by use of the build-in RADFRAC column. With this a connection was established for data transfer between Python and Aspen and the agent succeeded to show learning behaviour, while increasing profit. Although results were generated, the use of Aspen was slow (190 hours) and Aspen was found unsuitable for parallelisation. This makes that Aspen is incompatible for solving RL problems. Code and thesis are available at https://github.com/lollcat/Aspen-RL △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2208.01893 [pdf, other]

Flow Annealed Importance Sampling Bootstrap

Authors: Laurence Illing Midgley, Vincent Stimper, Gregor N. C. Simm, Bernhard Schölkopf, José Miguel Hernández-Lobato

Abstract: Normalizing flows are tractable density models that can approximate complicated target distributions, e.g. Boltzmann distributions of physical systems. However, current methods for training flows either suffer from mode-seeking behavior, use samples from the target generated beforehand by expensive MCMC methods, or use stochastic losses that have high variance. To avoid these problems, we augment… ▽ More Normalizing flows are tractable density models that can approximate complicated target distributions, e.g. Boltzmann distributions of physical systems. However, current methods for training flows either suffer from mode-seeking behavior, use samples from the target generated beforehand by expensive MCMC methods, or use stochastic losses that have high variance. To avoid these problems, we augment flows with annealed importance sampling (AIS) and minimize the mass-covering $α$-divergence with $α=2$, which minimizes importance weight variance. Our method, Flow AIS Bootstrap (FAB), uses AIS to generate samples in regions where the flow is a poor approximation of the target, facilitating the discovery of new modes. We apply FAB to multimodal targets and show that we can approximate them very accurately where previous methods fail. To the best of our knowledge, we are the first to learn the Boltzmann distribution of the alanine dipeptide molecule using only the unnormalized target density, without access to samples generated via Molecular Dynamics (MD) simulations: FAB produces better results than training via maximum likelihood on MD samples while using 100 times fewer target evaluations. After reweighting the samples, we obtain unbiased histograms of dihedral angles that are almost identical to the ground truth. △ Less

Submitted 7 March, 2023; v1 submitted 3 August, 2022; originally announced August 2022.

arXiv:2111.11510 [pdf, other]

Bootstrap Your Flow

Authors: Laurence Illing Midgley, Vincent Stimper, Gregor N. C. Simm, José Miguel Hernández-Lobato

Abstract: Normalizing flows are flexible, parameterized distributions that can be used to approximate expectations from intractable distributions via importance sampling. However, current flow-based approaches are limited on challenging targets where they either suffer from mode seeking behaviour or high variance in the training loss, or rely on samples from the target distribution, which may not be availab… ▽ More Normalizing flows are flexible, parameterized distributions that can be used to approximate expectations from intractable distributions via importance sampling. However, current flow-based approaches are limited on challenging targets where they either suffer from mode seeking behaviour or high variance in the training loss, or rely on samples from the target distribution, which may not be available. To address these challenges, we combine flows with annealed importance sampling (AIS), while using the $α$-divergence as our objective, in a novel training procedure, FAB (Flow AIS Bootstrap). Thereby, the flow and AIS improve each other in a bootstrapping manner. We demonstrate that FAB can be used to produce accurate approximations to complex target distributions, including Boltzmann distributions, in problems where previous flow-based methods fail. △ Less

Submitted 14 March, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

arXiv:2009.13265 [pdf, other]

Deep Reinforcement Learning for Process Synthesis

Authors: Laurence Illing Midgley

Abstract: This paper demonstrates the application of reinforcement learning (RL) to process synthesis by presenting Distillation Gym, a set of RL environments in which an RL agent is tasked with designing a distillation train, given a user defined multi-component feed stream. Distillation Gym interfaces with a process simulator (COCO and ChemSep) to simulate the environment. A demonstration of two distillat… ▽ More This paper demonstrates the application of reinforcement learning (RL) to process synthesis by presenting Distillation Gym, a set of RL environments in which an RL agent is tasked with designing a distillation train, given a user defined multi-component feed stream. Distillation Gym interfaces with a process simulator (COCO and ChemSep) to simulate the environment. A demonstration of two distillation problem examples are discussed in this paper (a Benzene, Toluene, P-xylene separation problem and a hydrocarbon separation problem), in which a deep RL agent is successfully able to learn within Distillation Gym to produce reasonable designs. Finally, this paper proposes the creation of Chemical Engineering Gym, an all-purpose reinforcement learning software toolkit for chemical engineering process synthesis. △ Less

Submitted 23 September, 2020; originally announced September 2020.

Showing 1–6 of 6 results for author: Midgley, L I