Search | arXiv e-print repository

arXiv:2405.20971 [pdf, other]

Amortizing intractable inference in diffusion models for vision, language, and control

Authors: Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin

Abstract: Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generat… ▽ More Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generative model prior $p(\mathbf{x})$ and a black-box constraint or likelihood function $r(\mathbf{x})$. We state and prove the asymptotic correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from this posterior, a problem that existing methods solve only approximately or in restricted cases. Relative trajectory balance arises from the generative flow network perspective on diffusion models, which allows the use of deep reinforcement learning techniques to improve mode coverage. Experiments illustrate the broad potential of unbiased inference of arbitrary posteriors under diffusion priors: in vision (classifier guidance), language (infilling under a discrete diffusion LLM), and multimodal data (text-to-image generation). Beyond generative modeling, we apply relative trajectory balance to the problem of continuous control with a score-based behavior prior, achieving state-of-the-art results on benchmarks in offline reinforcement learning. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: Code: https://github.com/GFNOrg/diffusion-finetuning

arXiv:2405.20313 [pdf, other]

Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation

Authors: Guillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose

Abstract: Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFl… ▽ More Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFlow-2 presents substantial new architectural features over the previous FoldFlow family of models including a protein large language model to encode sequence, a new multi-modal fusion trunk that combines structure and sequence representations, and a geometric transformer based decoder. To increase diversity and novelty of generated samples -- crucial for de-novo drug design -- we train FoldFlow-2 at scale on a new dataset that is an order of magnitude larger than PDB datasets of prior works, containing both known proteins in PDB and high-quality synthetic structures achieved through filtering. We further demonstrate the ability to align FoldFlow-2 to arbitrary rewards, e.g. increasing secondary structures diversity, by introducing a Reinforced Finetuning (ReFT) objective. We empirically observe that FoldFlow-2 outperforms previous state-of-the-art protein structure-based generative models, improving over RFDiffusion in terms of unconditional generation across all metrics including designability, diversity, and novelty across all protein lengths, as well as exhibiting generalization on the task of equilibrium conformation sampling. Finally, we demonstrate that a fine-tuned FoldFlow-2 makes progress on challenging conditional design tasks such as designing scaffolds for the VHH nanobody. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: preprint

arXiv:2402.06121 [pdf, other]

Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

Authors: Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

Abstract: Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and… ▽ More Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and no data samples -- to train a diffusion-based sampler. Specifically, iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our stochastic matching objective to further improve the sampler. iDEM is scalable to high dimensions as the inner matching objective, is simulation-free, and requires no MCMC samples. Moreover, by leveraging the fast mode mixing behavior of diffusion, iDEM smooths out the energy landscape enabling efficient exploration and learning of an amortized sampler. We evaluate iDEM on a suite of tasks ranging from standard synthetic energy functions to invariant $n$-body particle systems. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5\times$ faster, which allows it to be the first method to train using energy on the challenging $55$-particle Lennard-Jones system. △ Less

Submitted 26 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

Comments: Published at ICML 2024. Code for iDEM is available at https://github.com/jarridrb/dem

arXiv:2402.05137 [pdf, other]

doi 10.33232/001c.120559

LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology

Authors: Matthew Ho, Deaglan J. Bartlett, Nicolas Chartier, Carolina Cuesta-Lazaro, Simon Ding, Axel Lapel, Pablo Lemos, Christopher C. Lovell, T. Lucas Makinen, Chirag Modi, Viraj Pandya, Shivam Pandey, Lucia A. Perez, Benjamin Wandelt, Greg L. Bryan

Abstract: This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i… ▽ More This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili. △ Less

Submitted 2 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

Comments: 22 pages, 10 figures, accepted in the Open Journal of Astrophysics. Code available at https://github.com/maho3/ltu-ili

Journal ref: 2024 OJA, Vol. 7

arXiv:2402.05098 [pdf, other]

Improved off-policy training of diffusion samplers

Authors: Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

Abstract: We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into… ▽ More We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference. △ Less

Submitted 26 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: 24 pages; changed title from v2; code: https://github.com/GFNOrg/gfn-diffusion

arXiv:2402.04355 [pdf, other]

PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

Authors: Pablo Lemos, Sammy Sharief, Nikolay Malkin, Laurence Perreault-Levasseur, Yashar Hezaveh

Abstract: We propose a comprehensive sample-based method for assessing the quality of generative models. The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution, providing a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models trained on the same dataset. T… ▽ More We propose a comprehensive sample-based method for assessing the quality of generative models. The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution, providing a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models trained on the same dataset. This comparison can be conducted by dividing the space into non-overlapping regions and comparing the number of data samples in each region. The method only requires samples from the generative model and the test data. It is capable of functioning directly on high-dimensional data, obviating the need for dimensionality reduction. Significantly, the proposed method does not depend on assumptions regarding the density of the true distribution, and it does not rely on training or fitting any auxiliary models. Instead, it focuses on approximating the integral of the density (probability mass) across various sub-regions within the data space. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: 14 pages, 13 figures

arXiv:2312.03911 [pdf, other]

Improving Gradient-guided Nested Sampling for Posterior Inference

Authors: Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur

Abstract: We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and rea… ▽ More We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and real-world problems. We also show the potential of combining nested sampling with generative flow networks to obtain large amounts of high-quality samples from the posterior distribution. This combination leads to faster mode discovery and more accurate estimates of the partition function. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: 10 pages, 5 figures. Code available at https://github.com/Pablo-Lemos/GGNS

arXiv:2311.18012 [pdf, other]

Bayesian Imaging for Radio Interferometry with Score-Based Priors

Authors: Noe Dia, M. J. Yantovski-Barth, Alexandre Adam, Micah Bowles, Pablo Lemos, Anna M. M. Scaife, Yashar Hezaveh, Laurence Perreault-Levasseur

Abstract: The inverse imaging task in radio interferometry is a key limiting factor to retrieving Bayesian uncertainties in radio astronomy in a computationally effective manner. We use a score-based prior derived from optical images of galaxies to recover images of protoplanetary disks from the DSHARP survey. We demonstrate that our method produces plausible posterior samples despite the misspecified galax… ▽ More The inverse imaging task in radio interferometry is a key limiting factor to retrieving Bayesian uncertainties in radio astronomy in a computationally effective manner. We use a score-based prior derived from optical images of galaxies to recover images of protoplanetary disks from the DSHARP survey. We demonstrate that our method produces plausible posterior samples despite the misspecified galaxy prior. We show that our approach produces results which are competitive with existing radio interferometry imaging algorithms. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 10+4 pages, 6 figures, Machine Learning and the Physical Sciences Workshop, NeurIPS 2023

arXiv:2310.15256 [pdf, other]

SimBIG: Field-level Simulation-Based Inference of Galaxy Clustering

Authors: Pablo Lemos, Liam Parker, ChangHoon Hahn, Shirley Ho, Michael Eickenberg, Jiamin Hou, Elena Massara, Chirag Modi, Azadeh Moradinezhad Dizgah, Bruno Regaldo-Saint Blancard, David Spergel

Abstract: We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution.… ▽ More We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution. To address these limitations, we use the {\sc SimBIG} forward modelling framework to perform SBI using normalizing flows. We apply SimBIG to a subset of the BOSS CMASS galaxy sample using a convolutional neural network with stochastic weight averaging to perform massive data compression of the galaxy field. We infer constraints on $Ω_m = 0.267^{+0.033}_{-0.029}$ and $σ_8=0.762^{+0.036}_{-0.035}$. While our constraints on $Ω_m$ are in-line with standard $P_\ell$ analyses, those on $σ_8$ are $2.65\times$ tighter. Our analysis also provides constraints on the Hubble constant $H_0=64.5 \pm 3.8 \ {\rm km / s / Mpc}$ from galaxy clustering alone. This higher constraining power comes from additional non-Gaussian cosmological information, inaccessible with $P_\ell$. We demonstrate the robustness of our analysis by showcasing our ability to infer unbiased cosmological constraints from a series of test simulations that are constructed using different forward models than the one used in our training dataset. This work not only presents competitive cosmological constraints but also introduces novel methods for leveraging additional cosmological information in upcoming galaxy surveys like DESI, PFS, and Euclid. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: 14 pages, 4 figures. A previous version of the paper was published in the ICML 2023 Workshop on Machine Learning for Astrophysics

arXiv:2310.14782 [pdf, other]

Towards equilibrium molecular conformation generation with GFlowNets

Authors: Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik, Yoshua Bengio

Abstract: Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this paper we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and disc… ▽ More Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this paper we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and discovers a diverse set of low-energy conformations for highly flexible drug-like molecules. We demonstrate that GFlowNet can reproduce molecular potential energy surfaces by sampling proportionally to the Boltzmann distribution. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2302.03026 [pdf, other]

Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

Authors: Pablo Lemos, Adam Coogan, Yashar Hezaveh, Laurence Perreault-Levasseur

Abstract: Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors enco… ▽ More Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors encoded in generative models is not straightforward. In this paper, we introduce `Tests of Accuracy with Random Points' (TARP) coverage testing as a method to estimate coverage probabilities of generative posterior estimators. Our method differs from previously-existing coverage-based methods, which require posterior evaluations. We prove that our approach is necessary and sufficient to show that a posterior estimator is accurate. We demonstrate the method on a variety of synthetic examples, and show that TARP can be used to test the results of posterior inference analyses in high-dimensional spaces. We also show that our method can detect inaccurate inferences in cases where existing methods fail. △ Less

Submitted 2 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Comments: 15 pages, Accepted at ICML 2023

arXiv:2301.12594 [pdf, other]

A theory of continuous generative flow networks

Authors: Salem Lahlou, Tristan Deleu, Pablo Lemos, Dinghuai Zhang, Alexandra Volokhova, Alex Hernández-García, Léna Néhale Ezzine, Yoshua Bengio, Nikolay Malkin

Abstract: Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects. A key limitation of GFlowNets until this time has been that they are restricted to discrete spaces. We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous… ▽ More Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects. A key limitation of GFlowNets until this time has been that they are restricted to discrete spaces. We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous or hybrid state spaces, and perform experiments with two goals in mind. First, we illustrate critical points of the theory and the importance of various assumptions. Second, we empirically demonstrate how observations about discrete GFlowNets transfer to the continuous case and show strong results compared to non-GFlowNet baselines on several previously studied tasks. This work greatly widens the perspectives for the application of GFlowNets in probabilistic inference and various modeling settings. △ Less

Submitted 25 May, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

Comments: ICML 2023; 32 pages; code: https://github.com/saleml/continuous-gfn

arXiv:2207.08435 [pdf, other]

doi 10.1088/2632-2153/acbb53

Robust Simulation-Based Inference in Cosmology with Bayesian Neural Networks

Authors: Pablo Lemos, Miles Cranmer, Muntazir Abidi, ChangHoon Hahn, Michael Eickenberg, Elena Massara, David Yallup, Shirley Ho

Abstract: Simulation-based inference (SBI) is rapidly establishing itself as a standard machine learning technique for analyzing data in cosmological surveys. Despite continual improvements to the quality of density estimation by learned models, applications of such techniques to real data are entirely reliant on the generalization power of neural networks far outside the training distribution, which is mos… ▽ More Simulation-based inference (SBI) is rapidly establishing itself as a standard machine learning technique for analyzing data in cosmological surveys. Despite continual improvements to the quality of density estimation by learned models, applications of such techniques to real data are entirely reliant on the generalization power of neural networks far outside the training distribution, which is mostly unconstrained. Due to the imperfections in scientist-created simulations, and the large computational expense of generating all possible parameter combinations, SBI methods in cosmology are vulnerable to such generalization issues. Here, we discuss the effects of both issues, and show how using a Bayesian neural network framework for training SBI can mitigate biases, and result in more reliable inference outside the training set. We introduce cosmoSWAG, the first application of Stochastic Weight Averaging to cosmology, and apply it to SBI trained for inference on the cosmic microwave background. △ Less

Submitted 2 March, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

Comments: 5 pages, 3 figures. Preliminary version accepted at the ML4Astro Machine Learning for Astrophysics Workshop at the Thirty-ninth International Conference on Machine Learning (ICML 2022). Final version published at Machine Learning: Science and Technology

Journal ref: Mach. Learn.: Sci. Technol. 4 01LT01 (2023)

arXiv:2205.12841 [pdf, other]

Marginal Post Processing of Bayesian Inference Products with Normalizing Flows and Kernel Density Estimators

Authors: Harry T. J. Bevins, William J. Handley, Pablo Lemos, Peter H. Sims, Eloy de Lera Acedo, Anastasia Fialkov, Justin Alsing

Abstract: Bayesian analysis has become an indispensable tool across many different cosmological fields including the study of gravitational waves, the Cosmic Microwave Background and the 21-cm signal from the Cosmic Dawn among other phenomena. The method provides a way to fit complex models to data describing key cosmological and astrophysical signals and a whole host of contaminating signals and instrument… ▽ More Bayesian analysis has become an indispensable tool across many different cosmological fields including the study of gravitational waves, the Cosmic Microwave Background and the 21-cm signal from the Cosmic Dawn among other phenomena. The method provides a way to fit complex models to data describing key cosmological and astrophysical signals and a whole host of contaminating signals and instrumental effects modelled with `nuisance parameters'. In this paper, we summarise a method that uses Masked Autoregressive Flows and Kernel Density Estimators to learn marginal posterior densities corresponding to core science parameters. We find that the marginal or 'nuisance-free' posteriors and the associated likelihoods have an abundance of applications including; the calculation of previously intractable marginal Kullback-Leibler divergences and marginal Bayesian Model Dimensionalities, likelihood emulation and prior emulation. We demonstrate each application using toy examples, examples from the field of 21-cm cosmology and samples from the Dark Energy Survey. We discuss how marginal summary statistics like the Kullback-Leibler divergences and Bayesian Model Dimensionalities can be used to examine the constraining power of different experiments and how we can perform efficient joint analysis by taking advantage of marginal prior and likelihood emulators. We package our multipurpose code up in the pip-installable code margarine for use in the wider scientific community. △ Less

Submitted 18 December, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

Comments: Accepted for MNRAS

arXiv:2205.11151 [pdf, other]

Split personalities in Bayesian Neural Networks: the case for full marginalisation

Authors: David Yallup, Will Handley, Mike Hobson, Anthony Lasenby, Pablo Lemos

Abstract: The true posterior distribution of a Bayesian neural network is massively multimodal. Whilst most of these modes are functionally equivalent, we demonstrate that there remains a level of real multimodality that manifests in even the simplest neural network setups. It is only by fully marginalising over all posterior modes, using appropriate Bayesian sampling tools, that we can capture the split pe… ▽ More The true posterior distribution of a Bayesian neural network is massively multimodal. Whilst most of these modes are functionally equivalent, we demonstrate that there remains a level of real multimodality that manifests in even the simplest neural network setups. It is only by fully marginalising over all posterior modes, using appropriate Bayesian sampling tools, that we can capture the split personalities of the network. The ability of a network trained in this manner to reason between multiple candidate solutions dramatically improves the generalisability of the model, a feature we contend is not consistently captured by alternative approaches to the training of Bayesian neural networks. We provide a concise minimal example of this, which can provide lessons and a future path forward for correctly utilising the explainability and interpretability of Bayesian neural networks. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Comments: 10 pages, 5 figures

arXiv:2204.03405 [pdf, other]

Recommended Guidelines for Effective MOOCs based on a Multiple-Case Study

Authors: Eduardo Guerra, Fabio Kon, Paulo Lemos

Abstract: Massive Open Online Courseware (MOOCs) appeared in 2008 and grew considerably in the past decade, now reaching millions of students and professionals all over the world. MOOCs do not replace other educational forms. Instead, they complement them by offering a powerful educational tool that can reach students that, otherwise, would not have access to that information. Nevertheless, designing and im… ▽ More Massive Open Online Courseware (MOOCs) appeared in 2008 and grew considerably in the past decade, now reaching millions of students and professionals all over the world. MOOCs do not replace other educational forms. Instead, they complement them by offering a powerful educational tool that can reach students that, otherwise, would not have access to that information. Nevertheless, designing and implementing a successful MOOC is not straightforward. Simply recording traditional classes is an approach that does not work, since the conditions in which a MOOC student learns are very different from the conventional classroom. In particular, dropout rates in MOOCs are, normally, at least an order of magnitude higher than in conventional courses. In this paper, we analyze data from 7 successful MOOCs that have attracted over 150,000 students in the past years. The analysis led to the proposal of a set of guidelines to help instructors in designing more effective MOOCs. These results contribute to the existing body of knowledge in the field, bring new insights, and pose new questions for future research. △ Less

Submitted 7 April, 2022; originally announced April 2022.

Report number: Technical Report RT-MAC-2021-02

arXiv:2202.02306 [pdf, other]

Rediscovering orbital mechanics with machine learning

Authors: Pablo Lemos, Niall Jeffrey, Miles Cranmer, Shirley Ho, Peter Battaglia

Abstract: We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a "graph neural network" to simulate the dynamics of our solar system's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law im… ▽ More We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a "graph neural network" to simulate the dynamics of our solar system's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton's law of gravitation. The key assumptions that were required were translational and rotational equivariance, and Newton's second and third laws of motion. Our approach correctly discovered the form of the symbolic force law. Furthermore, our approach did not require any assumptions about the masses of planets and moons or physical constants. They, too, were accurately inferred through our methods. Though, of course, the classical law of gravitation has been known since Isaac Newton, our result serves as a validation that our method can discover unknown laws and hidden properties from observed data. More broadly this work represents a key step toward realizing the potential of machine learning for accelerating scientific discovery. △ Less

Submitted 4 February, 2022; originally announced February 2022.

Comments: 12 pages, 6 figures, under review

arXiv:2009.12691 [pdf, other]

A Multi-Agent System for Solving the Dynamic Capacitated Vehicle Routing Problem with Stochastic Customers using Trajectory Data Mining

Authors: Juan Camilo Fonseca-Galindo, Gabriela de Castro Surita, José Maia Neto, Cristiano Leite de Castro, André Paim Lemos

Abstract: The worldwide growth of e-commerce has created new challenges for logistics companies, one of which is being able to deliver products quickly and at low cost, which reflects directly in the way of sorting packages, needing to eliminate steps such as storage and batch creation. Our work presents a multi-agent system that uses trajectory data mining techniques to extract territorial patterns and use… ▽ More The worldwide growth of e-commerce has created new challenges for logistics companies, one of which is being able to deliver products quickly and at low cost, which reflects directly in the way of sorting packages, needing to eliminate steps such as storage and batch creation. Our work presents a multi-agent system that uses trajectory data mining techniques to extract territorial patterns and use them in the dynamic creation of last-mile routes. The problem can be modeled as a Dynamic Capacitated Vehicle Routing Problem (VRP) with Stochastic Customer, being therefore NP-HARD, what makes its implementation unfeasible for many packages. The work's main contribution is to solve this problem only depending on the Warehouse system configurations and not on the number of packages processed, which is appropriate for Big Data scenarios commonly present in the delivery of e-commerce products. Computational experiments were conducted for single and multi depot instances. Due to its probabilistic nature, the proposed approach presented slightly lower performances when compared to the static VRP algorithm. However, the operational gains that our solution provides making it very attractive for situations in which the routes must be set dynamically. △ Less

Submitted 26 September, 2020; originally announced September 2020.

arXiv:1805.11472 [pdf, other]

Comparison of 1D and 3D Models for the Estimation of Fractional Flow Reserve

Authors: P. J. Blanco, C. A. Bulant, L. O. Müller, G. D. Maso Talou, C. Guedes Bezerra, P. L. Lemos, R. A. Feijóo

Abstract: In this work we propose to validate the predictive capabilities of one-dimensional (1D) blood flow models with full three-dimensional (3D) models in the context of patient-specific coronary hemodynamics in hyperemic conditions. Such conditions mimic the state of coronary circulation during the acquisition of the Fractional Flow Reserve (FFR) index. Demonstrating that 1D models accurately reproduce… ▽ More In this work we propose to validate the predictive capabilities of one-dimensional (1D) blood flow models with full three-dimensional (3D) models in the context of patient-specific coronary hemodynamics in hyperemic conditions. Such conditions mimic the state of coronary circulation during the acquisition of the Fractional Flow Reserve (FFR) index. Demonstrating that 1D models accurately reproduce FFR estimates obtained with 3D models has implications in the approach to computationally estimate FFR. To this end, a sample of 20 patients was employed from which 29 3D geometries of arterial trees were constructed, 9 obtained from coronary computed tomography angiography (CCTA) and 20 from intra-vascular ultrasound (IVUS). For each 3D arterial model, a 1D counterpart was generated. The same outflow and inlet pressure boundary conditions were applied to both (3D and 1D) models. In the 1D setting, pressure losses at stenoses and bifurcations were accounted for through specific lumped models. Comparisons between 1D models ($\text{FFR}_{\text{1D}}$) and 3D models ($\text{FFR}_{\text{3D}}$) were performed in terms of predicted $\text{FFR}$ value. Compared to $\text{FFR}_{\text{3D}}$, $\text{FFR}_{\text{1D}}$ resulted with a difference of 0.00$\pm$0.03 and overall predictive capability AUC, Acc, Spe, Sen, PPV and NPV of 0.97, 0.98, 0.90, 0.99, 0.82, and 0.99, with an FFR threshold of 0.8. We conclude that inexpensive $\text{FFR}_{\text{1D}}$ simulations can be reliably used as a surrogate of demanding $\text{FFR}_{\text{3D}}$ computations. △ Less

Submitted 29 May, 2018; originally announced May 2018.

Comments: 11 pages, 2 figures, 2 tables, submited to Scientific Reports

Showing 1–19 of 19 results for author: Lemos, P