Skip to main content

Showing 1–19 of 19 results for author: Lemos, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.20971  [pdf, other

    cs.LG cs.CV

    Amortizing intractable inference in diffusion models for vision, language, and control

    Authors: Siddarth Venkatraman, Moksh Jain, Luca Scimeca, Minsu Kim, Marcin Sendera, Mohsin Hasan, Luke Rowe, Sarthak Mittal, Pablo Lemos, Emmanuel Bengio, Alexandre Adam, Jarrid Rector-Brooks, Yoshua Bengio, Glen Berseth, Nikolay Malkin

    Abstract: Diffusion models have emerged as effective distribution estimators in vision, language, and reinforcement learning, but their use as priors in downstream tasks poses an intractable posterior inference problem. This paper studies amortized sampling of the posterior over data, $\mathbf{x}\sim p^{\rm post}(\mathbf{x})\propto p(\mathbf{x})r(\mathbf{x})$, in a model that consists of a diffusion generat… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Code: https://github.com/GFNOrg/diffusion-finetuning

  2. arXiv:2405.20313  [pdf, other

    cs.LG q-bio.BM

    Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation

    Authors: Guillaume Huguet, James Vuckovic, Kilian Fatras, Eric Thibodeau-Laufer, Pablo Lemos, Riashat Islam, Cheng-Hao Liu, Jarrid Rector-Brooks, Tara Akhound-Sadegh, Michael Bronstein, Alexander Tong, Avishek Joey Bose

    Abstract: Proteins are essential for almost all biological processes and derive their diverse functions from complex 3D structures, which are in turn determined by their amino acid sequences. In this paper, we exploit the rich biological inductive bias of amino acid sequences and introduce FoldFlow-2, a novel sequence-conditioned SE(3)-equivariant flow matching model for protein structure generation. FoldFl… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: preprint

  3. arXiv:2402.06121  [pdf, other

    cs.LG stat.ML

    Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

    Authors: Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

    Abstract: Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and… ▽ More

    Submitted 26 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. Code for iDEM is available at https://github.com/jarridrb/dem

  4. arXiv:2402.05137  [pdf, other

    astro-ph.IM astro-ph.CO astro-ph.GA cs.LG

    LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology

    Authors: Matthew Ho, Deaglan J. Bartlett, Nicolas Chartier, Carolina Cuesta-Lazaro, Simon Ding, Axel Lapel, Pablo Lemos, Christopher C. Lovell, T. Lucas Makinen, Chirag Modi, Viraj Pandya, Shivam Pandey, Lucia A. Perez, Benjamin Wandelt, Greg L. Bryan

    Abstract: This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i… ▽ More

    Submitted 2 July, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 22 pages, 10 figures, accepted in the Open Journal of Astrophysics. Code available at https://github.com/maho3/ltu-ili

    Journal ref: 2024 OJA, Vol. 7

  5. arXiv:2402.05098  [pdf, other

    cs.LG stat.ML

    Improved off-policy training of diffusion samplers

    Authors: Marcin Sendera, Minsu Kim, Sarthak Mittal, Pablo Lemos, Luca Scimeca, Jarrid Rector-Brooks, Alexandre Adam, Yoshua Bengio, Nikolay Malkin

    Abstract: We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into… ▽ More

    Submitted 26 May, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: 24 pages; changed title from v2; code: https://github.com/GFNOrg/gfn-diffusion

  6. arXiv:2402.04355  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    PQMass: Probabilistic Assessment of the Quality of Generative Models using Probability Mass Estimation

    Authors: Pablo Lemos, Sammy Sharief, Nikolay Malkin, Laurence Perreault-Levasseur, Yashar Hezaveh

    Abstract: We propose a comprehensive sample-based method for assessing the quality of generative models. The proposed approach enables the estimation of the probability that two sets of samples are drawn from the same distribution, providing a statistically rigorous method for assessing the performance of a single generative model or the comparison of multiple competing models trained on the same dataset. T… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: 14 pages, 13 figures

  7. arXiv:2312.03911  [pdf, other

    cs.LG stat.CO stat.ME stat.ML

    Improving Gradient-guided Nested Sampling for Posterior Inference

    Authors: Pablo Lemos, Nikolay Malkin, Will Handley, Yoshua Bengio, Yashar Hezaveh, Laurence Perreault-Levasseur

    Abstract: We present a performant, general-purpose gradient-guided nested sampling algorithm, ${\tt GGNS}$, combining the state of the art in differentiable programming, Hamiltonian slice sampling, clustering, mode separation, dynamic nested sampling, and parallelization. This unique combination allows ${\tt GGNS}$ to scale well with dimensionality and perform competitively on a variety of synthetic and rea… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 10 pages, 5 figures. Code available at https://github.com/Pablo-Lemos/GGNS

  8. arXiv:2311.18012  [pdf, other

    astro-ph.IM cs.CV

    Bayesian Imaging for Radio Interferometry with Score-Based Priors

    Authors: Noe Dia, M. J. Yantovski-Barth, Alexandre Adam, Micah Bowles, Pablo Lemos, Anna M. M. Scaife, Yashar Hezaveh, Laurence Perreault-Levasseur

    Abstract: The inverse imaging task in radio interferometry is a key limiting factor to retrieving Bayesian uncertainties in radio astronomy in a computationally effective manner. We use a score-based prior derived from optical images of galaxies to recover images of protoplanetary disks from the DSHARP survey. We demonstrate that our method produces plausible posterior samples despite the misspecified galax… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 10+4 pages, 6 figures, Machine Learning and the Physical Sciences Workshop, NeurIPS 2023

  9. arXiv:2310.15256  [pdf, other

    astro-ph.CO cs.LG

    SimBIG: Field-level Simulation-Based Inference of Galaxy Clustering

    Authors: Pablo Lemos, Liam Parker, ChangHoon Hahn, Shirley Ho, Michael Eickenberg, Jiamin Hou, Elena Massara, Chirag Modi, Azadeh Moradinezhad Dizgah, Bruno Regaldo-Saint Blancard, David Spergel

    Abstract: We present the first simulation-based inference (SBI) of cosmological parameters from field-level analysis of galaxy clustering. Standard galaxy clustering analyses rely on analyzing summary statistics, such as the power spectrum, $P_\ell$, with analytic models based on perturbation theory. Consequently, they do not fully exploit the non-linear and non-Gaussian features of the galaxy distribution.… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

    Comments: 14 pages, 4 figures. A previous version of the paper was published in the ICML 2023 Workshop on Machine Learning for Astrophysics

  10. arXiv:2310.14782  [pdf, other

    cs.LG cs.AI

    Towards equilibrium molecular conformation generation with GFlowNets

    Authors: Alexandra Volokhova, Michał Koziarski, Alex Hernández-García, Cheng-Hao Liu, Santiago Miret, Pablo Lemos, Luca Thiede, Zichao Yan, Alán Aspuru-Guzik, Yoshua Bengio

    Abstract: Sampling diverse, thermodynamically feasible molecular conformations plays a crucial role in predicting properties of a molecule. In this paper we propose to use GFlowNet for sampling conformations of small molecules from the Boltzmann distribution, as determined by the molecule's energy. The proposed approach can be used in combination with energy estimation methods of different fidelity and disc… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

  11. arXiv:2302.03026  [pdf, other

    stat.ML astro-ph.IM cs.LG stat.ME

    Sampling-Based Accuracy Testing of Posterior Estimators for General Inference

    Authors: Pablo Lemos, Adam Coogan, Yashar Hezaveh, Laurence Perreault-Levasseur

    Abstract: Parameter inference, i.e. inferring the posterior distribution of the parameters of a statistical model given some data, is a central problem to many scientific disciplines. Generative models can be used as an alternative to Markov Chain Monte Carlo methods for conducting posterior inference, both in likelihood-based and simulation-based problems. However, assessing the accuracy of posteriors enco… ▽ More

    Submitted 2 June, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: 15 pages, Accepted at ICML 2023

  12. arXiv:2301.12594  [pdf, other

    cs.LG stat.ML

    A theory of continuous generative flow networks

    Authors: Salem Lahlou, Tristan Deleu, Pablo Lemos, Dinghuai Zhang, Alexandra Volokhova, Alex Hernández-García, Léna Néhale Ezzine, Yoshua Bengio, Nikolay Malkin

    Abstract: Generative flow networks (GFlowNets) are amortized variational inference algorithms that are trained to sample from unnormalized target distributions over compositional objects. A key limitation of GFlowNets until this time has been that they are restricted to discrete spaces. We present a theory for generalized GFlowNets, which encompasses both existing discrete GFlowNets and ones with continuous… ▽ More

    Submitted 25 May, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

    Comments: ICML 2023; 32 pages; code: https://github.com/saleml/continuous-gfn

  13. arXiv:2207.08435  [pdf, other

    astro-ph.CO cs.LG

    Robust Simulation-Based Inference in Cosmology with Bayesian Neural Networks

    Authors: Pablo Lemos, Miles Cranmer, Muntazir Abidi, ChangHoon Hahn, Michael Eickenberg, Elena Massara, David Yallup, Shirley Ho

    Abstract: Simulation-based inference (SBI) is rapidly establishing itself as a standard machine learning technique for analyzing data in cosmological surveys. Despite continual improvements to the quality of density estimation by learned models, applications of such techniques to real data are entirely reliant on the generalization power of neural networks far outside the training distribution, which is mos… ▽ More

    Submitted 2 March, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: 5 pages, 3 figures. Preliminary version accepted at the ML4Astro Machine Learning for Astrophysics Workshop at the Thirty-ninth International Conference on Machine Learning (ICML 2022). Final version published at Machine Learning: Science and Technology

    Journal ref: Mach. Learn.: Sci. Technol. 4 01LT01 (2023)

  14. arXiv:2205.12841  [pdf, other

    astro-ph.IM astro-ph.CO cs.LG

    Marginal Post Processing of Bayesian Inference Products with Normalizing Flows and Kernel Density Estimators

    Authors: Harry T. J. Bevins, William J. Handley, Pablo Lemos, Peter H. Sims, Eloy de Lera Acedo, Anastasia Fialkov, Justin Alsing

    Abstract: Bayesian analysis has become an indispensable tool across many different cosmological fields including the study of gravitational waves, the Cosmic Microwave Background and the 21-cm signal from the Cosmic Dawn among other phenomena. The method provides a way to fit complex models to data describing key cosmological and astrophysical signals and a whole host of contaminating signals and instrument… ▽ More

    Submitted 18 December, 2023; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted for MNRAS

  15. arXiv:2205.11151  [pdf, other

    stat.ML cs.LG

    Split personalities in Bayesian Neural Networks: the case for full marginalisation

    Authors: David Yallup, Will Handley, Mike Hobson, Anthony Lasenby, Pablo Lemos

    Abstract: The true posterior distribution of a Bayesian neural network is massively multimodal. Whilst most of these modes are functionally equivalent, we demonstrate that there remains a level of real multimodality that manifests in even the simplest neural network setups. It is only by fully marginalising over all posterior modes, using appropriate Bayesian sampling tools, that we can capture the split pe… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: 10 pages, 5 figures

  16. arXiv:2204.03405  [pdf, other

    cs.CY

    Recommended Guidelines for Effective MOOCs based on a Multiple-Case Study

    Authors: Eduardo Guerra, Fabio Kon, Paulo Lemos

    Abstract: Massive Open Online Courseware (MOOCs) appeared in 2008 and grew considerably in the past decade, now reaching millions of students and professionals all over the world. MOOCs do not replace other educational forms. Instead, they complement them by offering a powerful educational tool that can reach students that, otherwise, would not have access to that information. Nevertheless, designing and im… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Report number: Technical Report RT-MAC-2021-02

  17. arXiv:2202.02306  [pdf, other

    astro-ph.EP astro-ph.IM cs.LG

    Rediscovering orbital mechanics with machine learning

    Authors: Pablo Lemos, Niall Jeffrey, Miles Cranmer, Shirley Ho, Peter Battaglia

    Abstract: We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a "graph neural network" to simulate the dynamics of our solar system's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law im… ▽ More

    Submitted 4 February, 2022; originally announced February 2022.

    Comments: 12 pages, 6 figures, under review

  18. arXiv:2009.12691  [pdf, other

    cs.AI

    A Multi-Agent System for Solving the Dynamic Capacitated Vehicle Routing Problem with Stochastic Customers using Trajectory Data Mining

    Authors: Juan Camilo Fonseca-Galindo, Gabriela de Castro Surita, José Maia Neto, Cristiano Leite de Castro, André Paim Lemos

    Abstract: The worldwide growth of e-commerce has created new challenges for logistics companies, one of which is being able to deliver products quickly and at low cost, which reflects directly in the way of sorting packages, needing to eliminate steps such as storage and batch creation. Our work presents a multi-agent system that uses trajectory data mining techniques to extract territorial patterns and use… ▽ More

    Submitted 26 September, 2020; originally announced September 2020.

  19. arXiv:1805.11472  [pdf, other

    physics.med-ph cs.CE

    Comparison of 1D and 3D Models for the Estimation of Fractional Flow Reserve

    Authors: P. J. Blanco, C. A. Bulant, L. O. Müller, G. D. Maso Talou, C. Guedes Bezerra, P. L. Lemos, R. A. Feijóo

    Abstract: In this work we propose to validate the predictive capabilities of one-dimensional (1D) blood flow models with full three-dimensional (3D) models in the context of patient-specific coronary hemodynamics in hyperemic conditions. Such conditions mimic the state of coronary circulation during the acquisition of the Fractional Flow Reserve (FFR) index. Demonstrating that 1D models accurately reproduce… ▽ More

    Submitted 29 May, 2018; originally announced May 2018.

    Comments: 11 pages, 2 figures, 2 tables, submited to Scientific Reports