Zum Hauptinhalt springen

Showing 1–45 of 45 results for author: Fortuin, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13711  [pdf, other

    cs.LG cs.AI

    FSP-Laplace: Function-Space Priors for the Laplace Approximation in Bayesian Deep Learning

    Authors: Tristan Cinquin, Marvin Pförtner, Vincent Fortuin, Philipp Hennig, Robert Bamler

    Abstract: Laplace approximations are popular techniques for endowing deep networks with epistemic uncertainty estimates as they can be applied without altering the predictions of the neural network, and they scale to large models and datasets. While the choice of prior strongly affects the resulting posterior distribution, computational tractability and lack of interpretability of weight space typically lim… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.13429  [pdf, other

    cs.LG cs.AI

    Towards Dynamic Feature Acquisition on Medical Time Series by Maximizing Conditional Mutual Information

    Authors: Fedor Sergeev, Paola Malsot, Gunnar Rätsch, Vincent Fortuin

    Abstract: Knowing which features of a multivariate time series to measure and when is a key task in medicine, wearables, and robotics. Better acquisition policies can reduce costs while maintaining or even improving the performance of downstream predictors. Inspired by the maximization of conditional mutual information, we propose an approach to train acquirers end-to-end using only the downstream loss. We… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Presented at the ICML 2024 Next Generation of Sequence Modeling Architectures (NGSM) Workshop

  3. arXiv:2406.06459  [pdf, other

    cs.LG

    How Useful is Intermittent, Asynchronous Expert Feedback for Bayesian Optimization?

    Authors: Agustinus Kristiadi, Felix Strieth-Kalthoff, Sriram Ganapathi Subramanian, Vincent Fortuin, Pascal Poupart, Geoff Pleiss

    Abstract: Bayesian optimization (BO) is an integral part of automated scientific discovery -- the so-called self-driving lab -- where human inputs are ideally minimal or at least non-blocking. However, scientists often have strong intuition, and thus human feedback is still useful. Nevertheless, prior works in enhancing BO with expert feedback, such as by incorporating it in an offline or online but blockin… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: AABI 2024. Code: https://github.com/wiseodd/bo-async-feedback

  4. arXiv:2405.03425  [pdf, other

    cs.CL

    Gaussian Stochastic Weight Averaging for Bayesian Low-Rank Adaptation of Large Language Models

    Authors: Emre Onal, Klemens Flöge, Emma Caldwell, Arsen Sheverdin, Vincent Fortuin

    Abstract: Fine-tuned Large Language Models (LLMs) often suffer from overconfidence and poor calibration, particularly when fine-tuned on small datasets. To address these challenges, we propose a simple combination of Low-Rank Adaptation (LoRA) with Gaussian Stochastic Weight Averaging (SWAG), facilitating approximate Bayesian inference in LLMs. Through extensive testing across several Natural Language Proce… ▽ More

    Submitted 20 July, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

    Comments: 14 pages, 1 figure, 2 tables

  5. arXiv:2403.00025  [pdf, ps, other

    cs.LG cs.AI

    On the Challenges and Opportunities in Generative AI

    Authors: Laura Manduchi, Kushagra Pandey, Robert Bamler, Ryan Cotterell, Sina Däubener, Sophie Fellenz, Asja Fischer, Thomas Gärtner, Matthias Kirchler, Marius Kloft, Yingzhen Li, Christoph Lippert, Gerard de Melo, Eric Nalisnick, Björn Ommer, Rajesh Ranganath, Maja Rudolph, Karen Ullrich, Guy Van den Broeck, Julia E Vogt, Yixin Wang, Florian Wenzel, Frank Wood, Stephan Mandt, Vincent Fortuin

    Abstract: The field of deep generative modeling has grown rapidly and consistently over the years. With the availability of massive amounts of training data coupled with advances in scalable unsupervised learning paradigms, recent large-scale generative models show tremendous promise in synthesizing high-resolution images and text, as well as structured data such as videos and molecules. However, we argue t… ▽ More

    Submitted 28 February, 2024; originally announced March 2024.

  6. arXiv:2402.15978  [pdf, other

    cs.LG stat.ML

    Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

    Authors: Rayen Dhahri, Alexander Immer, Betrand Charpentier, Stephan Günnemann, Vincent Fortuin

    Abstract: Neural network sparsification is a promising avenue to save computational time and memory costs, especially in an age where many successful AI models are becoming too large to naïvely deploy on consumer hardware. While much work has focused on different weight pruning criteria, the overall sparsifiability of the network, i.e., its capacity to be pruned without quality loss, has often been overlook… ▽ More

    Submitted 24 February, 2024; originally announced February 2024.

  7. arXiv:2402.00809  [pdf, other

    cs.LG stat.ML

    Position: Bayesian Deep Learning is Needed in the Age of Large-Scale AI

    Authors: Theodore Papamarkou, Maria Skoularidou, Konstantina Palla, Laurence Aitchison, Julyan Arbel, David Dunson, Maurizio Filippone, Vincent Fortuin, Philipp Hennig, José Miguel Hernández-Lobato, Aliaksandr Hubin, Alexander Immer, Theofanis Karaletsos, Mohammad Emtiyaz Khan, Agustinus Kristiadi, Yingzhen Li, Stephan Mandt, Christopher Nemeth, Michael A. Osborne, Tim G. J. Rudner, David Rügamer, Yee Whye Teh, Max Welling, Andrew Gordon Wilson, Ruqi Zhang

    Abstract: In the current landscape of deep learning research, there is a predominant emphasis on achieving high predictive accuracy in supervised tasks involving large image and language datasets. However, a broader perspective reveals a multitude of overlooked metrics, tasks, and data types, such as uncertainty, active and continual learning, and scientific data, that demand attention. Bayesian deep learni… ▽ More

    Submitted 6 August, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  8. arXiv:2312.00232  [pdf, other

    cs.LG cs.AI stat.ML

    Uncertainty in Graph Contrastive Learning with Bayesian Neural Networks

    Authors: Alexander Möllers, Alexander Immer, Elvin Isufi, Vincent Fortuin

    Abstract: Graph contrastive learning has shown great promise when labeled data is scarce, but large unlabeled datasets are available. However, it often does not take uncertainty estimation into account. We show that a variational Bayesian neural network approach can be used to improve not only the uncertainty estimates but also the downstream performance on semi-supervised node-classification tasks. Moreove… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  9. arXiv:2310.20053  [pdf, other

    stat.ML cs.LG

    Estimating optimal PAC-Bayes bounds with Hamiltonian Monte Carlo

    Authors: Szilvia Ujváry, Gergely Flamich, Vincent Fortuin, José Miguel Hernández Lobato

    Abstract: An important yet underexplored question in the PAC-Bayes literature is how much tightness we lose by restricting the posterior family to factorized Gaussian distributions when optimizing a PAC-Bayes bound. We investigate this issue by estimating data-independent PAC-Bayes bounds using the optimal posteriors, comparing them to bounds obtained using MFVI. Concretely, we (1) sample from the optimal G… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Mathematics of Modern Machine Learning Workshop at NeurIPS 2023

    ACM Class: G.3

  10. arXiv:2309.16314  [pdf, other

    stat.ML cs.LG math.ST stat.CO

    A Primer on Bayesian Neural Networks: Review and Debates

    Authors: Julyan Arbel, Konstantinos Pitas, Mariia Vladimirova, Vincent Fortuin

    Abstract: Neural networks have achieved remarkable performance across various problem domains, but their widespread applicability is hindered by inherent limitations such as overconfidence in predictions, lack of interpretability, and vulnerability to adversarial attacks. To address these challenges, Bayesian neural networks (BNNs) have emerged as a compelling extension of conventional neural networks, inte… ▽ More

    Submitted 28 September, 2023; originally announced September 2023.

    Comments: 65 pages

  11. arXiv:2309.07364  [pdf, other

    cs.LG cs.AI eess.SP

    Hodge-Aware Contrastive Learning

    Authors: Alexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi

    Abstract: Simplicial complexes prove effective in modeling data with multiway dependencies, such as data defined along the edges of networks or within other higher-order structures. Their spectrum can be decomposed into three interpretable subspaces via the Hodge decomposition, resulting foundational in numerous applications. We leverage this decomposition to develop a contrastive self-supervised learning a… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

    Comments: 4 pages, 2 figures

  12. arXiv:2306.16717  [pdf, other

    stat.ML cs.LG

    Understanding Pathologies of Deep Heteroskedastic Regression

    Authors: Eliot Wong-Toi, Alex Boyd, Vincent Fortuin, Stephan Mandt

    Abstract: Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean.… ▽ More

    Submitted 13 February, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 20 pages, 8 figures

  13. arXiv:2305.16905  [pdf, other

    stat.ML cs.LG

    Improving Neural Additive Models with Bayesian Principles

    Authors: Kouroche Bouchiat, Alexander Immer, Hugo Yèche, Gunnar Rätsch, Vincent Fortuin

    Abstract: Neural additive models (NAMs) enhance the transparency of deep neural networks by handling input features in separate additive sub-networks. However, they lack inherent mechanisms that provide calibrated uncertainties and enable selection of relevant features and interactions. Approaching NAMs from a Bayesian perspective, we augment them in three primary ways, namely by a) providing credible inter… ▽ More

    Submitted 29 May, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 41st International Conference on Machine Learning (ICML 2024)

  14. arXiv:2304.08309  [pdf, other

    cs.LG stat.ML

    Promises and Pitfalls of the Linearized Laplace in Bayesian Optimization

    Authors: Agustinus Kristiadi, Alexander Immer, Runa Eschenhagen, Vincent Fortuin

    Abstract: The linearized-Laplace approximation (LLA) has been shown to be effective and efficient in constructing Bayesian neural networks. It is theoretically compelling since it can be seen as a Gaussian process posterior with the mean function given by the neural network's maximum-a-posteriori predictive function and the covariance function induced by the empirical neural tangent kernel. However, while i… ▽ More

    Submitted 10 July, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

    Comments: AABI 2023

  15. arXiv:2304.01762  [pdf, other

    cs.LG cs.AI stat.ML

    Incorporating Unlabelled Data into Bayesian Neural Networks

    Authors: Mrinank Sharma, Tom Rainforth, Yee Whye Teh, Vincent Fortuin

    Abstract: Conventional Bayesian Neural Networks (BNNs) are unable to leverage unlabelled data to improve their predictions. To overcome this limitation, we introduce Self-Supervised Bayesian Neural Networks, which use unlabelled data to learn models with suitable prior predictive distributions. This is achieved by leveraging contrastive pretraining techniques and optimising a variational lower bound. We the… ▽ More

    Submitted 30 August, 2024; v1 submitted 4 April, 2023; originally announced April 2023.

    Comments: Published in the Transactions on Machine Learning Research

  16. arXiv:2211.07206  [pdf, other

    stat.ML cs.LG

    Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior: From Theory to Practice

    Authors: Jonas Rothfuss, Martin Josifoski, Vincent Fortuin, Andreas Krause

    Abstract: Meta-Learning aims to speed up the learning process on new tasks by acquiring useful inductive biases from datasets of related learning tasks. While, in practice, the number of related tasks available is often small, most of the existing approaches assume an abundance of tasks; making them unrealistic and prone to overfitting. A central question in the meta-learning literature is how to regularize… ▽ More

    Submitted 22 December, 2023; v1 submitted 14 November, 2022; originally announced November 2022.

    Comments: JMLR, 62 pages, text overlap with arXiv:2002.05551

    Journal ref: Journal of Machine Learning Research (24), 2023, 1-62

  17. arXiv:2202.10638  [pdf, other

    stat.ML cs.LG

    Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations

    Authors: Alexander Immer, Tycho F. A. van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk

    Abstract: Data augmentation is commonly applied to improve performance of deep learning by enforcing the knowledge that certain transformations on the input preserve the output. Currently, the data augmentation parameters are chosen by human effort and costly cross-validation, which makes it cumbersome to apply to new datasets. We develop a convenient gradient-based method for selecting the data augmentatio… ▽ More

    Submitted 13 October, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022

  18. arXiv:2110.08388  [pdf, other

    cs.CL

    Probing as Quantifying Inductive Bias

    Authors: Alexander Immer, Lucas Torroba Hennigen, Vincent Fortuin, Ryan Cotterell

    Abstract: Pre-trained contextual representations have led to dramatic performance improvements on a range of downstream tasks. Such performance improvements have motivated researchers to quantify and understand the linguistic information encoded in these representations. In general, researchers quantify the amount of linguistic information through probing, an endeavor which consists of training a supervised… ▽ More

    Submitted 24 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  19. arXiv:2110.04020  [pdf, other

    cs.LG stat.ML

    Pathologies in priors and inference for Bayesian transformers

    Authors: Tristan Cinquin, Alexander Immer, Max Horn, Vincent Fortuin

    Abstract: In recent years, the transformer has established itself as a workhorse in many applications ranging from natural language processing to reinforcement learning. Similarly, Bayesian deep learning has become the gold-standard for uncertainty estimation in safety-critical applications, where robustness and calibration are crucial. Surprisingly, no successful attempts to improve transformer models in t… ▽ More

    Submitted 15 October, 2021; v1 submitted 8 October, 2021; originally announced October 2021.

  20. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  21. arXiv:2110.02609  [pdf, other

    stat.ML cs.LG

    Deep Classifiers with Label Noise Modeling and Distance Awareness

    Authors: Vincent Fortuin, Mark Collier, Florian Wenzel, James Allingham, Jeremiah Liu, Dustin Tran, Balaji Lakshminarayanan, Jesse Berent, Rodolphe Jenatton, Effrosyni Kokiopoulou

    Abstract: Uncertainty estimation in deep learning has recently emerged as a crucial area of interest to advance reliability and robustness in safety-critical applications. While there have been many proposed methods that either focus on distance-aware model uncertainties for out-of-distribution detection or on input-dependent label uncertainties for in-distribution calibration, both of these types of uncert… ▽ More

    Submitted 8 August, 2022; v1 submitted 6 October, 2021; originally announced October 2021.

    Comments: Published in TMLR

  22. arXiv:2107.10731  [pdf, other

    cs.LG stat.CO stat.ML

    Neural Variational Gradient Descent

    Authors: Lauro Langosco di Langosco, Vincent Fortuin, Heiko Strathmann

    Abstract: Particle-based approximate Bayesian inference approaches such as Stein Variational Gradient Descent (SVGD) combine the flexibility and convergence guarantees of sampling methods with the computational benefits of variational inference. In practice, SVGD relies on the choice of an appropriate kernel function, which impacts its ability to model the target distribution -- a challenging problem with o… ▽ More

    Submitted 29 July, 2021; v1 submitted 22 July, 2021; originally announced July 2021.

  23. arXiv:2107.09301  [pdf, other

    stat.ML cs.LG

    A Bayesian Approach to Invariant Deep Neural Networks

    Authors: Nikolaos Mourdoukoutas, Marco Federici, Georges Pantalos, Mark van der Wilk, Vincent Fortuin

    Abstract: We propose a novel Bayesian neural network architecture that can learn invariances from data alone by inferring a posterior distribution over different weight-sharing schemes. We show that our model outperforms other non-invariant architectures, when trained on datasets that contain specific invariances. The same holds true when no data augmentation is performed.

    Submitted 2 November, 2021; v1 submitted 20 July, 2021; originally announced July 2021.

    Comments: 8 pages, 3 figures, To be published in ICML UDL 2021

  24. arXiv:2106.11642  [pdf, other

    cs.LG stat.ML

    Repulsive Deep Ensembles are Bayesian

    Authors: Francesco D'Angelo, Vincent Fortuin

    Abstract: Deep ensembles have recently gained popularity in the deep learning community for their conceptual simplicity and efficiency. However, maintaining functional diversity between ensemble members that are independently trained with gradient descent is challenging. This can lead to pathologies when adding more ensemble members, such as a saturation of the ensemble performance, which converges to the p… ▽ More

    Submitted 28 March, 2023; v1 submitted 22 June, 2021; originally announced June 2021.

  25. arXiv:2106.10760  [pdf, other

    cs.LG stat.ML

    On Stein Variational Neural Network Ensembles

    Authors: Francesco D'Angelo, Vincent Fortuin, Florian Wenzel

    Abstract: Ensembles of deep neural networks have achieved great success recently, but they do not offer a proper Bayesian justification. Moreover, while they allow for averaging of predictions over several hypotheses, they do not provide any guarantees for their diversity, leading to redundant solutions in function space. In contrast, particle-based inference methods, such as Stein variational gradient desc… ▽ More

    Submitted 22 June, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

  26. arXiv:2106.05586  [pdf, other

    stat.ML cs.LG

    Data augmentation in Bayesian neural networks and the cold posterior effect

    Authors: Seth Nabarro, Stoil Ganev, Adrià Garriga-Alonso, Vincent Fortuin, Mark van der Wilk, Laurence Aitchison

    Abstract: Bayesian neural networks that incorporate data augmentation implicitly use a ``randomly perturbed log-likelihood [which] does not have a clean interpretation as a valid likelihood function'' (Izmailov et al. 2021). Here, we provide several approaches to developing principled Bayesian neural networks incorporating data augmentation. We introduce a ``finite orbit'' setting which allows likelihoods t… ▽ More

    Submitted 9 December, 2021; v1 submitted 10 June, 2021; originally announced June 2021.

  27. BNNpriors: A library for Bayesian neural network inference with different prior distributions

    Authors: Vincent Fortuin, Adrià Garriga-Alonso, Mark van der Wilk, Laurence Aitchison

    Abstract: Bayesian neural networks have shown great promise in many applications where calibrated uncertainty estimates are crucial and can often also lead to a higher predictive performance. However, it remains challenging to choose a good prior distribution over their weights. While isotropic Gaussian priors are often chosen in practice due to their simplicity, they do not reflect our true prior beliefs w… ▽ More

    Submitted 14 May, 2021; originally announced May 2021.

    Comments: Accepted for publication at Software Impacts

  28. arXiv:2105.06868  [pdf, ps, other

    stat.ML cs.LG

    Priors in Bayesian Deep Learning: A Review

    Authors: Vincent Fortuin

    Abstract: While the choice of prior is one of the most critical parts of the Bayesian inference workflow, recent Bayesian deep learning models have often fallen back on vague priors, such as standard Gaussians. In this review, we highlight the importance of prior choices for Bayesian deep learning and present an overview of different priors that have been proposed for (deep) Gaussian processes, variational… ▽ More

    Submitted 18 March, 2022; v1 submitted 14 May, 2021; originally announced May 2021.

  29. arXiv:2104.04975  [pdf, other

    stat.ML cs.LG

    Scalable Marginal Likelihood Estimation for Model Selection in Deep Learning

    Authors: Alexander Immer, Matthias Bauer, Vincent Fortuin, Gunnar Rätsch, Mohammad Emtiyaz Khan

    Abstract: Marginal-likelihood based model-selection, even though promising, is rarely used in deep learning due to estimation difficulties. Instead, most approaches rely on validation data, which may not be readily available. In this work, we present a scalable marginal-likelihood estimation method to select both hyperparameters and network architectures, based on the training data alone. Some hyperparamete… ▽ More

    Submitted 15 June, 2021; v1 submitted 11 April, 2021; originally announced April 2021.

    Comments: ICML 2021

  30. arXiv:2102.06571  [pdf, other

    stat.ML cs.LG

    Bayesian Neural Network Priors Revisited

    Authors: Vincent Fortuin, Adrià Garriga-Alonso, Sebastian W. Ober, Florian Wenzel, Gunnar Rätsch, Richard E. Turner, Mark van der Wilk, Laurence Aitchison

    Abstract: Isotropic Gaussian priors are the de facto standard for modern Bayesian neural network inference. However, it is unclear whether these priors accurately reflect our true beliefs about the weight distributions or give optimal performance. To find better priors, we study summary statistics of neural network weights in networks trained using stochastic gradient descent (SGD). We find that convolution… ▽ More

    Submitted 16 March, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Accepted at ICLR 2022

  31. arXiv:2102.05507  [pdf, other

    stat.ML cs.LG

    On Disentanglement in Gaussian Process Variational Autoencoders

    Authors: Simon Bing, Vincent Fortuin, Gunnar Rätsch

    Abstract: Complex multivariate time series arise in many fields, ranging from computer vision to robotics or medicine. Often we are interested in the independent underlying factors that give rise to the high-dimensional data we are observing. While many models have been introduced to learn such disentangled representations, only few attempt to explicitly exploit the structure of sequential data. We investig… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

  32. arXiv:2102.01691  [pdf, ps, other

    stat.ML cs.LG

    Exact Langevin Dynamics with Stochastic Gradients

    Authors: Adrià Garriga-Alonso, Vincent Fortuin

    Abstract: Stochastic gradient Markov Chain Monte Carlo algorithms are popular samplers for approximate inference, but they are generally biased. We show that many recent versions of these methods (e.g. Chen et al. (2014)) cannot be corrected using Metropolis-Hastings rejection sampling, because their acceptance probability is always zero. We can fix this by employing a sampler with realizable backwards traj… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 13 pages, 2 figures. Accepted to the 3rd Symposium on Advances in Approximate Bayesian Inference (AABI 2021)

  33. arXiv:2101.09815  [pdf, other

    cs.LG

    Annealed Stein Variational Gradient Descent

    Authors: Francesco D'Angelo, Vincent Fortuin

    Abstract: Particle based optimization algorithms have recently been developed as sampling methods that iteratively update a set of particles to approximate a target distribution. In particular Stein variational gradient descent has gained attention in the approximate inference literature for its flexibility and accuracy. We empirically explore the ability of this method to sample from multi-modal distributi… ▽ More

    Submitted 18 March, 2021; v1 submitted 24 January, 2021; originally announced January 2021.

  34. arXiv:2011.07255  [pdf, other

    stat.ML cs.CV cs.LG

    Factorized Gaussian Process Variational Autoencoders

    Authors: Metod Jazbec, Michael Pearce, Vincent Fortuin

    Abstract: Variational autoencoders often assume isotropic Gaussian priors and mean-field posteriors, hence do not exploit structure in scenarios where we may expect similarity or consistency across latent variables. Gaussian process variational autoencoders alleviate this problem through the use of a latent Gaussian process, but lead to a cubic inference time complexity. We propose a more scalable extension… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

  35. arXiv:2010.13472  [pdf, other

    stat.ML cs.LG

    Scalable Gaussian Process Variational Autoencoders

    Authors: Metod Jazbec, Matthew Ashman, Vincent Fortuin, Michael Pearce, Stephan Mandt, Gunnar Rätsch

    Abstract: Conventional variational autoencoders fail in modeling correlations between data points due to their use of factorized priors. Amortized Gaussian process inference through GP-VAEs has led to significant improvements in this regard, but is still inhibited by the intrinsic complexity of exact GP inference. We improve the scalability of these methods through principled sparse inference approaches. We… ▽ More

    Submitted 24 February, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: Published at AISTATS 2021

  36. arXiv:2010.10177  [pdf, other

    stat.ML cs.LG cs.NE

    Sparse Gaussian Process Variational Autoencoders

    Authors: Matthew Ashman, Jonathan So, Will Tebbutt, Vincent Fortuin, Michael Pearce, Richard E. Turner

    Abstract: Large, multi-dimensional spatio-temporal datasets are omnipresent in modern science and engineering. An effective framework for handling such data are Gaussian process deep generative models (GP-DGMs), which employ GP priors over the latent variables of DGMs. Existing approaches for performing inference in GP-DGMs do not support sparse GP approximations based on inducing points, which are essentia… ▽ More

    Submitted 23 October, 2020; v1 submitted 20 October, 2020; originally announced October 2020.

    Comments: 19 pages, 6 figures

  37. arXiv:2002.05551  [pdf, other

    stat.ML cs.LG

    PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

    Authors: Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

    Abstract: Meta-learning can successfully acquire useful inductive biases from data. Yet, its generalization properties to unseen learning tasks are poorly understood. Particularly if the number of meta-training tasks is small, this raises concerns about overfitting. We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning. Using these bounds… ▽ More

    Submitted 18 June, 2021; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: International Conference on Machine Learning (ICML) 2021

    MSC Class: 68Q32

  38. arXiv:1910.07763  [pdf, other

    cs.LG stat.ML

    Mixture-of-Experts Variational Autoencoder for Clustering and Generating from Similarity-Based Representations on Single Cell Data

    Authors: Andreas Kopf, Vincent Fortuin, Vignesh Ram Somnath, Manfred Claassen

    Abstract: Clustering high-dimensional data, such as images or biological measurements, is a long-standingproblem and has been studied extensively. Recently, Deep Clustering has gained popularity due toits flexibility in fitting the specific peculiarities of complex data. Here we introduce the Mixture-of-Experts Similarity Variational Autoencoder (MoE-Sim-VAE), a novel generative clustering model.The model c… ▽ More

    Submitted 18 December, 2020; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: Submitted to PLOS Computational Biology

  39. arXiv:1910.01590  [pdf, other

    cs.LG stat.ML

    DPSOM: Deep Probabilistic Clustering with Self-Organizing Maps

    Authors: Laura Manduchi, Matthias Hüser, Julia Vogt, Gunnar Rätsch, Vincent Fortuin

    Abstract: Generating interpretable visualizations from complex data is a common problem in many applications. Two key ingredients for tackling this issue are clustering and representation learning. However, current methods do not yet successfully combine the strengths of these two approaches. Existing representation learning models which rely on latent topological structure such as self-organising maps, exh… ▽ More

    Submitted 9 June, 2020; v1 submitted 3 October, 2019; originally announced October 2019.

  40. arXiv:1909.13146  [pdf, other

    q-bio.GN cs.LG stat.ML

    META$^\mathbf{2}$: Memory-efficient taxonomic classification and abundance estimation for metagenomics with deep learning

    Authors: Andreas Georgiou, Vincent Fortuin, Harun Mustafa, Gunnar Rätsch

    Abstract: Metagenomic studies have increasingly utilized sequencing technologies in order to analyze DNA fragments found in environmental samples.One important step in this analysis is the taxonomic classification of the DNA fragments. Conventional read classification methods require large databases and vast amounts of memory to run, with recent deep learning methods suffering from very large model sizes. W… ▽ More

    Submitted 10 February, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

  41. MGP-AttTCN: An Interpretable Machine Learning Model for the Prediction of Sepsis

    Authors: Margherita Rosnati, Vincent Fortuin

    Abstract: With a mortality rate of 5.4 million lives worldwide every year and a healthcare cost of more than 16 billion dollars in the USA alone, sepsis is one of the leading causes of hospital mortality and an increasing concern in the ageing western world. Recently, medical and technological advances have helped re-define the illness criteria of this disease, which is otherwise poorly understood by the me… ▽ More

    Submitted 11 May, 2021; v1 submitted 27 September, 2019; originally announced September 2019.

    Comments: Published at PLOS ONE

  42. arXiv:1907.04155  [pdf, other

    stat.ML cs.LG

    GP-VAE: Deep Probabilistic Time Series Imputation

    Authors: Vincent Fortuin, Dmitry Baranchuk, Gunnar Rätsch, Stephan Mandt

    Abstract: Multivariate time series with missing values are common in areas such as healthcare and finance, and have grown in number and complexity over the years. This raises the question whether deep learning methodologies can outperform classical data imputation methods in this domain. However, naive applications of deep learning fall short in giving reliable confidence estimates and lack interpretability… ▽ More

    Submitted 20 February, 2020; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: Accepted for publication at the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020)

  43. arXiv:1901.08098  [pdf, other

    stat.ML cs.LG

    Meta-Learning Mean Functions for Gaussian Processes

    Authors: Vincent Fortuin, Heiko Strathmann, Gunnar Rätsch

    Abstract: When fitting Bayesian machine learning models on scarce data, the main challenge is to obtain suitable prior knowledge and encode it into the model. Recent advances in meta-learning offer powerful methods for extracting such prior knowledge from data acquired in related tasks. When it comes to meta-learning in Gaussian process models, approaches in this setting have mostly focused on learning the… ▽ More

    Submitted 14 February, 2020; v1 submitted 23 January, 2019; originally announced January 2019.

  44. arXiv:1810.10368  [pdf, other

    stat.ML cs.AI cs.LG

    Scalable Gaussian Processes on Discrete Domains

    Authors: Vincent Fortuin, Gideon Dresdner, Heiko Strathmann, Gunnar Rätsch

    Abstract: Kernel methods on discrete domains have shown great promise for many challenging data types, for instance, biological sequence data and molecular structure data. Scalable kernel methods like Support Vector Machines may offer good predictive performances but do not intrinsically provide uncertainty estimates. In contrast, probabilistic kernel methods like Gaussian Processes offer uncertainty estima… ▽ More

    Submitted 26 May, 2021; v1 submitted 24 October, 2018; originally announced October 2018.

    Comments: Published at IEEE Access

  45. arXiv:1806.02199  [pdf, other

    cs.LG stat.ML

    SOM-VAE: Interpretable Discrete Representation Learning on Time Series

    Authors: Vincent Fortuin, Matthias Hüser, Francesco Locatello, Heiko Strathmann, Gunnar Rätsch

    Abstract: High-dimensional time series are common in many domains. Since human cognition is not optimized to work well in high-dimensional spaces, these areas could benefit from interpretable low-dimensional representations. However, most representation learning algorithms for time series data are difficult to interpret. This is due to non-intuitive mappings from data features to salient properties of the r… ▽ More

    Submitted 4 January, 2019; v1 submitted 6 June, 2018; originally announced June 2018.

    Comments: Accepted for publication at the Seventh International Conference on Learning Representations (ICLR 2019)