Zum Hauptinhalt springen

Showing 1–50 of 51 results for author: Tomczak, J M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11001  [pdf, other

    cs.CL cs.LG

    Generative AI Systems: A Systems-based Perspective on Generative AI

    Authors: Jakub M. Tomczak

    Abstract: Large Language Models (LLMs) have revolutionized AI systems by enabling communication with machines using natural language. Recent developments in Generative AI (GenAI) like Vision-Language Models (GPT-4V) and Gemini have shown great promise in using LLMs as multimodal systems. This new research line results in building Generative AI systems, GenAISys for short, that are capable of multimodal proc… ▽ More

    Submitted 25 June, 2024; originally announced July 2024.

  2. arXiv:2404.06549  [pdf, other

    cs.LG stat.ML

    Variational Stochastic Gradient Descent for Deep Neural Networks

    Authors: Haotian Chen, Anna Kuzina, Babak Esmaeili, Jakub M Tomczak

    Abstract: Optimizing deep neural networks is one of the main tasks in successful deep learning. Current state-of-the-art optimizers are adaptive gradient-based optimization methods such as Adam. Recently, there has been an increasing interest in formulating gradient-based optimizers in a probabilistic framework for better estimation of gradients and modeling uncertainties. Here, we propose to combine both a… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  3. arXiv:2311.02455  [pdf, other

    cs.LG q-bio.GN q-bio.QM stat.AP

    Mixed Models with Multiple Instance Learning

    Authors: Jan P. Engelmann, Alessandro Palma, Jakub M. Tomczak, Fabian J. Theis, Francesco Paolo Casale

    Abstract: Predicting patient features from single-cell data can help identify cellular states implicated in health and disease. Linear models and average cell type expressions are typically favored for this task for their efficiency and robustness, but they overlook the rich cell heterogeneity inherent in single-cell data. To address this gap, we introduce MixMIL, a framework integrating Generalized Linear… ▽ More

    Submitted 8 March, 2024; v1 submitted 4 November, 2023; originally announced November 2023.

    Comments: AISTATS 2024 Oral, Code: https://github.com/AIH-SGML/MixMIL

  4. arXiv:2310.02066  [pdf, other

    cs.LG cs.AI

    De Novo Drug Design with Joint Transformers

    Authors: Adam Izdebski, Ewelina Weglarz-Tomczak, Ewa Szczurek, Jakub M. Tomczak

    Abstract: De novo drug design requires simultaneously generating novel molecules outside of training data and predicting their target properties, making it a hard task for generative models. To address this, we propose Joint Transformer that combines a Transformer decoder, Transformer encoder, and a predictor in a joint generative model with shared weights. We formulate a probabilistic black-box optimizatio… ▽ More

    Submitted 4 December, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023 Generative AI and Biology Workshop

  5. arXiv:2303.15342  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Exploring Continual Learning of Diffusion Models

    Authors: Michał Zając, Kamil Deja, Anna Kuzina, Jakub M. Tomczak, Tomasz Trzciński, Florian Shkurti, Piotr Miłoś

    Abstract: Diffusion models have achieved remarkable success in generating high-quality images thanks to their novel training procedures applied to unprecedented amounts of data. However, training a diffusion model from scratch is computationally expensive. This highlights the need to investigate the possibility of training these models iteratively, reusing computation while the data distribution changes. In… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  6. arXiv:2302.09976  [pdf, other

    cs.LG cs.CV

    Discouraging posterior collapse in hierarchical Variational Autoencoders using context

    Authors: Anna Kuzina, Jakub M. Tomczak

    Abstract: Hierarchical Variational Autoencoders (VAEs) are among the most popular likelihood-based generative models. There is a consensus that the top-down hierarchical VAEs allow effective learning of deep latent structures and avoid problems like posterior collapse. Here, we show that this is not necessarily the case, and the problem of collapsing posteriors remains. To discourage this issue, we propose… ▽ More

    Submitted 28 September, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: Code: https://github.com/AKuzina/dct_vae

  7. arXiv:2301.13622  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Data Representations with Joint Diffusion Models

    Authors: Kamil Deja, Tomasz Trzcinski, Jakub M. Tomczak

    Abstract: Joint machine learning models that allow synthesizing and classifying data often offer uneven performance between those tasks or are unstable to train. In this work, we depart from a set of empirical observations that indicate the usefulness of internal representations built by contemporary deep diffusion-based generative models not only for generating but also predicting. We then propose to exten… ▽ More

    Submitted 5 April, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Code: https://github.com/KamilDeja/joint_diffusion

  8. arXiv:2301.10540  [pdf, other

    cs.CV

    Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN

    Authors: David M. Knigge, David W. Romero, Albert Gu, Efstratios Gavves, Erik J. Bekkers, Jakub M. Tomczak, Mark Hoogendoorn, Jan-Jakob Sonke

    Abstract: Performant Convolutional Neural Network (CNN) architectures must be tailored to specific tasks in order to consider the length, resolution, and dimensionality of the input data. In this work, we tackle the need for problem-specific CNN architectures. We present the Continuous Convolutional Neural Network (CCNN): a single CNN able to process data of arbitrary resolution, dimensionality and length w… ▽ More

    Submitted 16 April, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

  9. arXiv:2212.12393  [pdf, other

    cs.LG cs.AI cs.LO stat.ML

    A-NeSI: A Scalable Approximate Method for Probabilistic Neurosymbolic Inference

    Authors: Emile van Krieken, Thiviyan Thanapalasingam, Jakub M. Tomczak, Frank van Harmelen, Annette ten Teije

    Abstract: We study the problem of combining neural networks with symbolic reasoning. Recently introduced frameworks for Probabilistic Neurosymbolic Learning (PNL), such as DeepProbLog, perform exponential-time exact inference, limiting the scalability of PNL solutions. We introduce Approximate Neurosymbolic Inference (A-NeSI): a new framework for PNL that uses neural networks for scalable approximate infere… ▽ More

    Submitted 22 September, 2023; v1 submitted 23 December, 2022; originally announced December 2022.

    Comments: Accepted to NeurIPS 2023. 13 pages, 11 appendix pages, 7 figures

  10. arXiv:2206.03398  [pdf, other

    cs.LG cs.CV

    Towards a General Purpose CNN for Long Range Dependencies in $N$D

    Authors: David W. Romero, David M. Knigge, Albert Gu, Erik J. Bekkers, Efstratios Gavves, Jakub M. Tomczak, Mark Hoogendoorn

    Abstract: The use of Convolutional Neural Networks (CNNs) is widespread in Deep Learning due to a range of desirable model properties which result in an efficient and effective machine learning framework. However, performant CNN architectures must be tailored to specific tasks in order to incorporate considerations such as the input length, resolution, and dimentionality. In this work, we overcome the need… ▽ More

    Submitted 5 July, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: First two authors contributed equally to this work

  11. arXiv:2206.00070  [pdf, other

    cs.LG

    On Analyzing Generative and Denoising Capabilities of Diffusion-based Deep Generative Models

    Authors: Kamil Deja, Anna Kuzina, Tomasz Trzciński, Jakub M. Tomczak

    Abstract: Diffusion-based Deep Generative Models (DDGMs) offer state-of-the-art performance in generative modeling. Their main strength comes from their unique setup in which a model (the backward diffusion process) is trained to reverse the forward diffusion process, which gradually adds noise to the input signal. Although DDGMs are well studied, it is still unclear how the small amount of noise is transfo… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  12. arXiv:2203.09940  [pdf, other

    cs.LG

    Alleviating Adversarial Attacks on Variational Autoencoders with MCMC

    Authors: Anna Kuzina, Max Welling, Jakub M. Tomczak

    Abstract: Variational autoencoders (VAEs) are latent variable models that can generate complex objects and provide meaningful latent representations. Moreover, they could be further used in downstream tasks such as classification. As previous work has shown, one can easily fool VAEs to produce unexpected latent representations and reconstructions for a visually slightly modified input. Here, we examine seve… ▽ More

    Submitted 12 October, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

  13. arXiv:2111.09851  [pdf, other

    cs.RO cs.AI

    The Effects of Learning in Morphologically Evolving Robot Systems

    Authors: Jie Luo, Aart Stuurman, Jakub M. Tomczak, Jacintha Ellers, Agoston E. Eiben

    Abstract: Simultaneously evolving morphologies (bodies) and controllers (brains) of robots can cause a mismatch between the inherited body and brain in the offspring. To mitigate this problem, the addition of an infant learning period by the so-called Triangle of Life framework has been proposed relatively long ago. However, an empirical assessment is still lacking to-date. In this paper we investigate the… ▽ More

    Submitted 18 November, 2021; originally announced November 2021.

    Comments: Frontiers in Robotics and AI. arXiv admin note: text overlap with arXiv:2107.08249

  14. arXiv:2110.08059  [pdf, other

    cs.CV cs.LG

    FlexConv: Continuous Kernel Convolutions with Differentiable Kernel Sizes

    Authors: David W. Romero, Robert-Jan Bruintjes, Jakub M. Tomczak, Erik J. Bekkers, Mark Hoogendoorn, Jan C. van Gemert

    Abstract: When designing Convolutional Neural Networks (CNNs), one must select the size\break of the convolutional kernels before training. Recent works show CNNs benefit from different kernel sizes at different layers, but exploring all possible combinations is unfeasible in practice. A more efficient approach is to learn the kernel size during training. However, existing works that learn the kernel size h… ▽ More

    Submitted 17 March, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: First two authors contributed equally to this work

  15. arXiv:2109.11045  [pdf, other

    cs.NE cs.LG

    Training Deep Spiking Auto-encoders without Bursting or Dying Neurons through Regularization

    Authors: Justus F. Hübotter, Pablo Lanillos, Jakub M. Tomczak

    Abstract: Spiking neural networks are a promising approach towards next-generation models of the brain in computational neuroscience. Moreover, compared to classic artificial neural networks, they could serve as an energy-efficient deployment of AI by enabling fast computation in specialized neuromorphic hardware. However, training deep spiking neural networks, especially in an unsupervised manner, is chall… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: Under review

  16. arXiv:2104.00428  [pdf, other

    stat.ML cs.AI cs.LG

    Storchastic: A Framework for General Stochastic Automatic Differentiation

    Authors: Emile van Krieken, Jakub M. Tomczak, Annette ten Teije

    Abstract: Modelers use automatic differentiation (AD) of computation graphs to implement complex Deep Learning models without defining gradient computations. Stochastic AD extends AD to stochastic computation graphs with sampling steps, which arise when modelers handle the intractable expectations common in Reinforcement Learning and Variational Inference. However, current methods for stochastic AD are limi… ▽ More

    Submitted 26 October, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: 30 pages, 2 figures, 1 table, accepted in NeurIPS 2021

  17. arXiv:2103.06701  [pdf, other

    cs.CR cs.LG stat.ML

    Diagnosing Vulnerability of Variational Auto-Encoders to Adversarial Attacks

    Authors: Anna Kuzina, Max Welling, Jakub M. Tomczak

    Abstract: In this work, we explore adversarial attacks on the Variational Autoencoders (VAE). We show how to modify data point to obtain a prescribed latent code (supervised attack) or just get a drastically different code (unsupervised attack). We examine the influence of model modifications ($β$-VAE, NVAE) on the robustness of VAEs and suggest metrics to quantify it.

    Submitted 6 May, 2021; v1 submitted 10 March, 2021; originally announced March 2021.

  18. arXiv:2102.02694  [pdf, other

    stat.ML cs.LG

    Invertible DenseNets with Concatenated LipSwish

    Authors: Yura Perugachi-Diaz, Jakub M. Tomczak, Sandjai Bhulai

    Abstract: We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient extension of Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce invertibility of the network by satisfying the Lipschitz constant. Furthermore, we propose a learnable weighted concatenation, which not only improves the model performance but… ▽ More

    Submitted 23 October, 2021; v1 submitted 4 February, 2021; originally announced February 2021.

    Comments: Accepted at Neural Information Processing Systems (NeurIPS) 2021. This is an extension of Invertible DenseNets (arXiv:2010.02125). arXiv admin note: text overlap with arXiv:2010.02125

  19. arXiv:2102.02611  [pdf, other

    cs.LG

    CKConv: Continuous Kernel Convolution For Sequential Data

    Authors: David W. Romero, Anna Kuzina, Erik J. Bekkers, Jakub M. Tomczak, Mark Hoogendoorn

    Abstract: Conventional neural architectures for sequential data present important limitations. Recurrent networks suffer from exploding and vanishing gradients, small effective memory horizons, and must be trained sequentially. Convolutional networks are unable to handle sequences of unknown size and their memory horizon must be defined a priori. In this work, we show that all these problems can be solved b… ▽ More

    Submitted 17 March, 2022; v1 submitted 4 February, 2021; originally announced February 2021.

  20. arXiv:2011.15056  [pdf, other

    cs.LG stat.ML

    General Invertible Transformations for Flow-based Generative Modeling

    Authors: Jakub M. Tomczak

    Abstract: In this paper, we present a new class of invertible transformations with an application to flow-based generative models. We indicate that many well-known invertible transformations in reversible logic and reversible neural networks could be derived from our proposition. Next, we propose two new coupling layers that are important building blocks of flow-based generative models. In the experiments o… ▽ More

    Submitted 12 July, 2021; v1 submitted 30 November, 2020; originally announced November 2020.

    Comments: Code: https://github.com/jmtomczak/git_flow, accepted to INNF+ 2021 at ICML

  21. arXiv:2010.09790  [pdf, other

    stat.ML cs.LG

    ABC-Di: Approximate Bayesian Computation for Discrete Data

    Authors: Ilze Amanda Auzina, Jakub M. Tomczak

    Abstract: Many real-life problems are represented as a black-box, i.e., the internal workings are inaccessible or a closed-form mathematical expression of the likelihood function cannot be defined. For continuous random variables likelihood-free inference problems can be solved by a group of methods under the name of Approximate Bayesian Computation (ABC). However, a similar approach for discrete random var… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: Code: https://github.com/IlzeAmandaA/ABCdiscrete

  22. arXiv:2010.09531  [pdf, other

    cs.AI cs.NE cs.RO

    Learning Locomotion Skills in Evolvable Robots

    Authors: Gongjin Lan, Maarten van Hooft, Matteo De Carlo, Jakub M. Tomczak, A. E. Eiben

    Abstract: The challenge of robotic reproduction -- making of new robots by recombining two existing ones -- has been recently cracked and physically evolving robot systems have come within reach. Here we address the next big hurdle: producing an adequate brain for a newborn robot. In particular, we address the task of targeted locomotion which is arguably a fundamental skill in any practical implementation.… ▽ More

    Submitted 19 October, 2020; originally announced October 2020.

    Comments: 12 pages

  23. arXiv:2010.06456  [pdf, other

    q-bio.BM cs.NE

    Population-based Optimization for Kinetic Parameter Identification in Glycolytic Pathway in Saccharomyces cerevisiae

    Authors: Ewelina Weglarz-Tomczak, Jakub M. Tomczak, Agoston E. Eiben, Stanley Brul

    Abstract: Models in systems biology are mathematical descriptions of biological processes that are used to answer questions and gain a better understanding of biological phenomena. Dynamic models represent the network through rates of the production and consumption for the individual species. The ordinary differential equations that describe rates of the reactions in the model include a set of parameters. T… ▽ More

    Submitted 19 September, 2020; originally announced October 2020.

    Comments: Code at https://github.com/jmtomczak/popi

  24. arXiv:2010.02125  [pdf, other

    cs.LG cs.CV stat.ML

    Invertible DenseNets

    Authors: Yura Perugachi-Diaz, Jakub M. Tomczak, Sandjai Bhulai

    Abstract: We introduce Invertible Dense Networks (i-DenseNets), a more parameter efficient alternative to Residual Flows. The method relies on an analysis of the Lipschitz continuity of the concatenation in DenseNets, where we enforce the invertibility of the network by satisfying the Lipschitz constraint. Additionally, we extend this method by proposing a learnable concatenation, which not only improves th… ▽ More

    Submitted 8 January, 2021; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Accepted at 3rd Symposium on Advances in Approximate Bayesian Inference (AABI)

  25. arXiv:2010.02014  [pdf, other

    stat.ML cs.LG

    Self-Supervised Variational Auto-Encoders

    Authors: Ioannis Gatopoulos, Jakub M. Tomczak

    Abstract: Density estimation, compression and data generation are crucial tasks in artificial intelligence. Variational Auto-Encoders (VAEs) constitute a single framework to achieve these goals. Here, we present a novel class of generative models, called self-supervised Variational Auto-Encoder (selfVAE), that utilizes deterministic and discrete variational posteriors. This class of models allows to perform… ▽ More

    Submitted 6 October, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: 19 pages, 14 figures, 2 tables

  26. arXiv:2006.05259  [pdf, other

    cs.LG stat.ML

    Wavelet Networks: Scale-Translation Equivariant Learning From Raw Time-Series

    Authors: David W. Romero, Erik J. Bekkers, Jakub M. Tomczak, Mark Hoogendoorn

    Abstract: Leveraging the symmetries inherent to specific data domains for the construction of equivariant neural networks has lead to remarkable improvements in terms of data efficiency and generalization. However, most existing research focuses on symmetries arising from planar and volumetric data, leaving a crucial data source largely underexplored: time-series. In this work, we fill this gap by leveragin… ▽ More

    Submitted 21 January, 2024; v1 submitted 9 June, 2020; originally announced June 2020.

  27. arXiv:2006.05218  [pdf, other

    cs.LG cs.CV stat.ML

    Super-resolution Variational Auto-Encoders

    Authors: Ioannis Gatopoulos, Maarten Stol, Jakub M. Tomczak

    Abstract: The framework of variational autoencoders (VAEs) provides a principled method for jointly learning latent-variable models and corresponding inference models. However, the main drawback of this approach is the blurriness of the generated images. Some studies link this effect to the objective function, namely, the (negative) log-likelihood. Here, we propose to enhance VAEs by adding a random variabl… ▽ More

    Submitted 30 June, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: 13 pages, 11 figures, 3 tables. Code available at: https://github.com/ioangatop/srVAE

  28. arXiv:2006.01910  [pdf, other

    cs.LG cs.CV stat.ML

    The Convolution Exponential and Generalized Sylvester Flows

    Authors: Emiel Hoogeboom, Victor Garcia Satorras, Jakub M. Tomczak, Max Welling

    Abstract: This paper introduces a new method to build linear flows, by taking the exponential of a linear transformation. This linear transformation does not need to be invertible itself, and the exponential has the following desirable properties: it is guaranteed to be invertible, its inverse is straightforward to compute and the log Jacobian determinant is equal to the trace of the linear transformation.… ▽ More

    Submitted 26 October, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

    Comments: Accepted to Neural Information Processing Systems (NeurIPS) 2020

  29. arXiv:2005.04166  [pdf, other

    cs.NE cs.AI

    Time Efficiency in Optimization with a Bayesian-Evolutionary Algorithm

    Authors: Gongjin Lan, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben

    Abstract: Not all generate-and-test search algorithms are created equal. Bayesian Optimization (BO) invests a lot of computation time to generate the candidate solution that best balances the predicted value and the uncertainty given all previous data, taking increasingly more time as the number of evaluations performed grows. Evolutionary Algorithms (EA) on the other hand rely on search heuristics that typ… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

    Comments: 13 pages, 10 Figures

  30. arXiv:2005.01856  [pdf, other

    stat.ML cs.CV cs.LG

    Selecting Data Augmentation for Simulating Interventions

    Authors: Maximilian Ilse, Jakub M. Tomczak, Patrick Forré

    Abstract: Machine learning models trained with purely observational data and the principle of empirical risk minimization \citep{vapnik_principles_1992} can fail to generalize to unseen domains. In this paper, we focus on the case where the problem arises through spurious correlation between the observed domains and the actual task labels. We find that many domain generalization methods do not explicitly ta… ▽ More

    Submitted 26 October, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

  31. arXiv:2002.03830  [pdf, other

    cs.CV cs.LG stat.ML

    Attentive Group Equivariant Convolutional Networks

    Authors: David W. Romero, Erik J. Bekkers, Jakub M. Tomczak, Mark Hoogendoorn

    Abstract: Although group convolutional networks are able to learn powerful representations based on symmetry patterns, they lack explicit means to learn meaningful relationships among them (e.g., relative positions and poses). In this paper, we present attentive group equivariant convolutions, a generalization of the group convolution, in which attention is applied during the course of convolution to accent… ▽ More

    Submitted 30 June, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: Proceedings of the 37th International Conference on Machine Learning (ICML), 2020

  32. arXiv:2002.02869  [pdf, other

    cs.NE

    Differential Evolution with Reversible Linear Transformations

    Authors: Jakub M. Tomczak, Ewelina Weglarz-Tomczak, Agoston E. Eiben

    Abstract: Differential evolution (DE) is a well-known type of evolutionary algorithms (EA). Similarly to other EA variants it can suffer from small populations and loose diversity too quickly. This paper presents a new approach to mitigate this issue: We propose to generate new candidate solutions by utilizing reversible linear transformation applied to a triplet of solutions from the population. In other w… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

    Comments: Code: https://github.com/jmtomczak

  33. arXiv:2001.11235  [pdf, other

    cs.LG stat.ML

    Learning Discrete Distributions by Dequantization

    Authors: Emiel Hoogeboom, Taco S. Cohen, Jakub M. Tomczak

    Abstract: Media is generally stored digitally and is therefore discrete. Many successful deep distribution models in deep learning learn a density, i.e., the distribution of a continuous random variable. Naïve optimization on discrete data leads to arbitrarily high likelihoods, and instead, it has become standard practice to add noise to datapoints. In this paper, we present a general framework for dequanti… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

  34. arXiv:2001.07804  [pdf

    cs.NE cs.AI

    Learning Directed Locomotion in Modular Robots with Evolvable Morphologies

    Authors: Gongjin Lan, Matteo De Carlo, Fuda van Diggelen, Jakub M. Tomczak, Diederik M. Roijers, A. E. Eiben

    Abstract: We generalize the well-studied problem of gait learning in modular robots in two dimensions. Firstly, we address locomotion in a given target direction that goes beyond learning a typical undirected gait. Secondly, rather than studying one fixed robot morphology we consider a test suite of different modular robots. This study is based on our interest in evolutionary robot systems where both morpho… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: 30 pages, 14 figures

  35. arXiv:1910.02912  [pdf, other

    stat.ML cs.LG

    Increasing Expressivity of a Hyperspherical VAE

    Authors: Tim R. Davidson, Jakub M. Tomczak, Efstratios Gavves

    Abstract: Learning suitable latent representations for observed, high-dimensional data is an important research topic underlying many recent advances in machine learning. While traditionally the Gaussian normal distribution has been the go-to latent parameterization, recently a variety of works have successfully proposed the use of manifold-valued latents. In one such work (Davidson et al., 2018), the autho… ▽ More

    Submitted 7 October, 2019; originally announced October 2019.

    Comments: NeurIPS 2019, in Workshop on Bayesian Deep Learning

  36. arXiv:1908.05717  [pdf, other

    eess.IV cs.LG stat.ML

    Video Compression With Rate-Distortion Autoencoders

    Authors: Amirhossein Habibian, Ties van Rozendaal, Jakub M. Tomczak, Taco S. Cohen

    Abstract: In this paper we present a a deep generative model for lossy video compression. We employ a model that consists of a 3D autoencoder with a discrete latent space and an autoregressive prior used for entropy coding. Both autoencoder and prior are trained jointly to minimize a rate-distortion loss, which is closely related to the ELBO used in variational autoencoders. Despite its simplicity, we find… ▽ More

    Submitted 13 November, 2019; v1 submitted 14 August, 2019; originally announced August 2019.

    Comments: Accepted to ICCV 2019

  37. arXiv:1905.10427  [pdf, other

    stat.ML cs.LG

    DIVA: Domain Invariant Variational Autoencoders

    Authors: Maximilian Ilse, Jakub M. Tomczak, Christos Louizos, Max Welling

    Abstract: We consider the problem of domain generalization, namely, how to learn representations given data from a set of domains that generalize to data from a previously unseen domain. We propose the Domain Invariant Variational Autoencoder (DIVA), a generative model that tackles this problem by learning three independent latent subspaces, one for the domain, one for the class, and one for any residual va… ▽ More

    Submitted 7 October, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: Code available at https://github.com/AMLab-Amsterdam/DIVA

  38. arXiv:1904.11876  [pdf, other

    stat.ML cs.LG

    Simulating Execution Time of Tensor Programs using Graph Neural Networks

    Authors: Jakub M. Tomczak, Romain Lepert, Auke Wiggers

    Abstract: Optimizing the execution time of tensor program, e.g., a convolution, involves finding its optimal configuration. Searching the configuration space exhaustively is typically infeasible in practice. In line with recent research using TVM, we propose to learn a surrogate model to overcome this issue. The model is trained on an acyclic graph called an abstract syntax tree, and utilizes a graph convol… ▽ More

    Submitted 27 November, 2019; v1 submitted 26 April, 2019; originally announced April 2019.

    Comments: All authors contributed equally. Accepted as a workshop paper at Representation Learning on Graphs and Manifolds @ ICLR 2019. Fixed values in Table 1

  39. arXiv:1902.00448  [pdf, other

    stat.ML cs.LG

    Combinatorial Bayesian Optimization using the Graph Cartesian Product

    Authors: Changyong Oh, Jakub M. Tomczak, Efstratios Gavves, Max Welling

    Abstract: This paper focuses on Bayesian Optimization (BO) for objectives on combinatorial search spaces, including ordinal and categorical variables. Despite the abundance of potential applications of Combinatorial BO, including chipset configuration search and neural architecture search, only a handful of methods have been proposed. We introduce COMBO, a new Gaussian Process (GP) BO. COMBO quantifies "smo… ▽ More

    Submitted 28 October, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Accepted to NeurIPS 2019, code: https://github.com/QUVA-Lab/COMBO

  40. arXiv:1806.09918  [pdf, other

    stat.ML cs.LG

    Hierarchical VampPrior Variational Fair Auto-Encoder

    Authors: Philip Botros, Jakub M. Tomczak

    Abstract: Decision making is a process that is extremely prone to different biases. In this paper we consider learning fair representations that aim at removing nuisance (sensitive) information from the decision process. For this purpose, we propose to use deep generative modeling and adapt a hierarchical Variational Auto-Encoder to learn these fair representations. Moreover, we utilize the mutual informati… ▽ More

    Submitted 3 July, 2018; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: ICML Workshop on Theoretical Foundations and Applications of Deep Generative Models 2018, final version

  41. arXiv:1804.00891  [pdf, other

    stat.ML cs.LG

    Hyperspherical Variational Auto-Encoders

    Authors: Tim R. Davidson, Luca Falorsi, Nicola De Cao, Thomas Kipf, Jakub M. Tomczak

    Abstract: The Variational Auto-Encoder (VAE) is one of the most used unsupervised machine learning models. But although the default choice of a Gaussian distribution for both the prior and posterior represents a mathematically convenient distribution often leading to competitive results, we show that this parameterization fails to model data with a latent hyperspherical structure. To address this issue we p… ▽ More

    Submitted 27 September, 2022; v1 submitted 3 April, 2018; originally announced April 2018.

    Comments: Code at http://github.com/nicola-decao/s-vae-tf and https://github.com/nicola-decao/s-vae-pytorch, Blogpost: https://nicola-decao.github.io/s-vae

    Journal ref: Uncertainty in Artificial Intelligence (UAI). Proceedings of the Thirty-Fourth Conference (2018) 856- 865

  42. arXiv:1803.05649  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Sylvester Normalizing Flows for Variational Inference

    Authors: Rianne van den Berg, Leonard Hasenclever, Jakub M. Tomczak, Max Welling

    Abstract: Variational inference relies on flexible approximate posterior distributions. Normalizing flows provide a general recipe to construct flexible variational posteriors. We introduce Sylvester normalizing flows, which can be seen as a generalization of planar flows. Sylvester normalizing flows remove the well-known single-unit bottleneck from planar flows, making a single transformation much more fle… ▽ More

    Submitted 20 February, 2019; v1 submitted 15 March, 2018; originally announced March 2018.

    Comments: Published at UAI 2018, 12 pages, 3 figures, code at: https://github.com/riannevdberg/sylvester-flows

  43. arXiv:1802.04712  [pdf, other

    cs.LG stat.ML

    Attention-based Deep Multiple Instance Learning

    Authors: Maximilian Ilse, Jakub M. Tomczak, Max Welling

    Abstract: Multiple instance learning (MIL) is a variation of supervised learning where a single class label is assigned to a bag of instances. In this paper, we state the MIL problem as learning the Bernoulli distribution of the bag label where the bag label probability is fully parameterized by neural networks. Furthermore, we propose a neural network-based permutation-invariant aggregation operator that c… ▽ More

    Submitted 28 June, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: ICML 2018 paper, code source: https://github.com/AMLab-Amsterdam/AttentionDeepMIL

  44. arXiv:1712.00310  [pdf, other

    cs.LG stat.ML

    Deep Learning with Permutation-invariant Operator for Multi-instance Histopathology Classification

    Authors: Jakub M. Tomczak, Maximilian Ilse, Max Welling

    Abstract: The computer-aided analysis of medical scans is a longstanding goal in the medical imaging field. Currently, deep learning has became a dominant methodology for supporting pathologists and radiologist. Deep learning algorithms have been successfully applied to digital pathology and radiology, nevertheless, there are still practical issues that prevent these tools to be widely used in practice. The… ▽ More

    Submitted 5 December, 2017; v1 submitted 1 December, 2017; originally announced December 2017.

    Comments: Workshop on "Medical Imaging meets NIPS" at NIPS 2017

  45. arXiv:1705.07120  [pdf, other

    cs.LG cs.AI stat.ML

    VAE with a VampPrior

    Authors: Jakub M. Tomczak, Max Welling

    Abstract: Many different methods to train deep generative models have been introduced in the past. In this paper, we propose to extend the variational auto-encoder (VAE) framework with a new type of prior which we call "Variational Mixture of Posteriors" prior, or VampPrior for short. The VampPrior consists of a mixture distribution (e.g., a mixture of Gaussians) with components given by variational posteri… ▽ More

    Submitted 26 February, 2018; v1 submitted 19 May, 2017; originally announced May 2017.

    Comments: 16 pages, final version, AISTATS 2018

  46. arXiv:1611.09630  [pdf, other

    cs.LG stat.ML

    Improving Variational Auto-Encoders using Householder Flow

    Authors: Jakub M. Tomczak, Max Welling

    Abstract: Variational auto-encoders (VAE) are scalable and powerful generative models. However, the choice of the variational posterior determines tractability and flexibility of the VAE. Commonly, latent variables are modeled using the normal distribution with a diagonal covariance matrix. This results in computational efficiency but typically it is not flexible enough to match the true posterior distribut… ▽ More

    Submitted 26 January, 2017; v1 submitted 29 November, 2016; originally announced November 2016.

    Comments: A corrected version of the paper submitted to Bayesian Deep Learning Workshop (NIPS 2016)

  47. Learning Deep Architectures for Interaction Prediction in Structure-based Virtual Screening

    Authors: Adam Gonczarek, Jakub M. Tomczak, Szymon Zaręba, Joanna Kaczmar, Piotr Dąbrowski, Michał J. Walczak

    Abstract: We introduce a deep learning architecture for structure-based virtual screening that generates fixed-sized fingerprints of proteins and small molecules by applying learnable atom convolution and softmax operations to each compound separately. These fingerprints are further transformed non-linearly, their inner-product is calculated and used to predict the binding potential. Moreover, we show that… ▽ More

    Submitted 19 September, 2017; v1 submitted 23 October, 2016; originally announced October 2016.

    Comments: Workshop on Machine Learning in Computational Biology. 30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain Extended version published in Computers in Biology and Medicine and available online: http://www.sciencedirect.com/science/article/pii/S0010482517302974

  48. arXiv:1505.02581  [pdf, other

    cs.LG cs.NE

    Improving neural networks with bunches of neurons modeled by Kumaraswamy units: Preliminary study

    Authors: Jakub Mikolaj Tomczak

    Abstract: Deep neural networks have recently achieved state-of-the-art results in many machine learning problems, e.g., speech recognition or object recognition. Hitherto, work on rectified linear units (ReLU) provides empirical and theoretical evidence on performance increase of neural networks comparing to typically used sigmoid activation function. In this paper, we investigate a new manner of improving… ▽ More

    Submitted 11 May, 2015; originally announced May 2015.

    Comments: 7 pages, 4 figures

  49. arXiv:1407.4422  [pdf, other

    cs.LG

    Subspace Restricted Boltzmann Machine

    Authors: Jakub M. Tomczak, Adam Gonczarek

    Abstract: The subspace Restricted Boltzmann Machine (subspaceRBM) is a third-order Boltzmann machine where multiplicative interactions are between one visible and two hidden units. There are two kinds of hidden units, namely, gate units and subspace units. The subspace units reflect variations of a pattern in data and the gate unit is responsible for activating the subspace units. Additionally, the gate uni… ▽ More

    Submitted 16 July, 2014; originally announced July 2014.

    Comments: 7 pages

  50. arXiv:1308.6324  [pdf, other

    cs.LG

    Prediction of breast cancer recurrence using Classification Restricted Boltzmann Machine with Dropping

    Authors: Jakub M. Tomczak

    Abstract: In this paper, we apply Classification Restricted Boltzmann Machine (ClassRBM) to the problem of predicting breast cancer recurrence. According to the Polish National Cancer Registry, in 2010 only, the breast cancer caused almost 25% of all diagnosed cases of cancer in Poland. We propose how to use ClassRBM for predicting breast cancer return and discovering relevant inputs (symptoms) in illness r… ▽ More

    Submitted 30 October, 2013; v1 submitted 28 August, 2013; originally announced August 2013.

    Comments: technical report