Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: van Merrienboer, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.16436  [pdf

    cs.SD cs.AI cs.LG eess.AS

    Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

    Authors: Ben Williams, Bart van Merriënboer, Vincent Dumoulin, Jenny Hamer, Eleni Triantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, Tries B. Razak, Kate E. Jones, Tom Denton

    Abstract: Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pr… ▽ More

    Submitted 7 May, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: 18 pages, 5 figures

  2. arXiv:2312.07439  [pdf, other

    cs.LG

    BIRB: A Generalization Benchmark for Information Retrieval in Bioacoustics

    Authors: Jenny Hamer, Eleni Triantafillou, Bart van Merriënboer, Stefan Kahl, Holger Klinck, Tom Denton, Vincent Dumoulin

    Abstract: The ability for a machine learning model to cope with differences in training and deployment conditions--e.g. in the presence of distribution shift or the generalization to new classes altogether--is crucial for real-world use cases. However, most empirical work in this area has focused on the image domain with artificial benchmarks constructed to measure individual aspects of generalization. We p… ▽ More

    Submitted 13 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  3. arXiv:2302.06658  [pdf, other

    cs.LG

    In Search for a Generalizable Method for Source Free Domain Adaptation

    Authors: Malik Boudiaf, Tom Denton, Bart van Merriënboer, Vincent Dumoulin, Eleni Triantafillou

    Abstract: Source-free domain adaptation (SFDA) is compelling because it allows adapting an off-the-shelf model to a new domain using only unlabelled data. In this work, we apply existing SFDA techniques to a challenging set of naturally-occurring distribution shifts in bioacoustics, which are very different from the ones commonly studied in computer vision. We find existing methods perform differently relat… ▽ More

    Submitted 24 June, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  4. arXiv:2201.05125  [pdf, other

    cs.LG cs.CV

    GradMax: Growing Neural Networks using Gradient Information

    Authors: Utku Evci, Bart van Merriënboer, Thomas Unterthiner, Max Vladymyrov, Fabian Pedregosa

    Abstract: The architecture and the parameters of neural networks are often optimized independently, which requires costly retraining of the parameters whenever the architecture is modified. In this work we instead focus on growing the architecture without requiring costly retraining. We present a method that adds new neurons during training without impacting what is already learned, while improving the trai… ▽ More

    Submitted 7 June, 2022; v1 submitted 13 January, 2022; originally announced January 2022.

    Comments: ICLR 2022

    Journal ref: International Conference on Learning Representations, 2022

  5. arXiv:1906.11786  [pdf, other

    stat.ML cs.LG

    Fast Training of Sparse Graph Neural Networks on Dense Hardware

    Authors: Matej Balog, Bart van Merriënboer, Subhodeep Moitra, Yujia Li, Daniel Tarlow

    Abstract: Graph neural networks have become increasingly popular in recent years due to their ability to naturally encode relational input data and their ability to scale to large graphs by operating on a sparse representation of graph adjacency matrices. As we look to scale up these models using custom hardware, a natural assumption would be that we need hardware tailored to sparse operations and/or dynami… ▽ More

    Submitted 27 June, 2019; originally announced June 2019.

  6. arXiv:1906.07774  [pdf, other

    cs.LG stat.ML

    On the interplay between noise and curvature and its effect on optimization and generalization

    Authors: Valentin Thomas, Fabian Pedregosa, Bart van Merriënboer, Pierre-Antoine Mangazol, Yoshua Bengio, Nicolas Le Roux

    Abstract: The speed at which one can minimize an expected loss using stochastic methods depends on two properties: the curvature of the loss and the variance of the gradients. While most previous works focus on one or the other of these properties, we explore how their interaction affects optimization speed. Further, as the ultimate goal is good generalization performance, we clarify how both curvature and… ▽ More

    Submitted 6 April, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: Accepted to AISTATS 2020

  7. arXiv:1810.11530  [pdf, other

    cs.LG cs.PL stat.ML

    Automatic differentiation in ML: Where we are and where we should be going

    Authors: Bart van Merriënboer, Olivier Breuleux, Arnaud Bergeron, Pascal Lamblin

    Abstract: We review the current state of automatic differentiation (AD) for array programming in machine learning (ML), including the different approaches such as operator overloading (OO) and source transformation (ST) used for AD, graph-based intermediate representations for programs, and source languages. Based on these insights, we introduce a new graph-based intermediate representation (IR) which speci… ▽ More

    Submitted 2 January, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

  8. arXiv:1809.09569  [pdf, other

    cs.LG cs.SE stat.ML

    Tangent: Automatic differentiation using source-code transformation for dynamically typed array programming

    Authors: Bart van Merriënboer, Dan Moldovan, Alexander B Wiltschko

    Abstract: The need to efficiently calculate first- and higher-order derivatives of increasingly complex models expressed in Python has stressed or exceeded the capabilities of available tools. In this work, we explore techniques from the field of automatic differentiation (AD) that can give researchers expressive power, performance and strong usability. These include source-code transformation (SCT), flexib… ▽ More

    Submitted 26 September, 2018; v1 submitted 25 September, 2018; originally announced September 2018.

  9. arXiv:1711.02712  [pdf, other

    cs.MS stat.ML

    Tangent: Automatic Differentiation Using Source Code Transformation in Python

    Authors: Bart van Merriënboer, Alexander B. Wiltschko, Dan Moldovan

    Abstract: Automatic differentiation (AD) is an essential primitive for machine learning programming systems. Tangent is a new library that performs AD using source code transformation (SCT) in Python. It takes numeric functions written in a syntactic subset of Python and NumPy as input, and generates new Python functions which calculate a derivative. This approach to automatic differentiation is different f… ▽ More

    Submitted 7 November, 2017; originally announced November 2017.

  10. arXiv:1707.00762  [pdf, ps, other

    stat.ML cs.LG

    Multiscale sequence modeling with a learned dictionary

    Authors: Bart van Merriënboer, Amartya Sanyal, Hugo Larochelle, Yoshua Bengio

    Abstract: We propose a generalization of neural network sequence models. Instead of predicting one symbol at a time, our multi-scale model makes predictions over multiple, potentially overlapping multi-symbol tokens. A variation of the byte-pair encoding (BPE) compression algorithm is used to learn the dictionary of tokens that the model is trained with. When applied to language modelling, our model has the… ▽ More

    Submitted 5 July, 2017; v1 submitted 3 July, 2017; originally announced July 2017.

  11. arXiv:1605.02688  [pdf, other

    cs.SC cs.LG cs.MS

    Theano: A Python framework for fast computation of mathematical expressions

    Authors: The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano , et al. (88 additional authors not shown)

    Abstract: Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, mu… ▽ More

    Submitted 9 May, 2016; originally announced May 2016.

    Comments: 19 pages, 5 figures

  12. arXiv:1506.00619  [pdf, ps, other

    cs.LG cs.NE stat.ML

    Blocks and Fuel: Frameworks for deep learning

    Authors: Bart van Merriënboer, Dzmitry Bahdanau, Vincent Dumoulin, Dmitriy Serdyuk, David Warde-Farley, Jan Chorowski, Yoshua Bengio

    Abstract: We introduce two Python frameworks to train neural networks on large datasets: Blocks and Fuel. Blocks is based on Theano, a linear algebra compiler with CUDA-support. It facilitates the training of complex neural network models by providing parametrized Theano operations, attaching metadata to Theano's symbolic computational graph, and providing an extensive set of utilities to assist training th… ▽ More

    Submitted 1 June, 2015; originally announced June 2015.

  13. arXiv:1502.05698  [pdf, ps, other

    cs.AI cs.CL stat.ML

    Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

    Authors: Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin, Tomas Mikolov

    Abstract: One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent. To measure progress towards that goal, we argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. Our tasks measure understanding in several ways: whether a system is a… ▽ More

    Submitted 31 December, 2015; v1 submitted 19 February, 2015; originally announced February 2015.

  14. arXiv:1409.1259  [pdf, other

    cs.CL stat.ML

    On the Properties of Neural Machine Translation: Encoder-Decoder Approaches

    Authors: Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, Yoshua Bengio

    Abstract: Neural machine translation is a relatively new approach to statistical machine translation based purely on neural networks. The neural machine translation models often consist of an encoder and a decoder. The encoder extracts a fixed-length representation from a variable-length input sentence, and the decoder generates a correct translation from this representation. In this paper, we focus on anal… ▽ More

    Submitted 7 October, 2014; v1 submitted 3 September, 2014; originally announced September 2014.

    Comments: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8)

  15. arXiv:1409.1257  [pdf, other

    cs.CL cs.LG cs.NE stat.ML

    Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation

    Authors: Jean Pouget-Abadie, Dzmitry Bahdanau, Bart van Merrienboer, Kyunghyun Cho, Yoshua Bengio

    Abstract: The authors of (Cho et al., 2014a) have shown that the recently introduced neural network translation systems suffer from a significant drop in translation quality when translating long sentences, unlike existing phrase-based translation systems. In this paper, we propose a way to address this issue by automatically segmenting an input sentence into phrases that can be easily translated by the neu… ▽ More

    Submitted 7 October, 2014; v1 submitted 3 September, 2014; originally announced September 2014.

    Comments: Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation (SSST-8)

  16. arXiv:1406.1078  [pdf, other

    cs.CL cs.LG cs.NE stat.ML

    Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

    Authors: Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio

    Abstract: In this paper, we propose a novel neural network model called RNN Encoder-Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixed-length vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of… ▽ More

    Submitted 2 September, 2014; v1 submitted 3 June, 2014; originally announced June 2014.

    Comments: EMNLP 2014