Zum Hauptinhalt springen

Showing 1–9 of 9 results for author: Lavoie, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  2. arXiv:2405.00740  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Modeling Caption Diversity in Contrastive Vision-Language Pretraining

    Authors: Samuel Lavoie, Polina Kirichenko, Mark Ibrahim, Mahmoud Assran, Andrew Gordon Wilson, Aaron Courville, Nicolas Ballas

    Abstract: There are a thousand ways to caption an image. Contrastive Language Pretraining (CLIP) on the other hand, works by mapping an image and its caption to a single vector -- limiting how well CLIP-like models can represent the diverse ways to describe an image. In this work, we introduce Llip, Latent Language Image Pretraining, which models the diversity of captions that could match an image. Llip's v… ▽ More

    Submitted 14 May, 2024; v1 submitted 29 April, 2024; originally announced May 2024.

    Comments: 14 pages, 8 figures, 7 tables, to be published at ICML2024

  3. arXiv:2404.15721  [pdf, other

    cs.CV cs.AI

    SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision

    Authors: Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville

    Abstract: Selective attention helps us focus on task-relevant aspects in the constant flood of our sensory input. This constraint in our perception allows us to robustly generalize under distractions and to new compositions of perceivable concepts. Transformers employ a similar notion of attention in their architecture, but representation learning models with transformer backbones like CLIP and DINO often f… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

  4. arXiv:2312.07551  [pdf, other

    cs.CL

    Language Model Alignment with Elastic Reset

    Authors: Michael Noukhovitch, Samuel Lavoie, Florian Strub, Aaron Courville

    Abstract: Finetuning language models with reinforcement learning (RL), e.g. from human feedback (HF), is a prominent method for alignment. But optimizing against a reward model can improve on reward while degrading performance in other areas, a phenomenon known as reward hacking, alignment tax, or language drift. First, we argue that commonly-used test metrics are insufficient and instead measure how differ… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: Published at NeurIPS 2023

  5. arXiv:2310.18777  [pdf, other

    cs.LG cs.AI

    Improving Compositional Generalization Using Iterated Learning and Simplicial Embeddings

    Authors: Yi Ren, Samuel Lavoie, Mikhail Galkin, Danica J. Sutherland, Aaron Courville

    Abstract: Compositional generalization, the ability of an agent to generalize to unseen combinations of latent factors, is easy for humans but hard for deep neural networks. A line of research in cognitive science has hypothesized a process, ``iterated learning,'' to help explain how human language developed this ability; the theory rests on simultaneous pressures towards compressibility (when an ignorant a… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

  6. arXiv:2304.05369  [pdf, other

    cs.LG

    A surprisingly simple technique to control the pretraining bias for better transfer: Expand or Narrow your representation

    Authors: Florian Bordes, Samuel Lavoie, Randall Balestriero, Nicolas Ballas, Pascal Vincent

    Abstract: Self-Supervised Learning (SSL) models rely on a pretext task to learn representations. Because this pretext task differs from the downstream tasks used to evaluate the performance of these models, there is an inherent misalignment or pretraining bias. A commonly used trick in SSL, shown to make deep networks more robust to such bias, is the addition of a small projector (usually a 2 or 3 layer mul… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

  7. arXiv:2204.00616  [pdf, other

    cs.LG cs.CV

    Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

    Authors: Samuel Lavoie, Christos Tsirigotis, Max Schwarzer, Ankit Vani, Michael Noukhovitch, Kenji Kawaguchi, Aaron Courville

    Abstract: Simplicial Embeddings (SEM) are representations learned through self-supervised learning (SSL), wherein a representation is projected into $L$ simplices of $V$ dimensions each using a softmax operation. This procedure conditions the representation onto a constrained space during pretraining and imparts an inductive bias for group sparsity. For downstream classification, we formally prove that the… ▽ More

    Submitted 30 September, 2022; v1 submitted 1 April, 2022; originally announced April 2022.

    Comments: 30 pages, 8 figures, Preprint

  8. arXiv:2010.01262  [pdf, other

    cs.LG stat.ML

    Integrating Categorical Semantics into Unsupervised Domain Translation

    Authors: Samuel Lavoie, Faruk Ahmed, Aaron Courville

    Abstract: While unsupervised domain translation (UDT) has seen a lot of success recently, we argue that mediating its translation via categorical semantic features could broaden its applicability. In particular, we demonstrate that categorical semantics improves the translation between perceptually different domains sharing multiple object categories. We propose a method to learn, in an unsupervised manner,… ▽ More

    Submitted 16 March, 2021; v1 submitted 2 October, 2020; originally announced October 2020.

    Comments: 22 pages. In submission to the International Conference on Learning Representation (ICLR) 2021

  9. arXiv:1901.03984  [pdf, other

    cs.SE cs.DC

    Serverless architecture efficiency: an exploratory study

    Authors: Samuel Lavoie, Anthony Garant, Fabio Petrillo

    Abstract: Cloud service provider propose services to insensitive customers to use their platform. Different services can achieve the same result at different cost. In this paper, we study the efficiency of a serverless architecture for running highly parallelizable tasks to compare theses services in order to find the most efficient in term of performance and cost. More precisely, we look at the compute tim… ▽ More

    Submitted 13 January, 2019; originally announced January 2019.