Zum Hauptinhalt springen

Showing 1–18 of 18 results for author: Chapados, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06423  [pdf, other

    cs.AI

    InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

    Authors: Gaurav Sahu, Abhay Puri, Juan Rodriguez, Alexandre Drouin, Perouz Taslakian, Valentina Zantedeschi, Alexandre Lacoste, David Vazquez, Nicolas Chapados, Christopher Pal, Sai Rajeswar Mudumba, Issam Hadj Laradji

    Abstract: Data analytics is essential for extracting valuable insights from data that can assist organizations in making effective decisions. We introduce InsightBench, a benchmark dataset with three key features. First, it consists of 31 datasets representing diverse business use cases such as finance and incident management, each accompanied by a carefully curated set of insights planted in the datasets.… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.05291  [pdf, other

    cs.AI

    WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

    Authors: Léo Boisvert, Megh Thakkar, Maxime Gasse, Massimo Caccia, Thibault Le Sellier De Chezelles, Quentin Cappart, Nicolas Chapados, Alexandre Lacoste, Alexandre Drouin

    Abstract: The ability of large language models (LLMs) to mimic human-like intelligence has led to a surge in LLM-based autonomous agents. Though recent LLMs seem capable of planning and reasoning given user instructions, their effectiveness in applying these capabilities for autonomous task solving remains underexplored. This is especially true in enterprise settings, where automated agents hold the promise… ▽ More

    Submitted 7 July, 2024; originally announced July 2024.

  3. arXiv:2406.11811  [pdf, other

    cs.CL cs.AI

    RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content

    Authors: Joao Monteiro, Pierre-Andre Noel, Etienne Marcotte, Sai Rajeswar, Valentina Zantedeschi, David Vazquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian

    Abstract: Large Language Models (LLMs) are trained on vast amounts of data, most of which is automatically scraped from the internet. This data includes encyclopedic documents that harbor a vast amount of general knowledge (e.g., Wikipedia) but also potentially overlap with benchmark datasets used for evaluating LLMs. Consequently, evaluating models on test splits that might have leaked into the training se… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  4. arXiv:2404.15420  [pdf, other

    cs.CL cs.AI

    XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference

    Authors: João Monteiro, Étienne Marcotte, Pierre-André Noël, Valentina Zantedeschi, David Vázquez, Nicolas Chapados, Christopher Pal, Perouz Taslakian

    Abstract: In-context learning (ICL) approaches typically leverage prompting to condition decoder-only language model generation on reference information. Just-in-time processing of a context is inefficient due to the quadratic cost of self-attention operations, and caching is desirable. However, caching transformer states can easily require almost as much space as the model parameters. When the right contex… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  5. arXiv:2404.05961  [pdf, other

    cs.CL cs.AI

    LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

    Authors: Parishad BehnamGhader, Vaibhav Adlakha, Marius Mosbach, Dzmitry Bahdanau, Nicolas Chapados, Siva Reddy

    Abstract: Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks. Yet, the community is only slowly adopting these models for text embedding tasks, which require rich contextualized representations. In this work, we introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder. LLM2Vec consi… ▽ More

    Submitted 21 August, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted to COLM 2024

  6. arXiv:2403.07718  [pdf, other

    cs.LG cs.AI

    WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

    Authors: Alexandre Drouin, Maxime Gasse, Massimo Caccia, Issam H. Laradji, Manuel Del Verme, Tom Marty, Léo Boisvert, Megh Thakkar, Quentin Cappart, David Vazquez, Nicolas Chapados, Alexandre Lacoste

    Abstract: We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuring the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 33 tasks based on the widely-used ServiceNow platform. We also… ▽ More

    Submitted 23 July, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: 21 pages, 11 figures, preprint

  7. arXiv:2402.19173  [pdf, other

    cs.SE cs.AI

    StarCoder 2 and The Stack v2: The Next Generation

    Authors: Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo , et al. (41 additional authors not shown)

    Abstract: The BigCode project, an open-scientific collaboration focused on the responsible development of Large Language Models for Code (Code LLMs), introduces StarCoder2. In partnership with Software Heritage (SWH), we build The Stack v2 on top of the digital commons of their source code archive. Alongside the SWH repositories spanning 619 programming languages, we carefully select other high-quality data… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  8. arXiv:2312.13876  [pdf, other

    cs.LG cs.CL stat.ML

    Capture the Flag: Uncovering Data Insights with Large Language Models

    Authors: Issam Laradji, Perouz Taslakian, Sai Rajeswar, Valentina Zantedeschi, Alexandre Lacoste, Nicolas Chapados, David Vazquez, Christopher Pal, Alexandre Drouin

    Abstract: The extraction of a small number of relevant insights from vast amounts of data is a crucial component of data-driven decision-making. However, accomplishing this task requires considerable technical skills, domain expertise, and human labor. This study explores the potential of using Large Language Models (LLMs) to automate the discovery of insights in data, leveraging recent advances in reasonin… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: 14 pages, 1 figure, Foundation Models for Decision Making Workshop at NeurIPS 2023

  9. arXiv:2310.08278  [pdf, other

    cs.LG cs.AI

    Lag-Llama: Towards Foundation Models for Probabilistic Time Series Forecasting

    Authors: Kashif Rasul, Arjun Ashok, Andrew Robert Williams, Hena Ghonia, Rishika Bhagwatkar, Arian Khorasani, Mohammad Javad Darvishi Bayazi, George Adamopoulos, Roland Riachi, Nadhir Hassen, Marin Biloš, Sahil Garg, Anderson Schneider, Nicolas Chapados, Alexandre Drouin, Valentina Zantedeschi, Yuriy Nevmyvaka, Irina Rish

    Abstract: Over the past years, foundation models have caused a paradigm shift in machine learning due to their unprecedented capabilities for zero-shot and few-shot generalization. However, despite the success of foundation models in modalities such as natural language processing and computer vision, the development of foundation models for time series forecasting has lagged behind. We present Lag-Llama, a… ▽ More

    Submitted 8 February, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: First two authors contributed equally. All data, models and code used are open-source. GitHub: https://github.com/time-series-foundation-models/lag-llama

  10. arXiv:2310.01327  [pdf, other

    cs.LG cs.AI stat.ML

    TACTiS-2: Better, Faster, Simpler Attentional Copulas for Multivariate Time Series

    Authors: Arjun Ashok, Étienne Marcotte, Valentina Zantedeschi, Nicolas Chapados, Alexandre Drouin

    Abstract: We introduce a new model for multivariate probabilistic time series prediction, designed to flexibly address a range of tasks including forecasting, interpolation, and their combinations. Building on copula theory, we propose a simplified objective for the recently-introduced transformer-based attentional copulas (TACTiS), wherein the number of distributional parameters now scales linearly with th… ▽ More

    Submitted 25 March, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 28 pages, 15 figures, The Twelfth International Conference on Learning Representations (ICLR 2024)

  11. arXiv:2304.09836  [pdf, other

    cs.LG stat.ML

    Regions of Reliability in the Evaluation of Multivariate Probabilistic Forecasts

    Authors: Étienne Marcotte, Valentina Zantedeschi, Alexandre Drouin, Nicolas Chapados

    Abstract: Multivariate probabilistic time series forecasts are commonly evaluated via proper scoring rules, i.e., functions that are minimal in expectation for the ground-truth distribution. However, this property is not sufficient to guarantee good discrimination in the non-asymptotic regime. In this paper, we provide the first systematic finite-sample study of proper scoring rules for time-series forecast… ▽ More

    Submitted 6 June, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: 47 pages, 37 figures, camera-ready version, Fortieth International Conference on Machine Learning (ICML 2023)

  12. arXiv:2202.03528  [pdf, other

    cs.LG stat.ML

    TACTiS: Transformer-Attentional Copulas for Time Series

    Authors: Alexandre Drouin, Étienne Marcotte, Nicolas Chapados

    Abstract: The estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance. However, the practical utility of such estimates is limited by how accurately they quantify predictive uncertainty. In this work, we address the problem of estimating the joint predictive distribution of high-dimensional multivariate time series. We propose a versatile… ▽ More

    Submitted 27 June, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 47 pages, 33 figures, camera-ready version, Thirty-ninth International Conference on Machine Learning (ICML 2022)

  13. arXiv:2002.02887  [pdf, other

    cs.LG stat.ML

    Meta-learning framework with applications to zero-shot time-series forecasting

    Authors: Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

    Abstract: Can meta-learning discover generic ways of processing time series (TS) from a diverse dataset so as to greatly improve generalization on new TS coming from different datasets? This work provides positive evidence to this using a broad meta-learning framework which we show subsumes many existing meta-learning algorithms. Our theoretical analysis suggests that residual connections act as a meta-lear… ▽ More

    Submitted 14 December, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  14. arXiv:1905.10437  [pdf, other

    cs.LG stat.ML

    N-BEATS: Neural basis expansion analysis for interpretable time series forecasting

    Authors: Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

    Abstract: We focus on solving the univariate times series point forecasting problem using deep learning. We propose a deep neural architecture based on backward and forward residual links and a very deep stack of fully-connected layers. The architecture has a number of desirable properties, being interpretable, applicable without modification to a wide array of target domains, and fast to train. We test the… ▽ More

    Submitted 20 February, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

  15. CASED: Curriculum Adaptive Sampling for Extreme Data Imbalance

    Authors: Andrew Jesson, Nicolas Guizard, Sina Hamidi Ghalehjegh, Damien Goblot, Florian Soudan, Nicolas Chapados

    Abstract: We introduce CASED, a novel curriculum sampling algorithm that facilitates the optimization of deep learning segmentation or detection models on data sets with extreme class imbalance. We evaluate the CASED learning framework on the task of lung nodule detection in chest CT. In contrast to two-stage solutions, wherein nodule candidates are first proposed by a segmentation model and refined by a se… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

    Comments: 20th International Conference on Medical Image Computing and Computer Assisted Intervention 2017

  16. arXiv:1807.05344  [pdf

    stat.ML cs.LG

    Adversarially Learned Mixture Model

    Authors: Andrew Jesson, Cécile Low-Kam, Tanya Nair, Florian Soudan, Florent Chandelier, Nicolas Chapados

    Abstract: The Adversarially Learned Mixture Model (AMM) is a generative model for unsupervised or semi-supervised data clustering. The AMM is the first adversarially optimized method to model the conditional dependence between inferred continuous and categorical latent variables. Experiments on the MNIST and SVHN datasets show that the AMM allows for semantic separation of complex data when little or no lab… ▽ More

    Submitted 23 April, 2022; v1 submitted 14 July, 2018; originally announced July 2018.

  17. arXiv:1806.00852  [pdf, other

    cs.LG cs.AI stat.ML

    On the Importance of Attention in Meta-Learning for Few-Shot Text Classification

    Authors: Xiang Jiang, Mohammad Havaei, Gabriel Chartrand, Hassan Chouaib, Thomas Vincent, Andrew Jesson, Nicolas Chapados, Stan Matwin

    Abstract: Current deep learning based text classification methods are limited by their ability to achieve fast learning and generalization when the data is scarce. We address this problem by integrating a meta-learning procedure that uses the knowledge learned across many tasks as an inductive bias towards better natural language understanding. Based on the Model-Agnostic Meta-Learning framework (MAML), we… ▽ More

    Submitted 3 June, 2018; originally announced June 2018.

    Comments: 13 pages, 4 figures, submitted to NIPS

  18. arXiv:1607.05194  [pdf, other

    cs.CV

    HeMIS: Hetero-Modal Image Segmentation

    Authors: Mohammad Havaei, Nicolas Guizard, Nicolas Chapados, Yoshua Bengio

    Abstract: We introduce a deep learning image segmentation framework that is extremely robust to missing imaging modalities. Instead of attempting to impute or synthesize missing data, the proposed approach learns, for each modality, an embedding of the input image into a single latent vector space for which arithmetic operations (such as taking the mean) are well defined. Points in that space, which are ave… ▽ More

    Submitted 18 July, 2016; originally announced July 2016.

    Comments: Accepted as an oral presentation at MICCAI 2016