Zum Hauptinhalt springen

Showing 1–14 of 14 results for author: Zappella, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12824  [pdf, other

    cs.CL cs.AI

    Whispering Experts: Neural Interventions for Toxicity Mitigation in Language Models

    Authors: Xavier Suau, Pieter Delobelle, Katherine Metcalf, Armand Joulin, Nicholas Apostoloff, Luca Zappella, Pau Rodríguez

    Abstract: An important issue with Large Language Models (LLMs) is their undesired ability to generate toxic language. In this work, we show that the neurons responsible for toxicity can be determined by their power to discriminate toxic sentences, and that toxic language can be mitigated by reducing their activation levels proportionally to this power. We propose AUROC adaptation (AurA), an intervention tha… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: ICML 2024, 8 pages + appendix

  2. arXiv:2310.13040  [pdf, other

    cs.LG cs.AI cs.CV

    Robust multimodal models have outlier features and encode more concepts

    Authors: Jonathan Crabbé, Pau Rodríguez, Vaishaal Shankar, Luca Zappella, Arno Blaas

    Abstract: What distinguishes robust models from non-robust ones? This question has gained traction with the appearance of large-scale multimodal models, such as CLIP. These models have demonstrated unprecedented robustness with respect to natural distribution shifts. While it has been shown that such differences in robustness can be traced back to differences in training data, so far it is not known what th… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 29 pages, 18 figures

  3. arXiv:2309.16318  [pdf, other

    cs.LG

    DeepPCR: Parallelizing Sequential Operations in Neural Networks

    Authors: Federico Danieli, Miguel Sarabia, Xavier Suau, Pau Rodríguez, Luca Zappella

    Abstract: Parallelization techniques have become ubiquitous for accelerating inference and training of deep neural networks. Despite this, several operations are still performed in a sequential manner. For instance, the forward and backward passes are executed layer-by-layer, and the output of diffusion models is produced by applying a sequence of denoising steps. This sequential approach results in a compu… ▽ More

    Submitted 27 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

  4. arXiv:2308.09514  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Spatial LibriSpeech: An Augmented Dataset for Spatial Audio Learning

    Authors: Miguel Sarabia, Elena Menyaylenko, Alessandro Toso, Skyler Seto, Zakaria Aldeneh, Shadi Pirhosseinloo, Luca Zappella, Barry-John Theobald, Nicholas Apostoloff, Jonathan Sheaffer

    Abstract: We present Spatial LibriSpeech, a spatial audio dataset with over 650 hours of 19-channel audio, first-order ambisonics, and optional distractor noise. Spatial LibriSpeech is designed for machine learning model training, and it includes labels for source position, speaking direction, room acoustics and geometry. Spatial LibriSpeech is generated by augmenting LibriSpeech samples with 200k+ simulate… ▽ More

    Submitted 18 August, 2023; originally announced August 2023.

    Journal ref: Proceedings of INTERSPEECH (2023), pp. 3724-3728

  5. arXiv:2307.10907  [pdf, other

    cs.LG

    The Role of Entropy and Reconstruction in Multi-View Self-Supervised Learning

    Authors: Borja Rodríguez-Gálvez, Arno Blaas, Pau Rodríguez, Adam Goliński, Xavier Suau, Jason Ramapuram, Dan Busbridge, Luca Zappella

    Abstract: The mechanisms behind the success of multi-view self-supervised learning (MVSSL) are not yet fully understood. Contrastive MVSSL methods have been studied through the lens of InfoNCE, a lower bound of the Mutual Information (MI). However, the relation between other MVSSL methods and MI remains unclear. We consider a different lower bound on the MI consisting of an entropy and a reconstruction term… ▽ More

    Submitted 9 December, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: 18 pages: 9 of main text, 2 of references, and 7 of supplementary material [Updated typo in page 6 (Section 3.2)]. Appears in the proceedings of ICML 2023

  6. arXiv:2306.16058  [pdf, other

    cs.LG cs.AI

    DUET: 2D Structured and Approximately Equivariant Representations

    Authors: Xavier Suau, Federico Danieli, T. Anderson Keller, Arno Blaas, Chen Huang, Jason Ramapuram, Dan Busbridge, Luca Zappella

    Abstract: Multiview Self-Supervised Learning (MSSL) is based on learning invariances with respect to a set of input transformations. However, invariance partially or totally removes transformation-related information from the representations, which might harm performance for specific downstream tasks that require such information. We propose 2D strUctured and EquivarianT representations (coined DUET), which… ▽ More

    Submitted 17 November, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: Accepted at ICML 2023

  7. arXiv:2301.10319  [pdf, other

    cs.HC cs.AI cs.LG

    Designing Data: Proactive Data Collection and Iteration for Machine Learning

    Authors: Aspen Hopkins, Fred Hohman, Luca Zappella, Xavier Suau Cuadros, Dominik Moritz

    Abstract: Lack of diversity in data collection has caused significant failures in machine learning (ML) applications. While ML developers perform post-collection interventions, these are time intensive and rarely comprehensive. Thus, new methods to track & manage data collection, iteration, and model training are necessary for evaluating whether datasets reflect real world variability. We present designing… ▽ More

    Submitted 28 July, 2023; v1 submitted 24 January, 2023; originally announced January 2023.

    Comments: AI + HCI workshop at ICML 2023

  8. arXiv:2211.08282  [pdf, other

    cs.LG cs.AI

    Homomorphic Self-Supervised Learning

    Authors: T. Anderson Keller, Xavier Suau, Luca Zappella

    Abstract: In this work, we observe that many existing self-supervised learning algorithms can be both unified and generalized when seen through the lens of equivariant representations. Specifically, we introduce a general framework we call Homomorphic Self-Supervised Learning, and theoretically show how it may subsume the use of input-augmentations provided an augmentation-homomorphic feature extractor. We… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  9. arXiv:2211.05304  [pdf, other

    cs.CV cs.LG

    Contrastive Self-Supervised Learning for Skeleton Representations

    Authors: Nico Lingg, Miguel Sarabia, Luca Zappella, Barry-John Theobald

    Abstract: Human skeleton point clouds are commonly used to automatically classify and predict the behaviour of others. In this paper, we use a contrastive self-supervised learning method, SimCLR, to learn representations that capture the semantics of skeleton point clouds. This work focuses on systematically evaluating the effects that different algorithmic decisions (including augmentations, dataset partit… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: 8 pages, 2 figures, 4 tables. Accepted at the NeurIPS 2022 Workshop: Self-Supervised Learning - Theory and Practice

  10. arXiv:2202.03586  [pdf, other

    cs.CV cs.AI cs.LG

    Fair SA: Sensitivity Analysis for Fairness in Face Recognition

    Authors: Aparna R. Joshi, Xavier Suau, Nivedha Sivakumar, Luca Zappella, Nicholas Apostoloff

    Abstract: As the use of deep learning in high impact domains becomes ubiquitous, it is increasingly important to assess the resilience of models. One such high impact domain is that of face recognition, with real world applications involving images affected by various degradations, such as motion blur or high exposure. Moreover, images captured across different attributes, such as gender and race, can also… ▽ More

    Submitted 9 February, 2022; v1 submitted 7 February, 2022; originally announced February 2022.

    Comments: 8 pages, 5 figures, to be published in NeurIPS 2021 Workshop, Algorithmic Fairness through the Lens of Causality and Robustness

  11. arXiv:2111.12427  [pdf, other

    cs.LG cs.CV

    Challenges of Adversarial Image Augmentations

    Authors: Arno Blaas, Xavier Suau, Jason Ramapuram, Nicholas Apostoloff, Luca Zappella

    Abstract: Image augmentations applied during training are crucial for the generalization performance of image classifiers. Therefore, a large body of research has focused on finding the optimal augmentation policy for a given task. Yet, RandAugment [2], a simple random augmentation policy, has recently been shown to outperform existing sophisticated policies. Only Adversarial AutoAugment (AdvAA) [11], an ap… ▽ More

    Submitted 3 December, 2021; v1 submitted 24 November, 2021; originally announced November 2021.

    Comments: To appear at the ICBINB 2021 Neurips Workshop

  12. arXiv:2110.02802  [pdf, other

    cs.CL

    Self-conditioning pre-trained language models

    Authors: Xavier Suau, Luca Zappella, Nicholas Apostoloff

    Abstract: In this paper we aim to investigate the mechanisms that guide text generation with pre-trained Transformer-based Language Models (TLMs). Grounded on the Product of Experts formulation by Hinton (1999), we describe a generative mechanism that exploits expert units which naturally exist in TLMs. Such units are responsible for detecting concepts in the input and conditioning text generation on such c… ▽ More

    Submitted 14 June, 2023; v1 submitted 30 September, 2021; originally announced October 2021.

    Comments: 8 pages and supplementary material, accepted at ICML 2022

  13. arXiv:2005.07647  [pdf, other

    cs.AI cs.CL cs.LG

    Finding Experts in Transformer Models

    Authors: Xavier Suau, Luca Zappella, Nicholas Apostoloff

    Abstract: In this work we study the presence of expert units in pre-trained Transformer Models (TM), and how they impact a model's performance. We define expert units to be neurons that are able to classify a concept with a given average precision, where a concept is represented by a binary set of sentences containing the concept (or not). Leveraging the OneSec dataset (Scarlini et al., 2019), we compile a… ▽ More

    Submitted 15 May, 2020; originally announced May 2020.

  14. arXiv:1807.10585  [pdf, ps, other

    cs.CV

    Filter Distillation for Network Compression

    Authors: Xavier Suau, Luca Zappella, Nicholas Apostoloff

    Abstract: In this paper we introduce Principal Filter Analysis (PFA), an easy to use and effective method for neural network compression. PFA exploits the correlation between filter responses within network layers to recommend a smaller network that maintain as much as possible the accuracy of the full model. We propose two algorithms: the first allows users to target compression to specific network propert… ▽ More

    Submitted 11 December, 2019; v1 submitted 20 July, 2018; originally announced July 2018.

    Comments: 10 pages, 3 figures, Deep neural network compression, spectral analysis, machine learning

    Journal ref: WACV 2020