Zum Hauptinhalt springen

Showing 1–10 of 10 results for author: Bie, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12108  [pdf, other

    cs.LG cs.CL cs.CR

    Private prediction for large-scale synthetic text generation

    Authors: Kareem Amin, Alex Bie, Weiwei Kong, Alexey Kurakin, Natalia Ponomareva, Umar Syed, Andreas Terzis, Sergei Vassilvitskii

    Abstract: We present an approach for generating differentially private synthetic text using large language models (LLMs), via private prediction. In the private prediction framework, we only require the output synthetic data to satisfy differential privacy guarantees. This is in contrast to approaches that train a generative model on potentially sensitive user-supplied source data and seek to ensure the mod… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 12 pages main text + 15 pages appendix

  2. arXiv:2406.17814  [pdf, ps, other

    stat.ML cs.DS cs.IT cs.LG math.ST

    Distribution Learnability and Robustness

    Authors: Shai Ben-David, Alex Bie, Gautam Kamath, Tosca Lechner

    Abstract: We examine the relationship between learnability and robust (or agnostic) learnability for the problem of distribution learning. We show that, contrary to other learning settings (e.g., PAC learning of function classes), realizable learnability of a class of probability distributions does not imply its agnostic learnability. We go on to examine what type of data corruption can disrupt the learnabi… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: In NeurIPS 2023

  3. arXiv:2402.01862  [pdf, other

    cs.LG cs.AI

    Parametric Feature Transfer: One-shot Federated Learning with Foundation Models

    Authors: Mahdi Beitollahi, Alex Bie, Sobhan Hemati, Leo Maxime Brunswic, Xu Li, Xi Chen, Guojun Zhang

    Abstract: In one-shot federated learning (FL), clients collaboratively train a global model in a single round of communication. Existing approaches for one-shot FL enhance communication efficiency at the expense of diminished accuracy. This paper introduces FedPFT (Federated Learning with Parametric Feature Transfer), a methodology that harnesses the transferability of foundation models to enhance both accu… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 20 pages, 12 figures

  4. arXiv:2308.09565  [pdf, other

    cs.LG stat.ML

    Understanding the Role of Layer Normalization in Label-Skewed Federated Learning

    Authors: Guojun Zhang, Mahdi Beitollahi, Alex Bie, Xi Chen

    Abstract: Layer normalization (LN) is a widely adopted deep learning technique especially in the era of foundation models. Recently, LN has been shown to be surprisingly effective in federated learning (FL) with non-i.i.d. data. However, exactly why and how it works remains mysterious. In this work, we reveal the profound connection between layer normalization and the label shift problem in federated learni… ▽ More

    Submitted 14 February, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: accepted at TMLR

  5. arXiv:2308.06239  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Private Distribution Learning with Public Data: The View from Sample Compression

    Authors: Shai Ben-David, Alex Bie, Clément L. Canonne, Gautam Kamath, Vikrant Singhal

    Abstract: We study the problem of private distribution learning with access to public data. In this setup, which we refer to as public-private learning, the learner is given public and private samples drawn from an unknown distribution $p$ belonging to a class $\mathcal Q$, with the goal of outputting an estimate of $p$ while adhering to privacy constraints (here, pure differential privacy) only with respec… ▽ More

    Submitted 14 August, 2023; v1 submitted 11 August, 2023; originally announced August 2023.

    Comments: 31 pages

  6. arXiv:2302.02936  [pdf, other

    cs.LG cs.CR cs.CV

    Private GANs, Revisited

    Authors: Alex Bie, Gautam Kamath, Guojun Zhang

    Abstract: We show that the canonical approach for training differentially private GANs -- updating the discriminator with differentially private stochastic gradient descent (DPSGD) -- can yield significantly improved results after modifications to training. Specifically, we propose that existing instantiations of this approach neglect to consider how adding noise only to discriminator updates inhibits discr… ▽ More

    Submitted 5 October, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: 28 pages; revisions and new experiments from TMLR camera-ready + code release at https://github.com/alexbie98/dpgan-revisit

  7. arXiv:2208.07984  [pdf, other

    cs.LG cs.CR stat.ML

    Private Estimation with Public Data

    Authors: Alex Bie, Gautam Kamath, Vikrant Singhal

    Abstract: We initiate the study of differentially private (DP) estimation with access to a small amount of public data. For private estimation of d-dimensional Gaussians, we assume that the public data comes from a Gaussian that may have vanishing similarity in total variation distance with the underlying Gaussian of the private data. We show that under the constraints of pure or concentrated DP, d+1 public… ▽ More

    Submitted 5 April, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: 55 pages; updated funding acknowledgement + simulation results from NeurIPS 2022 camera-ready

  8. arXiv:2111.01177  [pdf, other

    cs.LG cs.CR

    Don't Generate Me: Training Differentially Private Generative Models with Sinkhorn Divergence

    Authors: Tianshi Cao, Alex Bie, Arash Vahdat, Sanja Fidler, Karsten Kreis

    Abstract: Although machine learning models trained on massive data have led to break-throughs in several areas, their deployment in privacy-sensitive domains remains limited due to restricted access to data. Generative models trained with privacy constraints on private data can sidestep this challenge, providing indirect access to private data instead. We propose DP-Sinkhorn, a novel optimal transport-based… ▽ More

    Submitted 29 November, 2021; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: Accepted to NeurIPS 2021. 13 pages, 7 pages of supplementary; 6 tables, 8 figures

    Journal ref: Advances in Neural Information Processing Systems, Volume 34, pages 12480--12492, year 2021

  9. A Model-Based Approach to Synthetic Data Set Generation for Patient-Ventilator Waveforms for Machine Learning and Educational Use

    Authors: A. van Diepen, T. H. G. F. Bakkes, A. J. R. De Bie, S. Turco, R. A. Bouwman, P. H. Woerlee, M. Mischi

    Abstract: Although mechanical ventilation is a lifesaving intervention in the ICU, it has harmful side-effects, such as barotrauma and volutrauma. These harms can occur due to asynchronies. Asynchronies are defined as a mismatch between the ventilator timing and patient respiratory effort. Automatic detection of these asynchronies, and subsequent feedback, would improve lung ventilation and reduce the proba… ▽ More

    Submitted 7 May, 2021; v1 submitted 29 March, 2021; originally announced March 2021.

    Journal ref: J Clin Monit Comput (2022)

  10. arXiv:1911.03604  [pdf, other

    cs.CL cs.SD eess.AS

    A Simplified Fully Quantized Transformer for End-to-end Speech Recognition

    Authors: Alex Bie, Bharat Venkitesh, Joao Monteiro, Md. Akmal Haidar, Mehdi Rezagholizadeh

    Abstract: While significant improvements have been made in recent years in terms of end-to-end automatic speech recognition (ASR) performance, such improvements were obtained through the use of very large neural networks, unfit for embedded use on edge devices. That being said, in this paper, we work on simplifying and compressing Transformer-based encoder-decoder architectures for the end-to-end ASR task.… ▽ More

    Submitted 24 March, 2020; v1 submitted 8 November, 2019; originally announced November 2019.

    Comments: Submitted to IEEE Signal Processing Letters Minor changes in Section 3