Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Cohen-Wang, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.00194  [pdf, other

    cs.LG

    Ask Your Distribution Shift if Pre-Training is Right for You

    Authors: Benjamin Cohen-Wang, Joshua Vendrow, Aleksander Madry

    Abstract: Pre-training is a widely used approach to develop models that are robust to distribution shifts. However, in practice, its effectiveness varies: fine-tuning a pre-trained model improves robustness significantly in some cases but not at all in others (compared to training from scratch). In this work, we seek to characterize the failure modes that pre-training can and cannot address. In particular,… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  2. arXiv:2312.02132  [pdf, other

    cs.LG cs.AI cs.CR cs.DS

    Hot PATE: Private Aggregation of Distributions for Diverse Task

    Authors: Edith Cohen, Benjamin Cohen-Wang, Xin Lyu, Jelani Nelson, Tamas Sarlos, Uri Stemmer

    Abstract: The Private Aggregation of Teacher Ensembles (PATE) framework is a versatile approach to privacy-preserving machine learning. In PATE, teacher models that are not privacy-preserving are trained on distinct portions of sensitive data. Privacy-preserving knowledge transfer to a student model is then facilitated by privately aggregating teachers' predictions on new examples. Employing PATE with gener… ▽ More

    Submitted 17 May, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

  3. arXiv:2103.02761  [pdf, other

    cs.LG stat.ML

    Comparing the Value of Labeled and Unlabeled Data in Method-of-Moments Latent Variable Estimation

    Authors: Mayee F. Chen, Benjamin Cohen-Wang, Stephen Mussmann, Frederic Sala, Christopher Ré

    Abstract: Labeling data for modern machine learning is expensive and time-consuming. Latent variable models can be used to infer labels from weaker, easier-to-acquire sources operating on unlabeled data. Such models can also be trained using labeled data, presenting a key question: should a user invest in few labeled or many unlabeled points? We answer this via a framework centered on model misspecification… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: To appear in AISTATS 2021