Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Sohoni, N S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2203.01517  [pdf, other

    cs.LG

    Correct-N-Contrast: A Contrastive Approach for Improving Robustness to Spurious Correlations

    Authors: Michael Zhang, Nimit S. Sohoni, Hongyang R. Zhang, Chelsea Finn, Christopher Ré

    Abstract: Spurious correlations pose a major challenge for robust machine learning. Models trained with empirical risk minimization (ERM) may learn to rely on correlations between class labels and spurious attributes, leading to poor performance on data groups without these correlations. This is particularly challenging to address when spurious attribute labels are unavailable. To improve worst-group perfor… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 38 pages, 14 figures. Preprint

  2. arXiv:2201.00072  [pdf, other

    cs.LG

    BARACK: Partially Supervised Group Robustness With Guarantees

    Authors: Nimit S. Sohoni, Maziar Sanjabi, Nicolas Ballas, Aditya Grover, Shaoliang Nie, Hamed Firooz, Christopher Ré

    Abstract: While neural networks have shown remarkable success on classification tasks in terms of average-case performance, they often fail to perform well on certain groups of the data. Such group information may be expensive to obtain; thus, recent works in robustness and fairness have proposed ways to improve worst-group performance even when group labels are unavailable for the training data. However, t… ▽ More

    Submitted 10 April, 2022; v1 submitted 31 December, 2021; originally announced January 2022.

    Comments: 26 pages

  3. arXiv:2109.05720  [pdf, other

    cs.CV cs.LG

    Low-Shot Validation: Active Importance Sampling for Estimating Classifier Performance on Rare Categories

    Authors: Fait Poms, Vishnu Sarukkai, Ravi Teja Mullapudi, Nimit S. Sohoni, William R. Mark, Deva Ramanan, Kayvon Fatahalian

    Abstract: For machine learning models trained with limited labeled training data, validation stands to become the main bottleneck to reducing overall annotation costs. We propose a statistical validation algorithm that accurately estimates the F-score of binary classifiers for rare categories, where finding relevant examples to evaluate on is particularly challenging. Our key insight is that simultaneous ca… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: Accepted to ICCV 2021; 12 pages, 12 figures

  4. arXiv:2107.00643  [pdf, other

    cs.LG

    Mandoline: Model Evaluation under Distribution Shift

    Authors: Mayee Chen, Karan Goel, Nimit S. Sohoni, Fait Poms, Kayvon Fatahalian, Christopher Ré

    Abstract: Machine learning models are often deployed in different settings than they were trained and validated on, posing a challenge to practitioners who wish to predict how well the deployed model will perform on a target distribution. If an unlabeled sample from the target distribution is available, along with a labeled sample from a possibly different source distribution, standard approaches such as im… ▽ More

    Submitted 10 April, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: 33 pages. Published as a conference paper at ICML 2021

  5. arXiv:2012.14966  [pdf, other

    cs.LG stat.ML

    Kaleidoscope: An Efficient, Learnable Representation For All Structured Linear Maps

    Authors: Tri Dao, Nimit S. Sohoni, Albert Gu, Matthew Eichhorn, Amit Blonder, Megan Leszczynski, Atri Rudra, Christopher Ré

    Abstract: Modern neural network architectures use structured linear transformations, such as low-rank matrices, sparse matrices, permutations, and the Fourier transform, to improve inference speed and reduce memory usage compared to general linear maps. However, choosing which of the myriad structured transformations to use (and its associated parameterization) is a laborious task that requires trading off… ▽ More

    Submitted 5 January, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

    Comments: International Conference on Learning Representations (ICLR) 2020 spotlight

  6. arXiv:2011.12945  [pdf, other

    cs.LG cs.CV

    No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

    Authors: Nimit S. Sohoni, Jared A. Dunnmon, Geoffrey Angus, Albert Gu, Christopher Ré

    Abstract: In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important consequences for models deployed in safety-critical appl… ▽ More

    Submitted 10 April, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

    Comments: 40 pages. Published as a conference paper at NeurIPS 2020

  7. arXiv:1906.11985  [pdf, other

    math.OC cs.CC cs.DS cs.LG stat.ML

    Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond

    Authors: Oliver Hinder, Aaron Sidford, Nimit S. Sohoni

    Abstract: In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are strictly unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $γ\in (0,1]$, where $γ= 1$ encompasses the classes of smooth convex and star-convex functions, and… ▽ More

    Submitted 24 February, 2023; v1 submitted 27 June, 2019; originally announced June 2019.

    Comments: 48 pages. Published as a conference paper at COLT 2020

  8. arXiv:1904.10631  [pdf, other

    cs.LG stat.ML

    Low-Memory Neural Network Training: A Technical Report

    Authors: Nimit S. Sohoni, Christopher R. Aberger, Megan Leszczynski, Jian Zhang, Christopher Ré

    Abstract: Memory is increasingly often the bottleneck when training neural network models. Despite this, techniques to lower the overall memory requirements of training have been less widely studied compared to the extensive literature on reducing the memory requirements of inference. In this paper we study a fundamental question: How much memory is actually needed to train a neural network? To answer this… ▽ More

    Submitted 8 April, 2022; v1 submitted 23 April, 2019; originally announced April 2019.

    Comments: Version notes: Copyedits and citation fixes