Zum Hauptinhalt springen

Showing 1–24 of 24 results for author: Muthukumar, V

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02769  [pdf, other

    stat.ML cs.LG

    Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks

    Authors: Chiraag Kaushik, Justin Romberg, Vidya Muthukumar

    Abstract: The classical iteratively reweighted least-squares (IRLS) algorithm aims to recover an unknown signal from linear measurements by performing a sequence of weighted least squares problems, where the weights are recursively updated at each step. Varieties of this algorithm have been shown to achieve favorable empirical performance and theoretical guarantees for sparse recovery and $\ell_p$-norm mini… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 25 pages, 3 figures

  2. arXiv:2405.06546  [pdf, other

    stat.ML cs.IT cs.LG

    Sharp analysis of out-of-distribution error for "importance-weighted" estimators in the overparameterized regime

    Authors: Kuo-Wei Lai, Vidya Muthukumar

    Abstract: Overparameterized models that achieve zero training error are observed to generalize well on average, but degrade in performance when faced with data that is under-represented in the training sample. In this work, we study an overparameterized Gaussian mixture model imbued with a spurious feature, and sharply analyze the in-distribution and out-of-distribution test error of a cost-sensitive interp… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: A short version of this work will be presented at IEEE ISIT 2024

  3. arXiv:2404.05819  [pdf, other

    stat.ML cs.IT cs.LG math.PR math.ST

    Just Wing It: Optimal Estimation of Missing Mass in a Markovian Sequence

    Authors: Ashwin Pananjady, Vidya Muthukumar, Andrew Thangaraj

    Abstract: We study the problem of estimating the stationary mass -- also called the unigram mass -- that is missing from a single trajectory of a discrete-time, ergodic Markov chain. This problem has several applications -- for example, estimating the stationary missing mass is critical for accurately smoothing probability estimates in sequence models. While the classical Good--Turing estimator from the 195… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 40 pages, 5 figures

  4. arXiv:2402.11742  [pdf, other

    cs.LG stat.ML

    Balanced Data, Imbalanced Spectra: Unveiling Class Disparities with Spectral Imbalance

    Authors: Chiraag Kaushik, Ran Liu, Chi-Heng Lin, Amrit Khera, Matthew Y Jin, Wenrui Ma, Vidya Muthukumar, Eva L Dyer

    Abstract: Classification models are expected to perform equally well for different classes, yet in practice, there are often large gaps in their performance. This issue of class bias is widely studied in cases of datasets with sample imbalance, but is relatively overlooked in balanced datasets. In this work, we introduce the concept of spectral imbalance in features as a potential source for class dispariti… ▽ More

    Submitted 3 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: 25 pages, 9 figures

  5. arXiv:2305.02304  [pdf, ps, other

    stat.ML cs.LG

    New Equivalences Between Interpolation and SVMs: Kernels and Structured Features

    Authors: Chiraag Kaushik, Andrew D. McRae, Mark A. Davenport, Vidya Muthukumar

    Abstract: The support vector machine (SVM) is a supervised learning algorithm that finds a maximum-margin linear classifier, often after mapping the data to a high-dimensional feature space via the kernel trick. Recent work has demonstrated that in certain sufficiently overparameterized settings, the SVM decision function coincides exactly with the minimum-norm label interpolant. This phenomenon of support… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 23 pages, 2 figures

    MSC Class: 62H30; 62J07; 68Q32; 46E22

  6. arXiv:2303.07475  [pdf, other

    stat.ML cs.LG math.OC

    General Loss Functions Lead to (Approximate) Interpolation in High Dimensions

    Authors: Kuo-Wei Lai, Vidya Muthukumar

    Abstract: We provide a unified framework, applicable to a general family of convex losses and across binary and multiclass settings in the overparameterized regime, to approximately characterize the implicit bias of gradient descent in closed form. Specifically, we show that the implicit bias is approximated (but not exactly equal to) the minimum-norm interpolation in high dimensions, which arises from trai… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: 52 pages

  7. arXiv:2302.09451  [pdf, other

    cs.LG stat.ML

    Estimating Optimal Policy Value in General Linear Contextual Bandits

    Authors: Jonathan N. Lee, Weihao Kong, Aldo Pacchiano, Vidya Muthukumar, Emma Brunskill

    Abstract: In many bandit problems, the maximal reward achievable by a policy is often unknown in advance. We consider the problem of estimating the optimal policy value in the sublinear data regime before the optimal policy is even learnable. We refer to this as $V^*$ estimation. It was recently shown that fast $V^*$ estimation is possible but only in disjoint linear bandits with Gaussian covariates. Whethe… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

  8. arXiv:2210.09385  [pdf, other

    cs.LG stat.ML

    Adaptive Oracle-Efficient Online Learning

    Authors: Guanghui Wang, Zihao Hu, Vidya Muthukumar, Jacob Abernethy

    Abstract: The classical algorithms for online learning and decision-making have the benefit of achieving the optimal performance guarantees, but suffer from computational complexity limitations when implemented at scale. More recent sophisticated techniques, which we refer to as oracle-efficient methods, address this problem by dispatching to an offline optimization oracle that can search through an exponen… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

  9. arXiv:2210.05021  [pdf, other

    cs.LG stat.ML

    The good, the bad and the ugly sides of data augmentation: An implicit spectral regularization perspective

    Authors: Chi-Heng Lin, Chiraag Kaushik, Eva L. Dyer, Vidya Muthukumar

    Abstract: Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning. Specific augmentations like translations and scaling in computer vision are traditionally believed to improve generalization by generating new (artificial) data from the same distribution. However, this traditional viewpoint does not explain the success of prevalent augmentations in modern machine… ▽ More

    Submitted 27 February, 2024; v1 submitted 10 October, 2022; originally announced October 2022.

    Comments: 72 pages, 8 figures

  10. arXiv:2111.05198  [pdf, other

    stat.ML cs.LG math.ST

    Harmless interpolation in regression and classification with structured features

    Authors: Andrew D. McRae, Santhosh Karnik, Mark A. Davenport, Vidya Muthukumar

    Abstract: Overparametrized neural networks tend to perfectly fit noisy training data yet generalize well on test data. Inspired by this empirical observation, recent work has sought to understand this phenomenon of benign overfitting or harmless interpolation in the much simpler linear model. Previous theoretical work critically assumes that either the data features are statistically independent or the inpu… ▽ More

    Submitted 21 February, 2022; v1 submitted 9 November, 2021; originally announced November 2021.

  11. arXiv:2111.04688  [pdf, ps, other

    cs.LG stat.ML

    Universal and data-adaptive algorithms for model selection in linear contextual bandits

    Authors: Vidya Muthukumar, Akshay Krishnamurthy

    Abstract: Model selection in contextual bandits is an important complementary problem to regret minimization with respect to a fixed model class. We consider the simplest non-trivial instance of model-selection: distinguishing a simple multi-armed bandit problem from a linear contextual bandit problem. Even in this instance, current state-of-the-art methods explore in a suboptimal manner and require strong… ▽ More

    Submitted 30 June, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

    Comments: 30 pages, to appear in ICML 2022

  12. arXiv:2109.13215  [pdf, other

    cs.LG cs.IT stat.ML

    Classification and Adversarial examples in an Overparameterized Linear Model: A Signal Processing Perspective

    Authors: Adhyyan Narang, Vidya Muthukumar, Anant Sahai

    Abstract: State-of-the-art deep learning classifiers are heavily overparameterized with respect to the amount of training examples and observed to generalize well on "clean" data, but be highly susceptible to infinitesmal adversarial perturbations. In this paper, we identify an overparameterized linear ensemble, that uses the "lifted" Fourier feature map, that demonstrates both of these behaviors. The input… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: 32 pages, 10 figures

  13. arXiv:2109.02355  [pdf, other

    stat.ML cs.LG

    A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

    Authors: Yehuda Dar, Vidya Muthukumar, Richard G. Baraniuk

    Abstract: The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpo… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  14. arXiv:2106.14866  [pdf, other

    stat.ML cs.AI cs.IT cs.LG cs.RO

    Learning from an Exploring Demonstrator: Optimal Reward Estimation for Bandits

    Authors: Wenshuo Guo, Kumar Krishna Agrawal, Aditya Grover, Vidya Muthukumar, Ashwin Pananjady

    Abstract: We introduce the "inverse bandit" problem of estimating the rewards of a multi-armed bandit instance from observing the learning process of a low-regret demonstrator. Existing approaches to the related problem of inverse reinforcement learning assume the execution of an optimal policy, and thereby suffer from an identifiability issue. In contrast, we propose to leverage the demonstrator's behavior… ▽ More

    Submitted 22 February, 2022; v1 submitted 28 June, 2021; originally announced June 2021.

    Comments: Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), 2022

  15. arXiv:2106.10865  [pdf, other

    stat.ML cs.IT cs.LG

    Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

    Authors: Ke Wang, Vidya Muthukumar, Christos Thrampoulidis

    Abstract: The literature on "benign overfitting" in overparameterized models has been mostly restricted to regression or binary classification; however, modern machine learning operates in the multiclass setting. Motivated by this discrepancy, we study benign overfitting in multiclass linear classification. Specifically, we consider the following training algorithms on separable data: (i) empirical risk min… ▽ More

    Submitted 11 July, 2023; v1 submitted 21 June, 2021; originally announced June 2021.

    Comments: Corrected subtle issue in the proofs of Lemmas 4 an 5. Relaxed Assumptions 1 and 2 and added error bound for ETF geometry

  16. arXiv:2012.02125  [pdf, other

    cs.GT stat.ML

    On the Impossibility of Convergence of Mixed Strategies with No Regret Learning

    Authors: Vidya Muthukumar, Soham Phade, Anant Sahai

    Abstract: We study the limiting behavior of the mixed strategies that result from optimal no-regret learning strategies in a repeated game setting where the stage game is any 2 by 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to a… ▽ More

    Submitted 2 March, 2022; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: 47 pages, 12 figures

  17. arXiv:2011.09750  [pdf, ps, other

    cs.LG stat.ML

    Online Model Selection for Reinforcement Learning with Function Approximation

    Authors: Jonathan N. Lee, Aldo Pacchiano, Vidya Muthukumar, Weihao Kong, Emma Brunskill

    Abstract: Deep reinforcement learning has achieved impressive successes yet often requires a very large amount of interaction data. This result is perhaps unsurprising, as using complicated function approximation often requires more data to fit, and early theoretical results on linear Markov decision processes provide regret bounds that scale with the dimension of the linear approximation. Ideally, we would… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

  18. arXiv:2009.10670  [pdf, other

    math.ST cs.LG stat.ML

    On the proliferation of support vectors in high dimensions

    Authors: Daniel Hsu, Vidya Muthukumar, Ji Xu

    Abstract: The support vector machine (SVM) is a well-established classification method whose name refers to the particular training examples, called support vectors, that determine the maximum margin separating hyperplane. The SVM classifier is known to enjoy good generalization properties when the number of support vectors is small compared to the number of training examples. However, recent research has s… ▽ More

    Submitted 13 June, 2022; v1 submitted 22 September, 2020; originally announced September 2020.

  19. arXiv:2005.08054  [pdf, other

    cs.LG cs.IT stat.ML

    Classification vs regression in overparameterized regimes: Does the loss function matter?

    Authors: Vidya Muthukumar, Adhyyan Narang, Vignesh Subramanian, Mikhail Belkin, Daniel Hsu, Anant Sahai

    Abstract: We compare classification and regression tasks in an overparameterized linear model with Gaussian features. On the one hand, we show that with sufficient overparameterization all training points are support vectors: solutions obtained by least-squares minimum-norm interpolation, typically used for regression, are identical to those produced by the hard-margin support vector machine (SVM) that mini… ▽ More

    Submitted 14 October, 2021; v1 submitted 16 May, 2020; originally announced May 2020.

    Journal ref: Journal of Machine Learning Research, 22(222):1-69, 2021

  20. arXiv:1905.10040  [pdf, other

    stat.ML cs.LG

    OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

    Authors: Niladri S. Chatterji, Vidya Muthukumar, Peter L. Bartlett

    Abstract: We consider the stochastic linear (multi-armed) contextual bandit problem with the possibility of hidden simple multi-armed bandit structure in which the rewards are independent of the contextual information. Algorithms that are designed solely for one of the regimes are known to be sub-optimal for the alternate regime. We design a single computationally efficient algorithm that simultaneously obt… ▽ More

    Submitted 5 October, 2020; v1 submitted 24 May, 2019; originally announced May 2019.

  21. arXiv:1903.09139  [pdf, other

    cs.LG stat.ML

    Harmless interpolation of noisy data in regression

    Authors: Vidya Muthukumar, Kailas Vodrahalli, Vignesh Subramanian, Anant Sahai

    Abstract: A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We char… ▽ More

    Submitted 9 September, 2019; v1 submitted 21 March, 2019; originally announced March 2019.

    Comments: 52 pages, expanded version of the paper presented at ITA in San Diego in Feb 2019, ISIT in Paris in July 2019, at Simons in July, and as a plenary at ITW in Visby in August 2019

  22. arXiv:1812.00099  [pdf, other

    cs.CV cs.CY stat.ML

    Understanding Unequal Gender Classification Accuracy from Face Images

    Authors: Vidya Muthukumar, Tejaswini Pedapati, Nalini Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R. Varshney

    Abstract: Recent work shows unequal performance of commercial face classification services in the gender classification task across intersectional groups defined by skin type and gender. Accuracy on dark-skinned females is significantly worse than on any other group. In this paper, we conduct several analyses to try to uncover the reason for this gap. The main finding, perhaps surprisingly, is that skin typ… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

  23. arXiv:1805.08562  [pdf, other

    cs.LG stat.ML

    Best of many worlds: Robust model selection for online supervised learning

    Authors: Vidya Muthukumar, Mitas Ray, Anant Sahai, Peter L. Bartlett

    Abstract: We introduce algorithms for online, full-information prediction that are competitive with contextual tree experts of unknown complexity, in both probabilistic and adversarial settings. We show that by incorporating a probabilistic framework of structural risk minimization into existing adaptive algorithms, we can robustly learn not only the presence of stochastic structure when it exists (leading… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

    Comments: 33 pages, 5 figures

  24. arXiv:1707.06217  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Worst-case vs Average-case Design for Estimation from Fixed Pairwise Comparisons

    Authors: Ashwin Pananjady, Cheng Mao, Vidya Muthukumar, Martin J. Wainwright, Thomas A. Courtade

    Abstract: Pairwise comparison data arises in many domains, including tournament rankings, web search, and preference elicitation. Given noisy comparisons of a fixed subset of pairs of items, we study the problem of estimating the underlying comparison probabilities under the assumption of strong stochastic transitivity (SST). We also consider the noisy sorting subclass of the SST model. We show that when th… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.