Zum Hauptinhalt springen

Showing 1–50 of 96 results for author: Mohri, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13746  [pdf, ps, other

    cs.LG stat.ML

    Multi-Label Learning with Stronger Consistency Guarantees

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a detailed study of surrogate losses and algorithms for multi-label learning, supported by $H$-consistency bounds. We first show that, for the simplest form of multi-label loss (the popular Hamming loss), the well-known consistent binary relevance surrogate suffers from a sub-optimal dependency on the number of labels in terms of $H$-consistency bounds, when using smooth losses such as… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.13732  [pdf, other

    cs.LG stat.ML

    Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a comprehensive study of surrogate loss functions for learning to defer. We introduce a broad family of surrogate losses, parameterized by a non-increasing function $Ψ$, and establish their realizable $H$-consistency under mild conditions. For cost functions based on classification error, we further show that these losses admit $H$-consistency bounds when the hypothesis set is symmetric… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  3. arXiv:2407.13722  [pdf, ps, other

    cs.LG stat.ML

    Enhanced $H$-Consistency Bounds

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: Recent research has introduced a key notion of $H$-consistency bounds for surrogate losses. These bounds offer finite-sample guarantees, quantifying the relationship between the zero-one estimation error (or other target loss) and the surrogate loss estimation error for a specific hypothesis set. However, previous bounds were derived under the condition that a lower bound of the surrogate loss con… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  4. arXiv:2407.07140  [pdf, other

    cs.LG stat.ML

    Cardinality-Aware Set Prediction and Top-$k$ Classification

    Authors: Corinna Cortes, Anqi Mao, Christopher Mohri, Mehryar Mohri, Yutao Zhong

    Abstract: We present a detailed study of cardinality-aware top-$k$ classification, a novel approach that aims to learn an accurate top-$k$ set predictor while maintaining a low cardinality. We introduce a new target loss function tailored to this setting that accounts for both the classification error and the cardinality of the set predicted. To optimize this loss function, we propose two families of surrog… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2403.19625

  5. arXiv:2406.07585  [pdf, other

    stat.ML cs.LG

    Rate-Preserving Reductions for Blackwell Approachability

    Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Jon Schneider, Balasubramanian Sivan

    Abstract: Abernethy et al. (2011) showed that Blackwell approachability and no-regret learning are equivalent, in the sense that any algorithm that solves a specific Blackwell approachability instance can be converted to a sublinear regret algorithm for a specific no-regret learning instance, and vice versa. In this paper, we study a more fine-grained form of such reductions, and ask when this translation b… ▽ More

    Submitted 17 July, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

  6. arXiv:2405.05968  [pdf, other

    cs.LG stat.ML

    A Universal Growth Rate for Learning with Smooth Surrogate Losses

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: This paper presents a comprehensive analysis of the growth rate of $H$-consistency bounds (and excess error bounds) for various surrogate losses used in classification. We prove a square-root growth rate near zero for smooth margin-based surrogate losses in binary classification, providing both upper and lower bounds under mild assumptions. This result also translates to excess error bounds. Our l… ▽ More

    Submitted 8 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  7. arXiv:2403.19625  [pdf, other

    cs.LG stat.ML

    Top-$k$ Classification and Cardinality-Aware Prediction

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a detailed study of top-$k$ classification, the task of predicting the $k$ most probable classes for an input, extending beyond single-class prediction. We demonstrate that several prevalent surrogate loss functions in multi-class classification, such as comp-sum and constrained losses, are supported by $H$-consistency bounds with respect to the top-$k$ loss. These bounds guarantee cons… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  8. arXiv:2403.19494  [pdf, ps, other

    cs.LG stat.ML

    Regression with Multi-Expert Deferral

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts. While this problem has received significant attention in classification contexts, it presents unique challenges in regression due to the infinite and continuous nature of the label space. In this work, we introduce a novel framework of regression with deferral, which invo… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  9. arXiv:2403.19480  [pdf, ps, other

    cs.LG stat.ML

    $H$-Consistency Guarantees for Regression

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a detailed study of $H$-consistency bounds for regression. We first present new theorems that generalize the tools previously given to establish $H$-consistency bounds. This generalization proves essential for analyzing $H$-consistency bounds specific to regression. Next, we prove a series of novel $H$-consistency bounds for surrogate loss functions of the squared loss, under the assump… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

  10. arXiv:2310.14774  [pdf, ps, other

    cs.LG stat.ML

    Principled Approaches for Learning to Defer with Multiple Experts

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a study of surrogate losses and algorithms for the general problem of learning to defer with multiple experts. We first introduce a new family of surrogate losses specifically tailored for the multiple-expert setting, where the prediction and deferral functions are learned simultaneously. We then prove that these surrogate losses benefit from strong $H$-consistency bounds. We illustrate… ▽ More

    Submitted 31 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: ISAIM 2024

  11. arXiv:2310.14772  [pdf, other

    cs.LG stat.ML

    Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We study the key framework of learning with abstention in the multi-class classification setting. In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost. We present a series of new theoretical and algorithmic results for this learning problem in the predictor-rejector framework. We introduce several new families of surrogate losses for which we prove… ▽ More

    Submitted 31 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: ALT 2024

  12. arXiv:2310.14770  [pdf, ps, other

    cs.LG stat.ML

    Theoretically Grounded Loss Functions and Algorithms for Score-Based Multi-Class Abstention

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: Learning with abstention is a key scenario where the learner can abstain from making a prediction at some cost. In this paper, we analyze the score-based formulation of learning with abstention in the multi-class classification setting. We introduce new families of surrogate losses for the abstention loss function, which include the state-of-the-art surrogate losses in the single-stage setting and… ▽ More

    Submitted 31 March, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: AISTATS 2024

  13. arXiv:2307.02035  [pdf, ps, other

    cs.LG stat.ML

    Ranking with Abstention

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We introduce a novel framework of ranking with abstention, where the learner can abstain from making prediction at some limited cost $c$. We present a extensive theoretical analysis of this framework including a series of $H$-consistency bounds for both the family of linear functions and that of neural networks with one hidden-layer. These theoretical guarantees are the state-of-the-art consistenc… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  14. arXiv:2306.08838  [pdf, other

    cs.LG cs.CR stat.ML

    Differentially Private Domain Adaptation with Theoretical Guarantees

    Authors: Raef Bassily, Corinna Cortes, Anqi Mao, Mehryar Mohri

    Abstract: In many applications, the labeled data at the learner's disposal is subject to privacy constraints and is relatively limited. To derive a more accurate predictor for the target domain, it is often beneficial to leverage publicly available labeled data from an alternative domain, somewhat close to the target domain. This is the modern problem of supervised domain adaptation from a public source to… ▽ More

    Submitted 4 February, 2024; v1 submitted 15 June, 2023; originally announced June 2023.

  15. arXiv:2305.05816  [pdf, other

    cs.LG stat.ML

    Best-Effort Adaptation

    Authors: Pranjal Awasthi, Corinna Cortes, Mehryar Mohri

    Abstract: We study a problem of best-effort adaptation motivated by several applications and considerations, which consists of determining an accurate predictor for a target domain, for which a moderate amount of labeled samples are available, while leveraging information from another domain for which substantially more labeled samples are at one's disposal. We present a new and general discrepancy-based th… ▽ More

    Submitted 9 May, 2023; originally announced May 2023.

  16. arXiv:2304.07288  [pdf, other

    cs.LG stat.ML

    Cross-Entropy Loss Functions: Theoretical Analysis and Applications

    Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We present a theoretical analysis of a broad family of loss functions, comp-sum losses, that includes cross-entropy (or logistic loss), generalized cr… ▽ More

    Submitted 19 June, 2023; v1 submitted 14 April, 2023; originally announced April 2023.

    Comments: ICML 2023

  17. arXiv:2302.01517  [pdf, ps, other

    cs.LG

    Pseudonorm Approachability and Applications to Regret Minimization

    Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Jon Schneider, Balasubramanian Sivan

    Abstract: Blackwell's celebrated approachability theory provides a general framework for a variety of learning problems, including regret minimization. However, Blackwell's proof and implicit algorithm measure approachability using the $\ell_2$ (Euclidean) distance. We argue that in many applications such as regret minimization, it is more useful to study approachability under other distance metrics, most c… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: To appear at ALT 2023

  18. arXiv:2208.10904  [pdf, ps, other

    cs.LG

    A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

    Authors: Christoph Dann, Mehryar Mohri, Tong Zhang, Julian Zimmert

    Abstract: Thompson Sampling is one of the most effective methods for contextual bandits and has been generalized to posterior sampling for certain MDP settings. However, existing posterior sampling methods for reinforcement learning are limited by being model-based or lack worst-case theoretical guarantees beyond linear MDPs. This paper proposes a new model-free formulation of posterior sampling that applie… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Journal ref: Dann C, Mohri M, Zhang T, Zimmert J. A provably efficient model-free posterior sampling method for episodic reinforcement learning. Advances in Neural Information Processing Systems. 2021 Dec 6;34:12040-51

  19. arXiv:2208.06135  [pdf, other

    cs.LG cs.CR stat.ML

    Private Domain Adaptation from a Public Source

    Authors: Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: A key problem in a variety of applications is that of domain adaptation from a public source domain, for which a relatively large amount of labeled data with no privacy constraints is at one's disposal, to a private target domain, for which a private sample is available with very few or no labeled data. In regression problems with no privacy constraints on the source or target data, a discrepancy… ▽ More

    Submitted 12 August, 2022; originally announced August 2022.

  20. arXiv:2206.10022  [pdf, other

    cs.LG

    Stochastic Online Learning with Feedback Graphs: Finite-Time and Asymptotic Optimality

    Authors: Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

    Abstract: We revisit the problem of stochastic online learning with feedback graphs, with the goal of devising algorithms that are optimal, up to constants, both asymptotically and in finite time. We show that, surprisingly, the notion of optimal finite-time regret is not a uniquely defined property in this context and that, in general, it is decoupled from the asymptotic rate. We discuss alternative choice… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  21. arXiv:2206.09421  [pdf, other

    cs.LG

    Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

    Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

    Abstract: Myopic exploration policies such as epsilon-greedy, softmax, or Gaussian noise fail to explore efficiently in some reinforcement learning tasks and yet, they perform well in many others. In fact, in practice, they are often selected as the top choices, due to their simplicity. But, for what tasks do such policies succeed? Can we give theoretical guarantees for their favorable performance? These cr… ▽ More

    Submitted 19 June, 2022; originally announced June 2022.

    Comments: to appear at ICML 2022

  22. arXiv:2205.08562  [pdf, ps, other

    cs.LG cs.GT

    Strategizing against Learners in Bayesian Games

    Authors: Yishay Mansour, Mehryar Mohri, Jon Schneider, Balasubramanian Sivan

    Abstract: We study repeated two-player games where one of the players, the learner, employs a no-regret learning strategy, while the other, the optimizer, is a rational utility maximizer. We consider general Bayesian games, where the payoffs of both the optimizer and the learner could depend on the type, which is drawn from a publicly known distribution, but revealed privately to the learner. We address the… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

  23. arXiv:2205.08017  [pdf, other

    cs.LG stat.ML

    $\mathscr{H}$-Consistency Estimation Error of Surrogate Loss Minimizers

    Authors: Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a detailed study of estimation errors in terms of surrogate loss estimation errors. We refer to such guarantees as $\mathscr{H}$-consistency estimation error bounds, since they account for the hypothesis set $\mathscr{H}$ adopted. These guarantees are significantly stronger than $\mathscr{H}$-calibration or $\mathscr{H}$-consistency. They are also more informative than similar excess er… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: ICML 2022 (long presentation)

  24. arXiv:2204.10376  [pdf, other

    cs.LG stat.ML

    Differentially Private Learning with Margin Guarantees

    Authors: Raef Bassily, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: We present a series of new differentially private (DP) algorithms with dimension-independent margin guarantees. For the family of linear hypotheses, we give a pure DP learning algorithm that benefits from relative deviation margin guarantees, as well as an efficient DP learning algorithm with margin guarantees. We also present a new efficient DP learning algorithm with margin guarantees for kernel… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

  25. arXiv:2112.01694  [pdf, other

    cs.LG stat.ML

    On the Existence of the Adversarial Bayes Classifier (Extended Version)

    Authors: Pranjal Awasthi, Natalie S. Frank, Mehryar Mohri

    Abstract: Adversarial robustness is a critical property in a variety of modern machine learning applications. While it has been the subject of several recent theoretical studies, many important questions related to adversarial robustness are still open. In this work, we study a fundamental question regarding Bayes optimality for adversarial robustness. We provide general sufficient conditions under which th… ▽ More

    Submitted 28 August, 2023; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: 27 pages, 3 figures. Version 2: Corrects 2 errors in the paper "On the Existence of the Adversarial Bayes Classifier" published in NeurIPS. Version 3: Update to acknowledgements

  26. arXiv:2107.06917  [pdf, other

    cs.LG

    A Field Guide to Federated Optimization

    Authors: Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz , et al. (28 additional authors not shown)

    Abstract: Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection. The distributed learning process can be formulated as solving federated optimization problems, which emphasize communication efficiency, data heterogeneity, compatibility with privacy and system requirements, and… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

  27. arXiv:2107.05745  [pdf, ps, other

    cs.LG stat.ML

    Adapting to Misspecification in Contextual Bandits

    Authors: Dylan J. Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

    Abstract: A major research direction in contextual bandits is to develop algorithms that are computationally efficient, yet support flexible, general-purpose function approximation. Algorithms based on modeling rewards have shown strong empirical performance, but typically require a well-specified model, and can fail when this assumption does not hold. Can we design algorithms that are efficient and flexibl… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: Appeared at NeurIPS 2020

  28. arXiv:2107.01264  [pdf, other

    cs.LG

    Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds for Episodic Reinforcement Learning

    Authors: Christoph Dann, Teodor V. Marinov, Mehryar Mohri, Julian Zimmert

    Abstract: We provide improved gap-dependent regret bounds for reinforcement learning in finite episodic Markov decision processes. Compared to prior work, our bounds depend on alternative definitions of gaps. These definitions are based on the insight that, in order to achieve a favorable regret, an algorithm does not need to learn how to behave optimally in states that are not reached by an optimal policy.… ▽ More

    Submitted 26 October, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

  29. arXiv:2106.11519  [pdf, other

    cs.LG cs.AI eess.SY

    Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations

    Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

    Abstract: There have been many recent advances on provably efficient Reinforcement Learning (RL) in problems with rich observation spaces. However, all these works share a strong realizability assumption about the optimal value function of the true MDP. Such realizability assumptions are often too strong to hold in practice. In this work, we consider the more realistic setting of agnostic RL with rich obser… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  30. arXiv:2105.01550  [pdf, ps, other

    cs.LG stat.ML

    A Finer Calibration Analysis for Adversarial Robustness

    Authors: Pranjal Awasthi, Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: We present a more general analysis of $H$-calibration for adversarially robust classification. By adopting a finer definition of calibration, we can cover settings beyond the restricted hypothesis sets studied in previous work. In particular, our results hold for most common hypothesis sets used in machine learning. We both fix some previous calibration results (Bao et al., 2020) and generalize ot… ▽ More

    Submitted 6 May, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

    Comments: arXiv admin note: text overlap with arXiv:2104.09658

  31. arXiv:2104.09658  [pdf, other

    cs.LG stat.ML

    Calibration and Consistency of Adversarial Surrogate Losses

    Authors: Pranjal Awasthi, Natalie Frank, Anqi Mao, Mehryar Mohri, Yutao Zhong

    Abstract: Adversarial robustness is an increasingly critical property of classifiers in applications. The design of robust algorithms relies on surrogate losses since the optimization of the adversarial loss with most hypothesis sets is NP-hard. But which surrogate losses should be used and when do they benefit from theoretical guarantees? We present an extensive study of this question, including a detailed… ▽ More

    Submitted 4 May, 2021; v1 submitted 19 April, 2021; originally announced April 2021.

  32. arXiv:2104.02748  [pdf, other

    cs.LG

    Communication-Efficient Agnostic Federated Averaging

    Authors: Jae Ro, Mingqing Chen, Rajiv Mathews, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: In distributed learning settings such as federated learning, the training algorithm can be potentially biased towards different clients. Mohri et al. (2019) proposed a domain-agnostic learning algorithm, where the model is optimized for any target distribution formed by a mixture of the client distributions in order to overcome this bias. They further proposed an algorithm for the cross-silo feder… ▽ More

    Submitted 15 June, 2021; v1 submitted 6 April, 2021; originally announced April 2021.

  33. arXiv:2102.11845  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Learning with User-Level Privacy

    Authors: Daniel Levy, Ziteng Sun, Kareem Amin, Satyen Kale, Alex Kulesza, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: We propose and analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution ($m \ge 1$ samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical… ▽ More

    Submitted 3 December, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

    Comments: NeurIPS 2021. 43 pages, 0 figure

  34. arXiv:2008.11036  [pdf, other

    cs.LG stat.ML

    A Discriminative Technique for Multiple-Source Adaptation

    Authors: Corinna Cortes, Mehryar Mohri, Ananda Theertha Suresh, Ningshan Zhang

    Abstract: We present a new discriminative technique for the multiple-source adaptation, MSA, problem. Unlike previous work, which relies on density estimation for each source domain, our solution only requires conditional probabilities that can easily be accurately estimated from unlabeled data from the source domains. We give a detailed analysis of our new technique, including general guarantees based on R… ▽ More

    Submitted 12 February, 2021; v1 submitted 25 August, 2020; originally announced August 2020.

  35. arXiv:2008.09490  [pdf, other

    cs.LG stat.ML

    Beyond Individual and Group Fairness

    Authors: Pranjal Awasthi, Corinna Cortes, Yishay Mansour, Mehryar Mohri

    Abstract: We present a new data-driven model of fairness that, unlike existing static definitions of individual or group fairness is guided by the unfairness complaints received by the system. Our model supports multiple fairness criteria and takes into account their potential incompatibilities. We consider both a stochastic and an adversarial setting of our model. In the stochastic setting, we show that ou… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

  36. arXiv:2008.03606  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    Mime: Mimicking Centralized Stochastic Algorithms in Federated Learning

    Authors: Sai Praneeth Karimireddy, Martin Jaggi, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

    Abstract: Federated learning (FL) is a challenging setting for optimization due to the heterogeneity of the data across different clients which gives rise to the client drift phenomenon. In fact, obtaining an algorithm for FL which is uniformly better than simple centralized training has been a major open problem thus far. In this work, we propose a general algorithmic framework, Mime, which i) mitigates cl… ▽ More

    Submitted 8 June, 2021; v1 submitted 8 August, 2020; originally announced August 2020.

    Comments: Version 2 provides stronger theoretical results and more thorough experiments

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  37. arXiv:2007.11045  [pdf, other

    cs.LG stat.ML

    On the Rademacher Complexity of Linear Hypothesis Sets

    Authors: Pranjal Awasthi, Natalie Frank, Mehryar Mohri

    Abstract: Linear predictors form a rich class of hypotheses used in a variety of learning algorithms. We present a tight analysis of the empirical Rademacher complexity of the family of linear hypothesis classes with weight vectors bounded in $\ell_p$-norm for any $p \geq 1$. This provides a tight analysis of generalization using these hypothesis sets and helps derive sharp data-dependent learning guarantee… ▽ More

    Submitted 21 July, 2020; originally announced July 2020.

  38. arXiv:2007.09762  [pdf, other

    cs.LG stat.ML

    A Theory of Multiple-Source Adaptation with Limited Target Labeled Data

    Authors: Yishay Mansour, Mehryar Mohri, Jae Ro, Ananda Theertha Suresh, Ke Wu

    Abstract: We present a theoretical and algorithmic study of the multiple-source domain adaptation problem in the common scenario where the learner has access only to a limited amount of labeled target data, but where the learner has at disposal a large amount of labeled data from multiple source domains. We show that a new family of algorithms based on model selection ideas benefits from very favorable guar… ▽ More

    Submitted 29 October, 2020; v1 submitted 19 July, 2020; originally announced July 2020.

    Comments: 20 pages

  39. arXiv:2006.14950  [pdf, other

    cs.LG stat.ML

    Relative Deviation Margin Bounds

    Authors: Corinna Cortes, Mehryar Mohri, Ananda Theertha Suresh

    Abstract: We present a series of new and more favorable margin-based learning guarantees that depend on the empirical margin loss of a predictor. We give two types of learning bounds, both distribution-dependent and valid for general families, in terms of the Rademacher complexity or the empirical $\ell_\infty$ covering number of the hypothesis set used. Furthermore, using our relative deviation margin boun… ▽ More

    Submitted 28 October, 2020; v1 submitted 26 June, 2020; originally announced June 2020.

    Comments: 29 pages

  40. arXiv:2006.09255  [pdf, other

    cs.LG stat.ML

    Corralling Stochastic Bandit Algorithms

    Authors: Raman Arora, Teodor V. Marinov, Mehryar Mohri

    Abstract: We study the problem of corralling stochastic bandit algorithms, that is combining multiple bandit algorithms designed for a stochastic environment, with the goal of devising a corralling algorithm that performs almost as well as the best base algorithm. We give two general algorithms for this setting, which we show benefit from favorable regret guarantees. We show that the regret of the corrallin… ▽ More

    Submitted 28 February, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

  41. arXiv:2005.03789  [pdf, other

    cs.LG cs.AI stat.ML

    Reinforcement Learning with Feedback Graphs

    Authors: Christoph Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

    Abstract: We study episodic reinforcement learning in Markov decision processes when the agent receives additional feedback per step in the form of several transition observations. Such additional observations are available in a range of tasks through extended sensors or prior knowledge about the environment (e.g., when certain actions yield similar outcome). We formalize this setting using a feedback graph… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

  42. arXiv:2004.13617  [pdf, other

    cs.LG stat.ML

    Adversarial Learning Guarantees for Linear Hypotheses and Neural Networks

    Authors: Pranjal Awasthi, Natalie Frank, Mehryar Mohri

    Abstract: Adversarial or test time robustness measures the susceptibility of a classifier to perturbations to the test input. While there has been a flurry of recent work on designing defenses against such perturbations, the theory of adversarial robustness is not well understood. In order to make progress on this, we focus on the problem of understanding generalization in adversarial settings, via the lens… ▽ More

    Submitted 28 April, 2020; originally announced April 2020.

  43. arXiv:2002.10619  [pdf, other

    cs.LG stat.ML

    Three Approaches for Personalization with Applications to Federated Learning

    Authors: Yishay Mansour, Mehryar Mohri, Jae Ro, Ananda Theertha Suresh

    Abstract: The standard objective in machine learning is to train a single model for all users. However, in many learning scenarios, such as cloud computing and federated learning, it is possible to learn a personalized model per user. In this work, we present a systematic learning-theoretic study of personalization. We propose and analyze three approaches: user clustering, data interpolation, and model inte… ▽ More

    Submitted 19 July, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: 24 pages

  44. arXiv:2002.07348  [pdf, other

    cs.LG stat.ML

    Adaptive Region-Based Active Learning

    Authors: Corinna Cortes, Giulia DeSalvo, Claudio Gentile, Mehryar Mohri, Ningshan Zhang

    Abstract: We present a new active learning algorithm that adaptively partitions the input space into a finite number of regions, and subsequently seeks a distinct predictor for each region, both phases actively requesting labels. We prove theoretical guarantees for both the generalization error and the label complexity of our algorithm, and analyze the number of regions defined by the algorithm under some m… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

  45. arXiv:1912.04977  [pdf, other

    cs.LG cs.CR stat.ML

    Advances and Open Problems in Federated Learning

    Authors: Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson , et al. (34 additional authors not shown)

    Abstract: Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs re… ▽ More

    Submitted 8 March, 2021; v1 submitted 10 December, 2019; originally announced December 2019.

    Comments: Published in Foundations and Trends in Machine Learning Vol 4 Issue 1. See: https://www.nowpublishers.com/article/Details/MAL-083

  46. arXiv:1910.08965  [pdf, other

    cs.LG stat.ML

    Learning GANs and Ensembles Using Discrepancy

    Authors: Ben Adlam, Corinna Cortes, Mehryar Mohri, Ningshan Zhang

    Abstract: Generative adversarial networks (GANs) generate data based on minimizing a divergence between two distributions. The choice of that divergence is therefore critical. We argue that the divergence must take into account the hypothesis set and the loss function used in a subsequent learning task, where the data generated by a GAN serves for training. Taking that structural information into account is… ▽ More

    Submitted 5 November, 2019; v1 submitted 20 October, 2019; originally announced October 2019.

  47. arXiv:1910.06378  [pdf, other

    cs.LG cs.DC math.OC stat.ML

    SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

    Authors: Sai Praneeth Karimireddy, Satyen Kale, Mehryar Mohri, Sashank J. Reddi, Sebastian U. Stich, Ananda Theertha Suresh

    Abstract: Federated Averaging (FedAvg) has emerged as the algorithm of choice for federated learning due to its simplicity and low communication cost. However, in spite of recent research efforts, its performance is not fully understood. We obtain tight convergence rates for FedAvg and prove that it suffers from `client-drift' when the data is heterogeneous (non-iid), resulting in unstable and slow converge… ▽ More

    Submitted 9 April, 2021; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: v2 contains analysis of FedAvg, non-convex rates of Scaffold, and experimental evaluation. v3 fixes typos, ICML version. v4 slightly improves rate of SCAFFOLD for general convex functions

    MSC Class: 68W40; 68W15; 90C25; 90C06 ACM Class: G.1.6; F.2.1; E.4

  48. arXiv:1907.12189  [pdf, ps, other

    cs.LG stat.ML

    Bandits with Feedback Graphs and Switching Costs

    Authors: Raman Arora, Teodor V. Marinov, Mehryar Mohri

    Abstract: We study the adversarial multi-armed bandit problem where partial observations are available and where, in addition to the loss incurred for each action, a \emph{switching cost} is incurred for shifting to a new action. All previously known results incur a factor proportional to the independence number of the feedback graph. We give a new algorithm whose regret guarantee depends only on the domina… ▽ More

    Submitted 22 March, 2020; v1 submitted 28 July, 2019; originally announced July 2019.

    Comments: Camera ready from NeurIPS 2019, new algorithm and improved results in Section 3.2

  49. arXiv:1905.00080  [pdf, other

    cs.LG stat.ML

    AdaNet: A Scalable and Flexible Framework for Automatically Learning Ensembles

    Authors: Charles Weill, Javier Gonzalvo, Vitaly Kuznetsov, Scott Yang, Scott Yak, Hanna Mazzawi, Eugen Hotaj, Ghassen Jerfel, Vladimir Macko, Ben Adlam, Mehryar Mohri, Corinna Cortes

    Abstract: AdaNet is a lightweight TensorFlow-based (Abadi et al., 2015) framework for automatically learning high-quality ensembles with minimal expert intervention. Our framework is inspired by the AdaNet algorithm (Cortes et al., 2017) which learns the structure of a neural network as an ensemble of subnetworks. We designed it to: (1) integrate with the existing TensorFlow ecosystem, (2) offer sensible de… ▽ More

    Submitted 30 April, 2019; originally announced May 2019.

  50. arXiv:1904.04755  [pdf, other

    cs.LG stat.ML

    Hypothesis Set Stability and Generalization

    Authors: Dylan J. Foster, Spencer Greenberg, Satyen Kale, Haipeng Luo, Mehryar Mohri, Karthik Sridharan

    Abstract: We present a study of generalization for data-dependent hypothesis sets. We give a general learning guarantee for data-dependent hypothesis sets based on a notion of transductive Rademacher complexity. Our main result is a generalization bound for data-dependent hypothesis sets expressed in terms of a notion of hypothesis set stability and a notion of Rademacher complexity for data-dependent hypot… ▽ More

    Submitted 5 October, 2020; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: Published in NeurIPS 2019. This version is equivalent to the camera-ready version but also includes the supplementary material