Zum Hauptinhalt springen

Showing 1–50 of 58 results for author: Hanneke, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16189  [pdf, ps, other

    stat.ML cs.AI cs.LG math.ST

    A More Unified Theory of Transfer Learning

    Authors: Steve Hanneke, Samory Kpotufe

    Abstract: We show that some basic moduli of continuity $δ$ -- which measure how fast target risk decreases as source risk decreases -- appear to be at the root of many of the classical relatedness measures in transfer learning and related literature. Namely, bounds in terms of $δ$ recover many of the existing bounds in terms of other measures of relatedness -- both in regression and classification -- and ca… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2407.19777  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Revisiting Agnostic PAC Learning

    Authors: Steve Hanneke, Kasper Green Larsen, Nikita Zhivotovskiy

    Abstract: PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64,'74, is a classic model for studying supervised learning. In the agnostic setting, we have access to a hypothesis set $\mathcal{H}$ and a training set of labeled samples $(x_1,y_1),\dots,(x_n,y_n) \in \mathcal{X} \times \{-1,1\}$ drawn i.i.d. from an unknown distribution $\mathcal{D}$. The goal is to produce a classifier… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  3. arXiv:2407.07765  [pdf, other

    cs.LG cs.CR cs.DS math.CO stat.ML

    Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem

    Authors: Simone Fioravanti, Steve Hanneke, Shay Moran, Hilla Schefler, Iska Tsubari

    Abstract: This work continues to investigate the link between differentially private (DP) and online learning. Alon, Livni, Malliaris, and Moran (2019) showed that for binary concept classes, DP learnability of a given class implies that it has a finite Littlestone dimension (equivalently, that it is online learnable). Their proof relies on a model-theoretic result by Hodges (1997), which demonstrates that… ▽ More

    Submitted 14 August, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2405.17120  [pdf, ps, other

    cs.DM cs.LG

    Dual VC Dimension Obstructs Sample Compression by Embeddings

    Authors: Zachary Chase, Bogdan Chornomaz, Steve Hanneke, Shay Moran, Amir Yehudayoff

    Abstract: This work studies embedding of arbitrary VC classes in well-behaved VC classes, focusing particularly on extremal classes. Our main result expresses an impossibility: such embeddings necessarily require a significant increase in dimension. In particular, we prove that for every $d$ there is a class with VC dimension $d$ that cannot be embedded in any extremal class of VC dimension smaller than exp… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    ACM Class: I.2.6; G.2.1

  5. arXiv:2403.10889  [pdf, other

    cs.LG stat.ML

    List Sample Compression and Uniform Convergence

    Authors: Steve Hanneke, Shay Moran, Tom Waknine

    Abstract: List learning is a variant of supervised classification where the learner outputs multiple plausible labels for each instance rather than just one. We investigate classical principles related to generalization within the context of list learning. Our primary goal is to determine whether classical principles in the PAC setting retain their applicability in the domain of list PAC learning. We focus… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  6. arXiv:2402.13400  [pdf, other

    stat.ML cs.LG

    The Dimension of Self-Directed Learning

    Authors: Pramith Devulapalli, Steve Hanneke

    Abstract: Understanding the self-directed learning complexity has been an important problem that has captured the attention of the online learning theory community since the early 1990s. Within this framework, the learner is allowed to adaptively choose its next data point in making predictions unlike the setting in adversarial online learning. In this paper, we study the self-directed learning complexity… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: ALT 2024 Camera ready version

  7. arXiv:2402.07453  [pdf, ps, other

    cs.LG stat.ML

    Bandit-Feedback Online Multiclass Classification: Variants and Tradeoffs

    Authors: Yuval Filmus, Steve Hanneke, Idan Mehalel, Shay Moran

    Abstract: Consider the domain of multiclass classification within the adversarial online setting. What is the price of relying on bandit feedback as opposed to full information? To what extent can an adaptive adversary amplify the loss compared to an oblivious one? To what extent can a randomized learner reduce the loss compared to a deterministic one? We study these questions in the mistake bound model and… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  8. arXiv:2311.06428  [pdf, other

    cs.LG

    A Trichotomy for Transductive Online Learning

    Authors: Steve Hanneke, Shay Moran, Jonathan Shafer

    Abstract: We present new upper and lower bounds on the number of learner mistakes in the `transductive' online learning setting of Ben-David, Kushilevitz and Mansour (1997). This setting is similar to standard online learning, except that the adversary fixes a sequence of instances $x_1,\dots,x_n$ to be labeled at the start of the game, and this sequence is known to the learner. Qualitatively, we prove a tr… ▽ More

    Submitted 29 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

  9. arXiv:2309.17016  [pdf, other

    cs.LG math.ST stat.ML

    Efficient Agnostic Learning with Average Smoothness

    Authors: Steve Hanneke, Aryeh Kontorovich, Guy Kornowski

    Abstract: We study distribution-free nonparametric regression following a notion of average smoothness initiated by Ashlagi et al. (2021), which measures the "effective" smoothness of a function with respect to an arbitrary unknown underlying distribution. While the recent work of Hanneke et al. (2023) established tight uniform convergence bounds for average-smooth functions in the realizable case and provi… ▽ More

    Submitted 13 February, 2024; v1 submitted 29 September, 2023; originally announced September 2023.

    Comments: ALT 2024 camera ready version. arXiv admin note: text overlap with arXiv:2302.06005

  10. arXiv:2307.03848  [pdf, other

    cs.LG cs.AI stat.ML

    Optimal Learners for Realizable Regression: PAC Learning and Online Learning

    Authors: Idan Attias, Steve Hanneke, Alkis Kalavasis, Amin Karbasi, Grigoris Velegkas

    Abstract: In this work, we aim to characterize the statistical complexity of realizable regression both in the PAC learning setting and the online learning setting. Previous work had established the sufficiency of finiteness of the fat shattering dimension for PAC learnability and the necessity of finiteness of the scaled Natarajan dimension, but little progress had been made towards a more complete charact… ▽ More

    Submitted 29 October, 2023; v1 submitted 7 July, 2023; originally announced July 2023.

  11. arXiv:2307.02066  [pdf, ps, other

    cs.LG stat.ML

    Universal Rates for Multiclass Learning

    Authors: Steve Hanneke, Shay Moran, Qian Zhang

    Abstract: We study universal rates for multiclass classification, establishing the optimal rates (up to log factors) for all hypothesis classes. This generalizes previous results on binary classification (Bousquet, Hanneke, Moran, van Handel, and Yehudayoff, 2021), and resolves an open question studied by Kalavasis, Velegkas, and Karbasi (2022) who handled the multiclass setting with a bounded number of cla… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 67 pages, accepted to the 36th Annual Conference on Learning Theory (COLT 2023)

  12. arXiv:2306.13119  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Adversarial Resilience in Sequential Prediction via Abstention

    Authors: Surbhi Goel, Steve Hanneke, Shay Moran, Abhishek Shetty

    Abstract: We study the problem of sequential prediction in the stochastic setting with an adversary that is allowed to inject clean-label adversarial (or out-of-distribution) examples. Algorithms designed to handle purely stochastic data tend to fail in the presence of such adversarial examples, often leading to erroneous predictions. This is undesirable in many high-stakes applications such as medical reco… ▽ More

    Submitted 24 January, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

  13. arXiv:2305.00152  [pdf, other

    stat.ML cs.LG

    Limits of Model Selection under Transfer Learning

    Authors: Steve Hanneke, Samory Kpotufe, Yasaman Mahdaviyeh

    Abstract: Theoretical studies on transfer learning or domain adaptation have so far focused on situations with a known hypothesis class or model; however in practice, some amount of model selection is usually involved, often appearing under the umbrella term of hyperparameter-tuning: for example, one may think of the problem of tuning for the right neural network architecture towards a target task, while le… ▽ More

    Submitted 12 October, 2023; v1 submitted 28 April, 2023; originally announced May 2023.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2023

  14. arXiv:2304.03370  [pdf, other

    cs.LG cs.CR

    Reliable learning in challenging environments

    Authors: Maria-Florina Balcan, Steve Hanneke, Rattana Pukdee, Dravyansh Sharma

    Abstract: The problem of designing learners that provide guarantees that their predictions are provably correct is of increasing importance in machine learning. However, learning theoretic guarantees have only been considered in very specific settings. In this work, we consider the design and analysis of reliable learners in challenging test-time environments as encountered in modern machine learning proble… ▽ More

    Submitted 29 October, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

    Journal ref: NeurIPS 2023

  15. arXiv:2303.17716  [pdf, ps, other

    cs.LG stat.ML

    Multiclass Online Learning and Uniform Convergence

    Authors: Steve Hanneke, Shay Moran, Vinod Raman, Unique Subedi, Ambuj Tewari

    Abstract: We study multiclass classification in the agnostic adversarial online learning setting. As our main result, we prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite. This solves an open problem studied by Daniely, Sabato, Ben-David, and Shalev-Shwartz (2011,2015) who handled the case when the number of classes (or labels) is bounded. W… ▽ More

    Submitted 7 July, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: COLT Camera-Ready, 15 pages

  16. arXiv:2302.13849  [pdf, ps, other

    cs.LG

    Optimal Prediction Using Expert Advice and Randomized Littlestone Dimension

    Authors: Yuval Filmus, Steve Hanneke, Idan Mehalel, Shay Moran

    Abstract: A classical result in online learning characterizes the optimal mistake bound achievable by deterministic learners using the Littlestone dimension (Littlestone '88). We prove an analogous result for randomized learners: we show that the optimal expected mistake bound in learning a class $\mathcal{H}$ equals its randomized Littlestone dimension, which is the largest $d$ for which there exists a tre… ▽ More

    Submitted 17 August, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

  17. arXiv:2302.07186  [pdf, ps, other

    stat.ML cs.LG math.ST

    Adversarial Rewards in Universal Learning for Contextual Bandits

    Authors: Moise Blanchard, Steve Hanneke, Patrick Jaillet

    Abstract: We study the fundamental limits of learning in contextual bandits, where a learner's rewards depend on their actions and a known context, which extends the canonical multi-armed bandit to the case where side-information is available. We are interested in universally consistent algorithms, which achieve sublinear regret compared to any measurable fixed policy, without any function class restriction… ▽ More

    Submitted 12 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

  18. arXiv:2302.06005  [pdf, other

    cs.LG math.ST stat.ML

    Near-optimal learning with average Hölder smoothness

    Authors: Steve Hanneke, Aryeh Kontorovich, Guy Kornowski

    Abstract: We generalize the notion of average Lipschitz smoothness proposed by Ashlagi et al. (COLT 2021) by extending it to Hölder smoothness. This measure of the "effective smoothness" of a function is sensitive to the underlying distribution and can be dramatically smaller than its classic "worst-case" Hölder constant. We consider both the realizable and the agnostic (noisy) regression settings, proving… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: NeurIPS 2023 camera ready version

  19. arXiv:2301.00241  [pdf, ps, other

    stat.ML cs.LG math.ST

    Contextual Bandits and Optimistically Universal Learning

    Authors: Moise Blanchard, Steve Hanneke, Patrick Jaillet

    Abstract: We consider the contextual bandit problem on general action and context spaces, where the learner's rewards depend on their selected actions and an observable context. This generalizes the standard multi-armed bandit to the case where side information is available, e.g., patients' records or customers' history, which allows for personalized treatment. We focus on consistency -- vanishing regret co… ▽ More

    Submitted 31 December, 2022; originally announced January 2023.

  20. arXiv:2210.02713  [pdf, ps, other

    cs.LG cs.CR

    On Optimal Learning Under Targeted Data Poisoning

    Authors: Steve Hanneke, Amin Karbasi, Mohammad Mahmoody, Idan Mehalel, Shay Moran

    Abstract: Consider the task of learning a hypothesis class $\mathcal{H}$ in the presence of an adversary that can replace up to an $η$ fraction of the examples in the training set with arbitrary adversarial examples. The adversary aims to fail the learner on a particular target test point $x$ which is known to the adversary but not to the learner. In this work we aim to characterize the smallest achievable… ▽ More

    Submitted 12 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

  21. arXiv:2209.07369  [pdf, ps, other

    cs.LG stat.ML

    Adversarially Robust Learning: A Generic Minimax Optimal Learner and Characterization

    Authors: Omar Montasser, Steve Hanneke, Nathan Srebro

    Abstract: We present a minimax optimal learner for the problem of learning predictors robust to adversarial examples at test-time. Interestingly, we find that this requires new algorithmic ideas and approaches to adversarially robust learning. In particular, we show, in a strong negative sense, the suboptimality of the robust learner proposed by Montasser, Hanneke, and Srebro (2019) and a broader family of… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: To appear in NeurIPS 2022

  22. arXiv:2208.14615  [pdf, other

    cs.LG cs.CC stat.ML

    Fine-Grained Distribution-Dependent Learning Curves

    Authors: Olivier Bousquet, Steve Hanneke, Shay Moran, Jonathan Shafer, Ilya Tolstikhin

    Abstract: Learning curves plot the expected error of a learning algorithm as a function of the number of labeled samples it receives from a target distribution. They are widely used as a measure of an algorithm's performance, but classic PAC learning theory cannot explain their behavior. As observed by Antos and Lugosi (1996 , 1998), the classic `No Free Lunch' lower bounds only trace the upper envelope a… ▽ More

    Submitted 10 November, 2022; v1 submitted 30 August, 2022; originally announced August 2022.

  23. arXiv:2206.12977  [pdf, ps, other

    cs.LG stat.ML

    Adversarially Robust PAC Learnability of Real-Valued Functions

    Authors: Idan Attias, Steve Hanneke

    Abstract: We study robustness to test-time adversarial attacks in the regression setting with $\ell_p$ losses and arbitrary perturbation sets. We address the question of which function classes are PAC learnable in this setting. We show that classes of finite fat-shattering dimension are learnable in both realizable and agnostic settings. Moreover, for convex function classes, they are even properly learnabl… ▽ More

    Submitted 5 May, 2024; v1 submitted 26 June, 2022; originally announced June 2022.

    Comments: accepted to ICML2023

  24. arXiv:2203.06046  [pdf, ps, other

    stat.ML cs.LG math.ST

    Universally Consistent Online Learning with Arbitrarily Dependent Responses

    Authors: Steve Hanneke

    Abstract: This work provides an online learning rule that is universally consistent under processes on (X,Y) pairs, under conditions only on the X process. As a special case, the conditions admit all processes on (X,Y) such that the process on X is stationary. This generalizes past results which required stationarity for the joint process on (X,Y), and additionally required this process to be ergodic. In pa… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

  25. arXiv:2203.04160  [pdf, other

    cs.LG cs.AI cs.CR cs.DS

    Robustly-reliable learners under poisoning attacks

    Authors: Maria-Florina Balcan, Avrim Blum, Steve Hanneke, Dravyansh Sharma

    Abstract: Data poisoning attacks, in which an adversary corrupts a training set with the goal of inducing specific desired mistakes, have raised substantial concern: even just the possibility of such an attack can make a user no longer trust the results of a learning system. In this work, we show how to achieve strong robustness guarantees in the face of such attacks across multiple axes. We provide robus… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

  26. arXiv:2202.05420  [pdf, other

    cs.LG stat.ML

    A Characterization of Semi-Supervised Adversarially-Robust PAC Learnability

    Authors: Idan Attias, Steve Hanneke, Yishay Mansour

    Abstract: We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model. We address the question of how many labeled and unlabeled examples are required to ensure learning. We show that having enough unlabeled data (the size of a labeled sample that a fully-supervised method would require), the labeled sample complexity can be arbitrarily smaller co… ▽ More

    Submitted 5 May, 2024; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022 camera-ready

  27. arXiv:2201.08903  [pdf, ps, other

    stat.ML cs.LG math.ST

    Universal Online Learning with Unbounded Losses: Memory Is All You Need

    Authors: Moise Blanchard, Romain Cosson, Steve Hanneke

    Abstract: We resolve an open problem of Hanneke on the subject of universally consistent online learning with non-i.i.d. processes and unbounded losses. The notion of an optimistically universal learning rule was defined by Hanneke in an effort to study learning theory under minimal assumptions. A given learning rule is said to be optimistically universal if it achieves a low long-run average loss whenever… ▽ More

    Submitted 21 January, 2022; originally announced January 2022.

  28. arXiv:2110.10602  [pdf, ps, other

    cs.LG stat.ML

    Transductive Robust Learning Guarantees

    Authors: Omar Montasser, Steve Hanneke, Nathan Srebro

    Abstract: We study the problem of adversarially robust learning in the transductive setting. For classes $\mathcal{H}$ of bounded VC dimension, we propose a simple transductive learner that when presented with a set of labeled training examples and a set of unlabeled test examples (both sets possibly adversarially perturbed), it correctly labels the test examples with a robust error rate that is linear in t… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

  29. arXiv:2107.09542  [pdf, ps, other

    cs.LG cs.AI math.PR math.ST stat.ML

    Open Problem: Is There an Online Learning Algorithm That Learns Whenever Online Learning Is Possible?

    Authors: Steve Hanneke

    Abstract: This open problem asks whether there exists an online learning algorithm for binary classification that guarantees, for all target concepts, to make a sublinear number of mistakes, under only the assumption that the (possibly random) sequence of points X allows that such a learning algorithm can exist for that sequence. As a secondary problem, it also asks whether a specific concise condition comp… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

  30. arXiv:2107.08444  [pdf, ps, other

    cs.LG cs.AI cs.CC cs.CG stat.ML

    A Theory of PAC Learnability of Partial Concept Classes

    Authors: Noga Alon, Steve Hanneke, Ron Holzman, Shay Moran

    Abstract: We extend the theory of PAC learning in a way which allows to model a rich variety of learning tasks where the data satisfy special properties that ease the learning process. For example, tasks where the distance of the data from the decision boundary is bounded away from zero. The basic and simple idea is to consider partial concepts: these are functions that can be undefined on certain parts of… ▽ More

    Submitted 20 July, 2021; v1 submitted 18 July, 2021; originally announced July 2021.

  31. arXiv:2103.00671  [pdf, other

    cs.LG

    Robust learning under clean-label attack

    Authors: Avrim Blum, Steve Hanneke, Jian Qian, Han Shao

    Abstract: We study the problem of robust learning under clean-label data-poisoning attacks, where the attacker injects (an arbitrary set of) correctly-labeled examples to the training set to fool the algorithm into making mistakes on specific test instances at test time. The learning goal is to minimize the attackable rate (the probability mass of attackable test instances), which is more difficult than opt… ▽ More

    Submitted 6 July, 2021; v1 submitted 28 February, 2021; originally announced March 2021.

  32. arXiv:2102.02145  [pdf, ps, other

    cs.LG stat.ML

    Adversarially Robust Learning with Unknown Perturbation Sets

    Authors: Omar Montasser, Steve Hanneke, Nathan Srebro

    Abstract: We study the problem of learning predictors that are robust to adversarial examples with respect to an unknown perturbation set, relying instead on interaction with an adversarial attacker or access to attack oracles, examining different models for such interactions. We obtain upper bounds on the sample complexity and upper and lower bounds on the number of required interactions, or number of succ… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  33. arXiv:2102.01646  [pdf, ps, other

    cs.LG cs.DS cs.GT stat.ML

    Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games

    Authors: Steve Hanneke, Roi Livni, Shay Moran

    Abstract: Which classes can be learned properly in the online model? -- that is, by an algorithm that at each round uses a predictor from the concept class. While there are simple and natural cases where improper learning is necessary, it is natural to ask how complex must the improper predictors be in such cases. Can one always achieve nearly optimal mistake/regret bounds using "simple" predictors? In th… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

  34. arXiv:2011.04586  [pdf, ps, other

    cs.LG stat.ML

    Stable Sample Compression Schemes: New Applications and an Optimal SVM Margin Bound

    Authors: Steve Hanneke, Aryeh Kontorovich

    Abstract: We analyze a family of supervised learning algorithms based on sample compression schemes that are stable, in the sense that removing points from the training set which were not selected for the compression set does not alter the resulting classifier. We use this technique to derive a variety of novel or improved data-dependent generalization bounds for several learning algorithms. In particular,… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  35. arXiv:2011.04483  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    A Theory of Universal Learning

    Authors: Olivier Bousquet, Steve Hanneke, Shay Moran, Ramon van Handel, Amir Yehudayoff

    Abstract: How quickly can a given class of concepts be learned from examples? It is common to measure the performance of a supervised machine learning algorithm by plotting its "learning curve", that is, the decay of the error rate as a function of the number of training examples. However, the classical theoretical framework for understanding learnability, the PAC model of Vapnik-Chervonenkis and Valiant, d… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

  36. arXiv:2010.12039  [pdf, ps, other

    cs.LG

    Reducing Adversarially Robust Learning to Non-Robust PAC Learning

    Authors: Omar Montasser, Steve Hanneke, Nathan Srebro

    Abstract: We study the problem of reducing adversarially robust learning to standard PAC learning, i.e. the complexity of learning adversarially robust predictors using access to only a black-box non-robust learner. We give a reduction that can robustly learn any hypothesis class $\mathcal{C}$ using any non-robust learner $\mathcal{A}$ for $\mathcal{C}$. The number of calls to $\mathcal{A}$ depends logarith… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: To appear in NeurIPS 2020

  37. arXiv:2006.15785  [pdf, other

    cs.LG math.ST stat.ML

    A No-Free-Lunch Theorem for MultiTask Learning

    Authors: Steve Hanneke, Samory Kpotufe

    Abstract: Multitask learning and related areas such as multi-source domain adaptation address modern settings where datasets from $N$ related distributions $\{P_t\}$ are to be combined towards improving performance on any single such distribution ${\cal D}$. A perplexing fact remains in the evolving theory on the subject: while we would hope for performance bounds that account for the contribution from mult… ▽ More

    Submitted 5 August, 2020; v1 submitted 28 June, 2020; originally announced June 2020.

  38. arXiv:2005.11818  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Proper Learning, Helly Number, and an Optimal SVM Bound

    Authors: Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy

    Abstract: The classical PAC sample complexity bounds are stated for any Empirical Risk Minimizer (ERM) and contain an extra logarithmic factor $\log(1/ε)$ which is known to be necessary for ERM in general. It has been recently shown by Hanneke (2016) that the optimal sample complexity of PAC learning for any VC class C is achieved by a particular improper learning algorithm, which outputs a specific majorit… ▽ More

    Submitted 24 May, 2020; originally announced May 2020.

  39. arXiv:2002.04747  [pdf, other

    cs.LG stat.ML

    On the Value of Target Data in Transfer Learning

    Authors: Steve Hanneke, Samory Kpotufe

    Abstract: We aim to understand the value of additional labeled or unlabeled target data in transfer learning, for any given amount of source data; this is motivated by practical questions around minimizing sampling costs, whereby, target data is usually harder or costlier to acquire than source data, but can yield better accuracy. To this aim, we establish the first minimax-rates in terms of both source and… ▽ More

    Submitted 11 February, 2020; originally announced February 2020.

    Journal ref: NeurIPS 2019

  40. arXiv:1906.09855  [pdf, other

    cs.LG math.ST stat.ML

    Universal Bayes consistency in metric spaces

    Authors: Steve Hanneke, Aryeh Kontorovich, Sivan Sabato, Roi Weiss

    Abstract: We extend a recently proposed 1-nearest-neighbor based multiclass learning algorithm and prove that our modification is universally strongly Bayes-consistent in all metric spaces admitting any such learner, making it an "optimistically universal" Bayes-consistent learner. This is the first learning algorithm known to enjoy this property; by comparison, the $k$-NN classifier and its variants are no… ▽ More

    Submitted 6 January, 2021; v1 submitted 24 June, 2019; originally announced June 2019.

    Comments: To appear in Annals of Statistics

    Journal ref: Annals of Statistics 2021, Vol. 49, No. 4, 2129-2150, August 2021

  41. arXiv:1902.04217  [pdf, ps, other

    cs.LG stat.ML

    VC Classes are Adversarially Robustly Learnable, but Only Improperly

    Authors: Omar Montasser, Steve Hanneke, Nathan Srebro

    Abstract: We study the question of learning an adversarially robust predictor. We show that any hypothesis class $\mathcal{H}$ with finite VC dimension is robustly PAC learnable with an improper learning rule. The requirement of being improper is necessary as we exhibit examples of hypothesis classes $\mathcal{H}$ with finite VC dimension that are not robustly PAC learnable with any proper learning rule.

    Submitted 3 July, 2019; v1 submitted 11 February, 2019; originally announced February 2019.

    Comments: COLT 2019 Camera Ready

  42. arXiv:1810.01864  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Agnostic Sample Compression Schemes for Regression

    Authors: Idan Attias, Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi

    Abstract: We obtain the first positive results for bounded sample compression in the agnostic regression setting with the $\ell_p$ loss, where $p\in [1,\infty]$. We construct a generic approximate sample compression scheme for real-valued function classes exhibiting exponential size in the fat-shattering dimension but independent of the sample size. Notably, for linear regression, an approximate compression… ▽ More

    Submitted 3 February, 2024; v1 submitted 3 October, 2018; originally announced October 2018.

    Comments: New results in this version: (1) Approximate agnostic sample compression scheme for function classes with finite fat-shattering dimension and the $\ell_p$ loss (section 3), (2) Near-optimal approximate compression for linear functions and the $\ell_p$ loss (section 4.1) The results in sections 4.2 and 4.3 appear in the previous version

  43. arXiv:1805.08254  [pdf, ps, other

    cs.LG stat.ML

    Sample Compression for Real-Valued Learners

    Authors: Steve Hanneke, Aryeh Kontorovich, Menachem Sadigurschi

    Abstract: We give an algorithmically efficient version of the learner-to-compression scheme conversion in Moran and Yehudayoff (2016). In extending this technique to real-valued hypotheses, we also obtain an efficient regression-to-bounded sample compression converter. To our knowledge, this is the first general compressed regression result (regardless of efficiency or boundedness) guaranteeing uniform appr… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  44. arXiv:1805.08140  [pdf, ps, other

    cs.LG math.ST stat.ML

    A New Lower Bound for Agnostic Learning with Sample Compression Schemes

    Authors: Steve Hanneke, Aryeh Kontorovich

    Abstract: We establish a tight characterization of the worst-case rates for the excess risk of agnostic learning with sample compression schemes and for uniform convergence for agnostic sample compression schemes. In particular, we find that the optimal rates of convergence for size-$k$ agnostic sample compression schemes are of the form $\sqrt{\frac{k \log(n/k)}{n}}$, which contrasts with agnostic learning… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  45. arXiv:1802.07229  [pdf, other

    cs.LG cs.DS stat.ML

    Actively Avoiding Nonsense in Generative Models

    Authors: Steve Hanneke, Adam Kalai, Gautam Kamath, Christos Tzamos

    Abstract: A generative model may generate utter nonsense when it is fit to maximize the likelihood of observed data. This happens due to "model error," i.e., when the true data generating distribution does not fit within the class of generative models being learned. To address this, we propose a model of active distribution learning using a binary invalidity oracle that identifies some examples as clearly i… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

  46. arXiv:1706.07669  [pdf, ps, other

    cs.DS cs.LG

    Testing Piecewise Functions

    Authors: Steve Hanneke, Liu Yang

    Abstract: This work explores the query complexity of property testing for general piecewise functions on the real line, in the active and passive property testing settings. The results are proven under an abstract zero-measure crossings condition, which has as special cases piecewise constant functions and piecewise polynomial functions. We find that, in the active testing setting, the query complexity of t… ▽ More

    Submitted 20 May, 2018; v1 submitted 23 June, 2017; originally announced June 2017.

  47. arXiv:1706.01418  [pdf, ps, other

    stat.ML cs.LG math.PR math.ST

    Learning Whenever Learning is Possible: Universal Learning under General Stochastic Processes

    Authors: Steve Hanneke

    Abstract: This work initiates a general study of learning and generalization without the i.i.d. assumption, starting from first principles. While the traditional approach to statistical learning theory typically relies on standard assumptions from probability theory (e.g., i.i.d. or stationary ergodic), in this work we are interested in developing a theory of learning based only on the most fundamental and… ▽ More

    Submitted 20 October, 2020; v1 submitted 5 June, 2017; originally announced June 2017.

  48. arXiv:1705.00219  [pdf, other

    cs.LG stat.CO stat.ML

    Learning with Changing Features

    Authors: Amit Dhurandhar, Steve Hanneke, Liu Yang

    Abstract: In this paper we study the setting where features are added or change interpretation over time, which has applications in multiple domains such as retail, manufacturing, finance. In particular, we propose an approach to provably determine the time instant from which the new/changed features start becoming relevant with respect to an output variable in an agnostic (supervised) learning setting. We… ▽ More

    Submitted 29 April, 2017; originally announced May 2017.

  49. arXiv:1512.08064  [pdf, ps, other

    cs.LG stat.ML

    Statistical Learning under Nonstationary Mixing Processes

    Authors: Steve Hanneke, Liu Yang

    Abstract: We study a special case of the problem of statistical learning without the i.i.d. assumption. Specifically, we suppose a learning method is presented with a sequence of data points, and required to make a prediction (e.g., a classification) for each one, and can then observe the loss incurred by this prediction. We go beyond traditional analyses, which have focused on stationary mixing processes o… ▽ More

    Submitted 20 May, 2018; v1 submitted 25 December, 2015; originally announced December 2015.

  50. arXiv:1512.07146  [pdf, ps, other

    cs.LG math.ST stat.ML

    Refined Error Bounds for Several Learning Algorithms

    Authors: Steve Hanneke

    Abstract: This article studies the achievable guarantees on the error rates of certain learning algorithms, with particular focus on refining logarithmic factors. Many of the results are based on a general technique for obtaining bounds on the error rates of sample-consistent classifiers with monotonic error regions, in the realizable case. We prove bounds of this type expressed in terms of either the VC di… ▽ More

    Submitted 10 September, 2016; v1 submitted 22 December, 2015; originally announced December 2015.

    Journal ref: Journal of Machine Learning Research, Vol. 17 (2016), No. 135, pp. 1-55