Zum Hauptinhalt springen

Showing 1–35 of 35 results for author: Kempe, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.01420  [pdf, other

    cs.LG cs.AI cs.CL

    Mission Impossible: A Statistical Perspective on Jailbreaking LLMs

    Authors: Jingtong Su, Julia Kempe, Karen Ullrich

    Abstract: Large language models (LLMs) are trained on a deluge of text data with limited quality control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as leaking information, fake news or hate speech. Countermeasures, commonly referred to as preference alignment, include fine-tuning the pretrained LLMs with carefully crafted text examples of desired behaviour. Even then, empiric… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  2. arXiv:2406.07515  [pdf, other

    cs.LG cs.AI stat.ML

    Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement

    Authors: Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton, Julia Kempe

    Abstract: Synthesized data from generative models is increasingly considered as an alternative to human-annotated data for fine-tuning Large Language Models. This raises concerns about model collapse: a drop in performance of models fine-tuned on generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investig… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  3. arXiv:2406.04981  [pdf, other

    cs.LG stat.ML

    The Price of Implicit Bias in Adversarially Robust Generalization

    Authors: Nikolaos Tsilivis, Natalie Frank, Nathan Srebro, Julia Kempe

    Abstract: We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization. In classification settings under adversarial perturbations with linear models, we study what type of regularization should ideally be applied for a given perturbation set to improve (robust) generalization. We then show that the implicit bias of optimization… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  4. arXiv:2406.02128  [pdf, other

    cs.LG cs.AI cs.CL

    Iteration Head: A Mechanistic Study of Chain-of-Thought

    Authors: Vivien Cabannes, Charles Arnal, Wassim Bouaziz, Alice Yang, Francois Charton, Julia Kempe

    Abstract: Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains limited. This paper helps fill this gap by demonstrating how CoT reasoning emerges in transformers in a controlled and interpretable setting. In particul… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  5. arXiv:2404.19640  [pdf, other

    cs.LG cs.AI cs.CV stat.ME stat.ML

    Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

    Authors: Yunzhen Feng, Tim G. J. Rudner, Nikolaos Tsilivis, Julia Kempe

    Abstract: Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that Bayesian neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN infe… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

  6. arXiv:2404.05579  [pdf, other

    cs.LG cs.CV

    Robust Data Pruning: Uncovering and Overcoming Implicit Bias

    Authors: Artem Vysogorets, Kartik Ahuja, Julia Kempe

    Abstract: In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by removing redundant or uninformative samples from the dataset, which yields faster convergence and improved neural scaling laws. However, little is known about its impact on classification bias of the trained models.… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  7. arXiv:2403.09869  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    Mind the GAP: Improving Robustness to Subpopulation Shifts with Group-Aware Priors

    Authors: Tim G. J. Rudner, Ya Shi Zhang, Andrew Gordon Wilson, Julia Kempe

    Abstract: Machine learning models often perform poorly under subpopulation shifts in the data distribution. Developing methods that allow machine learning models to better generalize to such shifts is crucial for safe deployment in real-world settings. In this paper, we develop a family of group-aware prior (GAP) distributions over neural network parameters that explicitly favor models that generalize well… ▽ More

    Submitted 14 March, 2024; originally announced March 2024.

    Comments: Published in Proceedings of the 27th International Conference on Artificial Intelligence and Statistics (AISTATS 2024)

  8. arXiv:2402.07712  [pdf, other

    cs.LG cs.AI stat.ML

    Model Collapse Demystified: The Case of Regression

    Authors: Elvis Dohmatob, Yunzhen Feng, Julia Kempe

    Abstract: In the era of proliferation of large language and image generation models, the phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i.e the model collapses. In this work, we study this phenomenon in the setting… ▽ More

    Submitted 30 April, 2024; v1 submitted 12 February, 2024; originally announced February 2024.

  9. arXiv:2402.07043  [pdf, other

    cs.LG cs.AI cs.CL

    A Tale of Tails: Model Collapse as a Change of Scaling Laws

    Authors: Elvis Dohmatob, Yunzhen Feng, Pu Yang, Francois Charton, Julia Kempe

    Abstract: As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will… ▽ More

    Submitted 31 May, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Journal ref: ICML 2024

  10. arXiv:2402.03579  [pdf, other

    cs.LG math.OC

    Deconstructing the Goldilocks Zone of Neural Network Initialization

    Authors: Artem Vysogorets, Anna Dawid, Julia Kempe

    Abstract: The second-order properties of the training loss have a massive impact on the optimization dynamics of deep learning models. Fort & Scherlis (2019) discovered that a large excess of positive curvature and local convexity of the loss Hessian is associated with highly trainable initial points located in a region coined the "Goldilocks zone". Only a handful of subsequent studies touched upon this rel… ▽ More

    Submitted 4 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning, PMLR (2024) 235:49717-49732

  11. arXiv:2311.17967  [pdf, other

    cs.CV astro-ph.IM cs.LG

    Discovering Galaxy Features via Dataset Distillation

    Authors: Haowen Guan, Xuan Zhao, Zishi Wang, Zhiyang Li, Julia Kempe

    Abstract: In many applications, Neural Nets (NNs) have classification performance on par or even exceeding human capacity. Moreover, it is likely that NNs leverage underlying features that might differ from those humans perceive to classify. Can we "reverse-engineer" pertinent features to enhance our scientific understanding? Here, we apply this idea to the notoriously difficult task of galaxy classificatio… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted to NeurIPS Workshop on Machine Learning and the Physical Sciences, 2023

  12. arXiv:2311.07444  [pdf, other

    cs.LG

    On the Robustness of Neural Collapse and the Neural Collapse of Robustness

    Authors: Jingtong Su, Ya Shi Zhang, Nikolaos Tsilivis, Julia Kempe

    Abstract: Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex). While it has been observed empirically in various cases and has been theoretically motivated, its connection with crucial properties of neural networks, like their generalization and robustness,… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  13. arXiv:2311.07025  [pdf, other

    cs.LG cs.AI stat.ML

    Embarassingly Simple Dataset Distillation

    Authors: Yunzhen Feng, Ramakrishna Vedantam, Julia Kempe

    Abstract: Dataset distillation extracts a small set of synthetic training samples from a large dataset with the goal of achieving competitive performance on test data when trained on this sample. In this work, we tackle dataset distillation at its core by treating it directly as a bilevel optimization problem. Re-examining the foundational back-propagation through time method, we study the pronounced varian… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

    Comments: Short version appears at NeurIPS 2023 WANT workshop

  14. arXiv:2307.02693  [pdf, other

    cs.LG stat.ML

    Kernels, Data & Physics

    Authors: Francesco Cagnetta, Deborah Oliveira, Mahalakshmi Sabanayagam, Nikolaos Tsilivis, Julia Kempe

    Abstract: Lecture notes from the course given by Professor Julia Kempe at the summer school "Statistical physics of Machine Learning" in Les Houches. The notes discuss the so-called NTK approach to problems in machine learning, which consists of gaining an understanding of generally unsolvable problems by finding a tractable kernel formulation. The notes are mainly focused on practical applications such as… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: These are notes from the lecture of Julia Kempe given at the summer school "Statistical Physics \& Machine Learning", that took place in Les Houches School of Physics in France from 4th to 29th July 2022

  15. arXiv:2304.09403  [pdf, other

    cs.LG cs.CV

    Wavelets Beat Monkeys at Adversarial Robustness

    Authors: Jingtong Su, Julia Kempe

    Abstract: Research on improving the robustness of neural networks to adversarial noise - imperceptible malicious perturbations of the data - has received significant attention. The currently uncontested state-of-the-art defense to obtain robust deep neural networks is Adversarial Training (AT), but it consumes significantly more resources compared to standard training and trades off accuracy for robustness.… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

    Comments: Machine Learning and the Physical Sciences Workshop, NeurIPS 2022

  16. arXiv:2210.05577  [pdf, other

    cs.LG cs.CR

    What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

    Authors: Nikolaos Tsilivis, Julia Kempe

    Abstract: The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon. Here, we study adversarial examples of trained neural networks through analytical tools afforded by recent theory advances connecting neural networks and kernel methods, namely the Neural Tangent Kernel (NTK… ▽ More

    Submitted 30 January, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022; added link to GitHub repository

  17. arXiv:2210.01987  [pdf, other

    cs.CV cs.LG

    ImpressLearn: Continual Learning via Combined Task Impressions

    Authors: Dhrupad Bhardwaj, Julia Kempe, Artem Vysogorets, Angela M. Teng, Evaristus C. Ezekwem

    Abstract: This work proposes a new method to sequentially train deep neural networks on multiple tasks without suffering catastrophic forgetting, while endowing it with the capability to quickly adapt to unseen tasks. Starting from existing work on network masking (Wortsman et al., 2020), we show that simply learning a linear combination of a small number of task-specific supermasks (impressions) on a rando… ▽ More

    Submitted 31 January, 2023; v1 submitted 4 October, 2022; originally announced October 2022.

  18. arXiv:2207.11727  [pdf, other

    cs.LG cs.CV

    Can we achieve robustness from data alone?

    Authors: Nikolaos Tsilivis, Jingtong Su, Julia Kempe

    Abstract: We introduce a meta-learning algorithm for adversarially robust classification. The proposed method tries to be as model agnostic as possible and optimizes a dataset prior to its deployment in a machine learning system, aiming to effectively erase its non-robust features. Once the dataset has been created, in principle no specialized algorithm (besides standard gradient descent) is needed to train… ▽ More

    Submitted 30 January, 2023; v1 submitted 24 July, 2022; originally announced July 2022.

  19. arXiv:2107.02306  [pdf, other

    cs.LG cs.CV

    Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity

    Authors: Artem Vysogorets, Julia Kempe

    Abstract: Neural network pruning is a fruitful area of research with surging interest in high sparsity regimes. Benchmarking in this domain heavily relies on faithful representation of the sparsity of subnetworks, which has been traditionally computed as the fraction of removed connections (direct sparsity). This definition, however, fails to recognize unpruned parameters that detached from input or output… ▽ More

    Submitted 7 April, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

  20. Hardness of approximation for quantum problems

    Authors: Sevag Gharibian, Julia Kempe

    Abstract: The polynomial hierarchy plays a central role in classical complexity theory. Here, we define a quantum generalization of the polynomial hierarchy, and initiate its study. We show that not only are there natural complete problems for the second level of this quantum hierarchy, but that these problems are in fact hard to approximate. Using these techniques, we also obtain hardness of approximation… ▽ More

    Submitted 5 September, 2012; originally announced September 2012.

    Comments: 21 pages, 1 figure, extended abstract appeared in Proceedings of the 39th International Colloquium on Automata, Languages and Programming (ICALP), pages 387-398, Springer, 2012

    Journal ref: Quantum Information & Computation 14 (5 & 6): 517-540, 2014. Also in Proceedings of ICALP 2012

  21. arXiv:1101.3884  [pdf, ps, other

    quant-ph cs.CC

    Approximation algorithms for QMA-complete problems

    Authors: Sevag Gharibian, Julia Kempe

    Abstract: Approximation algorithms for classical constraint satisfaction problems are one of the main research areas in theoretical computer science. Here we define a natural approximation version of the QMA-complete local Hamiltonian problem and initiate its study. We present two main results. The first shows that a non-trivial approximation ratio can be obtained in the class NP using product states. The s… ▽ More

    Submitted 20 January, 2011; originally announced January 2011.

    Comments: 22 pages, comments welcome

    Journal ref: SIAM Journal on Computing 41(4): 1028-1050, 2012. Also in Proceedings of 26th IEEE Conference on Computational Complexity (CCC), 178-188, 2011

  22. arXiv:1005.0512  [pdf, ps, other

    quant-ph cs.CC

    Two-Source Extractors Secure Against Quantum Adversaries

    Authors: Roy Kasher, Julia Kempe

    Abstract: We initiate the study of multi-source extractors in the quantum world. In this setting, our goal is to extract random bits from two independent weak random sources, on which two quantum adversaries store a bounded amount of information. Our main result is a two-source extractor secure against quantum adversaries, with parameters closely matching the classical case and tight in several instances. M… ▽ More

    Submitted 4 May, 2010; originally announced May 2010.

    Comments: 20 pages, no figures

  23. A Quantum Lovasz Local Lemma

    Authors: Andris Ambainis, Julia Kempe, Or Sattath

    Abstract: The Lovasz Local Lemma (LLL) is a powerful tool in probability theory to show the existence of combinatorial objects meeting a prescribed collection of "weakly dependent" criteria. We show that the LLL extends to a much more general geometric setting, where events are replaced with subspaces and probability is replaced with relative dimension, which allows to lower bound the dimension of the int… ▽ More

    Submitted 9 November, 2009; originally announced November 2009.

    Comments: 19 pages

    Journal ref: Journal of the ACM, Volume 59 Issue 5, October 2012, Article No. 24

  24. arXiv:0911.0201  [pdf, ps, other

    quant-ph cs.CC

    No Strong Parallel Repetition with Entangled and Non-signaling Provers

    Authors: Julia Kempe, Oded Regev

    Abstract: We consider one-round games between a classical verifier and two provers. One of the main questions in this area is the \emph{parallel repetition question}: If the game is played $\ell$ times in parallel, does the maximum winning probability decay exponentially in $\ell$? In the classical setting, this question was answered in the affirmative by Raz. More recently the question arose whether the… ▽ More

    Submitted 1 November, 2009; originally announced November 2009.

    Comments: 15 pages, 2 figures

  25. arXiv:quant-ph/0607174  [pdf, ps, other

    quant-ph cs.CC

    Exponential Separation of Quantum and Classical One-Way Communication Complexity for a Boolean Function

    Authors: Dmytro Gavinsky, Julia Kempe, Ronald de Wolf

    Abstract: We give an exponential separation between one-way quantum and classical communication complexity for a Boolean function. Earlier such a separation was known only for a relation. A very similar result was obtained earlier but independently by Kerenidis and Raz [KR06]. Our version of the result gives an example in the bounded storage model of cryptography, where the key is secure if the adversary… ▽ More

    Submitted 25 July, 2006; originally announced July 2006.

    Comments: 8 pages, no figures

  26. arXiv:quant-ph/0603173  [pdf, ps, other

    quant-ph cs.CC

    Strengths and Weaknesses of Quantum Fingerprinting

    Authors: Dmytro Gavinsky, Julia Kempe, Ronald de Wolf

    Abstract: We study the power of quantum fingerprints in the simultaneous message passing (SMP) setting of communication complexity. Yao recently showed how to simulate, with exponential overhead, classical shared-randomness SMP protocols by means of quantum SMP protocols without shared randomness ($Q^\parallel$-protocols). Our first result is to extend Yao's simulation to the strongest possible model: eve… ▽ More

    Submitted 20 March, 2006; originally announced March 2006.

    Comments: 13 pages, no figures, to appear in CCC'06

    Journal ref: Proc. 21st CCC (Complexity), p. 288-295 (2006)

  27. arXiv:quant-ph/0511013  [pdf, ps, other

    quant-ph cs.CC

    Bounded-Error Quantum State Identification and Exponential Separations in Communication Complexity

    Authors: Dmytro Gavinsky, Julia Kempe, Oded Regev, Ronald de Wolf

    Abstract: We consider the problem of bounded-error quantum state identification: given either state α_0 or state α_1, we are required to output `0', `1' or `?' ("don't know"), such that conditioned on outputting `0' or `1', our guess is correct with high probability. The goal is to maximize the probability of not outputting `?'. We prove a direct product theorem: if we're given two such problems, with opt… ▽ More

    Submitted 2 November, 2005; originally announced November 2005.

    Comments: 20 pages, no figures

  28. arXiv:quant-ph/0411051  [pdf, ps, other

    quant-ph cs.CC

    Quantum Communication Cannot Simulate a Public Coin

    Authors: Dmytro Gavinsky, Julia Kempe, Ronald de Wolf

    Abstract: We study the simultaneous message passing model of communication complexity. Building on the quantum fingerprinting protocol of Buhrman et al., Yao recently showed that a large class of efficient classical public-coin protocols can be turned into efficient quantum protocols without public coin. This raises the question whether this can be done always, i.e. whether quantum communication can alway… ▽ More

    Submitted 8 November, 2004; originally announced November 2004.

    Comments: 12 pages LaTeX

  29. arXiv:quant-ph/0406180  [pdf, ps, other

    quant-ph cs.CC

    The Complexity of the Local Hamiltonian Problem

    Authors: Julia Kempe, Alexei Kitaev, Oded Regev

    Abstract: The k-local Hamiltonian problem is a natural complete problem for the complexity class QMA, the quantum analog of NP. It is similar in spirit to MAX-k-SAT, which is NP-complete for k<=2. It was known that the problem is QMA-complete for any k <= 3. On the other hand 1-local Hamiltonian is in P, and hence not believed to be QMA-complete. The complexity of the 2-local Hamiltonian problem has long… ▽ More

    Submitted 2 October, 2005; v1 submitted 24 June, 2004; originally announced June 2004.

    Comments: 30 pages, 3 figures, replaced with revised version, numerous improvements to readability and expanded adiabatic section

    Journal ref: SIAM Journal of Computing, Vol. 35(5), p. 1070-1097 (2006), conference version in Proc. 24th FSTTCS, p. 372-383 (2004)

  30. arXiv:quant-ph/0406046  [pdf, ps, other

    quant-ph cs.CC

    The hidden subgroup problem and permutation group theory

    Authors: Julia Kempe, Aner Shalev

    Abstract: We employ concepts and tools from the theory of finite permutation groups in order to analyse the Hidden Subgroup Problem via Quantum Fourier Sampling (QFS) for the symmetric group. We show that under very general conditions both the weak and the random-strong form (strong form with random choices of basis) of QFS fail to provide any advantage over classical exhaustive search. In particular we g… ▽ More

    Submitted 8 June, 2004; originally announced June 2004.

    Comments: 12 pages

    Journal ref: Proc. 16th ACM-SIAM SODA, p. 1118-1125 (2005)

  31. arXiv:quant-ph/0402107  [pdf, ps, other

    quant-ph cs.DS

    Coins Make Quantum Walks Faster

    Authors: Andris Ambainis, Julia Kempe, Alexander Rivosh

    Abstract: We show how to search N items arranged on a $\sqrt{N}\times\sqrt{N}$ grid in time $O(\sqrt N \log N)$, using a discrete time quantum walk. This result for the first time exhibits a significant difference between discrete time and continuous time walks without coin degrees of freedom, since it has been shown recently that such a continuous time walk needs time $Ω(N)$ to perform the same task. Our… ▽ More

    Submitted 16 February, 2004; originally announced February 2004.

    Comments: 25 pages, no figures

    Journal ref: Proc. 16th ACM-SIAM SODA, p. 1099-1108 (2005)

  32. Quantum random walks - an introductory overview

    Authors: Julia Kempe

    Abstract: This article aims to provide an introductory survey on quantum random walks. Starting from a physical effect to illustrate the main ideas we will introduce quantum random walks, review some of their properties and outline their striking differences to classical walks. We will touch upon both physical effects and computer science applications, introducing some of the main concepts and language of… ▽ More

    Submitted 13 March, 2003; originally announced March 2003.

    Comments: 20 pages, 13 figures, to appear in Contemporary Physics

    Journal ref: Contemporary Physics, Vol. 44 (4), p.307-327, 2003

  33. arXiv:quant-ph/0302079  [pdf, ps, other

    quant-ph cs.CC

    3-Local Hamiltonian is QMA-complete

    Authors: Julia Kempe, Oded Regev

    Abstract: It has been shown by Kitaev that the 5-local Hamiltonian problem is QMA-complete. Here we reduce the locality of the problem by showing that 3-local Hamiltonian is already QMA-complete.

    Submitted 20 May, 2003; v1 submitted 10 February, 2003; originally announced February 2003.

    Comments: 7 pages, minor changes and corrections, published version

    Journal ref: Quantum Computation and Information, Vol. 3(3), p. 258-64, 2003

  34. A Quantum Random Walk Search Algorithm

    Authors: Neil Shenvi, Julia Kempe, K. Birgitta Whaley

    Abstract: Quantum random walks on graphs have been shown to display many interesting properties, including exponentially fast hitting times when compared with their classical counterparts. However, it is still unclear how to use these novel properties to gain an algorithmic speed-up over classical algorithms. In this paper, we present a quantum search algorithm based on the quantum random walk architectur… ▽ More

    Submitted 9 October, 2002; originally announced October 2002.

    Comments: 13 pages, 3 figures

    Journal ref: Phys. Rev. A, Vol. 67 (5), 052307 (2003)

  35. arXiv:quant-ph/0205083  [pdf, ps, other

    quant-ph cs.CC

    Quantum Random Walks Hit Exponentially Faster

    Authors: Julia Kempe

    Abstract: We show that the hitting time of the discrete time quantum random walk on the n-bit hypercube from one corner to its opposite is polynomial in n. This gives the first exponential quantum-classical gap in the hitting time of discrete quantum random walks. We provide the framework for quantum hitting time and give two alternative definitions to set the ground for its study on general graphs. We th… ▽ More

    Submitted 14 May, 2002; originally announced May 2002.

    Comments: 15 pages, no Figures

    Journal ref: Probability Theory and Related Fields, Vol. 133(2), p. 215-235 (2005), conference version in Proc. 7th RANDOM, p. 354-69, 2003