Zum Hauptinhalt springen

Showing 1–20 of 20 results for author: May, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.15237  [pdf, other

    cs.LG cs.AI

    The Mamba in the Llama: Distilling and Accelerating Hybrid Models

    Authors: Junxiong Wang, Daniele Paliotta, Avner May, Alexander M. Rush, Tri Dao

    Abstract: Linear RNN architectures, like Mamba, can be competitive with Transformer models in language modeling while having advantageous deployment characteristics. Given the focus on training large-scale Transformer models, we consider the challenge of converting these pretrained models for deployment. We demonstrate that it is feasible to distill large Transformers into linear RNNs by reusing the linear… ▽ More

    Submitted 27 August, 2024; originally announced August 2024.

    Comments: Code is open-sourced at https://github.com/jxiw/MambaInLlama

  2. arXiv:2406.02532  [pdf, other

    cs.CL

    SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

    Authors: Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin

    Abstract: As large language models gain widespread adoption, running them efficiently becomes crucial. Recent works on LLM inference use speculative decoding to achieve extreme speedups. However, most of these works implicitly design their algorithms for high-end datacenter hardware. In this work, we ask the opposite question: how fast can we run LLMs on consumer machines? Consumer GPUs can no longer fit th… ▽ More

    Submitted 25 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: preprint

  3. arXiv:2404.03039  [pdf, ps, other

    cs.FL

    Illustrating Finite Automata with Grail+ and TikZ

    Authors: Alastair May, Taylor J. Smith

    Abstract: In this article, we discuss a new software tool that interacts with Grail+, a library of automata-theoretic command-line utilities. Our software, the Grail+ Visualizer, takes the textual representation of a finite automaton produced by Grail+ and generates TikZ code to illustrate the finite automaton, with automatic layout of states and transitions. In addition to giving an overview of the basics… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    MSC Class: 68-04 (primary); 68Q45 (secondary)

  4. arXiv:2402.12374  [pdf, other

    cs.CL

    Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding

    Authors: Zhuoming Chen, Avner May, Ruslan Svirschevski, Yuhsun Huang, Max Ryabinin, Zhihao Jia, Beidi Chen

    Abstract: As the usage of large language models (LLMs) grows, performing efficient inference with these models becomes increasingly important. While speculative decoding has recently emerged as a promising direction for speeding up inference, existing methods are limited in their ability to scale to larger speculation budgets, and adapt to different hyperparameters and hardware. This paper introduces Sequoi… ▽ More

    Submitted 29 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

  5. arXiv:2312.09369  [pdf, other

    cs.SD cs.AI eess.AS

    Audio-visual fine-tuning of audio-only ASR models

    Authors: Avner May, Dmitriy Serdyuk, Ankit Parag Shah, Otavio Braga, Olivier Siohan

    Abstract: Audio-visual automatic speech recognition (AV-ASR) models are very effective at reducing word error rates on noisy speech, but require large amounts of transcribed AV training data. Recently, audio-visual self-supervised learning (SSL) approaches have been developed to reduce this dependence on transcribed AV data, but these methods are quite complex and computationally expensive. In this work, we… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

  6. Pseudonymization at Scale: OLCF's Summit Usage Data Case Study

    Authors: Ketan Maheshwari, Sean R. Wilkinson, Alex May, Tyler Skluzacek, Olga A. Kuchar, Rafael Ferreira da Silva

    Abstract: The analysis of vast amounts of data and the processing of complex computational jobs have traditionally relied upon high performance computing (HPC) systems. Understanding these analyses' needs is paramount for designing solutions that can lead to better science, and similarly, understanding the characteristics of the user behavior on those systems is important for improving user experiences on H… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: 9 pages, 5 figures, accepted to BTSD 2022 workshop (see https://sites.google.com/view/btsd2022 for more information), to be published in the proceedings of IEEE Big Data 2022

  7. How Not to Protect Your IP -- An Industry-Wide Break of IEEE 1735 Implementations

    Authors: Julian Speith, Florian Schweins, Maik Ender, Marc Fyrbiak, Alexander May, Christof Paar

    Abstract: Modern hardware systems are composed of a variety of third-party Intellectual Property (IP) cores to implement their overall functionality. Since hardware design is a globalized process involving various (untrusted) stakeholders, a secure management of the valuable IP between authors and users is inevitable to protect them from unauthorized access and modification. To this end, the widely adopted… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

  8. arXiv:2005.09117  [pdf, other

    cs.CL cs.LG

    Contextual Embeddings: When Are They Worth It?

    Authors: Simran Arora, Avner May, Jian Zhang, Christopher Ré

    Abstract: We study the settings for which deep contextual embeddings (e.g., BERT) give large improvements in performance relative to classic pretrained embeddings (e.g., GloVe), and an even simpler baseline---random word embeddings---focusing on the impact of the training set size and the linguistic properties of the task. Surprisingly, we find that both of these simpler baselines can match contextual embed… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  9. arXiv:2003.04983  [pdf, other

    cs.CL cs.LG stat.ML

    Understanding the Downstream Instability of Word Embeddings

    Authors: Megan Leszczynski, Avner May, Jian Zhang, Sen Wu, Christopher R. Aberger, Christopher Ré

    Abstract: Many industrial machine learning (ML) systems require frequent retraining to keep up-to-date with constantly changing data. This retraining exacerbates a large challenge facing ML systems today: model training is unstable, i.e., small changes in training data can cause significant changes in the model's predictions. In this paper, we work on developing a deeper understanding of this instability, w… ▽ More

    Submitted 28 February, 2020; originally announced March 2020.

    Comments: In Proceedings of the 3rd MLSys Conference, 2020

  10. arXiv:1910.00802  [pdf, other

    cs.CR quant-ph

    Noisy Simon Period Finding

    Authors: Alexander May, Lars Schlieper, Jonathan Schwinger

    Abstract: Let $f: \mathbb{F}_2^n \rightarrow \mathbb{F}_2^n$ be a Boolean function with period $\vec s$. It is well-known that Simon's algorithm finds $\vec s$ in time polynomial in $n$ on quantum devices that are capable of performing error-correction. However, today's quantum devices are inherently noisy, too limited for error correction, and Simon's algorithm is not error-tolerant. We show that even no… ▽ More

    Submitted 10 March, 2021; v1 submitted 2 October, 2019; originally announced October 2019.

  11. arXiv:1909.01264  [pdf, other

    cs.LG stat.ML

    On the Downstream Performance of Compressed Word Embeddings

    Authors: Avner May, Jian Zhang, Tri Dao, Christopher Ré

    Abstract: Compressing word embeddings is important for deploying NLP models in memory-constrained settings. However, understanding what makes compressed embeddings perform well on downstream tasks is challenging---existing measures of compression quality often fail to distinguish between embeddings that perform well and those that do not. We thus propose the eigenspace overlap score as a new measure. We rel… ▽ More

    Submitted 14 January, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

    Comments: NeurIPS 2019 spotlight (Conference on Neural Information Processing Systems)

  12. arXiv:1907.04295   

    cs.DS cs.CR

    Better Sample -- Random Subset Sum in $2^{0.255n}$ and its Impact on Decoding Random Linear Codes

    Authors: Andre Esser, Alexander May

    Abstract: We propose a new heuristic algorithm for solving random subset sum instances $a_1, \ldots, a_n, t \in \mathbb{Z}_{2^n}$, which play a crucial role in cryptographic constructions. Our algorithm is search tree-based and solves the instances in a divide-and-conquer method using the representation method. From a high level perspective, our algorithm is similar to the algorithm of Howgrave-Graham-Joux… ▽ More

    Submitted 21 October, 2019; v1 submitted 9 July, 2019; originally announced July 2019.

    Comments: Issue with counting duplicate representations

  13. arXiv:1905.10074  [pdf, other

    cs.CR quant-ph

    Quantum Period Finding is Compression Robust

    Authors: Alexander May, Lars Schlieper

    Abstract: We study quantum period finding algorithms such as Simon and Shor (and its variants Ekerå-Håstad and Mosca-Ekert). For a periodic function $f$ these algorithms produce -- via some quantum embedding of $f$ -- a quantum superposition $\sum_x |x\rangle|f(x)\rangle$, which requires a certain amount of output qubits that represent $|f(x)\rangle$. We show that one can lower this amount to a single outpu… ▽ More

    Submitted 15 February, 2021; v1 submitted 24 May, 2019; originally announced May 2019.

  14. Gender Differences in Participation and Reward on Stack Overflow

    Authors: Anna May, Johannes Wachs, Aniko Hannak

    Abstract: Programming is a valuable skill in the labor market, making the underrepresentation of women in computing an increasingly important issue. Online question and answer platforms serve a dual purpose in this field: they form a body of knowledge useful as a reference and learning tool, and they provide opportunities for individuals to demonstrate credible, verifiable expertise. Issues, such as male-or… ▽ More

    Submitted 28 November, 2018; originally announced November 2018.

    Journal ref: Empirical Software Engineering 2019

  15. arXiv:1811.00155  [pdf, other

    cs.LG cs.AI stat.ML

    Low-Precision Random Fourier Features for Memory-Constrained Kernel Approximation

    Authors: Jian Zhang, Avner May, Tri Dao, Christopher Ré

    Abstract: We investigate how to train kernel approximation methods that generalize well under a memory budget. Building on recent theoretical work, we define a measure of kernel approximation error which we find to be more predictive of the empirical generalization performance of kernel approximation methods than conventional metrics. An important consequence of this definition is that a kernel approximatio… ▽ More

    Submitted 20 March, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: International Conference on Artificial Intelligence and Statistics (AISTATS) 2019

  16. arXiv:1701.03577  [pdf, ps, other

    stat.ML cs.AI cs.CL cs.LG

    Kernel Approximation Methods for Speech Recognition

    Authors: Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha

    Abstract: We study large-scale kernel methods for acoustic modeling in speech recognition and compare their performance to deep neural networks (DNNs). We perform experiments on four speech recognition datasets, including the TIMIT and Broadcast News benchmark tasks, and compare these two types of models on frame-level performance metrics (accuracy, cross-entropy), as well as on recognition metrics (word/ch… ▽ More

    Submitted 13 January, 2017; originally announced January 2017.

  17. arXiv:1604.07163  [pdf, other

    cs.MS

    Extreme-scale Multigrid Components within PETSc

    Authors: Dave A. May, Patrick Sanan, Karl Rupp, Matthew G. Knepley, Barry F. Smith

    Abstract: Elliptic partial differential equations (PDEs) frequently arise in continuum descriptions of physical processes relevant to science and engineering. Multilevel preconditioners represent a family of scalable techniques for solving discrete PDEs of this type and thus are the method of choice for high-resolution simulations. The scalability and time-to-solution of massively parallel multilevel precon… ▽ More

    Submitted 25 April, 2016; originally announced April 2016.

  18. arXiv:1603.05800  [pdf, ps, other

    cs.LG stat.ML

    A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition

    Authors: Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurelien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha

    Abstract: We study large-scale kernel methods for acoustic modeling and compare to DNNs on performance metrics related to both acoustic modeling and recognition. Measuring perplexity and frame-level classification accuracy, kernel-based acoustic models are as effective as their DNN counterparts. However, on token-error-rates DNN models can be significantly better. We have discovered that this might be attri… ▽ More

    Submitted 18 March, 2016; originally announced March 2016.

    Comments: arXiv admin note: text overlap with arXiv:1411.4000

  19. arXiv:1411.4000  [pdf, other

    cs.LG cs.AI stat.ML

    How to Scale Up Kernel Methods to Be As Good As Deep Neural Nets

    Authors: Zhiyun Lu, Avner May, Kuan Liu, Alireza Bagheri Garakani, Dong Guo, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha

    Abstract: The computational complexity of kernel methods has often been a major barrier for applying them to large-scale learning problems. We argue that this barrier can be effectively overcome. In particular, we develop methods to scale up kernel models to successfully tackle large-scale learning problems that are so far only approachable by deep learning architectures. Based on the seminal work by Rahimi… ▽ More

    Submitted 17 June, 2015; v1 submitted 14 November, 2014; originally announced November 2014.

  20. arXiv:1107.5951  [pdf, other

    cs.CE cs.DC physics.geo-ph

    Optimal, scalable forward models for computing gravity anomalies

    Authors: Dave A. May, Matthew G. Knepley

    Abstract: We describe three approaches for computing a gravity signal from a density anomaly. The first approach consists of the classical "summation" technique, whilst the remaining two methods solve the Poisson problem for the gravitational potential using either a Finite Element (FE) discretization employing a multilevel preconditioner, or a Green's function evaluated with the Fast Multipole Method (FMM)… ▽ More

    Submitted 29 July, 2011; originally announced July 2011.

    Comments: 38 pages, 13 figures; accepted by Geophysical Journal International

    Journal ref: Geophysical Journal International, 187(1):161-177, 2011