Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Chern, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.06313  [pdf, other

    cs.LG cs.CL cs.CV stat.ML

    The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers

    Authors: Zonglin Li, Chong You, Srinadh Bhojanapalli, Daliang Li, Ankit Singh Rawat, Sashank J. Reddi, Ke Ye, Felix Chern, Felix Yu, Ruiqi Guo, Sanjiv Kumar

    Abstract: This paper studies the curious phenomenon for machine learning models with Transformer architectures that their activation maps are sparse. By activation map we refer to the intermediate output of the multi-layer perceptrons (MLPs) after a ReLU activation function, and by sparse we mean that on average very few entries (e.g., 3.0% for T5-Base and 6.3% for ViT-B16) are nonzero for each input to MLP… ▽ More

    Submitted 9 June, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: A short version was presented at ICLR 2023. Previous title: Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers

  2. arXiv:2206.14286  [pdf, ps, other

    cs.PF cs.LG

    TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

    Authors: Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar

    Abstract: This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivated by an accurate accelerator performance model that takes into account both the memory and instruction bottlenecks. Our algorithm comes with an analyt… ▽ More

    Submitted 30 June, 2022; v1 submitted 28 June, 2022; originally announced June 2022.

  3. arXiv:1908.10396  [pdf, other

    cs.LG stat.ML

    Accelerating Large-Scale Inference with Anisotropic Vector Quantization

    Authors: Ruiqi Guo, Philip Sun, Erik Lindgren, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar

    Abstract: Quantization based techniques are the current state-of-the-art for scaling maximum inner product search to massive databases. Traditional approaches to quantization aim to minimize the reconstruction error of the database points. Based on the observation that for a given query, the database points that have the largest inner products are more relevant, we develop a family of anisotropic quantizati… ▽ More

    Submitted 4 December, 2020; v1 submitted 27 August, 2019; originally announced August 2019.