Zum Hauptinhalt springen

Showing 1–12 of 12 results for author: Kuzmin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14983  [pdf, other

    cs.LG cs.IR stat.ML

    Hierarchical thematic classification of major conference proceedings

    Authors: Arsentii Kuzmin, Alexander Aduenko, Vadim Strijov

    Abstract: In this paper, we develop a decision support system for the hierarchical text classification. We consider text collections with a fixed hierarchical structure of topics given by experts in the form of a tree. The system sorts the topics by relevance to a given document. The experts choose one of the most relevant topics to finish the classification. We propose a weighted hierarchical similarity fu… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  2. arXiv:2402.15319  [pdf, other

    cs.LG cs.CL

    GPTVQ: The Blessing of Dimensionality for LLM Quantization

    Authors: Mart van Baalen, Andrey Kuzmin, Markus Nagel, Peter Couperus, Cedric Bastoul, Eric Mahurin, Tijmen Blankevoort, Paul Whatmough

    Abstract: In this work we show that the size versus accuracy trade-off of neural network quantization can be significantly improved by increasing the quantization dimensionality. We propose the GPTVQ method, a new fast method for post-training vector quantization (VQ) that scales well to Large Language Models (LLMs). Our method interleaves quantization of one or more columns with updates to the remaining un… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  3. arXiv:2311.09770  [pdf, other

    cs.SD eess.AS

    DINO-VITS: Data-Efficient Zero-Shot TTS with Self-Supervised Speaker Verification Loss for Noise Robustness

    Authors: Vikentii Pankov, Valeria Pronina, Alexander Kuzmin, Maksim Borisov, Nikita Usoltsev, Xingshan Zeng, Alexander Golubkov, Nikolai Ermolenko, Aleksandra Shirshova, Yulia Matveeva

    Abstract: We address zero-shot TTS systems' noise-robustness problem by proposing a dual-objective training for the speaker encoder using self-supervised DINO loss. This approach enhances the speaker encoder with the speech synthesis objective, capturing a wider range of speech characteristics beneficial for voice cloning. At the same time, the DINO objective improves speaker representation learning, ensuri… ▽ More

    Submitted 18 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

    Comments: Accepted to Interspeech2024

  4. arXiv:2307.02973  [pdf, other

    cs.LG

    Pruning vs Quantization: Which is Better?

    Authors: Andrey Kuzmin, Markus Nagel, Mart van Baalen, Arash Behboodi, Tijmen Blankevoort

    Abstract: Neural network pruning and quantization techniques are almost as old as neural networks themselves. However, to date only ad-hoc comparisons between the two have been published. In this paper, we set out to answer the question on which is better: neural network quantization or pruning? By answering this question, we hope to inform design decisions made on neural network hardware going forward. We… ▽ More

    Submitted 16 February, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

  5. arXiv:2303.17951  [pdf, other

    cs.LG

    FP8 versus INT8 for efficient deep learning inference

    Authors: Mart van Baalen, Andrey Kuzmin, Suparna S Nair, Yuwei Ren, Eric Mahurin, Chirag Patel, Sundar Subramanian, Sanghyuk Lee, Markus Nagel, Joseph Soriaga, Tijmen Blankevoort

    Abstract: Recently, the idea of using FP8 as a number format for neural network training has been floating around the deep learning world. Given that most training is currently conducted with entire networks in FP32, or sometimes FP16 with mixed-precision, the step to having some parts of a network run in FP8 with 8-bit weights is an appealing potential speed-up for the generally costly and time-intensive t… ▽ More

    Submitted 15 June, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

  6. arXiv:2208.09225  [pdf, other

    cs.LG

    FP8 Quantization: The Power of the Exponent

    Authors: Andrey Kuzmin, Mart Van Baalen, Yuwei Ren, Markus Nagel, Jorn Peters, Tijmen Blankevoort

    Abstract: When quantizing neural networks for efficient inference, low-bit integers are the go-to format for efficiency. However, low-bit floating point numbers have an extra degree of freedom, assigning some bits to work on an exponential scale instead. This paper in-depth investigates this benefit of the floating point format for neural network inference. We detail the choices that can be made for the FP8… ▽ More

    Submitted 23 February, 2024; v1 submitted 19 August, 2022; originally announced August 2022.

  7. arXiv:2207.11048  [pdf, other

    cs.LG

    Quantized Sparse Weight Decomposition for Neural Network Compression

    Authors: Andrey Kuzmin, Mart van Baalen, Markus Nagel, Arash Behboodi

    Abstract: In this paper, we introduce a novel method of neural network weight compression. In our method, we store weight tensors as sparse, quantized matrix factors, whose product is computed on the fly during inference to generate the target model's weights. We use projected gradient descent methods to find quantized and sparse factorization of the weight tensors. We show that this approach can be seen as… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

  8. arXiv:2202.01290  [pdf, other

    cs.LG cs.CV

    Cyclical Pruning for Sparse Neural Networks

    Authors: Suraj Srinivas, Andrey Kuzmin, Markus Nagel, Mart van Baalen, Andrii Skliar, Tijmen Blankevoort

    Abstract: Current methods for pruning neural network weights iteratively apply magnitude-based pruning on the model weights and re-train the resulting model to recover lost accuracy. In this work, we show that such strategies do not allow for the recovery of erroneously pruned weights. To enable weight recovery, we propose a simple strategy called \textit{cyclical pruning} which requires the pruning schedul… ▽ More

    Submitted 2 February, 2022; originally announced February 2022.

  9. arXiv:2109.09684  [pdf

    eess.SP cs.SD eess.AS physics.ao-ph

    Development of In Situ Acoustic Instruments for The Aquatic Environment Study

    Authors: Aleksandr N. Grekov, Nikolay A. Grekov, Evgeniy Sychov, K. A. Kuzmin

    Abstract: Based on the analysis of existing acoustic methods and instruments, a prototype of an automated instrument has been developed to perform joint measurements in situ of two parameters: sound speed and ultrasound attenuation. The device is based on existing sound velocity profilers. It was proposed to replace the TDC-GP22 converters used in the sound speed meter ISZ-1 with more advanced modern modifi… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: 8 pages, 3 figures

    Journal ref: Monitoring systems of environment 2 (2019): 22-29

  10. arXiv:1912.09802  [pdf, other

    cs.LG cs.CV stat.ML

    Taxonomy and Evaluation of Structured Compression of Convolutional Neural Networks

    Authors: Andrey Kuzmin, Markus Nagel, Saurabh Pitre, Sandeep Pendyam, Tijmen Blankevoort, Max Welling

    Abstract: The success of deep neural networks in many real-world applications is leading to new challenges in building more efficient architectures. One effective way of making networks more efficient is neural network compression. We provide an overview of existing neural network compression methods that can be used to make neural networks more efficient by changing the architecture of the network. First,… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

  11. arXiv:1612.07697  [pdf, other

    cs.CV

    Set2Model Networks: Learning Discriminatively To Learn Generative Models

    Authors: A. Vakhitov, A. Kuzmin, V. Lempitsky

    Abstract: We present a new "learning-to-learn"-type approach that enables rapid learning of concepts from small-to-medium sized training sets and is primarily designed for web-initialized image retrieval. At the core of our approach is a deep architecture (a Set2Model network) that maps sets of examples to simple generative probabilistic models such as Gaussians or mixtures of Gaussians in the space of high… ▽ More

    Submitted 27 October, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

  12. arXiv:1611.05689  [pdf, other

    cs.CV

    End-to-end Learning of Cost-Volume Aggregation for Real-time Dense Stereo

    Authors: Andrey Kuzmin, Dmitry Mikushin, Victor Lempitsky

    Abstract: We present a new deep learning-based approach for dense stereo matching. Compared to previous works, our approach does not use deep learning of pixel appearance descriptors, employing very fast classical matching scores instead. At the same time, our approach uses a deep convolutional network to predict the local parameters of cost volume aggregation process, which in this paper we implement using… ▽ More

    Submitted 17 November, 2016; originally announced November 2016.