Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Saxena, U

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.05646  [pdf, other

    cs.LG cs.AI cs.CL

    Eigen Attention: Attention in Low-Rank Space for KV Cache Compression

    Authors: Utkarsh Saxena, Gobinda Saha, Sakshi Choudhary, Kaushik Roy

    Abstract: Large language models (LLMs) represent a groundbreaking advancement in the domain of natural language processing due to their impressive reasoning abilities. Recently, there has been considerable interest in increasing the context lengths for these models to enhance their applicability to complex tasks. However, at long context lengths and large batch sizes, the key-value (KV) cache, which stores… ▽ More

    Submitted 10 August, 2024; originally announced August 2024.

    Comments: 12 page, 6 figures, 6 tables

  2. arXiv:2403.13577  [pdf, other

    cs.AR

    HCiM: ADC-Less Hybrid Analog-Digital Compute in Memory Accelerator for Deep Learning Workloads

    Authors: Shubham Negi, Utkarsh Saxena, Deepika Sharma, Kaushik Roy

    Abstract: Analog Compute-in-Memory (CiM) accelerators are increasingly recognized for their efficiency in accelerating Deep Neural Networks (DNN). However, their dependence on Analog-to-Digital Converters (ADCs) for accumulating partial sums from crossbars leads to substantial power and area overhead. Moreover, the high area overhead of ADCs constrains the throughput due to the limited number of ADCs that c… ▽ More

    Submitted 20 March, 2024; originally announced March 2024.

  3. Hardware/Software co-design with ADC-Less In-memory Computing Hardware for Spiking Neural Networks

    Authors: Marco Paul E. Apolinario, Adarsh Kumar Kosta, Utkarsh Saxena, Kaushik Roy

    Abstract: Spiking Neural Networks (SNNs) are bio-plausible models that hold great potential for realizing energy-efficient implementations of sequential tasks on resource-constrained edge devices. However, commercial edge platforms based on standard GPUs are not optimized to deploy SNNs, resulting in high energy and latency. While analog In-Memory Computing (IMC) platforms can serve as energy-efficient infe… ▽ More

    Submitted 4 June, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: 13 pages, 14 figures

    Journal ref: IEEE Transactions on Emerging Topics in Computing (2023)

  4. arXiv:2101.10552  [pdf, other

    cs.LG

    A Unified Paths Perspective for Pruning at Initialization

    Authors: Thomas Gebhart, Udit Saxena, Paul Schrater

    Abstract: A number of recent approaches have been proposed for pruning neural network parameters at initialization with the goal of reducing the size and computational burden of models while minimally affecting their training dynamics and generalization performance. While each of these approaches have some amount of well-founded motivation, a rigorous analysis of the effect of these pruning methods on netwo… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: 13 pages, 3 figures

  5. arXiv:1907.00625  [pdf, other

    cs.NE eess.SY

    On-chip learning in a conventional silicon MOSFET based Analog Hardware Neural Network

    Authors: Nilabjo Dey, Janak Sharda, Utkarsh Saxena, Divya Kaushik, Utkarsh Singh, Debanjan Bhowmik

    Abstract: On-chip learning in a crossbar array based analog hardware Neural Network (NN) has been shown to have major advantages in terms of speed and energy compared to training NN on a traditional computer. However analog hardware NN proposals and implementations thus far have mostly involved Non Volatile Memory (NVM) devices like Resistive Random Access Memory (RRAM), Phase Change Memory (PCM), spintroni… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: 18 pages, 10 figures, 1 table (shorter version submitted to conference for review)

  6. arXiv:1811.09966  [pdf, other

    physics.app-ph cs.ET cs.NE

    On-chip learning for domain wall synapse based Fully Connected Neural Network

    Authors: Apoorv Dankar, Anand Verma, Utkarsh Saxena, Divya Kaushik, Shouri Chatterjee, Debanjan Bhowmik

    Abstract: Spintronic devices are considered as promising candidates in implementing neuromorphic systems or hardware neural networks, which are expected to perform better than other existing computing systems for certain data classification and regression tasks. In this paper, we have designed a feedforward Fully Connected Neural Network (FCNN) with no hidden layer using spin orbit torque driven domain wall… ▽ More

    Submitted 25 November, 2018; originally announced November 2018.

    Comments: Submitted on November 5, 2018 for review in journal

    Report number: Accepted for publication in Journal of Magnetism and Magnetic Materials on June 7, 2019

    Journal ref: Journal of Magnetism and Magnetic Materials vol. 489, no. 165434, 2019