Search | arXiv e-print repository

Employing Vector Field Techniques on the Analysis of Memristor Cellular Nonlinear Networks Cell Dynamics

Authors: Chandan Singh, Vasileios Ntinas, Dimitrios Prousalis, Yongmin Wang, Ahmet Samil Demirkol, Ioannis Messaris, Vikas Rana, Stephan Menzel, Alon Ascoli, Ronald Tetzlaff

Abstract: This paper introduces an innovative graphical analysis tool for investigating the dynamics of Memristor Cellular Nonlinear Networks (M-CNNs) featuring 2nd-order processing elements, known as M-CNN cells. In the era of specialized hardware catering to the demands of intelligent autonomous systems, the integration of memristors within Cellular Nonlinear Networks (CNNs) has emerged as a promising par… ▽ More This paper introduces an innovative graphical analysis tool for investigating the dynamics of Memristor Cellular Nonlinear Networks (M-CNNs) featuring 2nd-order processing elements, known as M-CNN cells. In the era of specialized hardware catering to the demands of intelligent autonomous systems, the integration of memristors within Cellular Nonlinear Networks (CNNs) has emerged as a promising paradigm due to their exceptional characteristics. However, the standard Dynamic Route Map (DRM) analysis, applicable to 1st-order systems, fails to address the intricacies of 2nd-order M-CNN cell dynamics, as well the 2nd-order DRM (DRM2) exhibits limitations on the graphical illustration of local dynamical properties of the M-CNN cells, e.g. state derivative's magnitude. To address this limitation, we propose a novel integration of M-CNN cell vector field into the cell's phase portrait, enhancing the analysis efficacy and enabling efficient M-CNN cell design. A comprehensive exploration of M-CNN cell dynamics is presented, showcasing the utility of the proposed graphical tool for various scenarios, including bistable and monostable behavior, and demonstrating its superior ability to reveal subtle variations in cell behavior. Through this work, we offer a refined perspective on the analysis and design of M-CNNs, paving the way for advanced applications in edge computing and specialized hardware. △ Less

Submitted 6 August, 2024; originally announced August 2024.

Comments: Presented at the 18th IEEE International Workshop on Cellular Nanoscale Networks and their Applications (CNNA'23) and the 8th Memristor and Memristive Symposium

arXiv:2407.02921 [pdf, other]

In-Memory Mirroring: Cloning Without Reading

Authors: Simranjeet Singh, Ankit Bende, Chandan Kumar Jha, Vikas Rana, Rolf Drechsler, Sachin Patkar, Farhad Merchant

Abstract: In-memory computing (IMC) has gained significant attention recently as it attempts to reduce the impact of memory bottlenecks. Numerous schemes for digital IMC are presented in the literature, focusing on logic operations. Often, an application's description has data dependencies that must be resolved. Contemporary IMC architectures perform read followed by write operations for this purpose, which… ▽ More In-memory computing (IMC) has gained significant attention recently as it attempts to reduce the impact of memory bottlenecks. Numerous schemes for digital IMC are presented in the literature, focusing on logic operations. Often, an application's description has data dependencies that must be resolved. Contemporary IMC architectures perform read followed by write operations for this purpose, which results in performance and energy penalties. To solve this fundamental problem, this paper presents in-memory mirroring (IMM). IMM eliminates the need for read and write-back steps, thus avoiding energy and performance penalties. Instead, we perform data movement within memory, involving row-wise and column-wise data transfers. Additionally, the IMM scheme enables parallel cloning of entire row (word) with a complexity of $\mathcal{O}(1)$. Moreover, our analysis of the energy consumption of the proposed technique using resistive random-access memory crossbar and experimentally validated JART VCM v1b model. The IMM increases energy efficiency and shows 2$\times$ performance improvement compared to conventional data movement methods. △ Less

Submitted 4 July, 2024; v1 submitted 3 July, 2024; originally announced July 2024.

Comments: Accepted in IFIP/IEEE VLSI-SoC 2024

arXiv:2404.09818 [pdf, other]

Error Detection and Correction Codes for Safe In-Memory Computations

Authors: Luca Parrini, Taha Soliman, Benjamin Hettwer, Jan Micha Borrmann, Simranjeet Singh, Ankit Bende, Vikas Rana, Farhad Merchant, Norbert Wehn

Abstract: In-Memory Computing (IMC) introduces a new paradigm of computation that offers high efficiency in terms of latency and power consumption for AI accelerators. However, the non-idealities and defects of emerging technologies used in advanced IMC can severely degrade the accuracy of inferred Neural Networks (NN) and lead to malfunctions in safety-critical applications. In this paper, we investigate a… ▽ More In-Memory Computing (IMC) introduces a new paradigm of computation that offers high efficiency in terms of latency and power consumption for AI accelerators. However, the non-idealities and defects of emerging technologies used in advanced IMC can severely degrade the accuracy of inferred Neural Networks (NN) and lead to malfunctions in safety-critical applications. In this paper, we investigate an architectural-level mitigation technique based on the coordinated action of multiple checksum codes, to detect and correct errors at run-time. This implementation demonstrates higher efficiency in recovering accuracy across different AI algorithms and technologies compared to more traditional methods such as Triple Modular Redundancy (TMR). The results show that several configurations of our implementation recover more than 91% of the original accuracy with less than half of the area required by TMR and less than 40% of latency overhead. △ Less

Submitted 15 April, 2024; originally announced April 2024.

Comments: This paper will be presented at 29th IEEE European Test Symposium 2024 (ETS) 2024

arXiv:2310.10460 [pdf, other]

Experimental Validation of Memristor-Aided Logic Using 1T1R TaOx RRAM Crossbar Array

Authors: Ankit Bende, Simranjeet Singh, Chandan Kumar Jha, Tim Kempen, Felix Cüppers, Christopher Bengel, Andre Zambanini, Dennis Nielinger, Sachin Patkar, Rolf Drechsler, Rainer Waser, Farhad Merchant, Vikas Rana

Abstract: Memristor-aided logic (MAGIC) design style holds a high promise for realizing digital logic-in-memory functionality. The ability to implement a specific gate in a MAGIC design style hinges on the SET-to-RESET threshold ratio. The TaOx memristive devices exhibit distinct SET-to-RESET ratios, enabling the implementation of OR and NOT operations. As the adoption of the MAGIC design style gains moment… ▽ More Memristor-aided logic (MAGIC) design style holds a high promise for realizing digital logic-in-memory functionality. The ability to implement a specific gate in a MAGIC design style hinges on the SET-to-RESET threshold ratio. The TaOx memristive devices exhibit distinct SET-to-RESET ratios, enabling the implementation of OR and NOT operations. As the adoption of the MAGIC design style gains momentum, it becomes crucial to understand the breakdown of energy consumption in the various phases of its operation. This paper presents experimental demonstrations of the OR and NOT gates on a 1T1R crossbar array. Additionally, it provides insights into the energy distribution for performing these operations at different stages. Through our experiments across different gates, we found that the energy consumption is dominated by initialization in the MAGIC design style. The energy split-up is 14.8%, 85%, and 0.2% for execution, initialization, and read operations respectively. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: Accepted in VLSID 2024

arXiv:2309.04868 [pdf, other]

MemSPICE: Automated Simulation and Energy Estimation Framework for MAGIC-Based Logic-in-Memory

Authors: Simranjeet Singh, Chandan Kumar Jha, Ankit Bende, Vikas Rana, Sachin Patkar, Rolf Drechsler, Farhad Merchant

Abstract: Existing logic-in-memory (LiM) research is limited to generating mappings and micro-operations. In this paper, we present~\emph{MemSPICE}, a novel framework that addresses this gap by automatically generating both the netlist and testbench needed to evaluate the LiM on a memristive crossbar. MemSPICE goes beyond conventional approaches by providing energy estimation scripts to calculate the precis… ▽ More Existing logic-in-memory (LiM) research is limited to generating mappings and micro-operations. In this paper, we present~\emph{MemSPICE}, a novel framework that addresses this gap by automatically generating both the netlist and testbench needed to evaluate the LiM on a memristive crossbar. MemSPICE goes beyond conventional approaches by providing energy estimation scripts to calculate the precise energy consumption of the testbench at the SPICE level. We propose an automated framework that utilizes the mapping obtained from the SIMPLER tool to perform accurate energy estimation through SPICE simulations. To the best of our knowledge, no existing framework is capable of generating a SPICE netlist from a hardware description language. By offering a comprehensive solution for SPICE-based netlist generation, testbench creation, and accurate energy estimation, MemSPICE empowers researchers and engineers working on memristor-based LiM to enhance their understanding and optimization of energy usage in these systems. Finally, we tested the circuits from the ISCAS'85 benchmark on MemSPICE and conducted a detailed energy analysis. △ Less

Submitted 9 September, 2023; originally announced September 2023.

Comments: Accepted in ASP-DAC 2024

arXiv:2307.16352 [pdf, other]

Semi-Quantitative Group Testing for Efficient and Accurate qPCR Screening of Pathogens with a Wide Range of Loads

Authors: Ananthan Nambiar, Chao Pan, Vishal Rana, Mahdi Cheraghchi, João Ribeiro, Sergei Maslov, Olgica Milenkovic

Abstract: Pathogenic infections pose a significant threat to global health, affecting millions of people every year and presenting substantial challenges to healthcare systems worldwide. Efficient and timely testing plays a critical role in disease control and transmission prevention. Group testing is a well-established method for reducing the number of tests needed to screen large populations when the dise… ▽ More Pathogenic infections pose a significant threat to global health, affecting millions of people every year and presenting substantial challenges to healthcare systems worldwide. Efficient and timely testing plays a critical role in disease control and transmission prevention. Group testing is a well-established method for reducing the number of tests needed to screen large populations when the disease prevalence is low. However, it does not fully utilize the quantitative information provided by qPCR methods, nor is it able to accommodate a wide range of pathogen loads. To address these issues, we introduce a novel adaptive semi-quantitative group testing (SQGT) scheme to efficiently screen populations via two-stage qPCR testing. The SQGT method quantizes cycle threshold ($Ct$) values into multiple bins, leveraging the information from the first stage of screening to improve the detection sensitivity. Dynamic $Ct$ threshold adjustments mitigate dilution effects and enhance test accuracy. Comparisons with traditional binary outcome GT methods show that SQGT reduces the number of tests by $24$% while maintaining a negligible false negative rate. △ Less

Submitted 2 August, 2023; v1 submitted 30 July, 2023; originally announced July 2023.

Comments: Corrected a misspelled name in the author list on page 1

arXiv:2307.03669 [pdf, other]

Should We Even Optimize for Execution Energy? Rethinking Mapping for MAGIC Design Style

Authors: Simranjeet Singh, Chandan Kumar Jha, Ankit Bende, Phrangboklang Lyngton Thangkhiew, Vikas Rana, Sachin Patkar, Rolf Drechsler, Farhad Merchant

Abstract: Memristor-based logic-in-memory (LiM) has become popular as a means to overcome the von Neumann bottleneck in traditional data-intensive computing. Recently, the memristor-aided logic (MAGIC) design style has gained immense traction for LiM due to its simplicity. However, understanding the energy distribution during the design of logic operations within the memristive memory is crucial in assessin… ▽ More Memristor-based logic-in-memory (LiM) has become popular as a means to overcome the von Neumann bottleneck in traditional data-intensive computing. Recently, the memristor-aided logic (MAGIC) design style has gained immense traction for LiM due to its simplicity. However, understanding the energy distribution during the design of logic operations within the memristive memory is crucial in assessing such an implementation's significance. The current energy estimation methods rely on coarse-grained techniques, which underestimate the energy consumption of MAGIC-styled operations performed on a memristor crossbar. To address this issue, we analyze the energy breakdown in MAGIC operations and propose a solution that utilizes mapping from the SIMPLER MAGIC tool to achieve accurate energy estimation through SPICE simulations. In contrast to existing research that primarily focuses on optimizing execution energy, our findings reveal that the memristor's initialization energy in the MAGIC design style is, on average, 68x higher. We demonstrate that this initialization energy significantly dominates the overall energy consumption. By highlighting this aspect, we aim to redirect the attention of designers towards developing algorithms and strategies that prioritize optimizations in initializations rather than execution for more effective energy savings. △ Less

Submitted 7 July, 2023; originally announced July 2023.

Comments: Accepted to published in IEEE EMBEDDED SYSTEMS LETTER

arXiv:2304.13552 [pdf, other]

Finite State Automata Design using 1T1R ReRAM Crossbar

Authors: Simranjeet Singh, Omar Ghazal, Chandan Kumar Jha, Vikas Rana, Rolf Drechsler, Rishad Shafik, Alex Yakovlev, Sachin Patkar, Farhad Merchant

Abstract: Data movement costs constitute a significant bottleneck in modern machine learning (ML) systems. When combined with the computational complexity of algorithms, such as neural networks, designing hardware accelerators with low energy footprint remains challenging. Finite state automata (FSA) constitute a type of computation model used as a low-complexity learning unit in ML systems. The implementat… ▽ More Data movement costs constitute a significant bottleneck in modern machine learning (ML) systems. When combined with the computational complexity of algorithms, such as neural networks, designing hardware accelerators with low energy footprint remains challenging. Finite state automata (FSA) constitute a type of computation model used as a low-complexity learning unit in ML systems. The implementation of FSA consists of a number of memory states. However, FSA can be in one of the states at a given time. It switches to another state based on the present state and input to the FSA. Due to its natural synergy with memory, it is a promising candidate for in-memory computing for reduced data movement costs. This work focuses on a novel FSA implementation using resistive RAM (ReRAM) for state storage in series with a CMOS transistor for biasing controls. We propose using multi-level ReRAM technology capable of transitioning between states depending on bias pulse amplitude and duration. We use an asynchronous control circuit for writing each ReRAM-transistor cell for the on-demand switching of the FSA. We investigate the impact of the device-to-device and cycle-to-cycle variations on the cell and show that FSA transitions can be seamlessly achieved without degradation of performance. Through extensive experimental evaluation, we demonstrate the implementation of FSA on 1T1R ReRAM crossbar. △ Less

Submitted 30 June, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

Comments: Accepted by 21st IEEE Interregional NEWCAS Conference 2023 (NEWCAS 2023)

arXiv:2304.13531 [pdf, other]

Integrated Architecture for Neural Networks and Security Primitives using RRAM Crossbar

Authors: Simranjeet Singh, Furqan Zahoor, Gokulnath Rajendran, Vikas Rana, Sachin Patkar, Anupam Chattopadhyay, Farhad Merchant

Abstract: This paper proposes an architecture that integrates neural networks (NNs) and hardware security modules using a single resistive random access memory (RRAM) crossbar. The proposed architecture enables using a single crossbar to implement NN, true random number generator (TRNG), and physical unclonable function (PUF) applications while exploiting the multi-state storage characteristic of the RRAM c… ▽ More This paper proposes an architecture that integrates neural networks (NNs) and hardware security modules using a single resistive random access memory (RRAM) crossbar. The proposed architecture enables using a single crossbar to implement NN, true random number generator (TRNG), and physical unclonable function (PUF) applications while exploiting the multi-state storage characteristic of the RRAM crossbar for the vector-matrix multiplication operation required for the implementation of NN. The TRNG is implemented by utilizing the crossbar's variation in device switching thresholds to generate random bits. The PUF is implemented using the same crossbar initialized as an entropy source for the TRNG. Additionally, the weights locking concept is introduced to enhance the security of NNs by preventing unauthorized access to the NN weights. The proposed architecture provides flexibility to configure the RRAM device in multiple modes to suit different applications. It shows promise in achieving a more efficient and compact design for the hardware implementation of NNs and security primitives. △ Less

Submitted 1 May, 2023; v1 submitted 26 April, 2023; originally announced April 2023.

arXiv:2210.16424 [pdf, other]

Machine Unlearning of Federated Clusters

Authors: Chao Pan, Jin Sima, Saurav Prakash, Vishal Rana, Olgica Milenkovic

Abstract: Federated clustering (FC) is an unsupervised learning problem that arises in a number of practical applications, including personalized recommender and healthcare systems. With the adoption of recent laws ensuring the "right to be forgotten", the problem of machine unlearning for FC methods has become of significant importance. We introduce, for the first time, the problem of machine unlearning fo… ▽ More Federated clustering (FC) is an unsupervised learning problem that arises in a number of practical applications, including personalized recommender and healthcare systems. With the adoption of recent laws ensuring the "right to be forgotten", the problem of machine unlearning for FC methods has become of significant importance. We introduce, for the first time, the problem of machine unlearning for FC, and propose an efficient unlearning mechanism for a customized secure FC framework. Our FC framework utilizes special initialization procedures that we show are well-suited for unlearning. To protect client data privacy, we develop the secure compressed multiset aggregation (SCMA) framework that addresses sparse secure federated learning (FL) problems encountered during clustering as well as more general problems. To simultaneously facilitate low communication complexity and secret sharing protocols, we integrate Reed-Solomon encoding with special evaluation points into our SCMA pipeline, and prove that the client communication cost is logarithmic in the vector dimension. Additionally, to demonstrate the benefits of our unlearning mechanism over complete retraining, we provide a theoretical analysis for the unlearning performance of our approach. Simulation results show that the new FC framework exhibits superior clustering performance compared to previously reported FC baselines when the cluster sizes are highly imbalanced. Compared to completely retraining K-means++ locally and globally for each removal request, our unlearning procedure offers an average speed-up of roughly 84x across seven datasets. Our implementation for the proposed method is available at https://github.com/thupchnsky/mufc. △ Less

Submitted 30 June, 2023; v1 submitted 28 October, 2022; originally announced October 2022.

Comments: 27 pages. ICLR 2023

arXiv:2206.03477 [pdf, other]

Short Blocklength Wiretap Channel Codes via Deep Learning: Design and Performance Evaluation

Authors: Vidhi Rana, Remi A. Chou

Abstract: We design short blocklength codes for the Gaussian wiretap channel under information-theoretic security guarantees. Our approach consists in decoupling the reliability and secrecy constraints in our code design. Specifically, we handle the reliability constraint via an autoencoder, and handle the secrecy constraint with hash functions. For blocklengths smaller than or equal to 128, we evaluate thr… ▽ More We design short blocklength codes for the Gaussian wiretap channel under information-theoretic security guarantees. Our approach consists in decoupling the reliability and secrecy constraints in our code design. Specifically, we handle the reliability constraint via an autoencoder, and handle the secrecy constraint with hash functions. For blocklengths smaller than or equal to 128, we evaluate through simulations the probability of error at the legitimate receiver and the leakage at the eavesdropper for our code construction. This leakage is defined as the mutual information between the confidential message and the eavesdropper's channel observations, and is empirically measured via a neural network-based mutual information estimator. Our simulation results provide examples of codes with positive secrecy rates that outperform the best known achievable secrecy rates obtained non-constructively for the Gaussian wiretap channel. Additionally, we show that our code design is suitable for the compound and arbitrarily varying Gaussian wiretap channels, for which the channel statistics are not perfectly known but only known to belong to a pre-specified uncertainty set. These models not only capture uncertainty related to channel statistics estimation, but also scenarios where the eavesdropper jams the legitimate transmission or influences its own channel statistics by changing its location. △ Less

Submitted 23 January, 2023; v1 submitted 7 June, 2022; originally announced June 2022.

arXiv:2112.01087 [pdf, ps, other]

doi 10.23919/DATE54114.2022.9774651

NeuroHammer: Inducing Bit-Flips in Memristive Crossbar Memories

Authors: Felix Staudigl, Hazem Al Indari, Daniel Schön, Dominik Sisejkovic, Farhad Merchant, Jan Moritz Joseph, Vikas Rana, Stephan Menzel, Rainer Leupers

Abstract: Emerging non-volatile memory (NVM) technologies offer unique advantages in energy efficiency, latency, and features such as computing-in-memory. Consequently, emerging NVM technologies are considered an ideal substrate for computation and storage in future-generation neuromorphic platforms. These technologies need to be evaluated for fundamental reliability and security issues. In this paper, we p… ▽ More Emerging non-volatile memory (NVM) technologies offer unique advantages in energy efficiency, latency, and features such as computing-in-memory. Consequently, emerging NVM technologies are considered an ideal substrate for computation and storage in future-generation neuromorphic platforms. These technologies need to be evaluated for fundamental reliability and security issues. In this paper, we present \emph{NeuroHammer}, a security threat in ReRAM crossbars caused by thermal crosstalk between memory cells. We demonstrate that bit-flips can be deliberately induced in ReRAM devices in a crossbar by systematically writing adjacent memory cells. A simulation flow is developed to evaluate NeuroHammer and the impact of physical parameters on the effectiveness of the attack. Finally, we discuss the security implications in the context of possible attack scenarios. △ Less

Submitted 6 December, 2021; v1 submitted 2 December, 2021; originally announced December 2021.

arXiv:2010.07536 [pdf, other]

Secret Sharing from Correlated Gaussian Random Variables and Public Communication

Authors: Vidhi Rana, Remi A. Chou, Hyuck Kwon

Abstract: In this paper, we study an information-theoretic secret sharing problem, where a dealer distributes shares of a secret among a set of participants under the following constraints: (i) authorized sets of users can recover the secret by pooling their shares, and (ii) non-authorized sets of colluding users cannot learn any information about the secret. We assume that the dealer and participants obser… ▽ More In this paper, we study an information-theoretic secret sharing problem, where a dealer distributes shares of a secret among a set of participants under the following constraints: (i) authorized sets of users can recover the secret by pooling their shares, and (ii) non-authorized sets of colluding users cannot learn any information about the secret. We assume that the dealer and participants observe the realizations of correlated Gaussian random variables and that the dealer can communicate with participants through a one-way, authenticated, rate-limited, and public channel. Unlike traditional secret sharing protocols, in our setting, no perfectly secure channel is needed between the dealer and the participants. Our main result is a closed-form characterization of the fundamental trade-off between secret rate and public communication rate. △ Less

Submitted 11 November, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

Comments: 11 pages, two-column, 3 figures, accepted to IEEE Transactions on Information Theory

arXiv:1710.09409 [pdf, other]

doi 10.1051/epjconf/201817509006

Performance Portability Strategies for Grid C++ Expression Templates

Authors: Peter A. Boyle, M. A. Clark, Carleton DeTar, Meifeng Lin, Verinder Rana, Alejandro Vaquero Avilés-Casco

Abstract: One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C++ expression template as a starting point, we report on the progress made with regards to the Grid GPU offloading strategies. We present both the successes and issues encountered in using CUDA, OpenACC and Ju… ▽ More One of the key requirements for the Lattice QCD Application Development as part of the US Exascale Computing Project is performance portability across multiple architectures. Using the Grid C++ expression template as a starting point, we report on the progress made with regards to the Grid GPU offloading strategies. We present both the successes and issues encountered in using CUDA, OpenACC and Just-In-Time compilation. Experimentation and performance on GPUs with a SU(3)$\times$SU(3) streaming test will be reported. We will also report on the challenges of using current OpenMP 4.x for GPU offloading in the same code. △ Less

Submitted 25 October, 2017; originally announced October 2017.

Comments: 8 pages, 4 figures. Talk presented at the 35th International Symposium on Lattice Field Theory, 18-24 June 2017, Granada, Spain

Showing 1–14 of 14 results for author: Rana, V