Search | arXiv e-print repository

Enhancing K-user Interference Alignment for Discrete Constellations via Learning

Authors: Rajesh Mishra, Syed Jafar, Sriram Vishwanath, Hyeji Kim

Abstract: In this paper, we consider a K-user interference channel where interference among the users is neither too strong nor too weak, a scenario that is relatively underexplored in the literature. We propose a novel deep learning-based approach to design the encoder and decoder functions that aim to maximize the sumrate of the interference channel for discrete constellations. We first consider the MaxSI… ▽ More In this paper, we consider a K-user interference channel where interference among the users is neither too strong nor too weak, a scenario that is relatively underexplored in the literature. We propose a novel deep learning-based approach to design the encoder and decoder functions that aim to maximize the sumrate of the interference channel for discrete constellations. We first consider the MaxSINR algorithm, a state-of-the-art linear scheme for Gaussian inputs, as the baseline and then propose a modified version of the algorithm for discrete inputs. We then propose a neural network-based approach that learns a constellation mapping with the objective of maximizing the sumrate. We provide numerical results to show that the constellations learned by the neural network-based approach provide enhanced alignments, not just in beamforming directions but also in terms of the effective constellation at the receiver, thereby leading to improved sum-rate performance. △ Less

Submitted 21 July, 2024; originally announced July 2024.

arXiv:2406.17688 [pdf, other]

Unified Auto-Encoding with Masked Diffusion

Authors: Philippe Hansen-Estruch, Sriram Vishwanath, Amy Zhang, Manan Tomar

Abstract: At the core of both successful generative and self-supervised representation learning models there is a reconstruction objective that incorporates some form of image corruption. Diffusion models implement this approach through a scheduled Gaussian corruption process, while masked auto-encoder models do so by masking patches of the image. Despite their different approaches, the underlying similarit… ▽ More At the core of both successful generative and self-supervised representation learning models there is a reconstruction objective that incorporates some form of image corruption. Diffusion models implement this approach through a scheduled Gaussian corruption process, while masked auto-encoder models do so by masking patches of the image. Despite their different approaches, the underlying similarity in their methodologies suggests a promising avenue for an auto-encoder capable of both de-noising tasks. We propose a unified self-supervised objective, dubbed Unified Masked Diffusion (UMD), that combines patch-based and noise-based corruption techniques within a single auto-encoding framework. Specifically, UMD modifies the diffusion transformer (DiT) training process by introducing an additional noise-free, high masking representation step in the diffusion noising schedule, and utilizes a mixed masked and noised image for subsequent timesteps. By integrating features useful for diffusion modeling and for predicting masked patch tokens, UMD achieves strong performance in downstream generative and representation learning tasks, including linear probing and class-conditional generation. This is achieved without the need for heavy data augmentations, multiple views, or additional encoders. Furthermore, UMD improves over the computational efficiency of prior diffusion based methods in total training time. We release our code at https://github.com/philippe-eecs/small-vision. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: 19 Pages, 8 Figures, 3Tables

ACM Class: I.2.10

arXiv:2406.14657 [pdf, other]

OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset

Authors: Allen Roush, Yusuf Shabazz, Arvind Balaji, Peter Zhang, Stefano Mezza, Markus Zhang, Sanjay Basu, Sriram Vishwanath, Mehdi Fatemi, Ravid Shwartz-Ziv

Abstract: We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable r… ▽ More We introduce OpenDebateEvidence, a comprehensive dataset for argument mining and summarization sourced from the American Competitive Debate community. This dataset includes over 3.5 million documents with rich metadata, making it one of the most extensive collections of debate evidence. OpenDebateEvidence captures the complexity of arguments in high school and college debates, providing valuable resources for training and evaluation. Our extensive experiments demonstrate the efficacy of fine-tuning state-of-the-art large language models for argumentative abstractive summarization across various methods, models, and datasets. By providing this comprehensive resource, we aim to advance computational argumentation and support practical applications for debaters, educators, and researchers. OpenDebateEvidence is publicly available to support further research and innovation in computational argumentation. Access it here: https://huggingface.co/datasets/Yusuf5/OpenCaselist △ Less

Submitted 5 July, 2024; v1 submitted 20 June, 2024; originally announced June 2024.

Comments: Accepted for Publication to ARGMIN 2024 at ACL2024

arXiv:2403.04607 [pdf, other]

Repelling-Attracting Hamiltonian Monte Carlo

Authors: Siddharth Vishwanath, Hyungsuk Tak

Abstract: We propose a variant of Hamiltonian Monte Carlo (HMC), called the Repelling-Attracting Hamiltonian Monte Carlo (RAHMC), for sampling from multimodal distributions. The key idea that underpins RAHMC is a departure from the conservative dynamics of Hamiltonian systems, which form the basis of traditional HMC, and turning instead to the dissipative dynamics of conformal Hamiltonian systems. In partic… ▽ More We propose a variant of Hamiltonian Monte Carlo (HMC), called the Repelling-Attracting Hamiltonian Monte Carlo (RAHMC), for sampling from multimodal distributions. The key idea that underpins RAHMC is a departure from the conservative dynamics of Hamiltonian systems, which form the basis of traditional HMC, and turning instead to the dissipative dynamics of conformal Hamiltonian systems. In particular, RAHMC involves two stages: a mode-repelling stage to encourage the sampler to move away from regions of high probability density; and, a mode-attracting stage, which facilitates the sampler to find and settle near alternative modes. We achieve this by introducing just one additional tuning parameter -- the coefficient of friction. The proposed method adapts to the geometry of the target distribution, e.g., modes and density ridges, and can generate proposals that cross low-probability barriers with little to no computational overhead in comparison to traditional HMC. Notably, RAHMC requires no additional information about the target distribution or memory of previously visited modes. We establish the theoretical basis for RAHMC, and we discuss repelling-attracting extensions to several variants of HMC in literature. Finally, we provide a tuning-free implementation via dual-averaging, and we demonstrate its effectiveness in sampling from, both, multimodal and unimodal distributions in high dimensions. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: 41 pages, 10 figures, 4 tables

MSC Class: 62-08

arXiv:2402.05132 [pdf, other]

TexShape: Information Theoretic Sentence Embedding for Language Models

Authors: Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath

Abstract: With the exponential growth in data volume and the emergence of data-intensive applications, particularly in the field of machine learning, concerns related to resource utilization, privacy, and fairness have become paramount. This paper focuses on the textual domain of data and addresses challenges regarding encoding sentences to their optimized representations through the lens of information-the… ▽ More With the exponential growth in data volume and the emergence of data-intensive applications, particularly in the field of machine learning, concerns related to resource utilization, privacy, and fairness have become paramount. This paper focuses on the textual domain of data and addresses challenges regarding encoding sentences to their optimized representations through the lens of information-theory. In particular, we use empirical estimates of mutual information, using the Donsker-Varadhan definition of Kullback-Leibler divergence. Our approach leverages this estimation to train an information-theoretic sentence embedding, called TexShape, for (task-based) data compression or for filtering out sensitive information, enhancing privacy and fairness. In this study, we employ a benchmark language model for initial text representation, complemented by neural networks for information-theoretic compression and mutual information estimations. Our experiments demonstrate significant advancements in preserving maximal targeted information and minimal sensitive information over adverse compression ratios, in terms of predictive accuracy of downstream models that are trained using the compressed data. △ Less

Submitted 11 May, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

Comments: Submitted to the 2024 IEEE International Symposium on Information Theory

arXiv:2312.07709 [pdf, other]

Majority is Not Required: A Rational Analysis of the Private Double-Spend Attack from a Sub-Majority Adversary

Authors: Yanni Georghiades, Rajesh Mishra, Karl Kreder, Sriram Vishwanath

Abstract: We study the incentives behind double-spend attacks on Nakamoto-style Proof-of-Work cryptocurrencies. In these systems, miners are allowed to choose which transactions to reference with their block, and a common strategy for selecting transactions is to simply choose those with the highest fees. This can be problematic if these transactions originate from an adversary with substantial (but less th… ▽ More We study the incentives behind double-spend attacks on Nakamoto-style Proof-of-Work cryptocurrencies. In these systems, miners are allowed to choose which transactions to reference with their block, and a common strategy for selecting transactions is to simply choose those with the highest fees. This can be problematic if these transactions originate from an adversary with substantial (but less than 50\%) computational power, as high-value transactions can present an incentive for a rational adversary to attempt a double-spend attack if they expect to profit. The most common mechanism for deterring double-spend attacks is for the recipients of large transactions to wait for additional block confirmations (i.e., to increase the attack cost). We argue that this defense mechanism is not satisfactory, as the security of the system is contingent on the actions of its users. Instead, we propose that defending against double-spend attacks should be the responsibility of the miners; specifically, miners should limit the amount of transaction value they include in a block (i.e., reduce the attack reward). To this end, we model cryptocurrency mining as a mean-field game in which we augment the standard mining reward function to simulate the presence of a rational, double-spending adversary. We design and implement an algorithm which characterizes the behavior of miners at equilibrium, and we show that miners who use the adversary-aware reward function accumulate more wealth than those who do not. We show that the optimal strategy for honest miners is to limit the amount of value transferred by each block such that the adversary's expected profit is 0. Additionally, we examine Bitcoin's resilience to double-spend attacks. Assuming a 6 block confirmation time, we find that an attacker with at least 25% of the network mining power can expect to profit from a double-spend attack. △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2310.10900 [pdf, other]

Stability of Sequential Lateration and of Stress Minimization in the Presence of Noise

Authors: Ery Arias-Castro, Siddharth Vishwanath

Abstract: Sequential lateration is a class of methods for multidimensional scaling where a suitable subset of nodes is first embedded by some method, e.g., a clique embedded by classical scaling, and then the remaining nodes are recursively embedded by lateration. A graph is a lateration graph when it can be embedded by such a procedure. We provide a stability result for a particular variant of sequential l… ▽ More Sequential lateration is a class of methods for multidimensional scaling where a suitable subset of nodes is first embedded by some method, e.g., a clique embedded by classical scaling, and then the remaining nodes are recursively embedded by lateration. A graph is a lateration graph when it can be embedded by such a procedure. We provide a stability result for a particular variant of sequential lateration. We do so in a setting where the dissimilarities represent noisy Euclidean distances between nodes in a geometric lateration graph. We then deduce, as a corollary, a perturbation bound for stress minimization. To argue that our setting applies broadly, we show that a (large) random geometric graph is a lateration graph with high probability under mild conditions, extending a previous result of Aspnes et al (2006). △ Less

Submitted 26 March, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2207.07218

arXiv:2309.16878 [pdf, other]

Investigating Human-Identifiable Features Hidden in Adversarial Perturbations

Authors: Dennis Y. Menn, Tzu-hsun Feng, Sriram Vishwanath, Hung-yi Lee

Abstract: Neural networks perform exceedingly well across various machine learning tasks but are not immune to adversarial perturbations. This vulnerability has implications for real-world applications. While much research has been conducted, the underlying reasons why neural networks fall prey to adversarial attacks are not yet fully understood. Central to our study, which explores up to five attack algori… ▽ More Neural networks perform exceedingly well across various machine learning tasks but are not immune to adversarial perturbations. This vulnerability has implications for real-world applications. While much research has been conducted, the underlying reasons why neural networks fall prey to adversarial attacks are not yet fully understood. Central to our study, which explores up to five attack algorithms across three datasets, is the identification of human-identifiable features in adversarial perturbations. Additionally, we uncover two distinct effects manifesting within human-identifiable features. Specifically, the masking effect is prominent in untargeted attacks, while the generation effect is more common in targeted attacks. Using pixel-level annotations, we extract such features and demonstrate their ability to compromise target models. In addition, our findings indicate a notable extent of similarity in perturbations across different attack algorithms when averaged over multiple models. This work also provides insights into phenomena associated with adversarial perturbations, such as transferability and model interpretability. Our study contributes to a deeper understanding of the underlying mechanisms behind adversarial attacks and offers insights for the development of more resilient defense strategies for neural networks. △ Less

Submitted 28 September, 2023; originally announced September 2023.

arXiv:2304.05354 [pdf, other]

iDML: Incentivized Decentralized Machine Learning

Authors: Haoxiang Yu, Hsiao-Yuan Chen, Sangsu Lee, Sriram Vishwanath, Xi Zheng, Christine Julien

Abstract: With the rising emergence of decentralized and opportunistic approaches to machine learning, end devices are increasingly tasked with training deep learning models on-devices using crowd-sourced data that they collect themselves. These approaches are desirable from a resource consumption perspective and also from a privacy preservation perspective. When the devices benefit directly from the traine… ▽ More With the rising emergence of decentralized and opportunistic approaches to machine learning, end devices are increasingly tasked with training deep learning models on-devices using crowd-sourced data that they collect themselves. These approaches are desirable from a resource consumption perspective and also from a privacy preservation perspective. When the devices benefit directly from the trained models, the incentives are implicit - contributing devices' resources are incentivized by the availability of the higher-accuracy model that results from collaboration. However, explicit incentive mechanisms must be provided when end-user devices are asked to contribute their resources (e.g., computation, communication, and data) to a task performed primarily for the benefit of others, e.g., training a model for a task that a neighbor device needs but the device owner is uninterested in. In this project, we propose a novel blockchain-based incentive mechanism for completely decentralized and opportunistic learning architectures. We leverage a smart contract not only for providing explicit incentives to end devices to participate in decentralized learning but also to create a fully decentralized mechanism to inspect and reflect on the behavior of the learning architecture. △ Less

Submitted 10 April, 2023; originally announced April 2023.

arXiv:2210.12881 [pdf, other]

A Control Theoretic Approach to Infrastructure-Centric Blockchain Tokenomics

Authors: Oguzhan Akcin, Robert P. Streit, Benjamin Oommen, Sriram Vishwanath, Sandeep Chinchali

Abstract: There are a multitude of Blockchain-based physical infrastructure systems, operating on a crypto-currency enabled token economy, where infrastructure suppliers are rewarded with tokens for enabling, validating, managing and/or securing the system. However, today's token economies are largely designed without infrastructure systems in mind, and often operate with a fixed token supply (e.g., Bitcoin… ▽ More There are a multitude of Blockchain-based physical infrastructure systems, operating on a crypto-currency enabled token economy, where infrastructure suppliers are rewarded with tokens for enabling, validating, managing and/or securing the system. However, today's token economies are largely designed without infrastructure systems in mind, and often operate with a fixed token supply (e.g., Bitcoin). This paper argues that token economies for infrastructure networks should be structured differently - they should continually incentivize new suppliers to join the network to provide services and support to the ecosystem. As such, the associated token rewards should gracefully scale with the size of the decentralized system, but should be carefully balanced with consumer demand to manage inflation and be designed to ultimately reach an equilibrium. To achieve such an equilibrium, the decentralized token economy should be adaptable and controllable so that it maximizes the total utility of all users, such as achieving stable (overall non-inflationary) token economies. Our main contribution is to model infrastructure token economies as dynamical systems - the circulating token supply, price, and consumer demand change as a function of the payment to nodes and costs to consumers for infrastructure services. Crucially, this dynamical systems view enables us to leverage tools from mathematical control theory to optimize the overall decentralized network's performance. Moreover, our model extends easily to a Stackelberg game between the controller and the nodes, which we use for robust, strategic pricing. In short, we develop predictive, optimization-based controllers that outperform traditional algorithmic stablecoin heuristics by up to $2.4 \times$ in simulations based on real demand data from existing decentralized wireless networks. △ Less

Submitted 23 October, 2022; originally announced October 2022.

arXiv:2210.08093 [pdf, other]

Spatial and Statistical Modeling of Multi-Panel Millimeter Wave Self-Interference

Authors: Ian P. Roberts, Aditya Chopra, Thomas Novlan, Sriram Vishwanath, Jeffrey G. Andrews

Abstract: Characterizing self-interference is essential to the design and evaluation of in-band full-duplex communication systems. Until now, little has been understood about this coupling in full-duplex systems operating at millimeter wave (mmWave) frequencies, and it has been shown that the highly-idealized models proposed for such do not align with practice. This work presents the first spatial and stati… ▽ More Characterizing self-interference is essential to the design and evaluation of in-band full-duplex communication systems. Until now, little has been understood about this coupling in full-duplex systems operating at millimeter wave (mmWave) frequencies, and it has been shown that the highly-idealized models proposed for such do not align with practice. This work presents the first spatial and statistical model of mmWave self-interference backed by measurements, enabling engineers to draw realizations that exhibit the large-scale and small-scale spatial characteristics observed in our nearly 6.5 million measurements taken at 28 GHz. Core to our model is its use of system and model parameters having real-world meaning, which facilitates its extension to systems beyond our own phased array platform through proper parameterization. We demonstrate this by collecting nearly 13 million additional measurements to show that our model can generalize to two other system configurations. We assess our model by comparing it against actual measurements to confirm its ability to align spatially and in distribution with real-world self-interference. In addition, using both measurements and our model of self-interference, we evaluate an existing beamforming-based full-duplex mmWave solution to illustrate that our model can be reliably used to design new solutions and validate the performance improvements they may offer. △ Less

Submitted 4 March, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

arXiv:2207.07281 [pdf, other]

STEER: Beam Selection for Full-Duplex Millimeter Wave Communication Systems

Authors: Ian P. Roberts, Aditya Chopra, Thomas Novlan, Sriram Vishwanath, Jeffrey G. Andrews

Abstract: Modern millimeter wave (mmWave) communication systems rely on beam alignment to deliver sufficient beamforming gain to close the link between devices. We present a novel beam selection methodology for multi-panel, full-duplex mmWave systems, which we call STEER, that delivers high beamforming gain while significantly reducing the full-duplex self-interference coupled between the transmit and recei… ▽ More Modern millimeter wave (mmWave) communication systems rely on beam alignment to deliver sufficient beamforming gain to close the link between devices. We present a novel beam selection methodology for multi-panel, full-duplex mmWave systems, which we call STEER, that delivers high beamforming gain while significantly reducing the full-duplex self-interference coupled between the transmit and receive beams. STEER does not necessitate changes to conventional beam alignment methodologies nor additional over-the-air feedback, making it compatible with existing cellular standards. Instead, STEER uses conventional beam alignment to identify the general directions beams should be steered, and then it makes use of a minimal number of self-interference measurements to jointly select transmit and receive beams that deliver high gain in these directions while coupling low self-interference. We implement STEER on an industry-grade 28 GHz phased array platform and use further simulation to show that full-duplex operation with beams selected by STEER can notably outperform both half-duplex and full-duplex operation with beams chosen via conventional beam selection. For instance, STEER can reliably reduce self-interference by more than 20 dB and improve SINR by more than 10 dB, compared to conventional beam selection. Our experimental results highlight that beam alignment can be used not only to deliver high beamforming gain in full-duplex mmWave systems but also to mitigate self-interference to levels near or below the noise floor, rendering additional self-interference cancellation unnecessary with STEER. △ Less

Submitted 15 July, 2022; originally announced July 2022.

arXiv:2207.05869 [pdf, other]

Achieving Almost All Blockchain Functionalities with Polylogarithmic Storage

Authors: Parikshit Hegde, Robert Streit, Yanni Georghiades, Chaya Ganesh, Sriram Vishwanath

Abstract: In current blockchain systems, full nodes that perform all of the available functionalities need to store the entire blockchain. In addition to the blockchain, full nodes also store a blockchain-summary, called the \emph{state}, which is used to efficiently verify transactions. With the size of popular blockchains and their states growing rapidly, full nodes require massive storage resources in or… ▽ More In current blockchain systems, full nodes that perform all of the available functionalities need to store the entire blockchain. In addition to the blockchain, full nodes also store a blockchain-summary, called the \emph{state}, which is used to efficiently verify transactions. With the size of popular blockchains and their states growing rapidly, full nodes require massive storage resources in order to keep up with the scaling. This leads to a tug-of-war between scaling and decentralization since fewer entities can afford expensive resources. We present \emph{hybrid nodes} for proof-of-work (PoW) cryptocurrencies which can validate transactions, validate blocks, validate states, mine, select the main chain, bootstrap new hybrid nodes, and verify payment proofs. With the use of a protocol called \emph{trimming}, hybrid nodes only retain polylogarithmic number of blocks in the chain length in order to represent the proof-of-work of the blockchain. Hybrid nodes are also optimized for the storage of the state with the use of \emph{stateless blockchain} protocols. The lowered storage requirements should enable more entities to join as hybrid nodes and improve the decentralization of the system. We define novel theoretical security models for hybrid nodes and show that they are provably secure. We also show that the storage requirement of hybrid nodes is near-optimal with respect to our security definitions. △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2206.11418 [pdf, other]

LoneSTAR: Analog Beamforming Codebooks for Full-Duplex Millimeter Wave Systems

Authors: Ian P. Roberts, Sriram Vishwanath, Jeffrey G. Andrews

Abstract: This work develops LoneSTAR, a novel enabler of full-duplex millimeter wave (mmWave) communication systems through the design of analog beamforming codebooks. LoneSTAR codebooks deliver high beamforming gain and broad coverage while simultaneously reducing the self-interference coupled by transmit and receive beams at a full-duplex mmWave transceiver. Our design framework accomplishes this by tole… ▽ More This work develops LoneSTAR, a novel enabler of full-duplex millimeter wave (mmWave) communication systems through the design of analog beamforming codebooks. LoneSTAR codebooks deliver high beamforming gain and broad coverage while simultaneously reducing the self-interference coupled by transmit and receive beams at a full-duplex mmWave transceiver. Our design framework accomplishes this by tolerating some variability in transmit and receive beamforming gain to strategically shape beams that reject self-interference spatially while accounting for digitally-controlled analog beamforming networks and self-interference channel estimation error. By leveraging the coherence time of the self-interference channel, a mmWave system can use the same LoneSTAR design over many time slots to serve several downlink-uplink user pairs in a full-duplex fashion without the need for additional self-interference cancellation. Compared to those using conventional codebooks, full-duplex mmWave systems employing LoneSTAR codebooks can mitigate higher levels of self-interference, tolerate more cross-link interference, and demand lower SNRs in order to outperform half-duplex operation -- all while supporting beam alignment. This makes LoneSTAR a potential standalone solution for enabling simultaneous transmission and reception in mmWave systems, from which it derives its name. △ Less

Submitted 22 June, 2022; originally announced June 2022.

arXiv:2206.10137 [pdf, other]

Few-Max: Few-Shot Domain Adaptation for Unsupervised Contrastive Representation Learning

Authors: Ali Lotfi Rezaabad, Sidharth Kumar, Sriram Vishwanath, Jonathan I. Tamir

Abstract: Contrastive self-supervised learning methods learn to map data points such as images into non-parametric representation space without requiring labels. While highly successful, current methods require a large amount of data in the training phase. In situations where the target training set is limited in size, generalization is known to be poor. Pretraining on a large source data set and fine-tunin… ▽ More Contrastive self-supervised learning methods learn to map data points such as images into non-parametric representation space without requiring labels. While highly successful, current methods require a large amount of data in the training phase. In situations where the target training set is limited in size, generalization is known to be poor. Pretraining on a large source data set and fine-tuning on the target samples is prone to overfitting in the few-shot regime, where only a small number of target samples are available. Motivated by this, we propose a domain adaption method for self-supervised contrastive learning, termed Few-Max, to address the issue of adaptation to a target distribution under few-shot learning. To quantify the representation quality, we evaluate Few-Max on a range of source and target datasets, including ImageNet, VisDA, and fastMRI, on which Few-Max consistently outperforms other approaches. △ Less

Submitted 22 June, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

arXiv:2206.07816 [pdf, other]

Beamformed Self-Interference Measurements at 28 GHz: Spatial Insights and Angular Spread

Authors: Ian P. Roberts, Aditya Chopra, Thomas Novlan, Sriram Vishwanath, Jeffrey G. Andrews

Abstract: We present measurements and analysis of self-interference in multi-panel millimeter wave (mmWave) full-duplex communication systems at 28 GHz. In an anechoic chamber, we measure the self-interference power between the input of a transmitting phased array and the output of a colocated receiving phased array, each of which is electronically steered across a number of directions in azimuth and elevat… ▽ More We present measurements and analysis of self-interference in multi-panel millimeter wave (mmWave) full-duplex communication systems at 28 GHz. In an anechoic chamber, we measure the self-interference power between the input of a transmitting phased array and the output of a colocated receiving phased array, each of which is electronically steered across a number of directions in azimuth and elevation. These self-interference power measurements shed light on the potential for a full-duplex communication system to successfully receive a desired signal while transmitting in-band. Our nearly 6.5 million measurements illustrate that more self-interference tends to be coupled when the transmitting and receiving phased arrays steer their beams toward one another but that slight shifts in steering direction (on the order of one degree) can lead to significant fluctuations in self-interference power. We analyze these measurements to characterize the spatial variability of self-interference to better quantify and statistically model this sensitivity. Our analyses and statistical results can be useful references when developing and evaluating mmWave full-duplex systems and motivate a variety of future topics including beam selection, beamforming codebook design, and self-interference channel modeling. △ Less

Submitted 15 June, 2022; originally announced June 2022.

arXiv:2206.01795 [pdf, other]

Robust Topological Inference in the Presence of Outliers

Authors: Siddharth Vishwanath, Bharath K. Sriperumbudur, Kenji Fukumizu, Satoshi Kuriki

Abstract: The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this w… ▽ More The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this work, we develop a framework of statistical inference for persistent homology in the presence of outliers. Drawing inspiration from recent developments in robust statistics, we propose a $\textit{median-of-means}$ variant of the distance function ($\textsf{MoM Dist}$), and establish its statistical properties. In particular, we show that, even in the presence of outliers, the sublevel filtrations and weighted filtrations induced by $\textsf{MoM Dist}$ are both consistent estimators of the true underlying population counterpart, and their rates of convergence in the bottleneck metric are controlled by the fraction of outliers in the data. Finally, we demonstrate the advantages of the proposed methodology through simulations and applications. △ Less

Submitted 3 June, 2022; originally announced June 2022.

Comments: 50 pages, 10 figures

MSC Class: 62R40; 55N31; 68T09

arXiv:2201.01283 [pdf, other]

Self-supervised Learning from 100 Million Medical Images

Authors: Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Dominik Neumann, Pragneshkumar Patel, R. S. Vishwanath, James M. Balter, Yue Cao, Sasa Grbic, Dorin Comaniciu

Abstract: Building accurate and robust artificial intelligence systems for medical image assessment requires not only the research and design of advanced deep learning models but also the creation of large and curated sets of annotated training examples. Constructing such datasets, however, is often very costly -- due to the complex nature of annotation tasks and the high level of expertise required for the… ▽ More Building accurate and robust artificial intelligence systems for medical image assessment requires not only the research and design of advanced deep learning models but also the creation of large and curated sets of annotated training examples. Constructing such datasets, however, is often very costly -- due to the complex nature of annotation tasks and the high level of expertise required for the interpretation of medical images (e.g., expert radiologists). To counter this limitation, we propose a method for self-supervised learning of rich image features based on contrastive learning and online feature clustering. For this purpose we leverage large training datasets of over 100,000,000 medical images of various modalities, including radiography, computed tomography (CT), magnetic resonance (MR) imaging and ultrasonography. We propose to use these features to guide model training in supervised and hybrid self-supervised/supervised regime on various downstream tasks. We highlight a number of advantages of this strategy on challenging image assessment problems in radiography, CT and MR: 1) Significant increase in accuracy compared to the state-of-the-art (e.g., AUC boost of 3-7% for detection of abnormalities from chest radiography scans and hemorrhage detection on brain CT); 2) Acceleration of model convergence during training by up to 85% compared to using no pretraining (e.g., 83% when training a model for detection of brain metastases in MR scans); 3) Increase in robustness to various image augmentations, such as intensity variations, rotations or scaling reflective of data variation seen in the field. △ Less

Submitted 4 January, 2022; originally announced January 2022.

arXiv:2112.11072 [pdf, other]

doi 10.1109/BLOCKCHAIN55522.2022.00072

Scalable Multi-Chain Coordination via the Hierarchical Longest Chain Rule

Authors: Yanni Georghiades, Karl Kreder, Jonathan Downing, Alan Orwick, Sriram Vishwanath

Abstract: This paper introduces BlockReduce, a Proof-of-Work (PoW) based blockchain system which achieves high transaction throughput through a hierarchy of merged mined blockchains, each operating in parallel on a partition the overall application state. Most notably, the full PoW available within the network is applied to all blockchains in BlockReduce, and cross-blockchain state transitions are enabled s… ▽ More This paper introduces BlockReduce, a Proof-of-Work (PoW) based blockchain system which achieves high transaction throughput through a hierarchy of merged mined blockchains, each operating in parallel on a partition the overall application state. Most notably, the full PoW available within the network is applied to all blockchains in BlockReduce, and cross-blockchain state transitions are enabled seamlessly within the core protocol. This paper shows that, given a hierarchy of blockchains and its associated security model, the protocol scales superlinearly in transaction throughput with the number of blockchains operated by the protocol. △ Less

Submitted 27 December, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

Journal ref: 2022 IEEE International Conference on Blockchain (Blockchain)

arXiv:2111.12002 [pdf, other]

Armada: A Robust Latency-Sensitive Edge Cloud in Heterogeneous Edge-Dense Environments

Authors: Lei Huang, Zhiying Liang, Nikhil Sreekumar, Sumanth Kaushik Vishwanath, Cody Perakslis, Abhishek Chandra, Jon Weissman

Abstract: Edge computing has enabled a large set of emerging edge applications by exploiting data proximity and offloading latency-sensitive and computation-intensive workloads to nearby edge servers. However, supporting edge application users at scale in wide-area environments poses challenges due to limited point-of-presence edge sites and constrained elasticity. In this paper, we introduce Armada: a dens… ▽ More Edge computing has enabled a large set of emerging edge applications by exploiting data proximity and offloading latency-sensitive and computation-intensive workloads to nearby edge servers. However, supporting edge application users at scale in wide-area environments poses challenges due to limited point-of-presence edge sites and constrained elasticity. In this paper, we introduce Armada: a densely-distributed edge cloud infrastructure that explores the use of dedicated and volunteer resources to serve geo-distributed users in heterogeneous environments. We describe the lightweight Armada architecture and optimization techniques including performance-aware edge selection, auto-scaling and load balancing on the edge, fault tolerance, and in-situ data access. We evaluate Armada in both real-world volunteer environments and emulated platforms to show how common edge applications, namely real-time object detection and face recognition, can be easily deployed on Armada serving distributed users at scale with low latency. △ Less

Submitted 23 November, 2021; originally announced November 2021.

Comments: 13 pages, 13 figures

ACM Class: C.2.4; D.4.5; D.4.7

arXiv:2105.13450 [pdf, other]

Millimeter Wave Analog Beamforming Codebooks Robust to Self-Interference

Authors: Ian P. Roberts, Hardik B. Jain, Sriram Vishwanath, Jeffrey G. Andrews

Abstract: This paper develops a novel methodology for designing analog beamforming codebooks for full-duplex millimeter wave (mmWave) transceivers, the first such codebooks to the best of our knowledge. Our design reduces the self-interference coupled by transmit-receive beam pairs and simultaneously delivers high beamforming gain over desired coverage regions, allowing mmWave full-duplex systems to support… ▽ More This paper develops a novel methodology for designing analog beamforming codebooks for full-duplex millimeter wave (mmWave) transceivers, the first such codebooks to the best of our knowledge. Our design reduces the self-interference coupled by transmit-receive beam pairs and simultaneously delivers high beamforming gain over desired coverage regions, allowing mmWave full-duplex systems to support beam alignment while minimizing self-interference. To do so, our methodology allows some variability in beamforming gain to strategically shape beams that reject self-interference while still having substantial gain. We present an algorithm for approximately solving our codebook design problem while accounting for the non-convexity posed by digitally-controlled phase shifters and attenuators. Numerical results suggest that our design can outperform or nearly match existing codebooks in sum spectral efficiency across a wide range of self-interference power levels. Results show that our design offers an extra 20-50 dB of robustness to self-interference, depending on hardware constraints. △ Less

Submitted 27 May, 2021; originally announced May 2021.

arXiv:2105.08892 [pdf, other]

A Phase Transition in Large Network Games

Authors: Abhishek Shende, Deepanshu Vasal, Sriram Vishwanath

Abstract: In this paper, we use a model of large random network game where the agents plays selfishly and are affected by their neighbors, to explore the conditions under which the Nash equilibrium (NE) of the game is affected by a perturbation in the network. We use a phase transition phenomenon observed in finite rank deformations of large random matrices, to study how the NE changes on crossing critical… ▽ More In this paper, we use a model of large random network game where the agents plays selfishly and are affected by their neighbors, to explore the conditions under which the Nash equilibrium (NE) of the game is affected by a perturbation in the network. We use a phase transition phenomenon observed in finite rank deformations of large random matrices, to study how the NE changes on crossing critical threshold points. Our main contribution is as follows: when the perturbation strength is greater than a critical point, it impacts the NE of the game, whereas when this perturbation is below this critical point, the NE remains independent of the perturbation parameter. This demonstrates a phase transition in NE which alludes that perturbations can affect the behavior of the society only if their strength is above a critical threshold. We provide numerical examples for this result and present scenarios under which this phenomenon could potentially occur in real world applications. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2103.02087 [pdf, other]

Deep J-Sense: Accelerated MRI Reconstruction via Unrolled Alternating Optimization

Authors: Marius Arvinte, Sriram Vishwanath, Ahmed H. Tewfik, Jonathan I. Tamir

Abstract: Accelerated multi-coil magnetic resonance imaging reconstruction has seen a substantial recent improvement combining compressed sensing with deep learning. However, most of these methods rely on estimates of the coil sensitivity profiles, or on calibration data for estimating model parameters. Prior work has shown that these methods degrade in performance when the quality of these estimators are p… ▽ More Accelerated multi-coil magnetic resonance imaging reconstruction has seen a substantial recent improvement combining compressed sensing with deep learning. However, most of these methods rely on estimates of the coil sensitivity profiles, or on calibration data for estimating model parameters. Prior work has shown that these methods degrade in performance when the quality of these estimators are poor or when the scan parameters differ from the training conditions. Here we introduce Deep J-Sense as a deep learning approach that builds on unrolled alternating minimization and increases robustness: our algorithm refines both the magnetization (image) kernel and the coil sensitivity maps. Experimental results on a subset of the knee fastMRI dataset show that this increases reconstruction performance and provides a significant degree of robustness to varying acceleration factors and calibration region sizes. △ Less

Submitted 11 April, 2021; v1 submitted 2 March, 2021; originally announced March 2021.

arXiv:2012.14296 [pdf, other]

Network Design for Social Welfare

Authors: Abhishek Shende, Deepanshu Vasal, Sriram Vishwanath

Abstract: In this paper, we consider the problem of network design on network games. We study the conditions on the adjacency matrix of the underlying network to design a game such that the Nash equilibrium coincides with the social optimum. We provide the examples for linear quadratic games that satisfy this condition. Furthermore, we identify conditions on properties of adjacency matrix that provide a uni… ▽ More In this paper, we consider the problem of network design on network games. We study the conditions on the adjacency matrix of the underlying network to design a game such that the Nash equilibrium coincides with the social optimum. We provide the examples for linear quadratic games that satisfy this condition. Furthermore, we identify conditions on properties of adjacency matrix that provide a unique solution using variational inequality formulation, and verify the robustness and continuity of the social cost under perturbations of the network. Finally we comment on individual rationality and extension of our results to large random networked games. △ Less

Submitted 10 September, 2022; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2012.12843 [pdf, other]

EQ-Net: A Unified Deep Learning Framework for Log-Likelihood Ratio Estimation and Quantization

Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

Abstract: In this work, we introduce EQ-Net: the first holistic framework that solves both the tasks of log-likelihood ratio (LLR) estimation and quantization using a data-driven method. We motivate our approach with theoretical insights on two practical estimation algorithms at the ends of the complexity spectrum and reveal a connection between the complexity of an algorithm and the information bottleneck… ▽ More In this work, we introduce EQ-Net: the first holistic framework that solves both the tasks of log-likelihood ratio (LLR) estimation and quantization using a data-driven method. We motivate our approach with theoretical insights on two practical estimation algorithms at the ends of the complexity spectrum and reveal a connection between the complexity of an algorithm and the information bottleneck method: simpler algorithms admit smaller bottlenecks when representing their solution. This motivates us to propose a two-stage algorithm that uses LLR compression as a pretext task for estimation and is focused on low-latency, high-performance implementations via deep neural networks. We carry out extensive experimental evaluation and demonstrate that our single architecture achieves state-of-the-art results on both tasks when compared to previous methods, with gains in quantization efficiency as high as $20\%$ and reduced estimation latency by up to $60\%$ when measured on general purpose and graphical processing units (GPU). In particular, our approach reduces the GPU inference latency by more than two times in several multiple-input multiple-output (MIMO) configurations. Finally, we demonstrate that our scheme is robust to distributional shifts and retains a significant part of its performance when evaluated on 5G channel models, as well as channel estimation errors. △ Less

Submitted 3 May, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2012.11647 [pdf, other]

Hybrid Beamforming for Millimeter Wave Full-Duplex under Limited Receive Dynamic Range

Authors: Ian P. Roberts, Jeffrey G. Andrews, Sriram Vishwanath

Abstract: Full-duplex millimeter wave (mmWave) communication has shown increasing promise for self-interference cancellation via hybrid precoding and combining. This paper proposes a novel mmWave multiple-input multiple-output (MIMO) design for configuring the analog and digital beamformers of a full-duplex transceiver. Our design is the first to holistically consider the key practical constraints of analog… ▽ More Full-duplex millimeter wave (mmWave) communication has shown increasing promise for self-interference cancellation via hybrid precoding and combining. This paper proposes a novel mmWave multiple-input multiple-output (MIMO) design for configuring the analog and digital beamformers of a full-duplex transceiver. Our design is the first to holistically consider the key practical constraints of analog beamforming codebooks, a minimal number of radio frequency (RF) chains, limited channel knowledge, beam alignment, and a limited receive dynamic range. To prevent self-interference from saturating the receiver of a full-duplex device having limited dynamic range, our design addresses saturation on a per-antenna and per-RF chain basis. Numerical results evaluate our design in a variety of settings and validate the need to prevent receiver-side saturation. These results and the corresponding insights serve as useful design references for practical full-duplex mmWave transceivers. △ Less

Submitted 21 December, 2020; originally announced December 2020.

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2011.00194 [pdf, other]

Hyperbolic Graph Embedding with Enhanced Semi-Implicit Variational Inference

Authors: Ali Lotfi Rezaabad, Rahi Kalantari, Sriram Vishwanath, Mingyuan Zhou, Jonathan Tamir

Abstract: Efficient modeling of relational data arising in physical, social, and information sciences is challenging due to complicated dependencies within the data. In this work, we build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation. We incorporate hyperbolic geometry in the latent space through a Poincare embedding… ▽ More Efficient modeling of relational data arising in physical, social, and information sciences is challenging due to complicated dependencies within the data. In this work, we build off of semi-implicit graph variational auto-encoders to capture higher-order statistics in a low-dimensional graph latent representation. We incorporate hyperbolic geometry in the latent space through a Poincare embedding to efficiently represent graphs exhibiting hierarchical structure. To address the naive posterior latent distribution assumptions in classical variational inference, we use semi-implicit hierarchical variational Bayes to implicitly capture posteriors of given graph data, which may exhibit heavy tails, multiple modes, skewness, and highly correlated latent structures. We show that the existing semi-implicit variational inference objective provably reduces information in the observed graph. Based on this observation, we estimate and add an additional mutual information term to the semi-implicit variational inference learning objective to capture rich correlations arising between the input and latent spaces. We show that the inclusion of this regularization term in conjunction with the Poincare embedding boosts the quality of learned high-level representations and enables more flexible and faithful graphical modeling. We experimentally demonstrate that our approach outperforms existing graph variational auto-encoders both in Euclidean and in hyperbolic spaces for edge link prediction and node classification. △ Less

Submitted 10 March, 2021; v1 submitted 31 October, 2020; originally announced November 2020.

arXiv:2009.06048 [pdf, other]

Millimeter Wave Full-Duplex Radios: New Challenges and Techniques

Authors: Ian P. Roberts, Jeffrey G. Andrews, Hardik B. Jain, Sriram Vishwanath

Abstract: Equipping millimeter wave (mmWave) systems with full-duplex capability would accelerate and transform next-generation wireless applications and forge a path for new ones. Full-duplex mmWave transceivers could capitalize on the already attractive features of mmWave communication by supplying spectral efficiency gains and latency improvements while also affording future networks with deployment solu… ▽ More Equipping millimeter wave (mmWave) systems with full-duplex capability would accelerate and transform next-generation wireless applications and forge a path for new ones. Full-duplex mmWave transceivers could capitalize on the already attractive features of mmWave communication by supplying spectral efficiency gains and latency improvements while also affording future networks with deployment solutions in the form of interference management and wireless backhaul. Foreseeable challenges and obstacles in making mmWave full-duplex a reality are presented in this article along with noteworthy unknowns warranting further investigation. With these novelties of mmWave full-duplex in mind, we lay out potential solutions---beyond active self-interference cancellation---that harness the spatial degrees of freedom bestowed by dense antenna arrays to enable simultaneous transmission and reception in-band. △ Less

Submitted 13 September, 2020; originally announced September 2020.

Comments: Submitted to the IEEE Wireless Communications Magazine: Special Issue on Full Duplex Communications

arXiv:2007.04779 [pdf, other]

Long Short-Term Memory Spiking Networks and Their Applications

Authors: Ali Lotfi Rezaabad, Sriram Vishwanath

Abstract: Recent advances in event-based neuromorphic systems have resulted in significant interest in the use and development of spiking neural networks (SNNs). However, the non-differentiable nature of spiking neurons makes SNNs incompatible with conventional backpropagation techniques. In spite of the significant progress made in training conventional deep neural networks (DNNs), training methods for SNN… ▽ More Recent advances in event-based neuromorphic systems have resulted in significant interest in the use and development of spiking neural networks (SNNs). However, the non-differentiable nature of spiking neurons makes SNNs incompatible with conventional backpropagation techniques. In spite of the significant progress made in training conventional deep neural networks (DNNs), training methods for SNNs still remain relatively poorly understood. In this paper, we present a novel framework for training recurrent SNNs. Analogous to the benefits presented by recurrent neural networks (RNNs) in learning time series models within DNNs, we develop SNNs based on long short-term memory (LSTM) networks. We show that LSTM spiking networks learn the timing of the spikes and temporal dependencies. We also develop a methodology for error backpropagation within LSTM-based SNNs. The developed architecture and method for backpropagation within LSTM-based SNNs enable them to learn long-term dependencies with comparable results to conventional LSTMs. △ Less

Submitted 9 July, 2020; originally announced July 2020.

arXiv:2007.04258 [pdf, other]

Quantifying and Leveraging Predictive Uncertainty for Medical Image Assessment

Authors: Florin C. Ghesu, Bogdan Georgescu, Awais Mansoor, Youngjin Yoo, Eli Gibson, R. S. Vishwanath, Abishek Balachandran, James M. Balter, Yue Cao, Ramandeep Singh, Subba R. Digumarthy, Mannudeep K. Kalra, Sasa Grbic, Dorin Comaniciu

Abstract: The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance.… ▽ More The interpretation of medical images is a challenging task, often complicated by the presence of artifacts, occlusions, limited contrast and more. Most notable is the case of chest radiography, where there is a high inter-rater variability in the detection and classification of abnormalities. This is largely due to inconclusive evidence in the data or subjective definitions of disease appearance. An additional example is the classification of anatomical views based on 2D Ultrasound images. Often, the anatomical context captured in a frame is not sufficient to recognize the underlying anatomy. Current machine learning solutions for these problems are typically limited to providing probabilistic predictions, relying on the capacity of underlying models to adapt to limited information and the high degree of label noise. In practice, however, this leads to overconfident systems with poor generalization on unseen data. To account for this, we propose a system that learns not only the probabilistic estimate for classification, but also an explicit uncertainty measure which captures the confidence of the system in the predicted output. We argue that this approach is essential to account for the inherent ambiguity characteristic of medical images from different radiologic exams including computed radiography, ultrasonography and magnetic resonance imaging. In our experiments we demonstrate that sample rejection based on the predicted uncertainty can significantly improve the ROC-AUC for various tasks, e.g., by 8% to 0.91 with an expected rejection rate of under 25% for the classification of different abnormalities in chest radiographs. In addition, we show that using uncertainty-driven bootstrapping to filter the training data, one can achieve a significant increase in robustness and accuracy. △ Less

Submitted 8 July, 2020; originally announced July 2020.

Comments: Under review at Medical Image Analysis

arXiv:2006.15189 [pdf, other]

Interpretable Factorization for Neural Network ECG Models

Authors: Christopher Snyder, Sriram Vishwanath

Abstract: The ability of deep learning (DL) to improve the practice of medicine and its clinical outcomes faces a looming obstacle: model interpretation. Without description of how outputs are generated, a collaborating physician can neither resolve when the model's conclusions are in conflict with his or her own, nor learn to anticipate model behavior. Current research aims to interpret networks that diagn… ▽ More The ability of deep learning (DL) to improve the practice of medicine and its clinical outcomes faces a looming obstacle: model interpretation. Without description of how outputs are generated, a collaborating physician can neither resolve when the model's conclusions are in conflict with his or her own, nor learn to anticipate model behavior. Current research aims to interpret networks that diagnose ECG recordings, which has great potential impact as recordings become more personalized and widely deployed. A generalizable impact beyond ECGs lies in the ability to provide a rich test-bed for the development of interpretive techniques in medicine. Interpretive techniques for Deep Neural Networks (DNNs), however, tend to be heuristic and observational in nature, lacking the mathematical rigor one might expect in the analysis of math equations. The motivation of this paper is to offer a third option, a scientific approach. We treat the model output itself as a phenomenon to be explained through component parts and equations governing their behavior. We argue that these component parts should also be "black boxes" --additional targets to interpret heuristically with clear functional connection to the original. We show how to rigorously factor a DNN into a hierarchical equation consisting of black box variables. This is not a subdivision into physical parts, like an organism into its cells; it is but one choice of an equation into a collection of abstract functions. Yet, for DNNs trained to identify normal ECG waveforms on PhysioNet 2017 Challenge data, we demonstrate this choice yields interpretable component models identified with visual composite sketches of ECG samples in corresponding input regions. Moreover, the recursion distills this interpretation: additional factorization of component black boxes corresponds to ECG partitions that are more morphologically pure. △ Less

Submitted 26 June, 2020; originally announced June 2020.

arXiv:2006.10012 [pdf, other]

Robust Persistence Diagrams using Reproducing Kernels

Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath Sriperumbudur

Abstract: Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtra… ▽ More Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtrations of robust density estimators constructed using reproducing kernels. Using an analogue of the influence function on the space of persistence diagrams, we establish the proposed framework to be less sensitive to outliers. The robust persistence diagrams are shown to be consistent estimators in bottleneck distance, with the convergence rate controlled by the smoothness of the kernel. This, in turn, allows us to construct uniform confidence bands in the space of persistence diagrams. Finally, we demonstrate the superiority of the proposed approach on benchmark datasets. △ Less

Submitted 3 June, 2022; v1 submitted 17 June, 2020; originally announced June 2020.

MSC Class: 55N31; 62R40; 62G07; 46E22

arXiv:2006.03638 [pdf, other]

Robust Face Verification via Disentangled Representations

Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

Abstract: We introduce a robust algorithm for face verification, i.e., deciding whether twoimages are of the same person or not. Our approach is a novel take on the idea ofusing deep generative networks for adversarial robustness. We use the generativemodel during training as an online augmentation method instead of a test-timepurifier that removes adversarial noise. Our architecture uses a contrastive loss… ▽ More We introduce a robust algorithm for face verification, i.e., deciding whether twoimages are of the same person or not. Our approach is a novel take on the idea ofusing deep generative networks for adversarial robustness. We use the generativemodel during training as an online augmentation method instead of a test-timepurifier that removes adversarial noise. Our architecture uses a contrastive loss termand a disentangled generative model to sample negative pairs. Instead of randomlypairing two real images, we pair an image with its class-modified counterpart whilekeeping its content (pose, head tilt, hair, etc.) intact. This enables us to efficientlysample hard negative pairs for the contrastive loss. We experimentally show that, when coupled with adversarial training, the proposed scheme converges with aweak inner solver and has a higher clean and robust accuracy than state-of-the-art-methods when evaluated against white-box physical attacks. △ Less

Submitted 23 June, 2020; v1 submitted 5 June, 2020; originally announced June 2020.

Comments: Preprint

arXiv:2005.11853 [pdf, other]

Model-free Reinforcement Learning for Stochastic Stackelberg Security Games

Authors: Rajesh K Mishra, Deepanshu Vasal, Sriram Vishwanath

Abstract: In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader and a follower. The follower has access to the state of the system while the leader does not. Assuming that the players act in their respective best interests, the follower's strategy is to play the best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to… ▽ More In this paper, we consider a sequential stochastic Stackelberg game with two players, a leader and a follower. The follower has access to the state of the system while the leader does not. Assuming that the players act in their respective best interests, the follower's strategy is to play the best response to the leader's strategy. In such a scenario, the leader has the advantage of committing to a policy which maximizes its own returns given the knowledge that the follower is going to play the best response to its policy. Thus, both players converge to a pair of policies that form the Stackelberg equilibrium of the game. Recently,~[1] provided a sequential decomposition algorithm to compute the Stackelberg equilibrium for such games which allow for the computation of Markovian equilibrium policies in linear time as opposed to double exponential, as before. In this paper, we extend the idea to an MDP whose dynamics are not known to the players, to propose an RL algorithm based on Expected Sarsa that learns the Stackelberg equilibrium policy by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm. by simulating a model of the MDP. We use particle filters to estimate the belief update for a common agent which computes the optimal policy based on the information which is common to both the players. We present a security game example to illustrate the policy learned by our algorithm. △ Less

Submitted 24 May, 2020; originally announced May 2020.

arXiv:2004.02073 [pdf, other]

Model-free Reinforcement Learning for Non-stationary Mean Field Games

Authors: Rajesh K Mishra, Deepanshu Vasal, Sriram Vishwanath

Abstract: In this paper, we consider a finite horizon, non-stationary, mean field games (MFG) with a large population of homogeneous players, sequentially making strategic decisions, where each player is affected by other players through an aggregate population state termed as mean field state. Each player has a private type that only it can observe, and a mean field population state representing the empiri… ▽ More In this paper, we consider a finite horizon, non-stationary, mean field games (MFG) with a large population of homogeneous players, sequentially making strategic decisions, where each player is affected by other players through an aggregate population state termed as mean field state. Each player has a private type that only it can observe, and a mean field population state representing the empirical distribution of other players' types, which is shared among all of them. Recently, authors in [1] provided a sequential decomposition algorithm to compute mean field equilibrium (MFE) for such games which allows for the computation of equilibrium policies for them in linear time than exponential, as before. In this paper, we extend it for the case when state transitions are not known, to propose a reinforcement learning algorithm based on Expected Sarsa with a policy gradient approach that learns the MFE policy by learning the dynamics of the game simultaneously. We illustrate our results using cyber-physical security example. △ Less

Submitted 4 April, 2020; originally announced April 2020.

Comments: 7 pages, 2 figures

arXiv:2003.11619 [pdf, other]

Deep Networks as Logical Circuits: Generalization and Interpretation

Authors: Christopher Snyder, Sriram Vishwanath

Abstract: Not only are Deep Neural Networks (DNNs) black box models, but also we frequently conceptualize them as such. We lack good interpretations of the mechanisms linking inputs to outputs. Therefore, we find it difficult to analyze in human-meaningful terms (1) what the network learned and (2) whether the network learned. We present a hierarchical decomposition of the DNN discrete classification map in… ▽ More Not only are Deep Neural Networks (DNNs) black box models, but also we frequently conceptualize them as such. We lack good interpretations of the mechanisms linking inputs to outputs. Therefore, we find it difficult to analyze in human-meaningful terms (1) what the network learned and (2) whether the network learned. We present a hierarchical decomposition of the DNN discrete classification map into logical (AND/OR) combinations of intermediate (True/False) classifiers of the input. Those classifiers that can not be further decomposed, called atoms, are (interpretable) linear classifiers. Taken together, we obtain a logical circuit with linear classifier inputs that computes the same label as the DNN. This circuit does not structurally resemble the network architecture, and it may require many fewer parameters, depending on the configuration of weights. In these cases, we obtain simultaneously an interpretation and generalization bound (for the original DNN), connecting two fronts which have historically been investigated separately. Unlike compression techniques, our representation is. We motivate the utility of this perspective by studying DNNs in simple, controlled settings, where we obtain superior generalization bounds despite using only combinatorial information (e.g. no margin information). We demonstrate how to "open the black box" on the MNIST dataset. We show that the learned, internal, logical computations correspond to semantically meaningful (unlabeled) categories that allow DNN descriptions in plain English. We improve the generalization of an already trained network by interpreting, diagnosing, and replacing components the logical circuit that is the DNN. △ Less

Submitted 26 June, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

arXiv:2003.10185 [pdf, other]

Decentralized multi-agent reinforcement learning with shared actions

Authors: Rajesh K Mishra, Deepanshu Vasal, Sriram Vishwanath

Abstract: In this paper, we propose a novel model-free reinforcement learning algorithm to compute the optimal policies for a multi-agent system with $N$ cooperative agents where each agent privately observes it's own private type and publicly observes each others' actions. The goal is to maximize their collective reward. The problem belongs to the broad class of decentralized control problems with partial… ▽ More In this paper, we propose a novel model-free reinforcement learning algorithm to compute the optimal policies for a multi-agent system with $N$ cooperative agents where each agent privately observes it's own private type and publicly observes each others' actions. The goal is to maximize their collective reward. The problem belongs to the broad class of decentralized control problems with partial information. We use the common agent approach wherein some fictitious common agent picks the best policy based on a belief on the current states of the agents. These beliefs are updated individually for each agent from their current belief and action histories. Belief state updates without the knowledge of system dynamics is a challenge. In this paper, we employ particle filters called the bootstrap filter distributively across agents to update the belief. We provide a model-free reinforcement learning (RL) method for this multi-agent partially observable Markov decision processes using the particle filter and sampled trajectories to estimate the optimal policies for the agents. We showcase our results with the help of a smartgrid application where the users strive to reduce collective cost of power for all the agents in the grid. Finally, we compare the performances for model and model-free implementation of the RL algorithm establishing the effectiveness of particle filter (pf) method. △ Less

Submitted 23 March, 2020; originally announced March 2020.

Comments: 9 Pages, Two column, 2 figures

arXiv:2002.12504 [pdf, other]

Detecting Patch Adversarial Attacks with Image Residuals

Authors: Marius Arvinte, Ahmed Tewfik, Sriram Vishwanath

Abstract: We introduce an adversarial sample detection algorithm based on image residuals, specifically designed to guard against patch-based attacks. The image residual is obtained as the difference between an input image and a denoised version of it, and a discriminator is trained to distinguish between clean and adversarial samples. More precisely, we use a wavelet domain algorithm for denoising images a… ▽ More We introduce an adversarial sample detection algorithm based on image residuals, specifically designed to guard against patch-based attacks. The image residual is obtained as the difference between an input image and a denoised version of it, and a discriminator is trained to distinguish between clean and adversarial samples. More precisely, we use a wavelet domain algorithm for denoising images and demonstrate that the obtained residuals act as a digital fingerprint for adversarial attacks. To emulate the limitations of a physical adversary, we evaluate the performance of our approach against localized (patch-based) adversarial attacks, including in settings where the adversary has complete knowledge about the detection scheme. Our results show that the proposed detection method generalizes to previously unseen, stronger attacks and that it is able to reduce the success rate (conversely, increase the computational effort) of an adaptive attacker. △ Less

Submitted 2 March, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

arXiv:2002.02567 [pdf, other]

doi 10.1145/3392153

Stability and Scalability of Blockchain Systems

Authors: Aditya Gopalan, Abishek Sankararaman, Anwar Walid, Sriram Vishwanath

Abstract: The blockchain paradigm provides a mechanism for content dissemination and distributed consensus on Peer-to-Peer (P2P) networks. While this paradigm has been widely adopted in industry, it has not been carefully analyzed in terms of its network scaling with respect to the number of peers. Applications for blockchain systems, such as cryptocurrencies and IoT, require this form of network scaling.… ▽ More The blockchain paradigm provides a mechanism for content dissemination and distributed consensus on Peer-to-Peer (P2P) networks. While this paradigm has been widely adopted in industry, it has not been carefully analyzed in terms of its network scaling with respect to the number of peers. Applications for blockchain systems, such as cryptocurrencies and IoT, require this form of network scaling. In this paper, we propose a new stochastic network model for a blockchain system. We identify a structural property called \emph{one-endedness}, which we show to be desirable in any blockchain system as it is directly related to distributed consensus among the peers. We show that the stochastic stability of the network is sufficient for the one-endedness of a blockchain. We further establish that our model belongs to a class of network models, called monotone separable models. This allows us to establish upper and lower bounds on the stability region. The bounds on stability depend on the connectivity of the P2P network through its conductance and allow us to analyze the scalability of blockchain systems on large P2P networks. We verify our theoretical insights using both synthetic data and real data from the Bitcoin network. △ Less

Submitted 18 December, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

Comments: This is the revised version of the paper

MSC Class: 94A06 (Primary); 60H06 (Secondary) ACM Class: C.4; G.3; H.4.3

Journal ref: Proc. ACM Meas. Anal. Comput. Syst. Vol. 4 No. 2 (2020) Article 35, pages 1-35

arXiv:2002.02127 [pdf, other]

Equipping Millimeter-Wave Full-Duplex with Analog Self-Interference Cancellation

Authors: Ian P. Roberts, Hardik B. Jain, Sriram Vishwanath

Abstract: There have been recent works on enabling in-band full-duplex operation using millimeter-wave (mmWave) transceivers. These works are based solely on creating sufficient isolation between a transceiver's transmitter and receiver via multiple-input multiple-output (MIMO) precoding and combining. In this work, we propose supplementing these beamforming strategies with analog self-interference cancella… ▽ More There have been recent works on enabling in-band full-duplex operation using millimeter-wave (mmWave) transceivers. These works are based solely on creating sufficient isolation between a transceiver's transmitter and receiver via multiple-input multiple-output (MIMO) precoding and combining. In this work, we propose supplementing these beamforming strategies with analog self-interference cancellation (A-SIC). By leveraging A-SIC, a portion of the self-interference is cancelled without the need for beamforming, allowing for more optimal beamforming strategies to be used in serving users. We use simulation to demonstrate that even with finite resolution A-SIC solutions, there are significant gains to be had in sum spectral efficiency. With a single bit of A-SIC resolution, improvements over a beamforming-only design are present. With 8 bits of A-SIC resolution, our design nearly approaches that of ideal full-duplex operation. To the best of our knowledge, this is the first mmWave full-duplex design that combines both beamforming and A-SIC to achieve simultaneous transmission and reception in-band. △ Less

Submitted 6 February, 2020; originally announced February 2020.

arXiv:2001.09599 [pdf, other]

Achieving Multi-Port Memory Performance on Single-Port Memory with Coding Techniques

Authors: Hardik Jain, Matthew Edwards, Ethan Elenberg, Ankit Singh Rawat, Sriram Vishwanath

Abstract: Many performance critical systems today must rely on performance enhancements, such as multi-port memories, to keep up with the increasing demand of memory-access capacity. However, the large area footprints and complexity of existing multi-port memory designs limit their applicability. This paper explores a coding theoretic framework to address this problem. In particular, this paper introduces a… ▽ More Many performance critical systems today must rely on performance enhancements, such as multi-port memories, to keep up with the increasing demand of memory-access capacity. However, the large area footprints and complexity of existing multi-port memory designs limit their applicability. This paper explores a coding theoretic framework to address this problem. In particular, this paper introduces a framework to encode data across multiple single-port memory banks in order to {\em algorithmically} realize the functionality of multi-port memory. This paper proposes three code designs with significantly less storage overhead compared to the existing replication based emulations of multi-port memories. To further improve performance, we also demonstrate a memory controller design that utilizes redundancy across coded memory banks to more efficiently schedule read and write requests sent across multiple cores. Furthermore, guided by DRAM traces, the paper explores {\em dynamic coding} techniques to improve the efficiency of the coding based memory design. We then show significant performance improvements in critical word read and write latency in the proposed coded-memory design when compared to a traditional uncoded-memory design. △ Less

Submitted 27 January, 2020; originally announced January 2020.

Comments: 10 pages, 20 figures, ICICT 2020 conference

arXiv:2001.05633 [pdf, ps, other]

Master equation of discrete time graphon mean field games and teams

Authors: Deepanshu Vasal, Rajesh K Mishra, Sriram Vishwanath

Abstract: In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. E… ▽ More In this paper, we present a sequential decomposition algorithm equivalent of Master equation to compute GMFE of GMFG and graphon optimal Markovian policies (GOMPs) of graphon mean field teams (GMFTs). We consider a large population of players sequentially making strategic decisions where the actions of each player affect their neighbors which is captured in a graph, generated by a known graphon. Each player observes a private state and also a common information as a graphon mean-field population state which represents the empirical networked distribution of other players' types. We consider non-stationary population state dynamics and present a novel backward recursive algorithm to compute both GMFE and GOMP that depend on both, a player's private type, and the current (dynamic) population state determined through the graphon. Each step in computing GMFE consists of solving a fixed-point equation, while computing GOMP involves solving for an optimization problem. We provide conditions on model parameters for which there exists such a GMFE. Using this algorithm, we obtain the GMFE and GOMP for a specific security setup in cyber physical systems for different graphons that capture the interactions between the nodes in the system. △ Less

Submitted 7 June, 2022; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: 26 pages, 6 figures. arXiv admin note: text overlap with arXiv:1905.04154

arXiv:2001.00220 [pdf, other]

On the Limits of Topological Data Analysis for Statistical Inference

Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath Sriperumbudur

Abstract: Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary… ▽ More Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary and sufficient conditions under which \textit{valid statistical inference} is possible using {topological summary statistics}. Additionally, we provide examples of models that demonstrate invariance with respect to topological summaries. △ Less

Submitted 15 February, 2024; v1 submitted 1 January, 2020; originally announced January 2020.

Comments: 36 pages, 9 figures

MSC Class: 62F30; 55N31; 62R40

arXiv:1912.13361 [pdf, other]

Learning Representations by Maximizing Mutual Information in Variational Autoencoders

Authors: Ali Lotfi Rezaabad, Sriram Vishwanath

Abstract: Variational autoencoders (VAEs) have ushered in a new era of unsupervised learning methods for complex distributions. Although these techniques are elegant in their approach, they are typically not useful for representation learning. In this work, we propose a simple yet powerful class of VAEs that simultaneously result in meaningful learned representations. Our solution is to combine traditional… ▽ More Variational autoencoders (VAEs) have ushered in a new era of unsupervised learning methods for complex distributions. Although these techniques are elegant in their approach, they are typically not useful for representation learning. In this work, we propose a simple yet powerful class of VAEs that simultaneously result in meaningful learned representations. Our solution is to combine traditional VAEs with mutual information maximization, with the goal to enhance amortized inference in VAEs using Information Theoretic techniques. We call this approach InfoMax-VAE, and such an approach can significantly boost the quality of learned high-level representations. We realize this through the explicit maximization of information measures associated with the representation. Using extensive experiments on varied datasets and setups, we show that InfoMax-VAE outperforms contemporary popular approaches, including Info-VAE and $β$-VAE. △ Less

Submitted 7 January, 2020; v1 submitted 21 December, 2019; originally announced December 2019.

arXiv:1911.11283 [pdf, other]

Enabling In-Band Coexistence of Millimeter-Wave Communication and Radar

Authors: Hardik B. Jain, Ian P. Roberts, Sriram Vishwanath

Abstract: The wide bandwidths available at millimeter-wave (mmWave) frequencies have offered exciting potential to wireless communication systems and radar alike. Communication systems can offer higher rates and support more users with mmWave bands while radar systems can benefit from higher resolution captures. This leads to the possibility that portions of mmWave spectrum will be occupied by both communic… ▽ More The wide bandwidths available at millimeter-wave (mmWave) frequencies have offered exciting potential to wireless communication systems and radar alike. Communication systems can offer higher rates and support more users with mmWave bands while radar systems can benefit from higher resolution captures. This leads to the possibility that portions of mmWave spectrum will be occupied by both communication and radar (e.g., 60 GHz industrial, scientific, and medical (ISM) band). This potential coexistence motivates the work of this paper, in which we present a design that can enable simultaneous, in-band operation of a communication system and radar system across the same mmWave frequencies. To enable such a feat, we mitigate the interference that would otherwise be incurred by leveraging the numerous antennas offered in mmWave communication systems. Dense antenna arrays allow us to avoid interference spatially, even with the hybrid beamforming constraints often imposed by mmWave communication systems. Simulation shows that our design sufficiently enables simultaneous, in-band coexistence of a mmWave radar and communication system. △ Less

Submitted 25 November, 2019; originally announced November 2019.

arXiv:1910.11983 [pdf, other]

Frequency-Selective Beamforming Cancellation Design for Millimeter-Wave Full-Duplex

Authors: Ian P. Roberts, Hardik B. Jain, Sriram Vishwanath

Abstract: The wide bandwidths offered at millimeter-wave (mmWave) frequencies have made them an attractive choice for future wireless communication systems. Recent works have presented beamforming strategies for enabling in-band full-duplex (FD) capability at mmWave even under the constraints of hybrid beamforming, extending the exciting possibilities of next-generation wireless. Existing mmWave FD designs,… ▽ More The wide bandwidths offered at millimeter-wave (mmWave) frequencies have made them an attractive choice for future wireless communication systems. Recent works have presented beamforming strategies for enabling in-band full-duplex (FD) capability at mmWave even under the constraints of hybrid beamforming, extending the exciting possibilities of next-generation wireless. Existing mmWave FD designs, however, do not consider frequency-selective mmWave channels. Wideband communication at mmWave suggests that frequency-selectivity will likely be of concern since communication channels will be on the order of hundreds of megahertz or more. This has motivated the work of this paper, in which we present a frequency-selective beamforming design to enable practical wideband mmWave FD applications. In our designs, we account for the challenges associated with hybrid analog/digital beamforming such as phase shifter resolution, a desirably low number of radio frequency (RF) chains, and the frequency-flat nature of analog beamformers. We use simulation to validate our work, which indicates that spectral efficiency gains can be achieved with our design by enabling simultaneous transmission and reception in-band. △ Less

Submitted 25 October, 2019; originally announced October 2019.

arXiv:1908.06505 [pdf, other]

doi 10.1109/GLOBECOM38437.2019.9013116

Beamforming Cancellation Design for Millimeter-Wave Full-Duplex

Authors: Ian P. Roberts, Sriram Vishwanath

Abstract: In recent years, there has been extensive research on millimeter-wave (mmWave) communication and on in-band full-duplex (FD) communication, but work on the combination of the two is relatively lacking. FD mmWave systems could offer increased spectral efficiency and decreased latency while also suggesting the redesign of existing mmWave applications. While FD technology has been well-explored for s… ▽ More In recent years, there has been extensive research on millimeter-wave (mmWave) communication and on in-band full-duplex (FD) communication, but work on the combination of the two is relatively lacking. FD mmWave systems could offer increased spectral efficiency and decreased latency while also suggesting the redesign of existing mmWave applications. While FD technology has been well-explored for sub-6 GHz systems, the developed methods do not translate well to mmWave. This turns us to a method called beamforming cancellation (BFC), where the highly directional mmWave beams are steered to mitigate self-interference (SI) and enable simultaneous transmission and reception in-band. In this paper, we present BFC designs for two fully-connected hybrid beamforming scenarios, both of which sufficiently suppress the SI such that the sum spectral efficiency approaches that of a SI-free FD system. A simulation and its results are then used to verify our designs. △ Less

Submitted 30 June, 2020; v1 submitted 18 August, 2019; originally announced August 2019.

Comments: Conference paper presented at 2019 IEEE Global Communications Conference (GLOBECOM)

Journal ref: 2019 IEEE Global Communications Conference (GLOBECOM), Waikoloa, HI, USA, 2019, pp. 1-6

arXiv:1906.07849 [pdf, other]

Deep Learning-Based Quantization of L-Values for Gray-Coded Modulation

Authors: Marius Arvinte, Sriram Vishwanath, Ahmed H. Tewfik

Abstract: In this work, a deep learning-based quantization scheme for log-likelihood ratio (L-value) storage is introduced. We analyze the dependency between the average magnitude of different L-values from the same quadrature amplitude modulation (QAM) symbol and show they follow a consistent ordering. Based on this we design a deep autoencoder that jointly compresses and separately reconstructs each L-val… ▽ More In this work, a deep learning-based quantization scheme for log-likelihood ratio (L-value) storage is introduced. We analyze the dependency between the average magnitude of different L-values from the same quadrature amplitude modulation (QAM) symbol and show they follow a consistent ordering. Based on this we design a deep autoencoder that jointly compresses and separately reconstructs each L-value, allowing the use of a weighted loss function that aims to more accurately reconstructs low magnitude inputs. Our method is shown to be competitive with state-of-the-art maximum mutual information quantization schemes, reducing the required memory footprint by a ratio of up to two and a loss of performance smaller than 0.1 dB with less than two effective bits per L-value or smaller than 0.04 dB with 2.25 effective bits. We experimentally show that our proposed method is a universal compression scheme in the sense that after training on an LDPC-coded Rayleigh fading scenario we can reuse the same network without further training on other channel models and codes while preserving the same performance benefits. △ Less

Submitted 9 May, 2021; v1 submitted 18 June, 2019; originally announced June 2019.

Comments: Submitted to IEEE Globecom 2019

arXiv:1903.04656 [pdf, other]

Deep Log-Likelihood Ratio Quantization

Authors: Marius Arvinte, Ahmed H. Tewfik, Sriram Vishwanath

Abstract: In this work, a deep learning-based method for log-likelihood ratio (LLR) lossy compression and quantization is proposed, with emphasis on a single-input single-output uncorrelated fading communication setting. A deep autoencoder network is trained to compress, quantize and reconstruct the bit log-likelihood ratios corresponding to a single transmitted symbol. Specifically, the encoder maps to a l… ▽ More In this work, a deep learning-based method for log-likelihood ratio (LLR) lossy compression and quantization is proposed, with emphasis on a single-input single-output uncorrelated fading communication setting. A deep autoencoder network is trained to compress, quantize and reconstruct the bit log-likelihood ratios corresponding to a single transmitted symbol. Specifically, the encoder maps to a latent space with dimension equal to the number of sufficient statistics required to recover the inputs - equal to three in this case - while the decoder aims to reconstruct a noisy version of the latent representation with the purpose of modeling quantization effects in a differentiable way. Simulation results show that, when applied to a standard rate-1/2 low-density parity-check (LDPC) code, a finite precision compression factor of nearly three times is achieved when storing an entire codeword, with an incurred loss of performance lower than 0.1 dB compared to straightforward scalar quantization of the log-likelihood ratios. △ Less

Submitted 9 May, 2021; v1 submitted 11 March, 2019; originally announced March 2019.

Comments: Accepted for publication at EUSIPCO 2019. Camera-ready version

arXiv:1902.00112 [pdf, other]

HashCore: Proof-of-Work Functions for General Purpose Processors

Authors: Yanni Georghiades, Steven Flolid, Sriram Vishwanath

Abstract: Over the past five years, the rewards associated with mining Proof-of-Work blockchains have increased substantially. As a result, miners are heavily incentivized to design and utilize Application Specific Integrated Circuits (ASICs) that can compute hashes far more efficiently than existing general purpose hardware. Currently, it is difficult for most users to purchase and operate ASICs due to pri… ▽ More Over the past five years, the rewards associated with mining Proof-of-Work blockchains have increased substantially. As a result, miners are heavily incentivized to design and utilize Application Specific Integrated Circuits (ASICs) that can compute hashes far more efficiently than existing general purpose hardware. Currently, it is difficult for most users to purchase and operate ASICs due to pricing and availability constraints, resulting in a relatively small number of miners with respect to total user base for most popular cryptocurrencies. In this work, we aim to invert the problem of ASIC development by constructing a Proof-of-Work function for which an existing general purpose processor (GPP, such as an x86 IC) is already an optimized ASIC. In doing so, we will ensure that any would-be miner either already owns an ASIC for the Proof-of-Work system they wish to participate in or can attain one at a competitive price with relative ease. In order to achieve this, we present HashCore, a Proof-of-Work function composed of "widgets" generated pseudo-randomly at runtime that each execute a sequence of general purpose processor instructions designed to stress the computational resources of such a GPP. The widgets will be modeled after workloads that GPPs have been optimized for, for example, the SPEC CPU 2017 benchmark suite for x86 ICs, in a technique we refer to as inverted benchmarking. We provide a proof that HashCore is collision-resistant regardless of how the widgets are implemented. We observe that GPP designers/developers essentially create an ASIC for benchmarks such as SPEC CPU 2017. By modeling HashCore after such benchmarks, we create a Proof-of-Work function that can be run most efficiently on a GPP, resulting in a more accessible, competitive, and balanced mining market. △ Less

Submitted 15 April, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

Showing 1–50 of 149 results for author: Vishwanath, S