Search | arXiv e-print repository

ILLUMINER: Instruction-tuned Large Language Models as Few-shot Intent Classifier and Slot Filler

Authors: Paramita Mirza, Viju Sudhi, Soumya Ranjan Sahoo, Sinchana Ramakanth Bhat

Abstract: State-of-the-art intent classification (IC) and slot filling (SF) methods often rely on data-intensive deep learning models, limiting their practicality for industry applications. Large language models on the other hand, particularly instruction-tuned models (Instruct-LLMs), exhibit remarkable zero-shot performance across various natural language tasks. This study evaluates Instruct-LLMs on popula… ▽ More State-of-the-art intent classification (IC) and slot filling (SF) methods often rely on data-intensive deep learning models, limiting their practicality for industry applications. Large language models on the other hand, particularly instruction-tuned models (Instruct-LLMs), exhibit remarkable zero-shot performance across various natural language tasks. This study evaluates Instruct-LLMs on popular benchmark datasets for IC and SF, emphasizing their capacity to learn from fewer examples. We introduce ILLUMINER, an approach framing IC and SF as language generation tasks for Instruct-LLMs, with a more efficient SF-prompting method compared to prior work. A comprehensive comparison with multiple baselines shows that our approach, using the FLAN-T5 11B model, outperforms the state-of-the-art joint IC+SF method and in-context learning with GPT3.5 (175B), particularly in slot filling by 11.1--32.2 percentage points. Additionally, our in-depth ablation study demonstrates that parameter-efficient fine-tuning requires less than 6% of training data to yield comparable performance with traditional full-weight fine-tuning. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2403.00290 [pdf, ps, other]

Semantic Text Transmission via Prediction with Small Language Models: Cost-Similarity Trade-off

Authors: Bhavani A Madhabhavi, Gangadhar Karevvanavar, Rajshekhar V Bhat, Nikolaos Pappas

Abstract: We consider the communication of natural language text from a source to a destination over noiseless and character-erasure channels. We exploit language's inherent correlations and predictability to constrain transmission costs by allowing the destination to predict or complete words with potential dissimilarity with the source text. Concretely, our objective is to obtain achievable… ▽ More We consider the communication of natural language text from a source to a destination over noiseless and character-erasure channels. We exploit language's inherent correlations and predictability to constrain transmission costs by allowing the destination to predict or complete words with potential dissimilarity with the source text. Concretely, our objective is to obtain achievable $(\bar{c}, \bar{s})$ pairs, where $\bar{c}$ is the average transmission cost at the source and $\bar{s}$ is the average semantic similarity measured via cosine similarity between vector embedding of words at the source and those predicted/completed at the destination. We obtain $(\bar{c}, \bar{s})$ pairs for neural language and first-order Markov chain-based small language models (SLM) for prediction, using both a threshold policy that transmits a word if its cosine similarity with that predicted/completed at the destination is below a threshold, and a periodic policy, which transmits words after a specific interval and predicts/completes the words in between, at the destination. We adopt an SLM for word completion. We demonstrate that, when communication occurs over a noiseless channel, the threshold policy achieves a higher $\bar{s}$ for a given $\bar{c}$ than the periodic policy and that the $\bar{s}$ achieved with the neural SLM is greater than or equal to that of the Markov chain-based algorithm for the same $\bar{c}$. The improved performance comes with a higher complexity in terms of time and computing requirements. However, when communication occurs over a character-erasure channel, all prediction algorithms and scheduling policies perform poorly. Furthermore, if character-level Huffman coding is used, the required $\bar{c}$ to achieve a given $\bar{s}$ is reduced, but the above observations still apply. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2311.09975 [pdf, ps, other]

Version Age of Information Minimization over Fading Broadcast Channels

Authors: Gangadhar Karevvanavar, Hrishikesh Pable, Om Patil, Rajshekhar V Bhat, Nikolaos Pappas

Abstract: We consider a base station (BS) that receives version update packets from multiple exogenous streams and broadcasts them to corresponding users over a fading broadcast channel using a non-orthogonal multiple access (NOMA) scheme. Sequentially indexed packets arrive randomly in each stream, with new packets making the previous ones obsolete. In this case, we consider the version age of information… ▽ More We consider a base station (BS) that receives version update packets from multiple exogenous streams and broadcasts them to corresponding users over a fading broadcast channel using a non-orthogonal multiple access (NOMA) scheme. Sequentially indexed packets arrive randomly in each stream, with new packets making the previous ones obsolete. In this case, we consider the version age of information (VAoI) at a user, defined as the difference in the version index of the latest available packet at the BS and that at the user, as a metric of freshness of information. Our objective is to minimize a weighted sum of average VAoI across users subject to an average power constraint at the BS by optimally scheduling the update packets from various streams for transmission and transmitting them with sufficient powers to guarantee their successful delivery. We consider the class of channel-only stationary randomized policies (CO-SRP), which rely solely on channel power gains for transmission decisions. We solve the resulting non-convex problem optimally and show that the VAoI achieved under the optimal CO-SRP is within twice the optimal achievable VAoI. We also obtained a Constrained Markov Decision Process (CMDP)-based solution and its structural properties. Numerical simulations show a close performance between the optimal CO-SRP and CMDP-based solutions. Additionally, a time division multiple access (TDMA) scheme, which allows transmission to at most one user at a time, matches NOMA's performance under tight average power constraints. However, NOMA outperforms TDMA as the constraint is relaxed. △ Less

Submitted 12 February, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

arXiv:2310.09536 [pdf, other]

CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering

Authors: Md Rashad Al Hasan Rony, Christian Suess, Sinchana Ramakanth Bhat, Viju Sudhi, Julia Schneider, Maximilian Vogel, Roman Teucher, Ken E. Friedl, Soumya Sahoo

Abstract: Large language models (LLMs) have demonstrated remarkable performance by following natural language instructions without fine-tuning them on domain-specific tasks and data. However, leveraging LLMs for domain-specific question answering suffers from severe limitations. The generated answer tends to hallucinate due to the training data collection time (when using off-the-shelf), complex user uttera… ▽ More Large language models (LLMs) have demonstrated remarkable performance by following natural language instructions without fine-tuning them on domain-specific tasks and data. However, leveraging LLMs for domain-specific question answering suffers from severe limitations. The generated answer tends to hallucinate due to the training data collection time (when using off-the-shelf), complex user utterance and wrong retrieval (in retrieval-augmented generation). Furthermore, due to the lack of awareness about the domain and expected output, such LLMs may generate unexpected and unsafe answers that are not tailored to the target domain. In this paper, we propose CarExpert, an in-car retrieval-augmented conversational question-answering system leveraging LLMs for different tasks. Specifically, CarExpert employs LLMs to control the input, provide domain-specific documents to the extractive and generative answering components, and controls the output to ensure safe and domain-specific answers. A comprehensive empirical evaluation exhibits that CarExpert outperforms state-of-the-art LLMs in generating natural, safe and car-specific answers. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: Accepted into EMNLP 2023 (industry track), corresponding Author: Md Rashad Al Hasan Rony

arXiv:2309.05974 [pdf, ps, other]

Optimizing Reported Age of Information with Short Error Correction and Detection Codes

Authors: Sumanth S Raikar, Rajshekhar V Bhat

Abstract: Timely sampling and fresh information delivery are important in 6G communications. This is achieved by encoding samples into short packets/codewords for transmission, with potential decoding errors. We consider a broadcasting base station (BS) that samples information from multiple sources and transmits to respective destinations/users, using short-blocklength cyclic and deep learning (DL) based c… ▽ More Timely sampling and fresh information delivery are important in 6G communications. This is achieved by encoding samples into short packets/codewords for transmission, with potential decoding errors. We consider a broadcasting base station (BS) that samples information from multiple sources and transmits to respective destinations/users, using short-blocklength cyclic and deep learning (DL) based codes for error correction, and cyclic-redundancy-check (CRC) codes for error detection. We use a metric called reported age of information (AoI), abbreviated as RAoI, to measure the freshness of information, which increases from an initial value if the CRC reports a failure, else is reset. We minimize long-term average expected RAoI, subject to constraints on transmission power and distortion, for which we obtain age-agnostic randomized and age-aware drift-plus-penalty policies that decide which user to transmit to, with what message-word length and transmit power, and derive bounds on their performance. Simulations show that longer CRC codes lead to higher RAoI, but the RAoI achieved is closer to the true, genie-aided AoI. DL-based codes achieve lower RAoI. Finally, we conclude that prior AoI optimization literature with finite blocklengths substantially underestimates AoI because they assume that all errors can be detected perfectly without using CRC. △ Less

Submitted 12 September, 2023; originally announced September 2023.

arXiv:2308.07335 [pdf, other]

An Encoder-Decoder Approach for Packing Circles

Authors: Akshay Kiran Jose, Gangadhar Karevvanavar, Rajshekhar V Bhat

Abstract: The problem of packing smaller objects within a larger object has been of interest since decades. In these problems, in addition to the requirement that the smaller objects must lie completely inside the larger objects, they are expected to not overlap or have minimum overlap with each other. Due to this, the problem of packing turns out to be a non-convex problem, obtaining whose optimal solution… ▽ More The problem of packing smaller objects within a larger object has been of interest since decades. In these problems, in addition to the requirement that the smaller objects must lie completely inside the larger objects, they are expected to not overlap or have minimum overlap with each other. Due to this, the problem of packing turns out to be a non-convex problem, obtaining whose optimal solution is challenging. As such, several heuristic approaches have been used for obtaining sub-optimal solutions in general, and provably optimal solutions for some special instances. In this paper, we propose a novel encoder-decoder architecture consisting of an encoder block, a perturbation block and a decoder block, for packing identical circles within a larger circle. In our approach, the encoder takes the index of a circle to be packed as an input and outputs its center through a normalization layer, the perturbation layer adds controlled perturbations to the center, ensuring that it does not deviate beyond the radius of the smaller circle to be packed, and the decoder takes the perturbed center as input and estimates the index of the intended circle for packing. We parameterize the encoder and decoder by a neural network and optimize it to reduce an error between the decoder's estimated index and the actual index of the circle provided as input to the encoder. The proposed approach can be generalized to pack objects of higher dimensions and different shapes by carefully choosing normalization and perturbation layers. The approach gives a sub-optimal solution and is able to pack smaller objects within a larger object with competitive performance with respect to classical methods. △ Less

Submitted 11 August, 2023; originally announced August 2023.

arXiv:2307.11317 [pdf, other]

XLDA: Linear Discriminant Analysis for Scaling Continual Learning to Extreme Classification at the Edge

Authors: Karan Shah, Vishruth Veerendranath, Anushka Hebbar, Raghavendra Bhat

Abstract: Streaming Linear Discriminant Analysis (LDA) while proven in Class-incremental Learning deployments at the edge with limited classes (upto 1000), has not been proven for deployment in extreme classification scenarios. In this paper, we present: (a) XLDA, a framework for Class-IL in edge deployment where LDA classifier is proven to be equivalent to FC layer including in extreme classification scena… ▽ More Streaming Linear Discriminant Analysis (LDA) while proven in Class-incremental Learning deployments at the edge with limited classes (upto 1000), has not been proven for deployment in extreme classification scenarios. In this paper, we present: (a) XLDA, a framework for Class-IL in edge deployment where LDA classifier is proven to be equivalent to FC layer including in extreme classification scenarios, and (b) optimizations to enable XLDA-based training and inference for edge deployment where there is a constraint on available compute resources. We show up to 42x speed up using a batched training approach and up to 5x inference speedup with nearest neighbor search on extreme datasets like AliProducts (50k classes) and Google Landmarks V2 (81k classes) △ Less

Submitted 20 July, 2023; originally announced July 2023.

Comments: Submitted at ICML 2023: PAC-Bayes Interactive Learning Workshop

arXiv:2305.11790 [pdf, other]

Prompting with Pseudo-Code Instructions

Authors: Mayank Mishra, Prince Kumar, Riyaz Bhat, Rudra Murthy V, Danish Contractor, Srikanth Tamilselvam

Abstract: Prompting with natural language instructions has recently emerged as a popular method of harnessing the capabilities of large language models. Given the inherent ambiguity present in natural language, it is intuitive to consider the possible advantages of prompting with less ambiguous prompt styles, such as the use of pseudo-code. In this paper we explore if prompting via pseudo-code instruction… ▽ More Prompting with natural language instructions has recently emerged as a popular method of harnessing the capabilities of large language models. Given the inherent ambiguity present in natural language, it is intuitive to consider the possible advantages of prompting with less ambiguous prompt styles, such as the use of pseudo-code. In this paper we explore if prompting via pseudo-code instructions helps improve the performance of pre-trained language models. We manually create a dataset of pseudo-code prompts for 132 different tasks spanning classification, QA and generative language tasks, sourced from the Super-NaturalInstructions dataset. Using these prompts along with their counterparts in natural language, we study their performance on two LLM families - BLOOM and CodeGen. Our experiments show that using pseudo-code instructions leads to better results, with an average increase (absolute) of 7-16 points in F1 scores for classification tasks and an improvement (relative) of 12-38% in aggregate ROUGE-L scores across all tasks. We include detailed ablation studies which indicate that code comments, docstrings, and the structural clues encoded in pseudo-code all contribute towards the improvement in performance. To the best of our knowledge our work is the first to demonstrate how pseudo-code prompts can be helpful in improving the performance of pre-trained LMs. △ Less

Submitted 19 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: Published in EMNLP 2023 main track

arXiv:2303.16143 [pdf, ps, other]

Importance-Aware Fresh Delivery of Versions over Energy Harvesting MACs

Authors: Gangadhar Karevvanavar, Rajshekhar V Bhat

Abstract: We consider a scenario where multiple users, powered by energy harvesting, send version updates over a fading multiple access channel (MAC) to an access point (AP). Version updates having random importance weights arrive at a user according to an exogenous arrival process, and a new version renders all previous versions obsolete. As energy harvesting imposes a time-varying peak power constraint, i… ▽ More We consider a scenario where multiple users, powered by energy harvesting, send version updates over a fading multiple access channel (MAC) to an access point (AP). Version updates having random importance weights arrive at a user according to an exogenous arrival process, and a new version renders all previous versions obsolete. As energy harvesting imposes a time-varying peak power constraint, it is not possible to deliver all the bits of a version instantaneously. Accordingly, the AP chooses the objective of minimizing a finite-horizon time average expectation of the product of importance weight and a convex increasing function of the number of remaining bits of a version to be transmitted at each time instant. The objective enables importance-aware delivery of as many bits, as soon as possible. In this setup, the AP optimizes the objective function subject to an achievable rate-region constraint of the MAC and energy constraints at the users, by deciding the transmit power and the number of bits to be transmitted by each user. We obtain a Markov Decision Process (MDP)-based optimal online policy to the problem and derive structural properties of the policy. We then develop a neural network (NN)-based online heuristic policy, for which we train an NN on the optimal offline policy derived for different sample paths of energy, version arrival and channel power gain processes. Via numerical simulations, we observe that the NN-based online policy performs competitively with respect to the MDP-based online policy. △ Less

Submitted 28 March, 2023; originally announced March 2023.

arXiv:2303.00850 [pdf, other]

Distortion Minimization with Age of Information and Cost Constraints

Authors: Jayanth S, Nikolaos Pappas, Rajshekhar V Bhat

Abstract: We consider a source monitoring a stochastic process with a transmitter to transmit timely information through a wireless ON/OFF channel to a destination. We assume that once the source samples the data, the sampled data has to be processed to identify the state of the stochastic process. The processing can take place either at the source before transmission or after transmission at the destinatio… ▽ More We consider a source monitoring a stochastic process with a transmitter to transmit timely information through a wireless ON/OFF channel to a destination. We assume that once the source samples the data, the sampled data has to be processed to identify the state of the stochastic process. The processing can take place either at the source before transmission or after transmission at the destination. The objective is to minimize the distortion while keeping the age of information (AoI) that measures the timeliness of information under a certain threshold. We use a stationary randomized policy (SRP) framework to solve the formulated problem. We show that the two-dimensional discrete-time Markov chain considering the AoI and instantaneous distortion as the state is lumpable and we obtain the expression for the expected AoI under the SRP. △ Less

Submitted 24 June, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

Comments: 8 pages, 6 figures

arXiv:2302.11512 [pdf, other]

Maximization of Timely Throughput with Target Wake Time in IEEE 802.11ax

Authors: Rishabh Roy, Rajshekhar V Bhat, Preyas Hathi, Nadeem Akhtar, Naveen Mysore Balasubramanya

Abstract: In the IEEE 802.11ax standard, a mode of operation called target wake time (TWT) is introduced towards enabling deterministic scheduling in WLAN networks. In the TWT mode, a group of stations (STAs) can negotiate with the access point (AP) a periodically repeating time window, referred to as TWT Service Period (TWT-SP), over which they are awake and outside which they sleep for saving power. The o… ▽ More In the IEEE 802.11ax standard, a mode of operation called target wake time (TWT) is introduced towards enabling deterministic scheduling in WLAN networks. In the TWT mode, a group of stations (STAs) can negotiate with the access point (AP) a periodically repeating time window, referred to as TWT Service Period (TWT-SP), over which they are awake and outside which they sleep for saving power. The offset from a common starting time to the first TWT-SP is referred to as the TWT Offset (TWT-O) and the periodicity of TWT-SP is referred to as the TWT Wake Interval (TWT-WI). In this work, we consider communication between multiple STAs with heterogeneous traffic flows and an AP of an IEEE 802.11ax network operating in the TWT mode. Our objective is to maximize a long-term weighted average timely throughput across the STAs, where the instantaneous timely throughput is defined as the number of packets delivered successfully before their deadlines at a decision instant. To achieve this, we obtain algorithms, composed of (i) an inner resource allocation (RA) routine that allocates resource units (RUs) and transmit powers to STAs, and (ii) an outer grouping routine that assigns STAs to (TWT-SP, TWT-O, TWT-WI) triplets. For inner RA, we propose a near-optimal low-complexity algorithm using the drift-plus-penalty (DPP) framework and we adopt a greedy algorithm as outer grouping routine. Via numerical simulations, we observe that the proposed algorithm, composed of a DPP based RA and a greedy grouping routine, performs better than other competitive algorithms. △ Less

Submitted 22 February, 2023; originally announced February 2023.

arXiv:2301.09715 [pdf, other]

PrimeQA: The Prime Repository for State-of-the-Art Multilingual Question Answering Research and Development

Authors: Avirup Sil, Jaydeep Sen, Bhavani Iyer, Martin Franz, Kshitij Fadnis, Mihaela Bornea, Sara Rosenthal, Scott McCarley, Rong Zhang, Vishwajeet Kumar, Yulong Li, Md Arafat Sultan, Riyaz Bhat, Radu Florian, Salim Roukos

Abstract: The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers. In this paper, we introduce PRIMEQA: a one-stop and open-source QA repository with an aim to democratize QA re-search and facilitate… ▽ More The field of Question Answering (QA) has made remarkable progress in recent years, thanks to the advent of large pre-trained language models, newer realistic benchmark datasets with leaderboards, and novel algorithms for key components such as retrievers and readers. In this paper, we introduce PRIMEQA: a one-stop and open-source QA repository with an aim to democratize QA re-search and facilitate easy replication of state-of-the-art (SOTA) QA methods. PRIMEQA supports core QA functionalities like retrieval and reading comprehension as well as auxiliary capabilities such as question generation.It has been designed as an end-to-end toolkit for various use cases: building front-end applications, replicating SOTA methods on pub-lic benchmarks, and expanding pre-existing methods. PRIMEQA is available at : https://github.com/primeqa. △ Less

Submitted 25 January, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

arXiv:2301.01015 [pdf, other]

Semi-Structured Object Sequence Encoders

Authors: Rudra Murthy V, Riyaz Bhat, Chulaka Gunasekara, Siva Sankalp Patel, Hui Wan, Tejas Indulal Dhamecha, Danish Contractor, Marina Danilevsky

Abstract: In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of developing a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present… ▽ More In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of developing a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present modeling challenges due to an ever-increasing sequence length. We propose a two-part approach, which first considers each key independently and encodes a representation of its values over time; we then self-attend over these value-aware key representations to accomplish a downstream task. This allows us to operate on longer object sequences than existing methods. We introduce a novel shared-attention-head architecture between the two modules and present an innovative training schedule that interleaves the training of both modules with shared weights for some attention heads. Our experiments on multiple prediction tasks using real-world data demonstrate that our approach outperforms a unified network with hierarchical encoding, as well as other methods including a record-centric representation and a flattened representation of the sequence. △ Less

Submitted 22 May, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

arXiv:2112.15566 [pdf, ps, other]

In Lieu of Privacy: Anonymous Contact Tracing

Authors: Rohit Bhat, Shranav Palakurthi, Naman Tiwari

Abstract: We present Tracer Tokens, a hardware token of privacy-preserving contact tracing utilizing Exposure Notification \cite{GAEN} protocol. Through subnetworks, we show that any disease spread by proximity can be traced such as seasonal flu, cold, regional strains of COVID-19, or Tuberculosis. Further, we show this protocol to notify $n^n$ users in parallel, providing a speed of information unmatched b… ▽ More We present Tracer Tokens, a hardware token of privacy-preserving contact tracing utilizing Exposure Notification \cite{GAEN} protocol. Through subnetworks, we show that any disease spread by proximity can be traced such as seasonal flu, cold, regional strains of COVID-19, or Tuberculosis. Further, we show this protocol to notify $n^n$ users in parallel, providing a speed of information unmatched by current contact tracing methods. △ Less

Submitted 31 December, 2021; originally announced December 2021.

Comments: 9 pages, 2 figures, student project

arXiv:2111.01374 [pdf, ps, other]

A Game of Primes

Authors: Raghavendra Bhat

Abstract: The basis for most of the ideas mentioned in this paper is the theory of cellular automata. A cellular automata contains a regular grid of cells, with each cell having a pre-defined set of finite states. The initial state is determined at time/state zero. At this point all the cells are assigned their respective starting states. The automata is defined by a set of simple rules that decide the subs… ▽ More The basis for most of the ideas mentioned in this paper is the theory of cellular automata. A cellular automata contains a regular grid of cells, with each cell having a pre-defined set of finite states. The initial state is determined at time/state zero. At this point all the cells are assigned their respective starting states. The automata is defined by a set of simple rules that decide the subsequent states of the cells. We aim to create a cellular automata of prime numbers and come up with some axioms, theorems and conjectures for the same. △ Less

Submitted 5 October, 2022; v1 submitted 2 November, 2021; originally announced November 2021.

arXiv:2108.05670 [pdf, other]

Communication Optimization in Large Scale Federated Learning using Autoencoder Compressed Weight Updates

Authors: Srikanth Chandar, Pravin Chandran, Raghavendra Bhat, Avinash Chakravarthi

Abstract: Federated Learning (FL) solves many of this decade's concerns regarding data privacy and computation challenges. FL ensures no data leaves its source as the model is trained at where the data resides. However, FL comes with its own set of challenges. The communication of model weight updates in this distributed environment comes with significant network bandwidth costs. In this context, we propose… ▽ More Federated Learning (FL) solves many of this decade's concerns regarding data privacy and computation challenges. FL ensures no data leaves its source as the model is trained at where the data resides. However, FL comes with its own set of challenges. The communication of model weight updates in this distributed environment comes with significant network bandwidth costs. In this context, we propose a mechanism of compressing the weight updates using Autoencoders (AE), which learn the data features of the weight updates and subsequently perform compression. The encoder is set up on each of the nodes where the training is performed while the decoder is set up on the node where the weights are aggregated. This setup achieves compression through the encoder and recreates the weights at the end of every communication round using the decoder. This paper shows that the dynamic and orthogonal AE based weight compression technique could serve as an advantageous alternative (or an add-on) in a large scale FL, as it not only achieves compression ratios ranging from 500x to 1720x and beyond, but can also be modified based on the accuracy requirements, computational capacity, and other requirements of the given FL setup. △ Less

Submitted 12 August, 2021; originally announced August 2021.

Comments: 7 pages, 11 figures, International Workshop on Federated and Transfer Learning for Data Sparsity and Confidentiality in Conjunction with IJCAI 2021 (FTL-IJCAI'21)

Report number: Paper 14

arXiv:2108.00578 [pdf, other]

Is My Model Using The Right Evidence? Systematic Probes for Examining Evidence-Based Tabular Reasoning

Authors: Vivek Gupta, Riyaz A. Bhat, Atreya Ghosal, Manish Shrivastava, Maneesh Singh, Vivek Srikumar

Abstract: Neural models command state-of-the-art performance across NLP tasks, including ones involving "reasoning". Models claiming to reason about the evidence presented to them should attend to the correct parts of the input avoiding spurious patterns therein, be self-consistent in their predictions across inputs, and be immune to biases derived from their pre-training in a nuanced, context-sensitive fas… ▽ More Neural models command state-of-the-art performance across NLP tasks, including ones involving "reasoning". Models claiming to reason about the evidence presented to them should attend to the correct parts of the input avoiding spurious patterns therein, be self-consistent in their predictions across inputs, and be immune to biases derived from their pre-training in a nuanced, context-sensitive fashion. {\em Do the prevalent *BERT-family of models do so?} In this paper, we study this question using the problem of reasoning on tabular data. Tabular inputs are especially well-suited for the study -- they admit systematic probes targeting the properties listed above. Our experiments demonstrate that a RoBERTa-based model, representative of the current state-of-the-art, fails at reasoning on the following counts: it (a) ignores relevant parts of the evidence, (b) is over-sensitive to annotation artifacts, and (c) relies on the knowledge encoded in the pre-trained language model rather than the evidence presented in its tabular inputs. Finally, through inoculation experiments, we show that fine-tuning the model on perturbed data does not help it overcome the above challenges. △ Less

Submitted 5 March, 2022; v1 submitted 1 August, 2021; originally announced August 2021.

Comments: 20 pages, 17 figure, 11 tables, TACL 2022, pre-MIT Press publication version

arXiv:2107.08540 [pdf, other]

Distributed Planning for Serving Cooperative Tasks with Time Windows: A Game Theoretic Approach

Authors: Yasin Yazicioglu, Raghavendra Bhat, Derya Aksaray

Abstract: We study distributed planning for multi-robot systems to provide optimal service to cooperative tasks that are distributed over space and time. Each task requires service by sufficiently many robots at the specified location within the specified time window. Tasks arrive over episodes and the robots try to maximize the total value of service in each episode by planning their own trajectories based… ▽ More We study distributed planning for multi-robot systems to provide optimal service to cooperative tasks that are distributed over space and time. Each task requires service by sufficiently many robots at the specified location within the specified time window. Tasks arrive over episodes and the robots try to maximize the total value of service in each episode by planning their own trajectories based on the specifications of incoming tasks. Robots are required to start and end each episode at their assigned stations in the environment. We present a game theoretic solution to this problem by mapping it to a game, where the action of each robot is its trajectory in an episode, and using a suitable learning algorithm to obtain optimal joint plans in a distributed manner. We present a systematic way to design minimal action sets (subsets of feasible trajectories) for robots based on the specifications of incoming tasks to facilitate fast learning. We then provide the performance guarantees for the cases where all the robots follow a best response or noisy best response algorithm to iteratively plan their trajectories. While the best response algorithm leads to a Nash equilibrium, the noisy best response algorithm leads to globally optimal joint plans with high probability. We show that the proposed game can in general have arbitrarily poor Nash equilibria, which makes the noisy best response algorithm preferable unless the task specifications are known to have some special structure. We also describe a family of special cases where all the equilibria are guaranteed to have bounded suboptimality. Simulations and experimental results are provided to demonstrate the proposed approach. △ Less

Submitted 18 July, 2021; originally announced July 2021.

arXiv:2106.15325 [pdf, other]

SE-MD: A Single-encoder multiple-decoder deep network for point cloud generation from 2D images

Authors: Abdul Mueed Hafiz, Rouf Ul Alam Bhat, Shabir Ahmad Parah, M. Hassaballah

Abstract: 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clo… ▽ More 3D model generation from single 2D RGB images is a challenging and actively researched computer vision task. Various techniques using conventional network architectures have been proposed for the same. However, the body of research work is limited and there are various issues like using inefficient 3D representation formats, weak 3D model generation backbones, inability to generate dense point clouds, dependence of post-processing for generation of dense point clouds, and dependence on silhouettes in RGB images. In this paper, a novel 2D RGB image to point cloud conversion technique is proposed, which improves the state of art in the field due to its efficient, robust and simple model by using the concept of parallelization in network architecture. It not only uses the efficient and rich 3D representation of point clouds, but also uses a novel and robust point cloud generation backbone in order to address the prevalent issues. This involves using a single-encoder multiple-decoder deep network architecture wherein each decoder generates certain fixed viewpoints. This is followed by fusing all the viewpoints to generate a dense point cloud. Various experiments are conducted on the technique and its performance is compared with those of other state of the art techniques and impressive gains in performance are demonstrated. Code is available at https://github.com/mueedhafiz1982/ △ Less

Submitted 17 June, 2021; originally announced June 2021.

arXiv:2106.14503 [pdf, other]

Weight Divergence Driven Divide-and-Conquer Approach for Optimal Federated Learning from non-IID Data

Authors: Pravin Chandran, Raghavendra Bhat, Avinash Chakravarthi, Srikanth Chandar

Abstract: Federated Learning allows training of data stored in distributed devices without the need for centralizing training data, thereby maintaining data privacy. Addressing the ability to handle data heterogeneity (non-identical and independent distribution or non-IID) is a key enabler for the wider deployment of Federated Learning. In this paper, we propose a novel Divide-and-Conquer training methodolo… ▽ More Federated Learning allows training of data stored in distributed devices without the need for centralizing training data, thereby maintaining data privacy. Addressing the ability to handle data heterogeneity (non-identical and independent distribution or non-IID) is a key enabler for the wider deployment of Federated Learning. In this paper, we propose a novel Divide-and-Conquer training methodology that enables the use of the popular FedAvg aggregation algorithm by overcoming the acknowledged FedAvg limitations in non-IID environments. We propose a novel use of Cosine-distance based Weight Divergence metric to determine the exact point where a Deep Learning network can be divided into class agnostic initial layers and class-specific deep layers for performing a Divide and Conquer training. We show that the methodology achieves trained model accuracy at par (and in certain cases exceeding) with numbers achieved by state-of-the-art Aggregation algorithms like FedProx, FedMA, etc. Also, we show that this methodology leads to compute and bandwidth optimizations under certain documented conditions. △ Less

Submitted 29 June, 2021; v1 submitted 28 June, 2021; originally announced June 2021.

arXiv:2106.07550 [pdf, other]

Attention mechanisms and deep learning for machine vision: A survey of the state of the art

Authors: Abdul Mueed Hafiz, Shabir Ahmad Parah, Rouf Ul Alam Bhat

Abstract: With the advent of state of the art nature-inspired pure attention based models i.e. transformers, and their success in natural language processing (NLP), their extension to machine vision (MV) tasks was inevitable and much felt. Subsequently, vision transformers (ViTs) were introduced which are giving quite a challenge to the established deep learning based machine vision techniques. However, pur… ▽ More With the advent of state of the art nature-inspired pure attention based models i.e. transformers, and their success in natural language processing (NLP), their extension to machine vision (MV) tasks was inevitable and much felt. Subsequently, vision transformers (ViTs) were introduced which are giving quite a challenge to the established deep learning based machine vision techniques. However, pure attention based models/architectures like transformers require huge data, large training times and large computational resources. Some recent works suggest that combinations of these two varied fields can prove to build systems which have the advantages of both these fields. Accordingly, this state of the art survey paper is introduced which hopefully will help readers get useful information about this interesting and potential research area. A gentle introduction to attention mechanisms is given, followed by a discussion of the popular attention based deep architectures. Subsequently, the major categories of the intersection of attention mechanisms and deep learning for machine vision (MV) based are discussed. Afterwards, the major algorithms, issues and trends within the scope of the paper are discussed. △ Less

Submitted 3 June, 2021; originally announced June 2021.

arXiv:2101.01546 [pdf, other]

Brain Tumor Segmentation and Survival Prediction using Automatic Hard mining in 3D CNN Architecture

Authors: Vikas Kumar Anand, Sanjeev Grampurohit, Pranav Aurangabadkar, Avinash Kori, Mahendra Khened, Raghavendra S Bhat, Ganapathy Krishnamurthi

Abstract: We utilize 3-D fully convolutional neural networks (CNN) to segment gliomas and its constituents from multimodal Magnetic Resonance Images (MRI). The architecture uses dense connectivity patterns to reduce the number of weights and residual connections and is initialized with weights obtained from training this model with BraTS 2018 dataset. Hard mining is done during training to train for the dif… ▽ More We utilize 3-D fully convolutional neural networks (CNN) to segment gliomas and its constituents from multimodal Magnetic Resonance Images (MRI). The architecture uses dense connectivity patterns to reduce the number of weights and residual connections and is initialized with weights obtained from training this model with BraTS 2018 dataset. Hard mining is done during training to train for the difficult cases of segmentation tasks by increasing the dice similarity coefficient (DSC) threshold to choose the hard cases as epoch increases. On the BraTS2020 validation data (n = 125), this architecture achieved a tumor core, whole tumor, and active tumor dice of 0.744, 0.876, 0.714,respectively. On the test dataset, we get an increment in DSC of tumor core and active tumor by approximately 7%. In terms of DSC, our network performances on the BraTS 2020 test data are 0.775, 0.815, and 0.85 for enhancing tumor, tumor core, and whole tumor, respectively. Overall survival of a subject is determined using conventional machine learning from rediomics features obtained using a generated segmentation mask. Our approach has achieved 0.448 and 0.452 as the accuracy on the validation and test dataset. △ Less

Submitted 5 January, 2021; originally announced January 2021.

Comments: 11 pages, 4 Figures

arXiv:1911.07499 [pdf, ps, other]

Throughput Maximization with an Average Age of Information Constraint in Fading Channels

Authors: Rajshekhar Vishweshwar Bhat, Rahul Vaze, Mehul Motani

Abstract: In the emerging fifth generation (5G) technology, communication nodes are expected to support two crucial classes of information traffic, namely, the enhanced mobile broadband (eMBB) traffic with high data rate requirements, and ultra-reliable low-latency communications (URLLC) traffic with strict requirements on latency and reliability. The URLLC traffic, which is usually analyzed by a metric cal… ▽ More In the emerging fifth generation (5G) technology, communication nodes are expected to support two crucial classes of information traffic, namely, the enhanced mobile broadband (eMBB) traffic with high data rate requirements, and ultra-reliable low-latency communications (URLLC) traffic with strict requirements on latency and reliability. The URLLC traffic, which is usually analyzed by a metric called the age of information (AoI), is assigned the first priority over the resources at a node. Motivated by this, we consider long-term average throughput maximization problems subject to average AoI and power constraints in a single user fading channel, when (i) perfect and (ii) no channel state information at the transmitter (CSIT) is available. We propose simple age-independent stationary randomized policies (AI-SRP), which allocate powers at the transmitter based only on the channel state and/or distribution information, without any knowledge of the AoI. We show that the optimal throughputs achieved by the AI-SRPs for scenarios (i) and (ii) are at least equal to the half of the respective optimal long-term average throughputs, independent of all the parameters of the problem, and that they are within additive gaps, expressed in terms of the optimal dual variable corresponding to their average AoI constraints, from the respective optimal long-term average throughputs. △ Less

Submitted 18 November, 2019; originally announced November 2019.

arXiv:1908.05630 [pdf, other]

Distributed Path Planning for Executing Cooperative Tasks with Time Windows

Authors: Raghavendra Bhat, Yasin Yazicioglu, Derya Aksaray

Abstract: We investigate the distributed planning of robot trajectories for optimal execution of cooperative tasks with time windows. In this setting, each task has a value and is completed if sufficiently many robots are simultaneously present at the necessary location within the specified time window. Tasks keep arriving periodically over cycles. The task specifications (required number of robots, locatio… ▽ More We investigate the distributed planning of robot trajectories for optimal execution of cooperative tasks with time windows. In this setting, each task has a value and is completed if sufficiently many robots are simultaneously present at the necessary location within the specified time window. Tasks keep arriving periodically over cycles. The task specifications (required number of robots, location, time window, and value) are unknown a priori and the robots try to maximize the value of completed tasks by planning their own trajectories for the upcoming cycle based on their past observations in a distributed manner. Considering the recharging and maintenance needs, robots are required to start and end each cycle at their assigned stations located in the environment. We map this problem to a game theoretic formulation and maximize the collective performance through distributed learning. Some simulation results are also provided to demonstrate the performance of the proposed approach. △ Less

Submitted 15 August, 2019; originally announced August 2019.

Comments: Accepted to the 8th IFAC Workshop on Distributed Estimation and Control in Networked Systems

arXiv:1907.01284 [pdf, other]

Semi-Bagging Based Deep Neural Architecture to Extract Text from High Entropy Images

Authors: Pranay Dugar, Anirban Chatterjee, Rajesh Shreedhar Bhat, Saswata Sahoo

Abstract: Extracting texts of various size and shape from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in natural scene, etc. The existing works (based on only CNN) often perform sub-optimally when the image contains regions of high entropy having multiple objects. This paper presents an end-to-end t… ▽ More Extracting texts of various size and shape from images containing multiple objects is an important problem in many contexts, especially, in connection to e-commerce, augmented reality assistance system in natural scene, etc. The existing works (based on only CNN) often perform sub-optimally when the image contains regions of high entropy having multiple objects. This paper presents an end-to-end text detection strategy combining a segmentation algorithm and an ensemble of multiple text detectors of different types to detect text in every individual image segments independently. The proposed strategy involves a super-pixel based image segmenter which splits an image into multiple regions. A convolutional deep neural architecture is developed which works on each of the segments and detects texts of multiple shapes, sizes, and structures. It outperforms the competing methods in terms of coverage in detecting texts in images especially the ones where the text of various types and sizes are compacted in a small region along with various other objects. Furthermore, the proposed text detection method along with a text recognizer outperforms the existing state-of-the-art approaches in extracting text from high entropy images. We validate the results on a dataset consisting of product images on an e-commerce website. △ Less

Submitted 2 July, 2019; originally announced July 2019.

Comments: 10 pages

arXiv:1905.09063 [pdf, other]

NTP : A Neural Network Topology Profiler

Authors: Raghavendra Bhat, Pravin Chandran, Juby Jose, Viswanath Dibbur, Prakash Sirra Ajith

Abstract: Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench [1] (o… ▽ More Performance of end-to-end neural networks on a given hardware platform is a function of its compute and memory signature, which in-turn, is governed by a wide range of parameters such as topology size, primitives used, framework used, batching strategy, latency requirements, precision etc. Current benchmarking tools suffer from limitations such as a) being either too granular like DeepBench [1] (or) b) mandate a working implementation that is either framework specific or hardware-architecture specific or both (or) c) provide only high level benchmark metrics. In this paper, we present NTP (Neural Net Topology Profiler), a sophisticated benchmarking framework, to effectively identify memory and compute signature of an end-to-end topology on multiple hardware architectures, without the need for an actual implementation. NTP is tightly integrated with hardware specific benchmarking tools to enable exhaustive data collection and analysis. Using NTP, a deep learning researcher can quickly establish baselines needed to understand performance of an end-to-end neural network topology and make high level architectural decisions. Further, integration of NTP with frameworks like Tensorflow, Pytorch, Intel OpenVINO etc. allows for performance comparison along several vectors like a) Comparison of different frameworks on a given hardware b) Comparison of different hardware using a given framework c) Comparison across different heterogeneous hardware configurations for given framework etc. These capabilities empower a researcher to effortlessly make architectural decisions needed for achieving optimized performance on any hardware platform. The paper documents the architectural approach of NTP and demonstrates the capabilities of the tool by benchmarking Mozilla DeepSpeech, a popular Speech Recognition topology. △ Less

Submitted 24 May, 2019; v1 submitted 22 May, 2019; originally announced May 2019.

arXiv:1902.05085 [pdf, ps, other]

Leveraging Newswire Treebanks for Parsing Conversational Data with Argument Scrambling

Authors: Riyaz Ahmad Bhat, Irshad Ahmad Bhat, Dipti Misra Sharma

Abstract: We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency… ▽ More We investigate the problem of parsing conversational data of morphologically-rich languages such as Hindi where argument scrambling occurs frequently. We evaluate a state-of-the-art non-linear transition-based parsing system on a new dataset containing 506 dependency trees for sentences from Bollywood (Hindi) movie scripts and Twitter posts of Hindi monolingual speakers. We show that a dependency parser trained on a newswire treebank is strongly biased towards the canonical structures and degrades when applied to conversational data. Inspired by Transformational Generative Grammar, we mitigate the sampling bias by generating all theoretically possible alternative word orders of a clause from the existing (kernel) structures in the treebank. Training our parser on canonical and transformed structures improves performance on conversational data by around 9% LAS over the baseline newswire parser. △ Less

Submitted 13 February, 2019; originally announced February 2019.

Comments: Proceedings of the 15th International Conference on Parsing Technologies, pages 61-66, Pisa, Italy; September 20-22, 2017. Association for Computational Linguistics

Journal ref: Proceedings of the 15th International Conference on Parsing Technologies, pages 61-66, Pisa, Italy; September 20-22, 2017. Association for Computational Linguistics

arXiv:1809.02147 [pdf, other]

Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit

Authors: Amrith Krishna, Bodhisattwa Prasad Majumder, Rajesh Shreedhar Bhat, Pawan Goyal

Abstract: We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit. Owing to the lack of resources our approach uses OCR models trained for other languages written in Roman. Currently, there exists no dataset available for Romanised Sanskrit OCR. So, we bootstrap a dataset of 430 images, scanned in two different settings and their corresponding ground truth. For training, we… ▽ More We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit. Owing to the lack of resources our approach uses OCR models trained for other languages written in Roman. Currently, there exists no dataset available for Romanised Sanskrit OCR. So, we bootstrap a dataset of 430 images, scanned in two different settings and their corresponding ground truth. For training, we synthetically generate training images for both the settings. We find that the use of copying mechanism (Gu et al., 2016) yields a percentage increase of 7.69 in Character Recognition Rate (CRR) than the current state of the art model in solving monotone sequence-to-sequence tasks (Schnober et al., 2016). We find that our system is robust in combating OCR-prone errors, as it obtains a CRR of 87.01% from an OCR output with CRR of 35.76% for one of the dataset settings. A human judgment survey performed on the models shows that our proposed model results in predictions which are faster to comprehend and faster to improve for a human than the other systems. △ Less

Submitted 6 September, 2018; originally announced September 2018.

Comments: This paper has been accepted as a full paper in the SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2018. The code, data and the supplementary material is available at https://github.com/majumderb/sanskrit-ocr

arXiv:1807.01928 [pdf, other]

FocusST Solution for Analysis of Cryptographic Properties

Authors: Maria Spichkova, Radhika Bhat

Abstract: To analyse cryptographic properties of distributed systems in a systematic way, a formal theory is required. In this paper, we present a theory that allows (1) to specify distributed systems formally, (2) to verify their cryptographic wrt. composition properties, and (3) to demonstrate the correctness of syntactic interfaces for specified system components automatically. To demonstrate the feasibi… ▽ More To analyse cryptographic properties of distributed systems in a systematic way, a formal theory is required. In this paper, we present a theory that allows (1) to specify distributed systems formally, (2) to verify their cryptographic wrt. composition properties, and (3) to demonstrate the correctness of syntactic interfaces for specified system components automatically. To demonstrate the feasibility of the approach we use a typical example from the domain of crypto-based systems: a variant of the Internet security protocol TLS. A security flaw in the initial version of TLS specification was revealed using a semi-automatic theorem prover, Isabelle/HOL. △ Less

Submitted 5 July, 2018; originally announced July 2018.

Comments: Preprint. Accepted to the 13th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2018). Final version published by SCITEPRESS

arXiv:1804.05868 [pdf, other]

Universal Dependency Parsing for Hindi-English Code-switching

Authors: Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma

Abstract: Code-switching is a phenomenon of mixing grammatical structures of two or more languages under varied social constraints. The code-switching data differ so radically from the benchmark corpora used in NLP community that the application of standard technologies to these data degrades their performance sharply. Unlike standard corpora, these data often need to go through additional processes such as… ▽ More Code-switching is a phenomenon of mixing grammatical structures of two or more languages under varied social constraints. The code-switching data differ so radically from the benchmark corpora used in NLP community that the application of standard technologies to these data degrades their performance sharply. Unlike standard corpora, these data often need to go through additional processes such as language identification, normalization and/or back-transliteration for their efficient processing. In this paper, we investigate these indispensable processes and other problems associated with syntactic parsing of code-switching data and propose methods to mitigate their effects. In particular, we study dependency parsing of code-switching data of Hindi and English multilingual speakers from Twitter. We present a treebank of Hindi-English code-switching tweets under Universal Dependencies scheme and propose a neural stacking model for parsing that efficiently leverages part-of-speech tag and syntactic tree annotations in the code-switching treebank and the preexisting Hindi and English treebanks. We also present normalization and back-transliteration models with a decoding process tailored for code-switching data. Results show that our neural stacking parser is 1.5% LAS points better than the augmented parsing model and our decoding process improves results by 3.8% LAS points over the first-best normalization and/or back-transliteration. △ Less

Submitted 24 April, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

arXiv:1804.03258 [pdf]

doi 10.1016/j.surg.2019.01.002

Comparing Clinical Judgment with MySurgeryRisk Algorithm for Preoperative Risk Assessment: A Pilot Study

Authors: Meghan Brennan, Sahil Puri, Tezcan Ozrazgat-Baslanti, Rajendra Bhat, Zheng Feng, Petar Momcilovic, Xiaolin Li, Daisy Zhe Wang, Azra Bihorac

Abstract: Background: Major postoperative complications are associated with increased short and long-term mortality, increased healthcare cost, and adverse long-term consequences. The large amount of data contained in the electronic health record (EHR) creates barriers for physicians to recognize patients most at risk. We hypothesize, if presented in an optimal format, information from data-driven predictiv… ▽ More Background: Major postoperative complications are associated with increased short and long-term mortality, increased healthcare cost, and adverse long-term consequences. The large amount of data contained in the electronic health record (EHR) creates barriers for physicians to recognize patients most at risk. We hypothesize, if presented in an optimal format, information from data-driven predictive risk algorithms for postoperative complications can improve physician risk assessment. Methods: Prospective, non-randomized, interventional pilot study of twenty perioperative physicians at a quarterly academic medical center. Using 150 clinical cases we compared physicians' risk assessment before and after interaction with MySurgeryRisk, a validated machine-learning algorithm predicting preoperative risk for six major postoperative complications using EHR data. Results: The area under the curve (AUC) of MySurgeryRisk algorithm ranged between 0.73 and 0.85 and was significantly higher than physicians' risk assessments (AUC between 0.47 and 0.69) for all postoperative complications except cardiovascular complications. The AUC for repeated physician's risk assessment improved by 2% to 5% for all complications with the exception of thirty-day mortality. Physicians' risk assessment for acute kidney injury and intensive care unit admission longer than 48 hours significantly improved after knowledge exchange, resulting in net reclassification improvement of 12.4% and 16%, respectively. Conclusions: The validated MySurgeryRisk algorithm predicted postoperative complications with equal or higher accuracy than pilot cohort of physicians using available clinical preoperative data. The interaction with algorithm significantly improved physicians' risk assessment. △ Less

Submitted 9 April, 2018; originally announced April 2018.

Comments: 21 pages, 4 tables

Report number: PMCID: PMC6502657

Journal ref: Surgery 165(5):1035-1045 (2019)

arXiv:1801.03813 [pdf, ps, other]

Energy Harvesting Communications Using Dual Alternating Batteries

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Chandra R Murthy, Rahul Vaze

Abstract: Practical energy harvesting (EH) based communication systems typically use a battery to temporarily store the harvested energy prior to its use for communication. The batteries can be damaged when they are repeatedly charged (discharged) after being partially discharged (charged), overcharged or deeply discharged. This motivates the cycle constraint which says that a battery must be charged (disch… ▽ More Practical energy harvesting (EH) based communication systems typically use a battery to temporarily store the harvested energy prior to its use for communication. The batteries can be damaged when they are repeatedly charged (discharged) after being partially discharged (charged), overcharged or deeply discharged. This motivates the cycle constraint which says that a battery must be charged (discharged) only after it is sufficiently discharged (charged). We also assume Bernoulli energy arrivals, and a half-duplex constraint due to which the batteries are not charged and discharged simultaneously. In this context, we study EH communication systems with: (a) a single-battery with capacity 2B units and (b) dual-batteries, each having capacity of B units. The aim is to obtain the best possible long-term average throughputs and throughput regions in point-to-point (P2P) channels and multiple access channels (MAC), respectively. For the P2P channel, we obtain an analytical optimal solution in the single-battery case, and propose optimal and sub-optimal power allocation policies for the dual-battery case. We extend these policies to obtain achievable throughput regions in MACs by jointly allocating rates and powers. From numerical simulations, we find that the optimal throughput in the dual-battery case is significantly higher than that in the single-battery case, although the total storage capacity in both cases is 2B units. Further, in the proposed policies, the largest throughput region in the single-battery case is contained within that of the dual-battery case. △ Less

Submitted 18 December, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

Comments: A single battery case is added and its performance is compared with that of the dual-battery case, with additional simulation results

arXiv:1801.03794 [pdf, ps, other]

Hybrid NOMA-TDMA for Multiple Access Channels with Non-Ideal Batteries and Circuit Cost

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: We consider a multiple-access channel where the users are powered from batteries having non-negligible internal resistance. When power is drawn from the battery, a variable fraction of the power, which is a function of the power drawn from the battery, is lost across the internal resistance. Hence, the power delivered to the load is less than the power drawn from the battery. The users consume a c… ▽ More We consider a multiple-access channel where the users are powered from batteries having non-negligible internal resistance. When power is drawn from the battery, a variable fraction of the power, which is a function of the power drawn from the battery, is lost across the internal resistance. Hence, the power delivered to the load is less than the power drawn from the battery. The users consume a constant power for the circuit operation during transmission but do not consume any power when not transmitting. In this setting, we obtain the maximum sum-rates and achievable rate regions under various cases. We show that, unlike in the ideal battery case, the TDMA (time-division multiple access) strategy, wherein the users transmit orthogonally in time, may not always achieve the maximum sum-rate when the internal resistance is non-zero. The users may need to adopt a hybrid NOMA-TDMA strategy which combines the features of NOMA (non-orthogonal multiple access) and TDMA, wherein a set of users are allocated fixed time windows for orthogonal single-user and non-orthogonal joint transmissions, respectively. We also numerically show that the maximum achievable rate regions in NOMA and TDMA strategies are contained within the maximum achievable rate region of the hybrid NOMA-TDMA strategy. △ Less

Submitted 15 January, 2018; v1 submitted 11 January, 2018; originally announced January 2018.

arXiv:1709.10192 [pdf, other]

Intelligent Perioperative System: Towards Real-time Big Data Analytics in Surgery Risk Assessment

Authors: Zheng Feng, Rajendra Rana Bhat, Xiaoyong Yuan, Daniel Freeman, Tezcan Baslanti, Azra Bihorac, Xiaolin Li

Abstract: Surgery risk assessment is an effective tool for physicians to manage the treatment of patients, but most current research projects fall short in providing a comprehensive platform to evaluate the patients' surgery risk in terms of different complications. The recent evolution of big data analysis techniques makes it possible to develop a real-time platform to dynamically analyze the surgery risk… ▽ More Surgery risk assessment is an effective tool for physicians to manage the treatment of patients, but most current research projects fall short in providing a comprehensive platform to evaluate the patients' surgery risk in terms of different complications. The recent evolution of big data analysis techniques makes it possible to develop a real-time platform to dynamically analyze the surgery risk from large-scale patients information. In this paper, we propose the Intelligent Perioperative System (IPS), a real-time system that assesses the risk of postoperative complications (PC) and dynamically interacts with physicians to improve the predictive results. In order to process large volume patients data in real-time, we design the system by integrating several big data computing and storage frameworks with the high through-output streaming data processing components. We also implement a system prototype along with the visualization results to show the feasibility of system design. △ Less

Submitted 28 September, 2017; originally announced September 2017.

Comments: 6 pages, 8 figures

arXiv:1703.10772 [pdf, ps, other]

Joining Hands: Exploiting Monolingual Treebanks for Parsing of Code-mixing Data

Authors: Irshad Ahmad Bhat, Riyaz Ahmad Bhat, Manish Shrivastava, Dipti Misra Sharma

Abstract: In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hi… ▽ More In this paper, we propose efficient and less resource-intensive strategies for parsing of code-mixed data. These strategies are not constrained by in-domain annotations, rather they leverage pre-existing monolingual annotated resources for training. We show that these methods can produce significantly better results as compared to an informed baseline. Besides, we also present a data set of 450 Hindi and English code-mixed tweets of Hindi multilingual speakers for evaluation. The data set is manually annotated with Universal Dependencies. △ Less

Submitted 31 March, 2017; originally announced March 2017.

Comments: 5 pages, EACL 2017 short paper

arXiv:1702.04258 [pdf, other]

Layered Coding for Energy Harvesting Communication Without CSIT

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: Due to stringent constraints on resources, it may be infeasible to acquire the current channel state information at the transmitter in energy harvesting communication systems. In this paper, we optimize an energy harvesting transmitter, communicating over a slow fading channel, using layered coding. The transmitter has access to the channel statistics, but does not know the exact channel state. In… ▽ More Due to stringent constraints on resources, it may be infeasible to acquire the current channel state information at the transmitter in energy harvesting communication systems. In this paper, we optimize an energy harvesting transmitter, communicating over a slow fading channel, using layered coding. The transmitter has access to the channel statistics, but does not know the exact channel state. In layered coding, the codewords are first designed for each of the channel states at different rates, and then the codewords are either time-multiplexed or superimposed before the transmission, leading to two transmission strategies. The receiver then decodes the information adaptively based on the realized channel state. The transmitter is equipped with a finite-capacity battery having non-zero internal resistance. In each of the transmission strategies, we first formulate and study an average rate maximization problem with non-causal knowledge of the harvested power variations. Further, assuming statistical knowledge and causal information of the harvested power variations, we propose a sub-optimal algorithm, and compare with the stochastic dynamic programming based solution and a greedy policy. △ Less

Submitted 14 April, 2017; v1 submitted 14 February, 2017; originally announced February 2017.

Comments: Elaborated on the system model, added a result (Lemma 1), added 2 more references

arXiv:1701.02444 [pdf, ps, other]

Energy Harvesting Communication Using Finite-Capacity Batteries with Internal Resistance

Authors: Rajshekhar Vishweshwar Bhat, Mehul Motani, Teng Joon Lim

Abstract: Modern systems will increasingly rely on energy harvested from their environment. Such systems utilize batteries to smoothen out the random fluctuations in harvested energy. These fluctuations induce highly variable battery charge and discharge rates, which affect the efficiencies of practical batteries that typically have non-zero internal resistances. In this paper, we study an energy harvesting… ▽ More Modern systems will increasingly rely on energy harvested from their environment. Such systems utilize batteries to smoothen out the random fluctuations in harvested energy. These fluctuations induce highly variable battery charge and discharge rates, which affect the efficiencies of practical batteries that typically have non-zero internal resistances. In this paper, we study an energy harvesting communication system using a finite battery with non-zero internal resistance. We adopt a dual-path architecture, in which harvested energy can be directly used, or stored and then used. In a frame, both time and power can be split between energy storage and data transmission. For a single frame, we derive an analytical expression for the rate optimal time and power splitting ratios between harvesting energy and transmitting data. We then optimize the time and power splitting ratios for a group of frames, assuming non-causal knowledge of harvested power and fading channel gains, by giving an approximate solution. When only the statistics of the energy arrivals and channel gains are known, we derive a dynamic programming based policy and, propose three sub-optimal policies, which are shown to perform competitively. In summary, our study suggests that battery internal resistance significantly impacts the design and performance of energy harvesting communication systems and must be taken into account. △ Less

Submitted 10 January, 2017; originally announced January 2017.

Comments: 30 single column pages

arXiv:1612.03211 [pdf, other]

DeepCancer: Detecting Cancer through Gene Expressions via Deep Generative Learning

Authors: Rajendra Rana Bhat, Vivek Viswanath, Xiaolin Li

Abstract: Transcriptional profiling on microarrays to obtain gene expressions has been used to facilitate cancer diagnosis. We propose a deep generative machine learning architecture (called DeepCancer) that learn features from unlabeled microarray data. These models have been used in conjunction with conventional classifiers that perform classification of the tissue samples as either being cancerous or non… ▽ More Transcriptional profiling on microarrays to obtain gene expressions has been used to facilitate cancer diagnosis. We propose a deep generative machine learning architecture (called DeepCancer) that learn features from unlabeled microarray data. These models have been used in conjunction with conventional classifiers that perform classification of the tissue samples as either being cancerous or non-cancerous. The proposed model has been tested on two different clinical datasets. The evaluation demonstrates that DeepCancer model achieves a very high precision score, while significantly controlling the false positive and false negative scores. △ Less

Submitted 13 December, 2016; v1 submitted 9 December, 2016; originally announced December 2016.

arXiv:1602.06456 [pdf, other]

Millimeter Wave Vehicular Communication to Support Massive Automotive Sensing

Authors: Junil Choi, Vutha Va, Nuria Gonzalez-Prelcic, Robert Daniels, Chandra R. Bhat, Robert W. Heath Jr

Abstract: As driving becomes more automated, vehicles are being equipped with more sensors generating even higher data rates. Radars (RAdio Detection and Ranging) are used for object detection, visual cameras as virtual mirrors, and LIDARs (LIght Detection and Ranging) for generating high resolution depth associated range maps, all to enhance the safety and efficiency of driving. Connected vehicles can use… ▽ More As driving becomes more automated, vehicles are being equipped with more sensors generating even higher data rates. Radars (RAdio Detection and Ranging) are used for object detection, visual cameras as virtual mirrors, and LIDARs (LIght Detection and Ranging) for generating high resolution depth associated range maps, all to enhance the safety and efficiency of driving. Connected vehicles can use wireless communication to exchange sensor data, allowing them to enlarge their sensing range and improve automated driving functions. Unfortunately, conventional technologies, such as dedicated short-range communication (DSRC) and 4G cellular communication, do not support the gigabit-per-second data rates that would be required for raw sensor data exchange between vehicles. This paper makes the case that millimeter wave (mmWave) communication is the only viable approach for high bandwidth connected vehicles. The motivations and challenges associated with using mmWave for vehicle-to-vehicle and vehicle-to-infrastructure applications are highlighted. A high-level solution to one key challenge - the overhead of mmWave beam training - is proposed. The critical feature of this solution is to leverage information derived from the sensors or DSRC as side information for the mmWave communication link configuration. Examples and simulation results show that the beam alignment overhead can be reduced by using position information obtained from DSRC. △ Less

Submitted 18 May, 2016; v1 submitted 20 February, 2016; originally announced February 2016.

Comments: 7 pages, 5 figures, 1 table, submitted to IEEE Communications Magazine

arXiv:1503.06009 [pdf, other]

A Framework for Textbook Enhancement and Learning using Crowdsourced Annotations

Authors: Anamika Chhabra, S. R. S. Iyengar, Poonam Saini, Rajesh Shreedhar Bhat

Abstract: Despite a significant improvement in the educational aids in terms of effective teaching-learning process, most of the educational content available to the students is less than optimal in the context of being up-to-date, exhaustive and easy-to-understand. There is a need to iteratively improve the educational material based on the feedback collected from the students' learning experience. This ca… ▽ More Despite a significant improvement in the educational aids in terms of effective teaching-learning process, most of the educational content available to the students is less than optimal in the context of being up-to-date, exhaustive and easy-to-understand. There is a need to iteratively improve the educational material based on the feedback collected from the students' learning experience. This can be achieved by observing the students' interactions with the content, and then having the authors modify it based on this feedback. Hence, we aim to facilitate and promote communication between the communities of authors, instructors and students in order to gradually improve the educational material. Such a system will also help in students' learning process by encouraging student-to-student teaching. Underpinning these objectives, we provide the framework of a platform named Crowdsourced Annotation System (CAS) where the people from these communities can collaborate and benefit from each other. We use the concept of in-context annotations, through which, the students can add their comments about the given text while learning it. An experiment was conducted on 60 students who try to learn an article of a textbook by annotating it for four days. According to the result of the experiment, most of the students were highly satisfied with the use of CAS. They stated that the system is extremely useful for learning and they would like to use it for learning other concepts in future. △ Less

Submitted 11 August, 2015; v1 submitted 20 March, 2015; originally announced March 2015.

Comments: 11 pages, 3 figures, 1 table

arXiv:1502.06719 [pdf, other]

Ecosystem: A Characteristic Of Crowdsourced Environments

Authors: Anamika Chhabra, S. R. S. Iyengar, Poonam Saini, Rajesh Shreedhar Bhat, Vijay Kumar

Abstract: The phenomenal success of certain crowdsourced online platforms, such as Wikipedia, is accredited to their ability to tap the crowd's potential to collaboratively build knowledge. While it is well known that the crowd's collective wisdom surpasses the cumulative individual expertise, little is understood on the dynamics of knowledge building in a crowdsourced environment. A proper understanding of… ▽ More The phenomenal success of certain crowdsourced online platforms, such as Wikipedia, is accredited to their ability to tap the crowd's potential to collaboratively build knowledge. While it is well known that the crowd's collective wisdom surpasses the cumulative individual expertise, little is understood on the dynamics of knowledge building in a crowdsourced environment. A proper understanding of the dynamics of knowledge building in a crowdsourced environment would enable one in the better designing of such environments to solicit knowledge from the crowd. Our experiment on crowdsourced systems based on annotations shows that an important reason for the rapid knowledge building in such environments is due to variance in expertise. First, we used as our test bed, a customized Crowdsourced Annotation System (CAS) which provides a group of users the facility to annotate a given document while trying to understand it. Our results showed the presence of different genres of proficiency amongst the users of an annotation system. We observed that the ecosystem in crowdsourced annotation system comprised of mainly four categories of contributors, namely: Probers, Solvers, Articulators and Explorers. We inferred from our experiment that the knowledge garnering mainly happens due to the synergetic interaction across these categories. Further, we conducted an analysis on the dataset of Wikipedia and Stack Overflow and noticed the ecosystem presence in these portals as well. From this study, we claim that the ecosystem is a universal characteristic of all crowdsourced portals. △ Less

Submitted 27 August, 2015; v1 submitted 24 February, 2015; originally announced February 2015.

Comments: 21 pages, 9 figures, 7 tables

arXiv:1501.05992 [pdf, other]

doi 10.1017/pasa.2015.5

The Murchison Widefield Array Correlator

Authors: S. M. Ord, B. Crosse, D. Emrich, D. Pallot, R. B. Wayth, M. A. Clark, S. E. Tremblay, W. Arcus, D. Barnes, M. Bell, G. Bernardi, N. D. R. Bhat, J. D. Bowman, F. Briggs, J. D. Bunton, R. J. Cappallo, B. E. Corey, A. A. Deshpande, L. deSouza, A. Ewell-Wice, L. Feng, R. Goeke, L. J. Greenhill, B. J. Hazelton, D. Herne , et al. (42 additional authors not shown)

Abstract: The Murchison Widefield Array (MWA) is a Square Kilometre Array (SKA) Precursor. The telescope is located at the Murchison Radio--astronomy Observatory (MRO) in Western Australia (WA). The MWA consists of 4096 dipoles arranged into 128 dual polarisation aperture arrays forming a connected element interferometer that cross-correlates signals from all 256 inputs. A hybrid approach to the correlation… ▽ More The Murchison Widefield Array (MWA) is a Square Kilometre Array (SKA) Precursor. The telescope is located at the Murchison Radio--astronomy Observatory (MRO) in Western Australia (WA). The MWA consists of 4096 dipoles arranged into 128 dual polarisation aperture arrays forming a connected element interferometer that cross-correlates signals from all 256 inputs. A hybrid approach to the correlation task is employed, with some processing stages being performed by bespoke hardware, based on Field Programmable Gate Arrays (FPGAs), and others by Graphics Processing Units (GPUs) housed in general purpose rack mounted servers. The correlation capability required is approximately 8 TFLOPS (Tera FLoating point Operations Per Second). The MWA has commenced operations and the correlator is generating 8.3 TB/day of correlation products, that are subsequently transferred 700 km from the MRO to Perth (WA) in real-time for storage and offline processing. In this paper we outline the correlator design, signal path, and processing elements and present the data format for the internal and external interfaces. △ Less

Submitted 23 January, 2015; originally announced January 2015.

Comments: 17 pages, 9 figures. Accepted for publication in PASA. Some figures altered to meet astro-ph submission requirements

arXiv:1112.4438 [pdf, ps, other]

Barcoding-free BAC Pooling Enables Combinatorial Selective Sequencing of the Barley Gene Space

Authors: Stefano Lonardi, Denisa Duma, Matthew Alpert, Francesca Cordero, Marco Beccuti, Prasanna R. Bhat, Yonghui Wu, Gianfranco Ciardo, Burair Alsaihati, Yaqin Ma, Steve Wanamaker, Josh Resnik, Timothy J. Close

Abstract: We propose a new sequencing protocol that combines recent advances in combinatorial pooling design and second-generation sequencing technology to efficiently approach de novo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when dealing with hundreds or thousands of DNA samples, such as genome-tiling gene-rich… ▽ More We propose a new sequencing protocol that combines recent advances in combinatorial pooling design and second-generation sequencing technology to efficiently approach de novo selective genome sequencing. We show that combinatorial pooling is a cost-effective and practical alternative to exhaustive DNA barcoding when dealing with hundreds or thousands of DNA samples, such as genome-tiling gene-rich BAC clones. The novelty of the protocol hinges on the computational ability to efficiently compare hundreds of million of short reads and assign them to the correct BAC clones so that the assembly can be carried out clone-by-clone. Experimental results on simulated data for the rice genome show that the deconvolution is extremely accurate (99.57% of the deconvoluted reads are assigned to the correct BAC), and the resulting BAC assemblies have very high quality (BACs are covered by contigs over about 77% of their length, on average). Experimental results on real data for a gene-rich subset of the barley genome confirm that the deconvolution is accurate (almost 70% of left/right pairs in paired-end reads are assigned to the same BAC, despite being processed independently) and the BAC assemblies have good quality (the average sum of all assembled contigs is about 88% of the estimated BAC length). △ Less

Submitted 19 December, 2011; originally announced December 2011.

Showing 1–43 of 43 results for author: Bhat, R