Search | arXiv e-print repository

Human-Interpretable Adversarial Prompt Attack on Large Language Models with Situational Context

Authors: Nilanjana Das, Edward Raff, Manas Gaur

Abstract: Previous research on testing the vulnerabilities in Large Language Models (LLMs) using adversarial attacks has primarily focused on nonsensical prompt injections, which are easily detected upon manual or automated review (e.g., via byte entropy). However, the exploration of innocuous human-understandable malicious prompts augmented with adversarial injections remains limited. In this research, we… ▽ More Previous research on testing the vulnerabilities in Large Language Models (LLMs) using adversarial attacks has primarily focused on nonsensical prompt injections, which are easily detected upon manual or automated review (e.g., via byte entropy). However, the exploration of innocuous human-understandable malicious prompts augmented with adversarial injections remains limited. In this research, we explore converting a nonsensical suffix attack into a sensible prompt via a situation-driven contextual re-writing. This allows us to show suffix conversion without any gradients, using only LLMs to perform the attacks, and thus better understand the scope of possible risks. We combine an independent, meaningful adversarial insertion and situations derived from movies to check if this can trick an LLM. The situations are extracted from the IMDB dataset, and prompts are defined following a few-shot chain-of-thought prompting. Our approach demonstrates that a successful situation-driven attack can be executed on both open-source and proprietary LLMs. We find that across many LLMs, as few as 1 attempt produces an attack and that these attacks transfer between LLMs. △ Less

Submitted 25 July, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

arXiv:2406.10131 [pdf, other]

Linear Contextual Bandits with Hybrid Payoff: Revisited

Authors: Nirjhar Das, Gaurav Sinha

Abstract: We study the Linear Contextual Bandit problem in the hybrid reward setting. In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms. We can reduce this setting to two closely related settings (a) Shared - no arm specific parameters, and (b) Disjoint - only arm specific parameters, enabling the application o… ▽ More We study the Linear Contextual Bandit problem in the hybrid reward setting. In this setting every arm's reward model contains arm specific parameters in addition to parameters shared across the reward models of all the arms. We can reduce this setting to two closely related settings (a) Shared - no arm specific parameters, and (b) Disjoint - only arm specific parameters, enabling the application of two popular state of the art algorithms - $\texttt{LinUCB}$ and $\texttt{DisLinUCB}$ (Algorithm 1 in (Li et al. 2010)). When the arm features are stochastic and satisfy a popular diversity condition, we provide new regret analyses for both algorithms, significantly improving on the known regret guarantees of these algorithms. Our novel analysis critically exploits the hybrid reward structure and the diversity condition. Moreover, we introduce a new algorithm $\texttt{HyLinUCB}$ that crucially modifies $\texttt{LinUCB}$ (using a new exploration coefficient) to account for sparsity in the hybrid setting. Under the same diversity assumptions, we prove that $\texttt{HyLinUCB}$ also incurs only $O(\sqrt{T})$ regret for $T$ rounds. We perform extensive experiments on synthetic and real-world datasets demonstrating strong empirical performance of $\texttt{HyLinUCB}$.For number of arm specific parameters much larger than the number of shared parameters, we observe that $\texttt{DisLinUCB}$ incurs the lowest regret. In this case, regret of $\texttt{HyLinUCB}$ is the second best and extremely competitive to $\texttt{DisLinUCB}$. In all other situations, including our real-world dataset, $\texttt{HyLinUCB}$ has significantly lower regret than $\texttt{LinUCB}$, $\texttt{DisLinUCB}$ and other SOTA baselines we considered. We also empirically observe that the regret of $\texttt{HyLinUCB}$ grows much slower with the number of arms compared to baselines, making it suitable even for very large action spaces. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: Accepted at ECML PKDD 2024 as a Research Track Paper

arXiv:2406.08606 [pdf, other]

End-to-End Argument Mining as Augmented Natural Language Generation

Authors: Nilmadhab Das, Vishal Choudhary, V. Vijaya Saradhi, Ashish Anand

Abstract: Argument Mining (AM) is a crucial aspect of computational argumentation, which deals with the identification and extraction of Argumentative Components (ACs) and their corresponding Argumentative Relations (ARs). Most prior works have solved these problems by dividing them into multiple subtasks. And the available end-to-end setups are mostly based on the dependency parsing approach. This work pro… ▽ More Argument Mining (AM) is a crucial aspect of computational argumentation, which deals with the identification and extraction of Argumentative Components (ACs) and their corresponding Argumentative Relations (ARs). Most prior works have solved these problems by dividing them into multiple subtasks. And the available end-to-end setups are mostly based on the dependency parsing approach. This work proposes a unified end-to-end framework based on a generative paradigm, in which the argumentative structures are framed into label-augmented text, called Augmented Natural Language (ANL). Additionally, we explore the role of different types of markers in solving AM tasks. Through different marker-based fine-tuning strategies, we present an extensive study by integrating marker knowledge into our generative model. The proposed framework achieves competitive results to the state-of-the-art (SoTA) model and outperforms several baselines. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2405.08317 [pdf, other]

SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models

Authors: Raghuveer Peri, Sai Muralidhar Jayanthi, Srikanth Ronanki, Anshu Bhatia, Karel Mundnich, Saket Dingliwal, Nilaksh Das, Zejiang Hou, Goeric Huybrechts, Srikanth Vishnubhotla, Daniel Garcia-Romero, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff

Abstract: Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we… ▽ More Integrated Speech and Large Language Models (SLMs) that can follow speech instructions and generate relevant text responses have gained popularity lately. However, the safety and robustness of these models remains largely unclear. In this work, we investigate the potential vulnerabilities of such instruction-following speech-language models to adversarial attacks and jailbreaking. Specifically, we design algorithms that can generate adversarial examples to jailbreak SLMs in both white-box and black-box attack settings without human involvement. Additionally, we propose countermeasures to thwart such jailbreaking attacks. Our models, trained on dialog data with speech instructions, achieve state-of-the-art performance on spoken question-answering task, scoring over 80% on both safety and helpfulness metrics. Despite safety guardrails, experiments on jailbreaking demonstrate the vulnerability of SLMs to adversarial perturbations and transfer attacks, with average attack success rates of 90% and 10% respectively when evaluated on a dataset of carefully designed harmful questions spanning 12 different toxic categories. However, we demonstrate that our proposed countermeasures reduce the attack success significantly. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: 9+6 pages, Submitted to ACL 2024

arXiv:2405.08295 [pdf, other]

SpeechVerse: A Large-scale Generalizable Audio Language Model

Authors: Nilaksh Das, Saket Dingliwal, Srikanth Ronanki, Rohit Paturi, Zhaocheng Huang, Prashant Mathur, Jie Yuan, Dhanush Bekal, Xing Niu, Sai Muralidhar Jayanthi, Xilai Li, Karel Mundnich, Monica Sunkara, Sundararajan Srinivasan, Kyu J Han, Katrin Kirchhoff

Abstract: Large language models (LLMs) have shown incredible proficiency in performing tasks that require semantic understanding of natural language instructions. Recently, many works have further expanded this capability to perceive multimodal audio and text inputs, but their capabilities are often limited to specific fine-tuned tasks such as automatic speech recognition and translation. We therefore devel… ▽ More Large language models (LLMs) have shown incredible proficiency in performing tasks that require semantic understanding of natural language instructions. Recently, many works have further expanded this capability to perceive multimodal audio and text inputs, but their capabilities are often limited to specific fine-tuned tasks such as automatic speech recognition and translation. We therefore develop SpeechVerse, a robust multi-task training and curriculum learning framework that combines pre-trained speech and text foundation models via a small set of learnable parameters, while keeping the pre-trained models frozen during training. The models are instruction finetuned using continuous latent representations extracted from the speech foundation model to achieve optimal zero-shot performance on a diverse range of speech processing tasks using natural language instructions. We perform extensive benchmarking that includes comparing our model performance against traditional baselines across several datasets and tasks. Furthermore, we evaluate the model's capability for generalized instruction following by testing on out-of-domain datasets, novel prompts, and unseen tasks. Our empirical experiments reveal that our multi-task SpeechVerse model is even superior to conventional task-specific baselines on 9 out of the 11 tasks. △ Less

Submitted 31 May, 2024; v1 submitted 13 May, 2024; originally announced May 2024.

Comments: Single Column, 13 page

arXiv:2405.05292 [pdf, other]

Smart Portable Computer

Authors: Niladri Das

Abstract: Amidst the COVID-19 pandemic, with many organizations, schools, colleges, and universities transitioning to virtual platforms, students encountered difficulties in acquiring PCs such as desktops or laptops. The starting prices, around 15,000 INR, often failed to offer adequate system specifications, posing a challenge for consumers. Additionally, those reliant on laptops for work found the convent… ▽ More Amidst the COVID-19 pandemic, with many organizations, schools, colleges, and universities transitioning to virtual platforms, students encountered difficulties in acquiring PCs such as desktops or laptops. The starting prices, around 15,000 INR, often failed to offer adequate system specifications, posing a challenge for consumers. Additionally, those reliant on laptops for work found the conventional approach cumbersome. Enter the "Portable Smart Computer," a leap into the future of computing. This innovative device boasts speed and performance comparable to traditional desktops but in a compact, energy-efficient, and cost-effective package. It delivers a seamless desktop experience, whether one is editing documents, browsing multiple tabs, managing spreadsheets, or creating presentations. Moreover, it supports programming languages like Python, C, C++, as well as compilers such as Keil and Xilinx, catering to the needs of programmers. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: 34 pages

Report number: Lovely Professional University Report 001

arXiv:2404.17367 [pdf]

An Optimised Brushless DC Motor Control Scheme for Robotics Applications

Authors: Nilabha Das, Laxman Rao S. Paragond, Balkrushna H. Waghmare

Abstract: This work aims to develop an integrated control strategy for Brushless Direct Current Motors for a wide range of applications in robotics systems. The controller is suited for both high torque - low speed and high-speed control of the motors. Hardware validation is done by developing a custom BLDC drive system, and the circuit elements are optimised for power efficiency. This work aims to develop an integrated control strategy for Brushless Direct Current Motors for a wide range of applications in robotics systems. The controller is suited for both high torque - low speed and high-speed control of the motors. Hardware validation is done by developing a custom BLDC drive system, and the circuit elements are optimised for power efficiency. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: 6 Pages, 8 figures, 1 table

arXiv:2404.06831 [pdf, other]

Generalized Linear Bandits with Limited Adaptivity

Authors: Ayush Sawarni, Nirjhar Das, Siddharth Barman, Gaurav Sinha

Abstract: We study the generalized linear contextual bandit problem within the constraints of limited adaptivity. In this paper, we present two algorithms, $\texttt{B-GLinCB}$ and $\texttt{RS-GLinCB}$, that address, respectively, two prevalent limited adaptivity settings. Given a budget $M$ on the number of policy updates, in the first setting, the algorithm needs to decide upfront $M$ rounds at which it wi… ▽ More We study the generalized linear contextual bandit problem within the constraints of limited adaptivity. In this paper, we present two algorithms, $\texttt{B-GLinCB}$ and $\texttt{RS-GLinCB}$, that address, respectively, two prevalent limited adaptivity settings. Given a budget $M$ on the number of policy updates, in the first setting, the algorithm needs to decide upfront $M$ rounds at which it will update its policy, while in the second setting it can adaptively perform $M$ policy updates during its course. For the first setting, we design an algorithm $\texttt{B-GLinCB}$, that incurs $\tilde{O}(\sqrt{T})$ regret when $M = Ω\left( \log{\log T} \right)$ and the arm feature vectors are generated stochastically. For the second setting, we design an algorithm $\texttt{RS-GLinCB}$ that updates its policy $\tilde{O}(\log^2 T)$ times and achieves a regret of $\tilde{O}(\sqrt{T})$ even when the arm feature vectors are adversarially generated. Notably, in these bounds, we manage to eliminate the dependence on a key instance dependent parameter $κ$, that captures non-linearity of the underlying reward model. Our novel approach for removing this dependence for generalized linear contextual bandits might be of independent interest. △ Less

Submitted 14 June, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

Comments: Reorganization; New Experiments

arXiv:2404.04245 [pdf]

Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism

Authors: Trilokesh Ranjan Sarkar, Nilanjan Das, Pralay Sankar Maitra, Bijoy Some, Ritwik Saha, Orijita Adhikary, Bishal Bose, Jaydip Sen

Abstract: This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Met… ▽ More This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner (CW) approach. These attacks are examined concerning three pre-trained image classifiers: Resnext50_32x4d, DenseNet-201, and VGG-19, utilizing the Tiny-ImageNet dataset. Furthermore, the study proposes the robustness of defensive distillation as a defense mechanism to counter FGSM and CW attacks. This defense mechanism is evaluated using the CIFAR-10 dataset, where CNN models, specifically resnet101 and Resnext50_32x4d, serve as the teacher and student models, respectively. The proposed defensive distillation model exhibits effectiveness in thwarting attacks such as FGSM. However, it is noted to remain susceptible to more sophisticated techniques like the CW attack. The document presents a meticulous validation of the proposed scheme. It provides detailed and comprehensive results, elucidating the efficacy and limitations of the defense mechanisms employed. Through rigorous experimentation and analysis, the study offers insights into the dynamics of adversarial attacks on DNNs, as well as the effectiveness of defensive strategies in mitigating their impact. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: This report pertains to the Capstone Project done by Group 1 of the Fall batch of 2023 students at Praxis Tech School, Kolkata, India. The reports consists of 35 pages and it includes 15 figures and 10 tables. This is the preprint which will be submitted to to an IEEE international conference for review

arXiv:2403.10885 [pdf, other]

Could We Generate Cytology Images from Histopathology Images? An Empirical Study

Authors: Soumyajyoti Dey, Sukanta Chakraborty, Utso Guha Roy, Nibaran Das

Abstract: Automation in medical imaging is quite challenging due to the unavailability of annotated datasets and the scarcity of domain experts. In recent years, deep learning techniques have solved some complex medical imaging tasks like disease classification, important object localization, segmentation, etc. However, most of the task requires a large amount of annotated data for their successful implemen… ▽ More Automation in medical imaging is quite challenging due to the unavailability of annotated datasets and the scarcity of domain experts. In recent years, deep learning techniques have solved some complex medical imaging tasks like disease classification, important object localization, segmentation, etc. However, most of the task requires a large amount of annotated data for their successful implementation. To mitigate the shortage of data, different generative models are proposed for data augmentation purposes which can boost the classification performances. For this, different synthetic medical image data generation models are developed to increase the dataset. Unpaired image-to-image translation models here shift the source domain to the target domain. In the breast malignancy identification domain, FNAC is one of the low-cost low-invasive modalities normally used by medical practitioners. But availability of public datasets in this domain is very poor. Whereas, for automation of cytology images, we need a large amount of annotated data. Therefore synthetic cytology images are generated by translating breast histopathology samples which are publicly available. In this study, we have explored traditional image-to-image transfer models like CycleGAN, and Neural Style Transfer. Further, it is observed that the generated cytology images are quite similar to real breast cytology samples by measuring FID and KID scores. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Accept at International Conference on Advanced Computing and Applications(ICACA-2024)

arXiv:2403.10884 [pdf, other]

Fuzzy Rank-based Late Fusion Technique for Cytology image Segmentation

Authors: Soumyajyoti Dey, Sukanta Chakraborty, Utso Guha Roy, Nibaran Das

Abstract: Cytology image segmentation is quite challenging due to its complex cellular structure and multiple overlapping regions. On the other hand, for supervised machine learning techniques, we need a large amount of annotated data, which is costly. In recent years, late fusion techniques have given some promising performances in the field of image classification. In this paper, we have explored a fuzzy-… ▽ More Cytology image segmentation is quite challenging due to its complex cellular structure and multiple overlapping regions. On the other hand, for supervised machine learning techniques, we need a large amount of annotated data, which is costly. In recent years, late fusion techniques have given some promising performances in the field of image classification. In this paper, we have explored a fuzzy-based late fusion techniques for cytology image segmentation. This fusion rule integrates three traditional semantic segmentation models UNet, SegNet, and PSPNet. The technique is applied on two cytology image datasets, i.e., cervical cytology(HErlev) and breast cytology(JUCYT-v1) image datasets. We have achieved maximum MeanIoU score 84.27% and 83.79% on the HErlev dataset and JUCYT-v1 dataset after the proposed late fusion technique, respectively which are better than that of the traditional fusion rules such as average probability, geometric mean, Borda Count, etc. The codes of the proposed model are available on GitHub. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Accept at International Conference on Data, Electronics and Computing (ICDEC-2023)

arXiv:2403.10881 [pdf, other]

Regularizing CNNs using Confusion Penalty Based Label Smoothing for Histopathology Images

Authors: Somenath Kuiry, Alaka Das, Mita Nasipuri, Nibaran Das

Abstract: Deep Learning, particularly Convolutional Neural Networks (CNN), has been successful in computer vision tasks and medical image analysis. However, modern CNNs can be overconfident, making them difficult to deploy in real-world scenarios. Researchers propose regularizing techniques, such as Label Smoothing (LS), which introduces soft labels for training data, making the classifier more regularized.… ▽ More Deep Learning, particularly Convolutional Neural Networks (CNN), has been successful in computer vision tasks and medical image analysis. However, modern CNNs can be overconfident, making them difficult to deploy in real-world scenarios. Researchers propose regularizing techniques, such as Label Smoothing (LS), which introduces soft labels for training data, making the classifier more regularized. LS captures disagreements or lack of confidence in the training phase, making the classifier more regularized. Although LS is quite simple and effective, traditional LS techniques utilize a weighted average between target distribution and a uniform distribution across the classes, which limits the objective of LS as well as the performance. This paper introduces a novel LS technique based on the confusion penalty, which treats model confusion for each class with more importance than others. We have performed extensive experiments with well-known CNN architectures with this technique on publicly available Colorectal Histology datasets and got satisfactory results. Also, we have compared our findings with the State-of-the-art and shown our method's efficacy with Reliability diagrams and t-distributed Stochastic Neighbor Embedding (t-SNE) plots of feature space. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Accepted at CICBA 2024 : 6th International Conference on Computational Intelligence in Communications, and Business Analytics

arXiv:2403.10880 [pdf, other]

COVID-CT-H-UNet: a novel COVID-19 CT segmentation network based on attention mechanism and Bi-category Hybrid loss

Authors: Anay Panja, Somenath Kuiry, Alaka Das, Mita Nasipuri, Nibaran Das

Abstract: Since 2019, the global COVID-19 outbreak has emerged as a crucial focus in healthcare research. Although RT-PCR stands as the primary method for COVID-19 detection, its extended detection time poses a significant challenge. Consequently, supplementing RT-PCR with the pathological study of COVID-19 through CT imaging has become imperative. The current segmentation approach based on TVLoss enhances… ▽ More Since 2019, the global COVID-19 outbreak has emerged as a crucial focus in healthcare research. Although RT-PCR stands as the primary method for COVID-19 detection, its extended detection time poses a significant challenge. Consequently, supplementing RT-PCR with the pathological study of COVID-19 through CT imaging has become imperative. The current segmentation approach based on TVLoss enhances the connectivity of afflicted areas. Nevertheless, it tends to misclassify normal pixels between certain adjacent diseased regions as diseased pixels. The typical Binary cross entropy(BCE) based U-shaped network only concentrates on the entire CT images without emphasizing on the affected regions, which results in hazy borders and low contrast in the projected output. In addition, the fraction of infected pixels in CT images is much less, which makes it a challenge for segmentation models to make accurate predictions. In this paper, we propose COVID-CT-H-UNet, a COVID-19 CT segmentation network to solve these problems. To recognize the unaffected pixels between neighbouring diseased regions, extra visual layer information is captured by combining the attention module on the skip connections with the proposed composite function Bi-category Hybrid Loss. The issue of hazy boundaries and poor contrast brought on by the BCE Loss in conventional techniques is resolved by utilizing the composite function Bi-category Hybrid Loss that concentrates on the pixels in the diseased area. The experiment shows when compared to the previous COVID-19 segmentation networks, the proposed COVID-CT-H-UNet's segmentation impact has greatly improved, and it may be used to identify and study clinical COVID-19. △ Less

Submitted 16 March, 2024; originally announced March 2024.

Comments: Accepted at CICBA 2024 : 6th International Conference on Computational Intelligence in Communications, and Business Analytics

arXiv:2402.10500 [pdf, other]

Active Preference Optimization for Sample Efficient RLHF

Authors: Nirjhar Das, Souradip Chakraborty, Aldo Pacchiano, Sayak Ray Chowdhury

Abstract: Reinforcement Learning from Human Feedback (RLHF) is pivotal in aligning Large Language Models (LLMs) with human preferences. Although aligned generative models have shown remarkable abilities in various tasks, their reliance on high-quality human preference data creates a costly bottleneck in the practical application of RLHF. One primary reason is that current methods rely on uniformly picking p… ▽ More Reinforcement Learning from Human Feedback (RLHF) is pivotal in aligning Large Language Models (LLMs) with human preferences. Although aligned generative models have shown remarkable abilities in various tasks, their reliance on high-quality human preference data creates a costly bottleneck in the practical application of RLHF. One primary reason is that current methods rely on uniformly picking prompt-generation pairs from a dataset of prompt-generations, to collect human feedback, resulting in sub-optimal alignment under a constrained budget, which highlights the criticality of adaptive strategies in efficient alignment. Recent works [Mehta et al., 2023, Muldrew et al., 2024] have tried to address this problem by designing various heuristics based on generation uncertainty. However, either the assumptions in [Mehta et al., 2023] are restrictive, or [Muldrew et al., 2024] do not provide any rigorous theoretical guarantee. To address these, we reformulate RLHF within contextual preference bandit framework, treating prompts as contexts, and develop an active-learning algorithm, $\textit{Active Preference Optimization}$ ($\texttt{APO}$), which enhances model alignment by querying preference data from the most important samples, achieving superior performance for small sample budget. We analyze the theoretical performance guarantees of $\texttt{APO}$ under the BTL preference model showing that the suboptimality gap of the policy learned via $\texttt{APO}$ scales as $O(1/\sqrt{T})$ for a budget of $T$. We also show that collecting preference data by choosing prompts randomly leads to a policy that suffers a constant sub-optimality. We perform detailed experimental evaluations on practical preference datasets to validate $\texttt{APO}$'s efficacy over the existing methods, establishing it as a sample-efficient and practical solution of alignment in a cost-effective and scalable manner. △ Less

Submitted 5 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: New experimental results added. Some reorganization

arXiv:2309.12421 [pdf, other]

Change Management using Generative Modeling on Digital Twins

Authors: Nilanjana Das, Anantaa Kotal, Daniel Roseberry, Anupam Joshi

Abstract: A key challenge faced by small and medium-sized business entities is securely managing software updates and changes. Specifically, with rapidly evolving cybersecurity threats, changes/updates/patches to software systems are necessary to stay ahead of emerging threats and are often mandated by regulators or statutory authorities to counter these. However, security patches/updates require stress tes… ▽ More A key challenge faced by small and medium-sized business entities is securely managing software updates and changes. Specifically, with rapidly evolving cybersecurity threats, changes/updates/patches to software systems are necessary to stay ahead of emerging threats and are often mandated by regulators or statutory authorities to counter these. However, security patches/updates require stress testing before they can be released in the production system. Stress testing in production environments is risky and poses security threats. Large businesses usually have a non-production environment where such changes can be made and tested before being released into production. Smaller businesses do not have such facilities. In this work, we show how "digital twins", especially for a mix of IT and IoT environments, can be created on the cloud. These digital twins act as a non-production environment where changes can be applied, and the system can be securely tested before patch release. Additionally, the non-production digital twin can be used to collect system data and run stress tests on the environment, both manually and automatically. In this paper, we show how using a small sample of real data/interactions, Generative Artificial Intelligence (AI) models can be used to generate testing scenarios to check for points of failure. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2308.11395 [pdf, other]

ULGss: A Strategy to construct a Library of Universal Logic Gates for $N$-variable Boolean Logic beyond NAND and NOR

Authors: Aadarsh G. Goenka, Shyamali Mitra, Mrinal K. Naskar, Nibaran Das

Abstract: In literature, NAND and NOR are two logic gates that display functional completeness, hence regarded as Universal gates. So, the present effort is focused on exploring a library of universal gates in binary that are still unexplored in literature along with a broad and systematic approach to classify the logic connectives. The study shows that the number of Universal Gates in any logic system grow… ▽ More In literature, NAND and NOR are two logic gates that display functional completeness, hence regarded as Universal gates. So, the present effort is focused on exploring a library of universal gates in binary that are still unexplored in literature along with a broad and systematic approach to classify the logic connectives. The study shows that the number of Universal Gates in any logic system grows exponentially with the number of input variables $N$. It is revealed that there are $56$ Universal gates in binary for $N=3$. It is shown that the ratio of the count of Universal gates to the total number of Logic gates is $\approx $ $\frac{1}{4}$ or 0.25. Adding constants $0,1$ allow for the creation of $4$ additional (for $N=2$) and $169$ additional Universal Gates (for $N=3$). In this article, the mathematical and logical underpinnings of the concept of universal logic gates are presented, along with a search strategy $ULG_{SS}$ exploring multiple paths leading to their identification. A fast-track approach has been introduced that uses the hexadecimal representation of a logic gate to quickly ascertain its attribute. △ Less

Submitted 22 August, 2023; originally announced August 2023.

Comments: 8 pages 10 tables 11 figures

arXiv:2305.08130 [pdf, ps, other]

doi 10.1007/978-3-031-45170-6_19

Inverse Reinforcement Learning With Constraint Recovery

Authors: Nirjhar Das, Arpan Chattopadhyay

Abstract: In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems. In standard IRL problems, the inverse learner or agent seeks to recover the reward function of the MDP, given a set of trajectory demonstrations for the optimal policy. In this work, we seek to infer not only the reward functions of the CMDP, but also the constra… ▽ More In this work, we propose a novel inverse reinforcement learning (IRL) algorithm for constrained Markov decision process (CMDP) problems. In standard IRL problems, the inverse learner or agent seeks to recover the reward function of the MDP, given a set of trajectory demonstrations for the optimal policy. In this work, we seek to infer not only the reward functions of the CMDP, but also the constraints. Using the principle of maximum entropy, we show that the IRL with constraint recovery (IRL-CR) problem can be cast as a constrained non-convex optimization problem. We reduce it to an alternating constrained optimization problem whose sub-problems are convex. We use exponentiated gradient descent algorithm to solve it. Finally, we demonstrate the efficacy of our algorithm for the grid world environment. △ Less

Submitted 14 May, 2023; originally announced May 2023.

arXiv:2305.03837 [pdf, other]

Mask The Bias: Improving Domain-Adaptive Generalization of CTC-based ASR with Internal Language Model Estimation

Authors: Nilaksh Das, Monica Sunkara, Sravan Bodapati, Jinglun Cai, Devang Kulshreshtha, Jeff Farris, Katrin Kirchhoff

Abstract: End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture,… ▽ More End-to-end ASR models trained on large amount of data tend to be implicitly biased towards language semantics of the training data. Internal language model estimation (ILME) has been proposed to mitigate this bias for autoregressive models such as attention-based encoder-decoder and RNN-T. Typically, ILME is performed by modularizing the acoustic and language components of the model architecture, and eliminating the acoustic input to perform log-linear interpolation with the text-only posterior. However, for CTC-based ASR, it is not as straightforward to decouple the model into such acoustic and language components, as CTC log-posteriors are computed in a non-autoregressive manner. In this work, we propose a novel ILME technique for CTC-based ASR models. Our method iteratively masks the audio timesteps to estimate a pseudo log-likelihood of the internal LM by accumulating log-posteriors for only the masked timesteps. Extensive evaluation across multiple out-of-domain datasets reveals that the proposed approach improves WER by up to 9.8% and OOV F1-score by up to 24.6% relative to Shallow Fusion, when only text data from target domain is available. In the case of zero-shot domain adaptation, with no access to any target domain data, we demonstrate that removing the source domain bias with ILME can still outperform Shallow Fusion to improve WER by up to 9.3% relative. △ Less

Submitted 5 May, 2023; originally announced May 2023.

Comments: Accepted to ICASSP 2023

arXiv:2304.11265 [pdf, other]

Time Series Classification for Detecting Parkinson's Disease from Wrist Motions

Authors: Cedric Donié, Neha Das, Satoshi Endo, Sandra Hirche

Abstract: Parkinson's disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size… ▽ More Parkinson's disease (PD) is a neurodegenerative condition characterized by frequently changing motor symptoms, necessitating continuous symptom monitoring for more targeted treatment. Classical time series classification and deep learning techniques have demonstrated limited efficacy in monitoring PD symptoms using wearable accelerometer data due to complex PD movement patterns and the small size of available datasets. We investigate InceptionTime and RandOm Convolutional KErnel Transform (ROCKET) as they are promising for PD symptom monitoring, with InceptionTime's high learning capacity being well-suited to modeling complex movement patterns while ROCKET is suited to small datasets. With random search methodology, we identify the highest-scoring InceptionTime architecture and compare its performance to ROCKET with a ridge classifier and a multi-layer perceptron (MLP) on wrist motion data from PD patients. Our findings indicate that all approaches are suitable for estimating tremor severity and bradykinesia presence but encounter challenges in detecting dyskinesia. ROCKET demonstrates superior performance in identifying dyskinesia, whereas InceptionTime exhibits slightly better performance in tremor and bradykinesia detection. Notably, both methods outperform the multi-layer perceptron. In conclusion, InceptionTime exhibits the capability to classify complex wrist motion time series and holds the greatest potential for continuous symptom monitoring in PD. △ Less

Submitted 20 May, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

Comments: The source code is available under https://github.com/cedricdonie/tsc-for-wrist-motion-pd-detection

ACM Class: I.5; J.2; J.3

arXiv:2304.07949 [pdf, other]

Metrics for Bayesian Optimal Experiment Design under Model Misspecification

Authors: Tommie A. Catanach, Niladri Das

Abstract: The conventional approach to Bayesian decision-theoretic experiment design involves searching over possible experiments to select a design that maximizes the expected value of a specified utility function. The expectation is over the joint distribution of all unknown variables implied by the statistical model that will be used to analyze the collected data. The utility function defines the objecti… ▽ More The conventional approach to Bayesian decision-theoretic experiment design involves searching over possible experiments to select a design that maximizes the expected value of a specified utility function. The expectation is over the joint distribution of all unknown variables implied by the statistical model that will be used to analyze the collected data. The utility function defines the objective of the experiment where a common utility function is the information gain. This article introduces an expanded framework for this process, where we go beyond the traditional Expected Information Gain criteria and introduce the Expected General Information Gain which measures robustness to the model discrepancy and Expected Discriminatory Information as a criterion to quantify how well an experiment can detect model discrepancy. The functionality of the framework is showcased through its application to a scenario involving a linearized spring mass damper system and an F-16 model where the model discrepancy is taken into account while doing Bayesian optimal experiment design. △ Less

Submitted 16 April, 2023; originally announced April 2023.

arXiv:2304.03201

Device-Independent Quantum Secure Direct Communication with User Authentication

Authors: Nayana Das, Goutam Paul

Abstract: Quantum Secure Direct Communication (QSDC) is an important branch of quantum cryptography, which enables the secure transmission of messages without prior key encryption. However, traditional quantum communication protocols rely on the security and trustworthiness of the devices employed to implement the protocols, which can be susceptible to attacks. Device-independent (DI) quantum protocols, on… ▽ More Quantum Secure Direct Communication (QSDC) is an important branch of quantum cryptography, which enables the secure transmission of messages without prior key encryption. However, traditional quantum communication protocols rely on the security and trustworthiness of the devices employed to implement the protocols, which can be susceptible to attacks. Device-independent (DI) quantum protocols, on the other hand, aim to secure quantum communication independent of the devices used by leveraging fundamental principles of quantum mechanics. In this research paper, we introduce the first DI-QSDC protocol that includes user identity authentication to establish the authenticity of both sender and receiver before message exchange. We also extend this approach to a DI Quantum Dialogue (QD) protocol where both parties can send secret messages upon mutual authentication. △ Less

Submitted 14 August, 2024; v1 submitted 6 April, 2023; originally announced April 2023.

Comments: There is a security loophole in this article, so we are withdrawing this

arXiv:2212.14410 [pdf, other]

Shared Cache Coded Caching Schemes Using Designs and Circuits of Matrices

Authors: Niladri Das, B. Sundar Rajan

Abstract: In this paper, we study shared cache coded caching (SC-CC): a set of caches serves a larger set of users; each user access one cache, and a cache may serve many users. For this problem, under uncoded placement, Parrinello, Ünsal, and Elia showed an optimal SC-CC scheme, in which the subpacketization level depends upon the number of caches. We show an SC-CC scheme where the subpacketization level d… ▽ More In this paper, we study shared cache coded caching (SC-CC): a set of caches serves a larger set of users; each user access one cache, and a cache may serve many users. For this problem, under uncoded placement, Parrinello, Ünsal, and Elia showed an optimal SC-CC scheme, in which the subpacketization level depends upon the number of caches. We show an SC-CC scheme where the subpacketization level does not directly depend upon the number of users or caches; any number of caches and users can be accommodated for a fixed subpacketization level. Furthermore, new caches can be added without re-doing the placement of the existing caches. We show that given an upper limit on the allowable subpacketization level, our SC-CC scheme may achieve a lesser rate than other relevant SC-CC schemes. Our scheme is constructed using matrices and designs. A matroid can be obtained from a matrix over a finite field; the placement of our scheme is decided by a design constructed from a matrix; the circuits of a matroid obtained from the matrix and the design is used to decide the delivery. △ Less

Submitted 29 December, 2022; originally announced December 2022.

Comments: 36 pages, the paper has been submitted to IEEE Transactions on Information Theory

arXiv:2212.00478 [pdf, ps, other]

Safe Learning-Based Control of Elastic Joint Robots via Control Barrier Functions

Authors: Armin Lederer, Azra Begzadić, Neha Das, Sandra Hirche

Abstract: Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the… ▽ More Ensuring safety is of paramount importance in physical human-robot interaction applications. This requires both adherence to safety constraints defined on the system state, as well as guaranteeing compliant behavior of the robot. If the underlying dynamical system is known exactly, the former can be addressed with the help of control barrier functions. The incorporation of elastic actuators in the robot's mechanical design can address the latter requirement. However, this elasticity can increase the complexity of the resulting system, leading to unmodeled dynamics, such that control barrier functions cannot directly ensure safety. In this paper, we mitigate this issue by learning the unknown dynamics using Gaussian process regression. By employing the model in a feedback linearizing control law, the safety conditions resulting from control barrier functions can be robustified to take into account model errors, while remaining feasible. In order to enforce them on-line, we formulate the derived safety conditions in the form of a second-order cone program. We demonstrate our proposed approach with simulations on a two-degree-of-freedom planar robot with elastic joints. △ Less

Submitted 14 April, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2211.06581 [pdf, other]

Variational Augmentation for Enhancing Historical Document Image Binarization

Authors: Avirup Dey, Nibaran Das, Mita Nasipuri

Abstract: Historical Document Image Binarization is a well-known segmentation problem in image processing. Despite ubiquity, traditional thresholding algorithms achieved limited success on severely degraded document images. With the advent of deep learning, several segmentation models were proposed that made significant progress in the field but were limited by the unavailability of large training datasets.… ▽ More Historical Document Image Binarization is a well-known segmentation problem in image processing. Despite ubiquity, traditional thresholding algorithms achieved limited success on severely degraded document images. With the advent of deep learning, several segmentation models were proposed that made significant progress in the field but were limited by the unavailability of large training datasets. To mitigate this problem, we have proposed a novel two-stage framework -- the first of which comprises a generator that generates degraded samples using variational inference and the second being a CNN-based binarization network that trains on the generated data. We evaluated our framework on a range of DIBCO datasets, where it achieved competitive results against previous state-of-the-art methods. △ Less

Submitted 12 November, 2022; originally announced November 2022.

Comments: Accepted at ICVGIP 2022

MSC Class: I.4.6

arXiv:2210.12492 [pdf, other]

NeuroMapper: In-browser Visualizer for Neural Network Training

Authors: Zhiyan Zhou, Kevin Li, Haekyu Park, Megan Dass, Austin Wright, Nilaksh Das, Duen Horng Chau

Abstract: We present our ongoing work NeuroMapper, an in-browser visualization tool that helps machine learning (ML) developers interpret the evolution of a model during training, providing a new way to monitor the training process and visually discover reasons for suboptimal training. While most existing deep neural networks (DNNs) interpretation tools are designed for already-trained model, NeuroMapper sc… ▽ More We present our ongoing work NeuroMapper, an in-browser visualization tool that helps machine learning (ML) developers interpret the evolution of a model during training, providing a new way to monitor the training process and visually discover reasons for suboptimal training. While most existing deep neural networks (DNNs) interpretation tools are designed for already-trained model, NeuroMapper scalably visualizes the evolution of the embeddings of a model's blocks across training epochs, enabling real-time visualization of 40,000 embedded points. To promote the embedding visualizations' spatial coherence across epochs, NeuroMapper adapts AlignedUMAP, a recent nonlinear dimensionality reduction technique to align the embeddings. With NeuroMapper, users can explore the training dynamics of a Resnet-50 model, and adjust the embedding visualizations' parameters in real time. NeuroMapper is open-sourced at https://github.com/poloclub/NeuroMapper and runs in all modern web browsers. A demo of the tool in action is available at: https://poloclub.github.io/NeuroMapper/. △ Less

Submitted 22 October, 2022; originally announced October 2022.

Comments: IEEE VIS 2022

arXiv:2206.13577 [pdf, other]

A View Independent Classification Framework for Yoga Postures

Authors: Mustafa Chasmai, Nirjhar Das, Aman Bhardwaj, Rahul Garg

Abstract: Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from Human Pose Estimation models for extracting 136 key-points spread all over the body to train a Random Forest classifier which is used for estimation of the Yogasanas. The results are evalua… ▽ More Yoga is a globally acclaimed and widely recommended practice for a healthy living. Maintaining correct posture while performing a Yogasana is of utmost importance. In this work, we employ transfer learning from Human Pose Estimation models for extracting 136 key-points spread all over the body to train a Random Forest classifier which is used for estimation of the Yogasanas. The results are evaluated on an in-house collected extensive yoga video database of 51 subjects recorded from 4 different camera angles. We propose a 3 step scheme for evaluating the generalizability of a Yoga classifier by testing it on 1) unseen frames, 2) unseen subjects, and 3) unseen camera angles. We argue that for most of the applications, validation accuracies on unseen subjects and unseen camera angles would be most important. We empirically analyze over three public datasets, the advantage of transfer learning and the possibilities of target leakage. We further demonstrate that the classification accuracies critically depend on the cross validation method employed and can often be misleading. To promote further research, we have made key-points dataset and code publicly available. △ Less

Submitted 14 August, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

arXiv:2204.13089 [pdf, other]

Variational Kalman Filtering with Hinf-Based Correction for Robust Bayesian Learning in High Dimensions

Authors: Niladri Das, Jed A. Duersch, Thomas A. Catanach

Abstract: In this paper, we address the problem of convergence of sequential variational inference filter (VIF) through the application of a robust variational objective and Hinf-norm based correction for a linear Gaussian system. As the dimension of state or parameter space grows, performing the full Kalman update with the dense covariance matrix for a large scale system requires increased storage and comp… ▽ More In this paper, we address the problem of convergence of sequential variational inference filter (VIF) through the application of a robust variational objective and Hinf-norm based correction for a linear Gaussian system. As the dimension of state or parameter space grows, performing the full Kalman update with the dense covariance matrix for a large scale system requires increased storage and computational complexity, making it impractical. The VIF approach, based on mean-field Gaussian variational inference, reduces this burden through the variational approximation to the covariance usually in the form of a diagonal covariance approximation. The challenge is to retain convergence and correct for biases introduced by the sequential VIF steps. We desire a framework that improves feasibility while still maintaining reasonable proximity to the optimal Kalman filter as data is assimilated. To accomplish this goal, a Hinf-norm based optimization perturbs the VIF covariance matrix to improve robustness. This yields a novel VIF- Hinf recursion that employs consecutive variational inference and Hinf based optimization steps. We explore the development of this method and investigate a numerical example to illustrate the effectiveness of the proposed filter. △ Less

Submitted 27 April, 2022; originally announced April 2022.

arXiv:2204.02381 [pdf, other]

Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task Learning

Authors: Nilaksh Das, Duen Horng Chau

Abstract: As automatic speech recognition (ASR) systems are now being widely deployed in the wild, the increasing threat of adversarial attacks raises serious questions about the security and reliability of using such systems. On the other hand, multi-task learning (MTL) has shown success in training models that can resist adversarial attacks in the computer vision domain. In this work, we investigate the i… ▽ More As automatic speech recognition (ASR) systems are now being widely deployed in the wild, the increasing threat of adversarial attacks raises serious questions about the security and reliability of using such systems. On the other hand, multi-task learning (MTL) has shown success in training models that can resist adversarial attacks in the computer vision domain. In this work, we investigate the impact of performing such multi-task learning on the adversarial robustness of ASR models in the speech domain. We conduct extensive MTL experimentation by combining semantically diverse tasks such as accent classification and ASR, and evaluate a wide range of adversarial settings. Our thorough analysis reveals that performing MTL with semantically diverse tasks consistently makes it harder for an adversarial attack to succeed. We also discuss in detail the serious pitfalls and their related remedies that have a significant impact on the robustness of MTL models. Our proposed MTL approach shows considerable absolute improvements in adversarially targeted WER ranging from 17.25 up to 59.90 compared to single-task learning baselines (attention decoder and CTC respectively). Ours is the first in-depth study that uncovers adversarial robustness gains from multi-task learning for ASR. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: Submitted to Insterspeech 2022

arXiv:2204.00734 [pdf, other]

SkeleVision: Towards Adversarial Resiliency of Person Tracking with Multi-Task Learning

Authors: Nilaksh Das, Sheng-Yun Peng, Duen Horng Chau

Abstract: Person tracking using computer vision techniques has wide ranging applications such as autonomous driving, home security and sports analytics. However, the growing threat of adversarial attacks raises serious concerns regarding the security and reliability of such techniques. In this work, we study the impact of multi-task learning (MTL) on the adversarial robustness of the widely used SiamRPN tra… ▽ More Person tracking using computer vision techniques has wide ranging applications such as autonomous driving, home security and sports analytics. However, the growing threat of adversarial attacks raises serious concerns regarding the security and reliability of such techniques. In this work, we study the impact of multi-task learning (MTL) on the adversarial robustness of the widely used SiamRPN tracker, in the context of person tracking. Specifically, we investigate the effect of jointly learning with semantically analogous tasks of person tracking and human keypoint detection. We conduct extensive experiments with more powerful adversarial attacks that can be physically realizable, demonstrating the practical value of our approach. Our empirical study with simulated as well as real-world datasets reveals that training with MTL consistently makes it harder to attack the SiamRPN tracker, compared to typically training only on the single task of person tracking. △ Less

Submitted 1 April, 2022; originally announced April 2022.

arXiv:2203.16475 [pdf, other]

Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries

Authors: Haekyu Park, Seongmin Lee, Benjamin Hoover, Austin P. Wright, Omar Shaikh, Rahul Duggal, Nilaksh Das, Kevin Li, Judy Hoffman, Duen Horng Chau

Abstract: We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unifie… ▽ More We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unified semantic space, enabling side-by-side comparison of different models during training, and (2) an algorithm that discovers and quantifies important concept evolutions for class predictions. Through a large-scale human evaluation and quantitative experiments, we demonstrate that ConceptEvo successfully identifies concept evolutions across different models, which are not only comprehensible to humans but also crucial for class predictions. ConceptEvo is applicable to both modern DNN architectures, such as ConvNeXt, and classic DNNs, such as VGGs and InceptionV3. △ Less

Submitted 22 August, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

Comments: Accepted at CIKM'23

arXiv:2203.08977 [pdf, other]

Adaptive n-ary Activation Functions for Probabilistic Boolean Logic

Authors: Jed A. Duersch, Thomas A. Catanach, Niladri Das

Abstract: Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually remove… ▽ More Balancing model complexity against the information contained in observed data is the central challenge to learning. In order for complexity-efficient models to exist and be discoverable in high dimensions, we require a computational framework that relates a credible notion of complexity to simple parameter representations. Further, this framework must allow excess complexity to be gradually removed via gradient-based optimization. Our n-ary, or n-argument, activation functions fill this gap by approximating belief functions (probabilistic Boolean logic) using logit representations of probability. Just as Boolean logic determines the truth of a consequent claim from relationships among a set of antecedent propositions, probabilistic formulations generalize predictions when antecedents, truth tables, and consequents all retain uncertainty. Our activation functions demonstrate the ability to learn arbitrary logic, such as the binary exclusive disjunction (p xor q) and ternary conditioned disjunction ( c ? p : q ), in a single layer using an activation function of matching or greater arity. Further, we represent belief tables using a basis that directly associates the number of nonzero parameters to the effective arity of the belief function, thus capturing a concrete relationship between logical complexity and efficient parameter representations. This opens optimization approaches to reduce logical complexity by inducing parameter sparsity. △ Less

Submitted 16 March, 2022; originally announced March 2022.

arXiv:2202.05515 [pdf, ps, other]

Multi-Access Coded Caching Schemes from Maximal Cross Resolvable Designs

Authors: Niladri Das, B. Sundar Rajan

Abstract: We study the problem of multi-access coded caching (MACC): a central server has $N$ files, $K$ ($K \leq N$) caches each of which stores $M$ out of the $N$ files, $K$ users each of which demands one out of the $N$ files, and each user accesses $z$ caches. The objective is to jointly design the placement, delivery, and user-to-cache association, to optimize the achievable rate. This problem has been… ▽ More We study the problem of multi-access coded caching (MACC): a central server has $N$ files, $K$ ($K \leq N$) caches each of which stores $M$ out of the $N$ files, $K$ users each of which demands one out of the $N$ files, and each user accesses $z$ caches. The objective is to jointly design the placement, delivery, and user-to-cache association, to optimize the achievable rate. This problem has been extensively studied in the literature under the assumption that a user accesses only one cache. However, when a user accesses more caches, this problem has been studied only under the assumption that a user accesses $z$ consecutive caches with a cyclic wrap-around over the boundaries. A natural question is how other user-to-cache associations fare against the cyclic wrap-around user-to-cache association. A bipartite graph can describe a general user-to-cache association. We identify a class of bipartite graphs that, when used as a user-to-cache association, achieves either a lesser rate or a lesser subpacketization than all other existing MACC schemes using a cyclic wrap-around user-to-cache association. The placement and delivery strategy of our MACC scheme is constructed using a combinatorial structure called maximal cross resolvable design. △ Less

Submitted 11 February, 2022; originally announced February 2022.

Comments: 34 pages, 5 figures and 3 tables

arXiv:2201.04314 [pdf, other]

doi 10.1109/LRA.2022.3147458

Configuration Space Decomposition for Scalable Proxy Collision Checking in Robot Planning and Control

Authors: Mrinal Verghese, Nikhil Das, Yuheng Zhi, Michael Yip

Abstract: Real-time robot motion planning in complex high-dimensional environments remains an open problem. Motion planning algorithms, and their underlying collision checkers, are crucial to any robot control stack. Collision checking takes up a large portion of the computational time in robot motion planning. Existing collision checkers make trade-offs between speed and accuracy and scale poorly to high-d… ▽ More Real-time robot motion planning in complex high-dimensional environments remains an open problem. Motion planning algorithms, and their underlying collision checkers, are crucial to any robot control stack. Collision checking takes up a large portion of the computational time in robot motion planning. Existing collision checkers make trade-offs between speed and accuracy and scale poorly to high-dimensional, complex environments. We present a novel space decomposition method using K-Means clustering in the Forward Kinematics space to accelerate proxy collision checking. We train individual configuration space models using Fastron, a kernel perceptron algorithm, on these decomposed subspaces, yielding compact yet highly accurate models that can be queried rapidly and scale better to more complex environments. We demonstrate this new method, called Decomposed Fast Perceptron (D-Fastron), on the 7-DOF Baxter robot producing on average 29x faster collision checks and up to 9.8x faster motion planning compared to state-of-the-art geometric collision checkers. △ Less

Submitted 26 January, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: 8 pages, 9 figures, Accepted to IEEE Robotics and Automation Letters

arXiv:2108.12931 [pdf, other]

NeuroCartography: Scalable Automatic Visual Summarization of Concepts in Deep Neural Networks

Authors: Haekyu Park, Nilaksh Das, Rahul Duggal, Austin P. Wright, Omar Shaikh, Fred Hohman, Duen Horng Chau

Abstract: Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that det… ▽ More Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that detect the same concepts, and describes how such neuron groups interact to form higher-level concepts and the subsequent predictions. NeuroCartography introduces two scalable summarization techniques: (1) neuron clustering groups neurons based on the semantic similarity of the concepts detected by neurons (e.g., neurons detecting "dog faces" of different breeds are grouped); and (2) neuron embedding encodes the associations between related concepts based on how often they co-occur (e.g., neurons detecting "dog face" and "dog tail" are placed closer in the embedding space). Key to our scalable techniques is the ability to efficiently compute all neuron pairs' relationships, in time linear to the number of neurons instead of quadratic time. NeuroCartography scales to large data, such as the ImageNet dataset with 1.2M images. The system's tightly coordinated views integrate the scalable techniques to visualize the concepts and their relationships, projecting the concept associations to a 2D space in Neuron Projection View, and summarizing neuron clusters and their relationships in Graph View. Through a large-scale human evaluation, we demonstrate that our technique discovers neuron groups that represent coherent, human-meaningful concepts. And through usage scenarios, we describe how our approaches enable interesting and surprising discoveries, such as concept cascades of related and isolated concepts. The NeuroCartography visualization runs in modern browsers and is open-sourced. △ Less

Submitted 29 August, 2021; originally announced August 2021.

Comments: Accepted to IEEE VIS'21

arXiv:2108.09460 [pdf, other]

Ensemble of CNN classifiers using Sugeno Fuzzy Integral Technique for Cervical Cytology Image Classification

Authors: Rohit Kundu, Hritam Basak, Akhil Koilada, Soham Chattopadhyay, Sukanta Chakraborty, Nibaran Das

Abstract: Cervical cancer is the fourth most common category of cancer, affecting more than 500,000 women annually, owing to the slow detection procedure. Early diagnosis can help in treating and even curing cancer, but the tedious, time-consuming testing process makes it impossible to conduct population-wise screening. To aid the pathologists in efficient and reliable detection, in this paper, we propose a… ▽ More Cervical cancer is the fourth most common category of cancer, affecting more than 500,000 women annually, owing to the slow detection procedure. Early diagnosis can help in treating and even curing cancer, but the tedious, time-consuming testing process makes it impossible to conduct population-wise screening. To aid the pathologists in efficient and reliable detection, in this paper, we propose a fully automated computer-aided diagnosis tool for classifying single-cell and slide images of cervical cancer. The main concern in developing an automatic detection tool for biomedical image classification is the low availability of publicly accessible data. Ensemble Learning is a popular approach for image classification, but simplistic approaches that leverage pre-determined weights to classifiers fail to perform satisfactorily. In this research, we use the Sugeno Fuzzy Integral to ensemble the decision scores from three popular pretrained deep learning models, namely, Inception v3, DenseNet-161 and ResNet-34. The proposed Fuzzy fusion is capable of taking into consideration the confidence scores of the classifiers for each sample, and thus adaptively changing the importance given to each classifier, capturing the complementary information supplied by each, thus leading to superior classification performance. We evaluated the proposed method on three publicly available datasets, the Mendeley Liquid Based Cytology (LBC) dataset, the SIPaKMeD Whole Slide Image (WSI) dataset, and the SIPaKMeD Single Cell Image (SCI) dataset, and the results thus yielded are promising. Analysis of the approach using GradCAM-based visual representations and statistical tests, and comparison of the method with existing and baseline models in literature justify the efficacy of the approach. △ Less

Submitted 21 August, 2021; originally announced August 2021.

Comments: 16 pages

arXiv:2108.03470 [pdf, other]

A distillation based approach for the diagnosis of diseases

Authors: Hmrishav Bandyopadhyay, Shuvayan Ghosh Dastidar, Bisakh Mondal, Biplab Banerjee, Nibaran Das

Abstract: Presently, Covid-19 is a serious threat to the world at large. Efforts are being made to reduce disease screening times and in the development of a vaccine to resist this disease, even as thousands succumb to it everyday. We propose a novel method of automated screening of diseases like Covid-19 and pneumonia from Chest X-Ray images with the help of Computer Vision. Unlike computer vision classifi… ▽ More Presently, Covid-19 is a serious threat to the world at large. Efforts are being made to reduce disease screening times and in the development of a vaccine to resist this disease, even as thousands succumb to it everyday. We propose a novel method of automated screening of diseases like Covid-19 and pneumonia from Chest X-Ray images with the help of Computer Vision. Unlike computer vision classification algorithms which come with heavy computational costs, we propose a knowledge distillation based approach which allows us to bring down the model depth, while preserving the accuracy. We make use of an augmentation of the standard distillation module with an auxiliary intermediate assistant network that aids in the continuity of the flow of information. Following this approach, we are able to build an extremely light student network, consisting of just 3 convolutional blocks without any compromise on accuracy. We thus propose a method of classification of diseases which can not only lead to faster screening, but can also operate seamlessly on low-end devices. △ Less

Submitted 7 August, 2021; originally announced August 2021.

arXiv:2106.12970 [pdf, other]

RikoNet: A Novel Anime Recommendation Engine

Authors: Badal Soni, Debangan Thakuria, Nilutpal Nath, Navarun Das, Bhaskarananda Boro

Abstract: Anime is quite well-received today, especially among the younger generations. With many genres of available shows, more and more people are increasingly getting attracted to this niche section of the entertainment industry. As anime has recently garnered mainstream attention, we have insufficient information regarding users' penchant and watching habits. Therefore, it is an uphill task to build a… ▽ More Anime is quite well-received today, especially among the younger generations. With many genres of available shows, more and more people are increasingly getting attracted to this niche section of the entertainment industry. As anime has recently garnered mainstream attention, we have insufficient information regarding users' penchant and watching habits. Therefore, it is an uphill task to build a recommendation engine for this relatively obscure entertainment medium. In this attempt, we have built a novel hybrid recommendation system that could act both as a recommendation system and as a means of exploring new anime genres and titles. We have analyzed the general trends in this field and the users' watching habits for coming up with our efficacious solution. Our solution employs deep autoencoders for the tasks of predicting ratings and generating embeddings. Following this, we formed clusters using the embeddings of the anime titles. These clusters form the search space for anime with similarities and are used to find anime similar to the ones liked and disliked by the user. This method, combined with the predicted ratings, forms the novel hybrid filter. In this article, we have demonstrated this idea and compared the performance of our implemented model with the existing state-of-the-art techniques. △ Less

Submitted 24 June, 2021; originally announced June 2021.

Comments: 19 pages

MSC Class: ams.org

arXiv:2106.12701 [pdf]

doi 10.13140/RG.2.2.24148.94086

Object Detection and Ranging for Autonomous Navigation of Mobile Robots

Authors: Md Ziaul Haque Zim, Nimai Chandra Das

Abstract: In the recent decade, electronic technology gets advanced day by day the methodologies too should update. For the purpose of ranging various methods such Radio Detection and Ranging (RADAR), Light Detection and Ranging (LIDAR) and Sonic Navigation and Ranging (SONAR) etc. are used. Later, by adapting the earlier technologies and further modifying the purposes of detection and ranging in navigation… ▽ More In the recent decade, electronic technology gets advanced day by day the methodologies too should update. For the purpose of ranging various methods such Radio Detection and Ranging (RADAR), Light Detection and Ranging (LIDAR) and Sonic Navigation and Ranging (SONAR) etc. are used. Later, by adapting the earlier technologies and further modifying the purposes of detection and ranging in navigation, the technology of Sonic Detection and Ranging (SODAR) is used in modern robotics. The SODAR can be defined as a child of SONAR and also a twin of Echo sounder. The echo-sounder is used only for ranging. But the SODAR use the low-frequency wave of 33 kHz to measure the underwater depth and also to detect the objects below the water medium. So, this work comprises the designing of a system to evaluate the Object Detection and Ranging for Autonomous Navigation of Mobile Robots. △ Less

Submitted 23 June, 2021; originally announced June 2021.

arXiv:2106.04919 [pdf, other]

doi 10.1007/s42979-021-00741-2

Cervical Cytology Classification Using PCA & GWO Enhanced Deep Features Selection

Authors: Hritam Basak, Rohit Kundu, Sukanta Chakraborty, Nibaran Das

Abstract: Cervical cancer is one of the most deadly and common diseases among women worldwide. It is completely curable if diagnosed in an early stage, but the tedious and costly detection procedure makes it unviable to conduct population-wise screening. Thus, to augment the effort of the clinicians, in this paper, we propose a fully automated framework that utilizes Deep Learning and feature selection usin… ▽ More Cervical cancer is one of the most deadly and common diseases among women worldwide. It is completely curable if diagnosed in an early stage, but the tedious and costly detection procedure makes it unviable to conduct population-wise screening. Thus, to augment the effort of the clinicians, in this paper, we propose a fully automated framework that utilizes Deep Learning and feature selection using evolutionary optimization for cytology image classification. The proposed framework extracts Deep feature from several Convolution Neural Network models and uses a two-step feature reduction approach to ensure reduction in computation cost and faster convergence. The features extracted from the CNN models form a large feature space whose dimensionality is reduced using Principal Component Analysis while preserving 99% of the variance. A non-redundant, optimal feature subset is selected from this feature space using an evolutionary optimization algorithm, the Grey Wolf Optimizer, thus improving the classification performance. Finally, the selected feature subset is used to train an SVM classifier for generating the final predictions. The proposed framework is evaluated on three publicly available benchmark datasets: Mendeley Liquid Based Cytology (4-class) dataset, Herlev Pap Smear (7-class) dataset, and the SIPaKMeD Pap Smear (5-class) dataset achieving classification accuracies of 99.47%, 98.32% and 97.87% respectively, thus justifying the reliability of the approach. The relevant codes for the proposed approach can be found in: https://github.com/DVLP-CMATERJU/Two-Step-Feature-Enhancement △ Less

Submitted 9 June, 2021; originally announced June 2021.

Comments: 28 pages

arXiv:2104.11620 [pdf, other]

GuideBP: Guiding Backpropagation Through Weaker Pathways of Parallel Logits

Authors: Bodhisatwa Mandal, Swarnendu Ghosh, Teresa Gonçalves, Paulo Quaresma, Mita Nasipuri, Nibaran Das

Abstract: Convolutional neural networks often generate multiple logits and use simple techniques like addition or averaging for loss computation. But this allows gradients to be distributed equally among all paths. The proposed approach guides the gradients of backpropagation along weakest concept representations. A weakness scores defines the class specific performance of individual pathways which is then… ▽ More Convolutional neural networks often generate multiple logits and use simple techniques like addition or averaging for loss computation. But this allows gradients to be distributed equally among all paths. The proposed approach guides the gradients of backpropagation along weakest concept representations. A weakness scores defines the class specific performance of individual pathways which is then used to create a logit that would guide gradients along the weakest pathways. The proposed approach has been shown to perform better than traditional column merging techniques and can be used in several application scenarios. Not only can the proposed model be used as an efficient technique for training multiple instances of a model parallely, but also CNNs with multiple output branches have been shown to perform better with the proposed upgrade. Various experiments establish the flexibility of the learning technique which is simple yet effective in various multi-objective scenarios both empirically and statistically. △ Less

Submitted 23 April, 2021; originally announced April 2021.

arXiv:2104.06348 [pdf, other]

Optimal Multi-Manipulator Arm Placement for Maximal Dexterity during Robotics Surgery

Authors: James Di, Mingwei Xu, Nikhil Das, Michael C. Yip

Abstract: Robot arm placements are oftentimes a limitation in surgical preoperative procedures, relying on trained staff to evaluate and decide on the optimal positions for the arms. Given new and different patient anatomies, it can be challenging to make an informed choice, leading to more frequently colliding arms or limited manipulator workspaces. In this paper, we develop a method to generate the optima… ▽ More Robot arm placements are oftentimes a limitation in surgical preoperative procedures, relying on trained staff to evaluate and decide on the optimal positions for the arms. Given new and different patient anatomies, it can be challenging to make an informed choice, leading to more frequently colliding arms or limited manipulator workspaces. In this paper, we develop a method to generate the optimal manipulator base positions for the multi-port da Vinci surgical system that minimizes self-collision and environment-collision, and maximizes the surgeon's reachability inside the patient. Scoring functions are defined for each criterion so that they may be optimized over. Since for multi-manipulator setups, a large number of free parameters are available to adjust the base positioning of each arm, a challenge becomes how one can expediently assess possible setups. We thus also propose methods that perform fast queries of each measure with the use of a proxy collision-checker. We then develop an optimization method to determine the optimal position using the scoring functions. We evaluate the optimality of the base positions for the robot arms on canonical trajectories, and show that the solution yielded by the optimization program can satisfy each criterion. The metrics and optimization strategy are generalizable to other surgical robotic platforms so that patient-side manipulator positioning may be optimized and solved. △ Less

Submitted 13 April, 2021; originally announced April 2021.

arXiv:2103.16435 [pdf, other]

EnergyVis: Interactively Tracking and Exploring Energy Consumption for ML Models

Authors: Omar Shaikh, Jon Saad-Falcon, Austin P Wright, Nilaksh Das, Scott Freitas, Omar Isaac Asensio, Duen Horng Chau

Abstract: The advent of larger machine learning (ML) models have improved state-of-the-art (SOTA) performance in various modeling tasks, ranging from computer vision to natural language. As ML models continue increasing in size, so does their respective energy consumption and computational requirements. However, the methods for tracking, reporting, and comparing energy consumption remain limited. We present… ▽ More The advent of larger machine learning (ML) models have improved state-of-the-art (SOTA) performance in various modeling tasks, ranging from computer vision to natural language. As ML models continue increasing in size, so does their respective energy consumption and computational requirements. However, the methods for tracking, reporting, and comparing energy consumption remain limited. We presentEnergyVis, an interactive energy consumption tracker for ML models. Consisting of multiple coordinated views, EnergyVis enables researchers to interactively track, visualize and compare model energy consumption across key energy consumption and carbon footprint metrics (kWh and CO2), helping users explore alternative deployment locations and hardware that may reduce carbon footprints. EnergyVis aims to raise awareness concerning computational sustainability by interactively highlighting excessive energy usage during model training; and by providing alternative training options to reduce energy usage. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: 7 pages, 5 figures; CHI 2021 Extended Abstracts

arXiv:2102.10335 [pdf, ps, other]

Exploring Knowledge Distillation of a Deep Neural Network for Multi-Script identification

Authors: Shuvayan Ghosh Dastidar, Kalpita Dutta, Nibaran Das, Mahantapas Kundu, Mita Nasipuri

Abstract: Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller student network by utilizing the teacher model's predictions. This process is known as dark knowledge transfer. It has been quite successful in many… ▽ More Multi-lingual script identification is a difficult task consisting of different language with complex backgrounds in scene text images. According to the current research scenario, deep neural networks are employed as teacher models to train a smaller student network by utilizing the teacher model's predictions. This process is known as dark knowledge transfer. It has been quite successful in many domains where the final result obtained is unachievable through directly training the student network with a simple architecture. In this paper, we explore dark knowledge transfer approach using long short-term memory(LSTM) and CNN based assistant model and various deep neural networks as the teacher model, with a simple CNN based student network, in this domain of multi-script identification from natural scene text images. We explore the performance of different teacher models and their ability to transfer knowledge to a student network. Although the small student network's limited size, our approach obtains satisfactory results on a well-known script identification dataset CVSI-2015. △ Less

Submitted 20 February, 2021; originally announced February 2021.

Comments: 14 pages, 6 figures, 7 tables

arXiv:2102.07413 [pdf, other]

DiffCo: Auto-Differentiable Proxy Collision Detection with Multi-class Labels for Safety-Aware Trajectory Optimization

Authors: Yuheng Zhi, Nikhil Das, Michael Yip

Abstract: The objective of trajectory optimization algorithms is to achieve an optimal collision-free path between a start and goal state. In real-world scenarios where environments can be complex and non-homogeneous, a robot needs to be able to gauge whether a state will be in collision with various objects in order to meet some safety metrics. The collision detector should be computationally efficient and… ▽ More The objective of trajectory optimization algorithms is to achieve an optimal collision-free path between a start and goal state. In real-world scenarios where environments can be complex and non-homogeneous, a robot needs to be able to gauge whether a state will be in collision with various objects in order to meet some safety metrics. The collision detector should be computationally efficient and, ideally, analytically differentiable to facilitate stable and rapid gradient descent during optimization. However, methods today lack an elegant approach to detect collision differentiably, relying rather on numerical gradients that can be unstable. We present DiffCo, the first, fully auto-differentiable, non-parametric model for collision detection. Its non-parametric behavior allows one to compute collision boundaries on-the-fly and update them, requiring no pre-training and allowing it to update continuously in dynamic environments. It provides robust gradients for trajectory optimization via backpropagation and is often 10-100x faster to compute than its geometric counterparts. DiffCo also extends trivially to modeling different object collision classes for semantically informed trajectory optimization. △ Less

Submitted 18 February, 2022; v1 submitted 15 February, 2021; originally announced February 2021.

Comments: This work has been accepted for publication at IEEE Transactions on Robotics

arXiv:2102.03115 [pdf]

Multispectral Object Detection with Deep Learning

Authors: Md Osman Gani, Somenath Kuiry, Alaka Das, Mita Nasipuri, Nibaran Das

Abstract: Object detection in natural scenes can be a challenging task. In many real-life situations, the visible spectrum is not suitable for traditional computer vision tasks. Moving outside the visible spectrum range, such as the thermal spectrum or the near-infrared (NIR) images, is much more beneficial in low visibility conditions, NIR images are very helpful for understanding the object's material qua… ▽ More Object detection in natural scenes can be a challenging task. In many real-life situations, the visible spectrum is not suitable for traditional computer vision tasks. Moving outside the visible spectrum range, such as the thermal spectrum or the near-infrared (NIR) images, is much more beneficial in low visibility conditions, NIR images are very helpful for understanding the object's material quality. In this work, we have taken images with both the Thermal and NIR spectrum for the object detection task. As multi-spectral data with both Thermal and NIR is not available for the detection task, we needed to collect data ourselves. Data collection is a time-consuming process, and we faced many obstacles that we had to overcome. We train the YOLO v3 network from scratch to detect an object from multi-spectral images. Also, to avoid overfitting, we have done data augmentation and tune hyperparameters. △ Less

Submitted 5 February, 2021; originally announced February 2021.

arXiv:2102.01120 [pdf, other]

RectiNet-v2: A stacked network architecture for document image dewarping

Authors: Hmrishav Bandyopadhyay, Tanmoy Dasgupta, Nibaran Das, Mita Nasipuri

Abstract: With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it tak… ▽ More With the advent of mobile and hand-held cameras, document images have found their way into almost every domain. Dewarping of these images for the removal of perspective distortions and folds is essential so that they can be understood by document recognition algorithms. For this, we propose an end-to-end CNN architecture that can produce distortion free document images from warped documents it takes as input. We train this model on warped document images simulated synthetically to compensate for lack of enough natural data. Our method is novel in the use of a bifurcated decoder with shared weights to prevent intermingling of grid coordinates, in the use of residual networks in the U-Net skip connections to allow flow of data from different receptive fields in the model, and in the use of a gated network to help the model focus on structure and line level detail of the document image. We evaluate our method on the DocUNet dataset, a benchmark in this domain, and obtain results comparable to state-of-the-art methods. △ Less

Submitted 1 February, 2021; originally announced February 2021.

Comments: 6 Pages

arXiv:2101.10586 [pdf, other]

SkeletonVis: Interactive Visualization for Understanding Adversarial Attacks on Human Action Recognition Models

Authors: Haekyu Park, Zijie J. Wang, Nilaksh Das, Anindya S. Paul, Pruthvi Perumalla, Zhiyan Zhou, Duen Horng Chau

Abstract: Skeleton-based human action recognition technologies are increasingly used in video based applications, such as home robotics, healthcare on aging population, and surveillance. However, such models are vulnerable to adversarial attacks, raising serious concerns for their use in safety-critical applications. To develop an effective defense against attacks, it is essential to understand how such att… ▽ More Skeleton-based human action recognition technologies are increasingly used in video based applications, such as home robotics, healthcare on aging population, and surveillance. However, such models are vulnerable to adversarial attacks, raising serious concerns for their use in safety-critical applications. To develop an effective defense against attacks, it is essential to understand how such attacks mislead the pose detection models into making incorrect predictions. We present SkeletonVis, the first interactive system that visualizes how the attacks work on the models to enhance human understanding of attacks. △ Less

Submitted 26 January, 2021; originally announced January 2021.

Comments: Accepted at AAAI'21 Demo

arXiv:2101.05560 [pdf, ps, other]

doi 10.26421/QIC21.3-4-2

Secure Multi-Party Quantum Conference and Xor Computation

Authors: Nayana Das, Goutam Paul

Abstract: Quantum conference is a process of securely exchanging messages between three or more parties, using quantum resources. A Measurement Device Independent Quantum Dialogue (MDI-QD) protocol, which is secure against information leakage, has been proposed (Quantum Information Processing 16.12 (2017): 305) in 2017, is proven to be insecure against intercept-and-resend attack strategy. We first modify t… ▽ More Quantum conference is a process of securely exchanging messages between three or more parties, using quantum resources. A Measurement Device Independent Quantum Dialogue (MDI-QD) protocol, which is secure against information leakage, has been proposed (Quantum Information Processing 16.12 (2017): 305) in 2017, is proven to be insecure against intercept-and-resend attack strategy. We first modify this protocol and generalize this MDI-QD to a three-party quantum conference and then to a multi-party quantum conference. We also propose a protocol for quantum multi-party XOR computation. None of these three protocols proposed here use entanglement as a resource and we prove the correctness and security of our proposed protocols. △ Less

Submitted 14 January, 2021; originally announced January 2021.

Comments: Accepted in Quantum Information and Computation

Journal ref: Quantum Information and Computation, Vol.21 No.3&4 March 2021

arXiv:2101.03577 [pdf, other]

doi 10.1007/s10773-021-04952-4

Quantum Secure Direct Communication with Mutual Authentication using a Single Basis

Authors: Nayana Das, Goutam Paul, Ritajit Majumdar

Abstract: In this paper, we propose a new theoretical scheme for quantum secure direct communication (QSDC) with user authentication. Different from the previous QSDC protocols, the present protocol uses only one orthogonal basis of single-qubit states to encode the secret message. Moreover, this is a one-time and one-way communication protocol, which uses qubits prepared in a randomly chosen arbitrary basi… ▽ More In this paper, we propose a new theoretical scheme for quantum secure direct communication (QSDC) with user authentication. Different from the previous QSDC protocols, the present protocol uses only one orthogonal basis of single-qubit states to encode the secret message. Moreover, this is a one-time and one-way communication protocol, which uses qubits prepared in a randomly chosen arbitrary basis, to transmit the secret message. We discuss the security of the proposed protocol against some common attacks and show that no eaves-dropper can get any information from the quantum and classical channels. We have also studied the performance of this protocol under realistic device noise. We have executed the protocol in IBMQ Armonk device and proposed a repetition code based protection scheme that requires minimal overhead. △ Less

Submitted 14 January, 2021; v1 submitted 10 January, 2021; originally announced January 2021.

Journal ref: International Journal of Theoretical Physics (2021)

arXiv:2011.03882 [pdf, other]

Multi-Modal Learning of Keypoint Predictive Models for Visual Object Manipulation

Authors: Sarah Bechtle, Neha Das, Franziska Meier

Abstract: Humans have impressive generalization capabilities when it comes to manipulating objects and tools in completely novel environments. These capabilities are, at least partially, a result of humans having internal models of their bodies and any grasped object. How to learn such body schemas for robots remains an open problem. In this work, we develop an self-supervised approach that can extend a rob… ▽ More Humans have impressive generalization capabilities when it comes to manipulating objects and tools in completely novel environments. These capabilities are, at least partially, a result of humans having internal models of their bodies and any grasped object. How to learn such body schemas for robots remains an open problem. In this work, we develop an self-supervised approach that can extend a robot's kinematic model when grasping an object from visual latent representations. Our framework comprises two components: (1) we present a multi-modal keypoint detector: an autoencoder architecture trained by fusing proprioception and vision to predict visual key points on an object; (2) we show how we can use our learned keypoint detector to learn an extension of the kinematic chain by regressing virtual joints from the predicted visual keypoints. Our evaluation shows that our approach learns to consistently predict visual keypoints on objects in the manipulator's hand, and thus can easily facilitate learning an extended kinematic chain to include the object grasped in various configurations, from a few seconds of visual data. Finally we show that this extended kinematic chain lends itself for object manipulation tasks such as placing a grasped object and present experiments in simulation and on hardware. △ Less

Submitted 25 June, 2021; v1 submitted 7 November, 2020; originally announced November 2020.

Showing 1–50 of 118 results for author: Das, N