Search | arXiv e-print repository

Joint Data Inpainting and Graph Learning via Unrolled Neural Networks

Authors: Subbareddy Batreddy, Pushkal Mishra, Yaswanth Kakarla, Aditya Siripuram

Abstract: Given partial measurements of a time-varying graph signal, we propose an algorithm to simultaneously estimate both the underlying graph topology and the missing measurements. The proposed algorithm operates by training an interpretable neural network, designed from the unrolling framework. The proposed technique can be used both as a graph learning and a graph signal reconstruction algorithm. This… ▽ More Given partial measurements of a time-varying graph signal, we propose an algorithm to simultaneously estimate both the underlying graph topology and the missing measurements. The proposed algorithm operates by training an interpretable neural network, designed from the unrolling framework. The proposed technique can be used both as a graph learning and a graph signal reconstruction algorithm. This work enhances prior work in graph signal reconstruction by allowing the underlying graph to be unknown; and also builds on prior work in graph learning by tailoring the learned graph to the signal reconstruction task. △ Less

Submitted 16 July, 2024; originally announced July 2024.

arXiv:2407.04866 [pdf, other]

Explainable Metric Learning for Deflating Data Bias

Authors: Emma Andrews, Prabhat Mishra

Abstract: Image classification is an essential part of computer vision which assigns a given input image to a specific category based on the similarity evaluation within given criteria. While promising classifiers can be obtained through deep learning models, these approaches lack explainability, where the classification results are hard to interpret in a human-understandable way. In this paper, we present… ▽ More Image classification is an essential part of computer vision which assigns a given input image to a specific category based on the similarity evaluation within given criteria. While promising classifiers can be obtained through deep learning models, these approaches lack explainability, where the classification results are hard to interpret in a human-understandable way. In this paper, we present an explainable metric learning framework, which constructs hierarchical levels of semantic segments of an image for better interpretability. The key methodology involves a bottom-up learning strategy, starting by training the local metric learning model for the individual segments and then combining segments to compose comprehensive metrics in a tree. Specifically, our approach enables a more human-understandable similarity measurement between two images based on the semantic segments within it, which can be utilized to generate new samples to reduce bias in a training dataset. Extensive experimental evaluation demonstrates that the proposed approach can drastically improve model accuracy compared with state-of-the-art methods. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.03689 [pdf, other]

Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models

Authors: Litton Jose Kurisinkel, Pruthwik Mishra, Yue Zhang

Abstract: Time series models, typically trained on numerical data, are designed to forecast future values. These models often rely on weighted averaging techniques over time intervals. However, real-world time series data is seldom isolated and is frequently influenced by non-numeric factors. For instance, stock price fluctuations are impacted by daily random events in the broader world, with each event exe… ▽ More Time series models, typically trained on numerical data, are designed to forecast future values. These models often rely on weighted averaging techniques over time intervals. However, real-world time series data is seldom isolated and is frequently influenced by non-numeric factors. For instance, stock price fluctuations are impacted by daily random events in the broader world, with each event exerting a unique influence on price signals. Previously, forecasts in financial markets have been approached in two main ways: either as time-series problems over price sequence or sentiment analysis tasks. The sentiment analysis tasks aim to determine whether news events will have a positive or negative impact on stock prices, often categorizing them into discrete labels. Recognizing the need for a more comprehensive approach to accurately model time series prediction, we propose a collaborative modeling framework that incorporates textual information about relevant events for predictions. Specifically, we leverage the intuition of large language models about future changes to update real number time series predictions. We evaluated the effectiveness of our approach on financial market data. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 21 pages, 12 figures

arXiv:2406.10893 [pdf, other]

Development and Validation of Fully Automatic Deep Learning-Based Algorithms for Immunohistochemistry Reporting of Invasive Breast Ductal Carcinoma

Authors: Sumit Kumar Jha, Purnendu Mishra, Shubham Mathur, Gursewak Singh, Rajiv Kumar, Kiran Aatre, Suraj Rengarajan

Abstract: Immunohistochemistry (IHC) analysis is a well-accepted and widely used method for molecular subtyping, a procedure for prognosis and targeted therapy of breast carcinoma, the most common type of tumor affecting women. There are four molecular biomarkers namely progesterone receptor (PR), estrogen receptor (ER), antigen Ki67, and human epidermal growth factor receptor 2 (HER2) whose assessment is n… ▽ More Immunohistochemistry (IHC) analysis is a well-accepted and widely used method for molecular subtyping, a procedure for prognosis and targeted therapy of breast carcinoma, the most common type of tumor affecting women. There are four molecular biomarkers namely progesterone receptor (PR), estrogen receptor (ER), antigen Ki67, and human epidermal growth factor receptor 2 (HER2) whose assessment is needed under IHC procedure to decide prognosis as well as predictors of response to therapy. However, IHC scoring is based on subjective microscopic examination of tumor morphology and suffers from poor reproducibility, high subjectivity, and often incorrect scoring in low-score cases. In this paper, we present, a deep learning-based semi-supervised trained, fully automatic, decision support system (DSS) for IHC scoring of invasive ductal carcinoma. Our system automatically detects the tumor region removing artifacts and scores based on Allred standard. The system is developed using 3 million pathologist-annotated image patches from 300 slides, fifty thousand in-house cell annotations, and forty thousand pixels marking of HER2 membrane. We have conducted multicentric trials at four centers with three different types of digital scanners in terms of percentage agreement with doctors. And achieved agreements of 95, 92, 88 and 82 percent for Ki67, HER2, ER, and PR stain categories, respectively. In addition to overall accuracy, we found that there is 5 percent of cases where pathologist have changed their score in favor of algorithm score while reviewing with detailed algorithmic analysis. Our approach could improve the accuracy of IHC scoring and subsequent therapy decisions, particularly where specialist expertise is unavailable. Our system is highly modular. The proposed algorithm modules can be used to develop DSS for other cancer types. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.05828 [pdf, other]

Multi-Stain Multi-Level Convolutional Network for Multi-Tissue Breast Cancer Image Segmentation

Authors: Akash Modi, Sumit Kumar Jha, Purnendu Mishra, Rajiv Kumar, Kiran Aatre, Gursewak Singh, Shubham Mathur

Abstract: Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across m… ▽ More Digital pathology and microscopy image analysis are widely employed in the segmentation of digitally scanned IHC slides, primarily to identify cancer and pinpoint regions of interest (ROI) indicative of tumor presence. However, current ROI segmentation models are either stain-specific or suffer from the issues of stain and scanner variance due to different staining protocols or modalities across multiple labs. Also, tissues like Ductal Carcinoma in Situ (DCIS), acini, etc. are often classified as Tumors due to their structural similarities and color compositions. In this paper, we proposed a novel convolutional neural network (CNN) based Multi-class Tissue Segmentation model for histopathology whole-slide Breast slides which classify tumors and segments other tissue regions such as Ducts, acini, DCIS, Squamous epithelium, Blood Vessels, Necrosis, etc. as a separate class. Our unique pixel-aligned non-linear merge across spatial resolutions empowers models with both local and global fields of view for accurate detection of various classes. Our proposed model is also able to separate bad regions such as folds, artifacts, blurry regions, bubbles, etc. from tissue regions using multi-level context from different resolutions of WSI. Multi-phase iterative training with context-aware augmentation and increasing noise was used to efficiently train a multi-stain generic model with partial and noisy annotations from 513 slides. Our training pipeline used 12 million patches generated using context-aware augmentations which made our model stain and scanner invariant across data sources. To extrapolate stain and scanner invariance, our model was evaluated on 23000 patches which were for a completely new stain (Hematoxylin and Eosin) from a completely new scanner (Motic) from a different lab. The mean IOU was 0.72 which is on par with model performance on other data sources and scanners. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2405.07007 [pdf, ps, other]

A New Algorithm for Computing Branch Number of Non-Singular Matrices over Finite Fields

Authors: P. R. Mishra, Yogesh Kumar, Susanta Samanta, Atul Gaur

Abstract: The notion of branch numbers of a linear transformation is crucial for both linear and differential cryptanalysis. The number of non-zero elements in a state difference or linear mask directly correlates with the active S-Boxes. The differential or linear branch number indicates the minimum number of active S-Boxes in two consecutive rounds of an SPN cipher, specifically for differential or linear… ▽ More The notion of branch numbers of a linear transformation is crucial for both linear and differential cryptanalysis. The number of non-zero elements in a state difference or linear mask directly correlates with the active S-Boxes. The differential or linear branch number indicates the minimum number of active S-Boxes in two consecutive rounds of an SPN cipher, specifically for differential or linear cryptanalysis, respectively. This paper presents a new algorithm for computing the branch number of non-singular matrices over finite fields. The algorithm is based on the existing classical method but demonstrates improved computational complexity compared to its predecessor. We conduct a comparative study of the proposed algorithm and the classical approach, providing an analytical estimation of the algorithm's complexity. Our analysis reveals that the computational complexity of our algorithm is the square root of that of the classical approach. △ Less

Submitted 11 May, 2024; originally announced May 2024.

arXiv:2405.04829 [pdf, other]

Fine-tuning Pre-trained Named Entity Recognition Models For Indian Languages

Authors: Sankalp Bahad, Pruthwik Mishra, Karunesh Arora, Rakesh Chandra Balabantaray, Dipti Misra Sharma, Parameswari Krishnamurthy

Abstract: Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges an… ▽ More Named Entity Recognition (NER) is a useful component in Natural Language Processing (NLP) applications. It is used in various tasks such as Machine Translation, Summarization, Information Retrieval, and Question-Answering systems. The research on NER is centered around English and some other major languages, whereas limited attention has been given to Indian languages. We analyze the challenges and propose techniques that can be tailored for Multilingual Named Entity Recognition for Indian Languages. We present a human annotated named entity corpora of 40K sentences for 4 Indian languages from two of the major Indian language families. Additionally,we present a multilingual model fine-tuned on our dataset, which achieves an F1 score of 0.80 on our dataset on average. We achieve comparable performance on completely unseen benchmark datasets for Indian languages which affirms the usability of our model. △ Less

Submitted 10 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: 8 pages, accepted in NAACL-SRW, 2024

arXiv:2404.18840 [pdf, other]

Fast Quantum Process Tomography via Riemannian Gradient Descent

Authors: Daniel Volya, Andrey Nikitin, Prabhat Mishra

Abstract: Constrained optimization plays a crucial role in the fields of quantum physics and quantum information science and becomes especially challenging for high-dimensional complex structure problems. One specific issue is that of quantum process tomography, in which the goal is to retrieve the underlying quantum process based on a given set of measurement data. In this paper, we introduce a modified ve… ▽ More Constrained optimization plays a crucial role in the fields of quantum physics and quantum information science and becomes especially challenging for high-dimensional complex structure problems. One specific issue is that of quantum process tomography, in which the goal is to retrieve the underlying quantum process based on a given set of measurement data. In this paper, we introduce a modified version of stochastic gradient descent on a Riemannian manifold that integrates recent advancements in numerical methods for Riemannian optimization. This approach inherently supports the physically driven constraints of a quantum process, takes advantage of state-of-the-art large-scale stochastic objective optimization, and has superior performance to traditional approaches such as maximum likelihood estimation and projected least squares. The data-driven approach enables accurate, order-of-magnitude faster results, and works with incomplete data. We demonstrate our approach on simulations of quantum processes and in hardware by characterizing an engineered process on quantum computers. △ Less

Submitted 29 April, 2024; originally announced April 2024.

arXiv:2404.18469 [pdf, other]

Error-Resilient Weakly Constrained Coding via Row-by-Row Coding

Authors: Prachi Mishra, Navin Kashyap

Abstract: A weakly constrained code is a collection of finite-length strings over a finite alphabet in which certain substrings or patterns occur according to some prescribed frequencies. Buzaglo and Siegel (ITW 2017) gave a construction of weakly constrained codes based on row-by-row coding, that achieved the capacity of the weak constraint. In this paper, we propose a method to make this row-by-row coding… ▽ More A weakly constrained code is a collection of finite-length strings over a finite alphabet in which certain substrings or patterns occur according to some prescribed frequencies. Buzaglo and Siegel (ITW 2017) gave a construction of weakly constrained codes based on row-by-row coding, that achieved the capacity of the weak constraint. In this paper, we propose a method to make this row-by-row coding scheme resilient to errors. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: 9 pages, a shorter version is submitted at the International Symposium on Information Theory (ISIT) 2024

arXiv:2404.13194 [pdf, other]

Privacy-Preserving Debiasing using Data Augmentation and Machine Unlearning

Authors: Zhixin Pan, Emma Andrews, Laura Chang, Prabhat Mishra

Abstract: Data augmentation is widely used to mitigate data bias in the training dataset. However, data augmentation exposes machine learning models to privacy attacks, such as membership inference attacks. In this paper, we propose an effective combination of data augmentation and machine unlearning, which can reduce data bias while providing a provable defense against known attacks. Specifically, we maint… ▽ More Data augmentation is widely used to mitigate data bias in the training dataset. However, data augmentation exposes machine learning models to privacy attacks, such as membership inference attacks. In this paper, we propose an effective combination of data augmentation and machine unlearning, which can reduce data bias while providing a provable defense against known attacks. Specifically, we maintain the fairness of the trained model with diffusion-based data augmentation, and then utilize multi-shard unlearning to remove identifying information of original data from the ML model for protection against privacy attacks. Experimental evaluation across diverse datasets demonstrates that our approach can achieve significant improvements in bias reduction as well as robustness against state-of-the-art privacy attacks. △ Less

Submitted 19 April, 2024; originally announced April 2024.

arXiv:2404.08250 [pdf, other]

doi 10.1007/s12190-024-02142-z

A Systematic Construction Approach for All $4\times 4$ Involutory MDS Matrices

Authors: Yogesh Kumar, P. R. Mishra, Susanta Samanta, Atul Gaur

Abstract: Maximum distance separable (MDS) matrices play a crucial role not only in coding theory but also in the design of block ciphers and hash functions. Of particular interest are involutory MDS matrices, which facilitate the use of a single circuit for both encryption and decryption in hardware implementations. In this article, we present several characterizations of involutory MDS matrices of even or… ▽ More Maximum distance separable (MDS) matrices play a crucial role not only in coding theory but also in the design of block ciphers and hash functions. Of particular interest are involutory MDS matrices, which facilitate the use of a single circuit for both encryption and decryption in hardware implementations. In this article, we present several characterizations of involutory MDS matrices of even order. Additionally, we introduce a new matrix form for obtaining all involutory MDS matrices of even order and compare it with other matrix forms available in the literature. We then propose a technique to systematically construct all $4 \times 4$ involutory MDS matrices over a finite field $\mathbb{F}_{2^m}$. This method significantly reduces the search space by focusing on involutory MDS class representative matrices, leading to the generation of all such matrices within a substantially smaller set compared to considering all $4 \times 4$ involutory matrices. Specifically, our approach involves searching for these representative matrices within a set of cardinality $(2^m-1)^5$. Through this method, we provide an explicit enumeration of the total number of $4 \times 4$ involutory MDS matrices over $\mathbb{F}_{2^m}$ for $m=3,4,\ldots,8$. △ Less

Submitted 17 June, 2024; v1 submitted 12 April, 2024; originally announced April 2024.

Journal ref: Journal of Applied Mathematics and Computing, 14 Jun 2024

arXiv:2404.02512 [pdf, other]

Towards Large Language Model driven Reference-less Translation Evaluation for English and Indian Languages

Authors: Vandan Mujadia, Pruthwik Mishra, Arafat Ahsan, Dipti Misra Sharma

Abstract: With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, an… ▽ More With the primary focus on evaluating the effectiveness of large language models for automatic reference-less translation assessment, this work presents our experiments on mimicking human direct assessment to evaluate the quality of translations in English and Indian languages. We constructed a translation evaluation task where we performed zero-shot learning, in-context example-driven learning, and fine-tuning of large language models to provide a score out of 100, where 100 represents a perfect translation and 1 represents a poor translation. We compared the performance of our trained systems with existing methods such as COMET, BERT-Scorer, and LABSE, and found that the LLM-based evaluator (LLaMA-2-13B) achieves a comparable or higher overall correlation with human judgments for the considered Indian language pairs. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: arXiv admin note: text overlap with arXiv:2311.09216

arXiv:2404.01822 [pdf, other]

A (More) Realistic Evaluation Setup for Generalisation of Community Models on Malicious Content Detection

Authors: Ivo Verhoeven, Pushkar Mishra, Rahel Beloch, Helen Yannakoudakis, Ekaterina Shutova

Abstract: Community models for malicious content detection, which take into account the context from a social graph alongside the content itself, have shown remarkable performance on benchmark datasets. Yet, misinformation and hate speech continue to propagate on social media networks. This mismatch can be partially attributed to the limitations of current evaluation setups that neglect the rapid evolution… ▽ More Community models for malicious content detection, which take into account the context from a social graph alongside the content itself, have shown remarkable performance on benchmark datasets. Yet, misinformation and hate speech continue to propagate on social media networks. This mismatch can be partially attributed to the limitations of current evaluation setups that neglect the rapid evolution of online content and the underlying social graph. In this paper, we propose a novel evaluation setup for model generalisation based on our few-shot subgraph sampling approach. This setup tests for generalisation through few labelled examples in local explorations of a larger graph, emulating more realistic application settings. We show this to be a challenging inductive setup, wherein strong performance on the training graph is not indicative of performance on unseen tasks, domains, or graph structures. Lastly, we show that graph meta-learners trained with our proposed few-shot subgraph sampling outperform standard community models in the inductive setup. We make our code publicly available. △ Less

Submitted 2 April, 2024; originally announced April 2024.

Comments: To be published at Findings of NAACL 2024

arXiv:2403.10372 [pdf, other]

Construction of all MDS and involutory MDS matrices

Authors: Yogesh Kumar, P. R. Mishra, Susanta Samanta, Kishan Chand Gupta, Atul Gaur

Abstract: In this paper, we propose two algorithms for a hybrid construction of all $n\times n$ MDS and involutory MDS matrices over a finite field $\mathbb{F}_{p^m}$, respectively. The proposed algorithms effectively narrow down the search space to identify $(n-1) \times (n-1)$ MDS matrices, facilitating the generation of all $n \times n$ MDS and involutory MDS matrices over $\mathbb{F}_{p^m}$. To the best… ▽ More In this paper, we propose two algorithms for a hybrid construction of all $n\times n$ MDS and involutory MDS matrices over a finite field $\mathbb{F}_{p^m}$, respectively. The proposed algorithms effectively narrow down the search space to identify $(n-1) \times (n-1)$ MDS matrices, facilitating the generation of all $n \times n$ MDS and involutory MDS matrices over $\mathbb{F}_{p^m}$. To the best of our knowledge, existing literature lacks methods for generating all $n\times n$ MDS and involutory MDS matrices over $\mathbb{F}_{p^m}$. In our approach, we introduce a representative matrix form for generating all $n\times n$ MDS and involutory MDS matrices over $\mathbb{F}_{p^m}$. The determination of these representative MDS matrices involves searching through all $(n-1)\times (n-1)$ MDS matrices over $\mathbb{F}_{p^m}$. Our contributions extend to proving that the count of all $3\times 3$ MDS matrices over $\mathbb{F}_{2^m}$ is precisely $(2^m-1)^5(2^m-2)(2^m-3)(2^{2m}-9\cdot 2^m+21)$. Furthermore, we explicitly provide the count of all $4\times 4$ MDS and involutory MDS matrices over $\mathbb{F}_{2^m}$ for $m=2, 3, 4$. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2403.08132 [pdf, other]

Information Leakage through Physical Layer Supply Voltage Coupling Vulnerability

Authors: Sahan Sanjaya, Aruna Jayasena, Prabhat Mishra

Abstract: Side-channel attacks exploit variations in non-functional behaviors to expose sensitive information across security boundaries. Existing methods leverage side-channels based on power consumption, electromagnetic radiation, silicon substrate coupling, and channels created by malicious implants. Power-based side-channel attacks are widely known for extracting information from data processed within a… ▽ More Side-channel attacks exploit variations in non-functional behaviors to expose sensitive information across security boundaries. Existing methods leverage side-channels based on power consumption, electromagnetic radiation, silicon substrate coupling, and channels created by malicious implants. Power-based side-channel attacks are widely known for extracting information from data processed within a device while assuming that an attacker has physical access or the ability to modify the device. In this paper, we introduce a novel side-channel vulnerability that leaks data-dependent power variations through physical layer supply voltage coupling (PSVC). Unlike traditional power side-channel attacks, the proposed vulnerability allows an adversary to mount an attack and extract information without modifying the device. We assess the effectiveness of PSVC vulnerability through three case studies, demonstrating several end-to-end attacks on general-purpose microcontrollers with varying adversary capabilities. These case studies provide evidence for the existence of PSVC vulnerability, its applicability for on-chip as well as on-board side-channel attacks, and how it can eliminate the need for physical access to the target device, making it applicable to any off-the-shelf hardware. Our experiments also reveal that designing devices to operate at the lowest operational voltage significantly reduces the risk of PSVC side-channel vulnerability. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.13919 [pdf, other]

SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization

Authors: Prakamya Mishra, Zonghai Yao, Parth Vashisht, Feiyun Ouyang, Beining Wang, Vidhi Dhaval Mody, Hong Yu

Abstract: Large Language Models (LLMs) such as GPT & Llama have demonstrated significant achievements in summarization tasks but struggle with factual inaccuracies, a critical issue in clinical NLP applications where errors could lead to serious consequences. To counter the high costs and limited availability of expert-annotated data for factual alignment, this study introduces an innovative pipeline that u… ▽ More Large Language Models (LLMs) such as GPT & Llama have demonstrated significant achievements in summarization tasks but struggle with factual inaccuracies, a critical issue in clinical NLP applications where errors could lead to serious consequences. To counter the high costs and limited availability of expert-annotated data for factual alignment, this study introduces an innovative pipeline that utilizes >100B parameter GPT variants like GPT-3.5 & GPT-4 to act as synthetic experts to generate high-quality synthetics feedback aimed at enhancing factual consistency in clinical note summarization. Our research primarily focuses on edit feedback generated by these synthetic feedback experts without additional human annotations, mirroring and optimizing the practical scenario in which medical professionals refine AI system outputs. Although such 100B+ parameter GPT variants have proven to demonstrate expertise in various clinical NLP tasks, such as the Medical Licensing Examination, there is scant research on their capacity to act as synthetic feedback experts and deliver expert-level edit feedback for improving the generation quality of weaker (<10B parameter) LLMs like GPT-2 (1.5B) & Llama 2 (7B) in clinical domain. So in this work, we leverage 100B+ GPT variants to act as synthetic feedback experts offering expert-level edit feedback, that is used to reduce hallucinations and align weaker (<10B parameter) LLMs with medical facts using two distinct alignment algorithms (DPO & SALT), endeavoring to narrow the divide between AI-generated content and factual accuracy. This highlights the substantial potential of LLM-based synthetic edits in enhancing the alignment of clinical factuality. △ Less

Submitted 18 April, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Equal contribution for the first two authors

arXiv:2402.09769 [pdf, other]

Representation Learning Using a Single Forward Pass

Authors: Aditya Somasundaram, Pushkal Mishra, Ayon Borthakur

Abstract: We propose a neuroscience-inspired Solo Pass Embedded Learning Algorithm (SPELA). SPELA is a prime candidate for training and inference applications in Edge AI devices. At the same time, SPELA can optimally cater to the need for a framework to study perceptual representation learning and formation. SPELA has distinctive features such as neural priors (in the form of embedded vectors), no weight tr… ▽ More We propose a neuroscience-inspired Solo Pass Embedded Learning Algorithm (SPELA). SPELA is a prime candidate for training and inference applications in Edge AI devices. At the same time, SPELA can optimally cater to the need for a framework to study perceptual representation learning and formation. SPELA has distinctive features such as neural priors (in the form of embedded vectors), no weight transport, no update locking of weights, complete local Hebbian learning, single forward pass with no storage of activations, and single weight update per sample. Juxtaposed with traditional approaches, SPELA operates without the need for backpropagation. We show that our algorithm can perform nonlinear classification on a noisy boolean operation dataset. Additionally, we exhibit high performance using SPELA across MNIST, KMNIST, and Fashion MNIST. Lastly, we show the few-shot and 1-epoch learning capabilities of SPELA on MNIST, KMNIST, and Fashion MNIST, where it consistently outperforms backpropagation. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: Under review

arXiv:2401.15605 [pdf, other]

AI as a Medical Ally: Evaluating ChatGPT's Usage and Impact in Indian Healthcare

Authors: Aryaman Raina, Prateek Mishra, Harshit goyal, Dhruv Kumar

Abstract: This study investigates the integration and impact of Large Language Models (LLMs), like ChatGPT, in India's healthcare sector. Our research employs a dual approach, engaging both general users and medical professionals through surveys and interviews respectively. Our findings reveal that healthcare professionals value ChatGPT in medical education and preliminary clinical settings, but exercise ca… ▽ More This study investigates the integration and impact of Large Language Models (LLMs), like ChatGPT, in India's healthcare sector. Our research employs a dual approach, engaging both general users and medical professionals through surveys and interviews respectively. Our findings reveal that healthcare professionals value ChatGPT in medical education and preliminary clinical settings, but exercise caution due to concerns about reliability, privacy, and the need for cross-verification with medical references. General users show a preference for AI interactions in healthcare, but concerns regarding accuracy and trust persist. The study underscores the need for these technologies to complement, not replace, human medical expertise, highlighting the importance of developing LLMs in collaboration with healthcare providers. This paper enhances the understanding of LLMs in healthcare, detailing current usage, user trust, and improvement areas. Our insights inform future research and development, underscoring the need for ethically compliant, user-focused LLM advancements that address healthcare-specific challenges. △ Less

Submitted 28 January, 2024; originally announced January 2024.

Comments: Under review

arXiv:2312.14542 [pdf, other]

Automatic Data Retrieval for Cross Lingual Summarization

Authors: Nikhilesh Bhatnagar, Ashok Urlana, Vandan Mujadia, Pruthwik Mishra, Dipti Misra Sharma

Abstract: Cross-lingual summarization involves the summarization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be h… ▽ More Cross-lingual summarization involves the summarization of text written in one language to a different one. There is a body of research addressing cross-lingual summarization from English to other European languages. In this work, we aim to perform cross-lingual summarization from English to Hindi. We propose pairing up the coverage of newsworthy events in textual and video format can prove to be helpful for data acquisition for cross lingual summarization. We analyze the data and propose methods to match articles to video descriptions that serve as document and summary pairs. We also outline filtering methods over reasonable thresholds to ensure the correctness of the summaries. Further, we make available 28,583 mono and cross-lingual article-summary pairs https://github.com/tingc9/Cross-Sum-News-Aligned. We also build and analyze multiple baselines on the collected data and report error analysis. △ Less

Submitted 22 December, 2023; originally announced December 2023.

Comments: 6 pages, 6 tables, 2 figures, conference: ICON 2023

arXiv:2312.11395 [pdf, other]

Verb Categorisation for Hindi Word Problem Solving

Authors: Harshita Sharma, Pruthwik Mishra, Dipti Misra Sharma

Abstract: Word problem Solving is a challenging NLP task that deals with solving mathematical problems described in natural language. Recently, there has been renewed interest in developing word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are… ▽ More Word problem Solving is a challenging NLP task that deals with solving mathematical problems described in natural language. Recently, there has been renewed interest in developing word problem solvers for Indian languages. As part of this paper, we have built a Hindi arithmetic word problem solver which makes use of verbs. Additionally, we have created verb categorization data for Hindi. Verbs are very important for solving word problems with addition/subtraction operations as they help us identify the set of operations required to solve the word problems. We propose a rule-based solver that uses verb categorisation to identify operations in a word problem and generate answers for it. To perform verb categorisation, we explore several approaches and present a comparative study. △ Less

Submitted 18 December, 2023; originally announced December 2023.

Comments: 16 pages, 17 figures, ICON 2023 Conference

ACM Class: I.2.7

arXiv:2311.09212 [pdf, other]

Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey

Authors: Ashok Urlana, Pruthwik Mishra, Tathagato Roy, Rahul Mishra

Abstract: Generic text summarization approaches often fail to address the specific intent and needs of individual users. Recently, scholarly attention has turned to the development of summarization methods that are more closely tailored and controlled to align with specific objectives and user needs. Despite a growing corpus of controllable summarization research, there is no comprehensive survey available… ▽ More Generic text summarization approaches often fail to address the specific intent and needs of individual users. Recently, scholarly attention has turned to the development of summarization methods that are more closely tailored and controlled to align with specific objectives and user needs. Despite a growing corpus of controllable summarization research, there is no comprehensive survey available that thoroughly explores the diverse controllable attributes employed in this context, delves into the associated challenges, and investigates the existing solutions. In this survey, we formalize the Controllable Text Summarization (CTS) task, categorize controllable attributes according to their shared characteristics and objectives, and present a thorough examination of existing datasets and methods within each category. Moreover, based on our findings, we uncover limitations and research gaps, while also exploring potential solutions and future directions for CTS. We release our detailed analysis of CTS papers at https://github.com/ashokurlana/controllable_text_summarization_survey. △ Less

Submitted 27 May, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: 21 pages, 6 figures, Accepted in ACL Findings 2024

ACM Class: I.2.7

arXiv:2311.07818 [pdf, other]

Container Resource Allocation versus Performance of Data-intensive Applications on Different Cloud Servers

Authors: Qing Wang, Snigdhaswin Kar, Prabodh Mishra, Caleb Linduff, Ryan Izard, Khayam Anjam, Geddings Barrineau, Junaid Zulfiqar, Kuang-Ching Wang

Abstract: In recent years, data-intensive applications have been increasingly deployed on cloud systems. Such applications utilize significant compute, memory, and I/O resources to process large volumes of data. Optimizing the performance and cost-efficiency for such applications is a non-trivial problem. The problem becomes even more challenging with the increasing use of containers, which are popular due… ▽ More In recent years, data-intensive applications have been increasingly deployed on cloud systems. Such applications utilize significant compute, memory, and I/O resources to process large volumes of data. Optimizing the performance and cost-efficiency for such applications is a non-trivial problem. The problem becomes even more challenging with the increasing use of containers, which are popular due to their lower operational overheads and faster boot speed at the cost of weaker resource assurances for the hosted applications. In this paper, two containerized data-intensive applications with very different performance objectives and resource needs were studied on cloud servers with Docker containers running on Intel Xeon E5 and AMD EPYC Rome multi-core processors with a range of CPU, memory, and I/O configurations. Primary findings from our experiments include: 1) Allocating multiple cores to a compute-intensive application can improve performance, but only if the cores do not contend for the same caches, and the optimal core counts depend on the specific workload; 2) allocating more memory to a memory-intensive application than its deterministic data workload does not further improve performance; however, 3) having multiple such memory-intensive containers on the same server can lead to cache and memory bus contention leading to significant and volatile performance degradation. The comparative observations on Intel and AMD servers provided insights into trade-offs between larger numbers of distributed chiplets interconnected with higher speed buses (AMD) and larger numbers of centrally integrated cores and caches with lesser speed buses (Intel). For the two types of applications studied, the more distributed caches and faster data buses have benefited the deployment of larger numbers of containers. △ Less

Submitted 13 November, 2023; originally announced November 2023.

arXiv:2311.07015 [pdf, other]

QudCom: Towards Quantum Compilation for Qudit Systems

Authors: Daniel Volya, Prabhat Mishra

Abstract: Qudit-based quantum computation offers unique advantages over qubit-based systems in terms of noise mitigation capabilities as well as algorithmic complexity improvements. However, the software ecosystem for multi-state quantum systems is severely limited. In this paper, we highlight a quantum workflow for describing and compiling qudit systems. We investigate the design and implementation of a qu… ▽ More Qudit-based quantum computation offers unique advantages over qubit-based systems in terms of noise mitigation capabilities as well as algorithmic complexity improvements. However, the software ecosystem for multi-state quantum systems is severely limited. In this paper, we highlight a quantum workflow for describing and compiling qudit systems. We investigate the design and implementation of a quantum compiler for qudit systems. We also explore several key theoretical properties of qudit computing as well as efficient optimization techniques. Finally, we provide demonstrations using physical quantum computers as well as simulations of the proposed quantum toolchain. △ Less

Submitted 12 November, 2023; originally announced November 2023.

arXiv:2311.00579 [pdf, other]

Revealing CNN Architectures via Side-Channel Analysis in Dataflow-based Inference Accelerators

Authors: Hansika Weerasena, Prabhat Mishra

Abstract: Convolution Neural Networks (CNNs) are widely used in various domains. Recent advances in dataflow-based CNN accelerators have enabled CNN inference in resource-constrained edge devices. These dataflow accelerators utilize inherent data reuse of convolution layers to process CNN models efficiently. Concealing the architecture of CNN models is critical for privacy and security. This paper evaluates… ▽ More Convolution Neural Networks (CNNs) are widely used in various domains. Recent advances in dataflow-based CNN accelerators have enabled CNN inference in resource-constrained edge devices. These dataflow accelerators utilize inherent data reuse of convolution layers to process CNN models efficiently. Concealing the architecture of CNN models is critical for privacy and security. This paper evaluates memory-based side-channel information to recover CNN architectures from dataflow-based CNN inference accelerators. The proposed attack exploits spatial and temporal data reuse of the dataflow mapping on CNN accelerators and architectural hints to recover the structure of CNN models. Experimental results demonstrate that our proposed side-channel attack can recover the structures of popular CNN models, namely Lenet, Alexnet, and VGGnet16. △ Less

Submitted 1 November, 2023; originally announced November 2023.

arXiv:2310.20033 [pdf, other]

Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization

Authors: Prakamya Mishra, Zonghai Yao, Shuwei Chen, Beining Wang, Rohan Mittal, Hong Yu

Abstract: Large Language Models (LLMs) like the GPT and LLaMA families have demonstrated exceptional capabilities in capturing and condensing critical contextual information and achieving state-of-the-art performance in the summarization task. However, community concerns about these models' hallucination issues continue to rise. LLMs sometimes generate factually hallucinated summaries, which can be extremel… ▽ More Large Language Models (LLMs) like the GPT and LLaMA families have demonstrated exceptional capabilities in capturing and condensing critical contextual information and achieving state-of-the-art performance in the summarization task. However, community concerns about these models' hallucination issues continue to rise. LLMs sometimes generate factually hallucinated summaries, which can be extremely harmful in the clinical domain NLP tasks (e.g., clinical note summarization), where factually incorrect statements can lead to critically erroneous diagnoses. Fine-tuning LLMs using human feedback has shown the promise of aligning LLMs to be factually consistent during generation, but such training procedure requires high-quality human-annotated data, which can be extremely expensive to get in the clinical domain. In this work, we propose a new pipeline using ChatGPT instead of human experts to generate high-quality feedback data for improving factual consistency in the clinical note summarization task. We focus specifically on edit feedback because recent work discusses the shortcomings of human alignment via preference feedback in complex situations (such as clinical NLP tasks that require extensive expert knowledge), as well as some advantages of collecting edit feedback from domain experts. In addition, although GPT has reached the expert level in many clinical NLP tasks (e.g., USMLE QA), there is not much previous work discussing whether GPT can generate expert-level edit feedback for LMs in the clinical note summarization task. We hope to fill this gap. Finally, our evaluations demonstrate the potential use of GPT edits in human alignment, especially from a factuality perspective. △ Less

Submitted 3 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: To appear in NeurIPS 2023 Workshop SyntheticData4ML

arXiv:2309.15687 [pdf, other]

Breaking On-Chip Communication Anonymity using Flow Correlation Attacks

Authors: Hansika Weerasena, Prabhat Mishra

Abstract: Network-on-Chip (NoC) is widely used to facilitate communication between components in sophisticated System-on-Chip (SoC) designs. Security of the on-chip communication is crucial because exploiting any vulnerability in shared NoC would be a goldmine for an attacker that puts the entire computing infrastructure at risk. NoC security relies on effective countermeasures against diverse attacks, incl… ▽ More Network-on-Chip (NoC) is widely used to facilitate communication between components in sophisticated System-on-Chip (SoC) designs. Security of the on-chip communication is crucial because exploiting any vulnerability in shared NoC would be a goldmine for an attacker that puts the entire computing infrastructure at risk. NoC security relies on effective countermeasures against diverse attacks, including attacks on anonymity. We investigate the security strength of existing anonymous routing protocols in NoC architectures. Specifically, this paper makes two important contributions. We show that the existing anonymous routing is vulnerable to machine learning (ML) based flow correlation attacks on NoCs. We propose lightweight anonymous routing with traffic obfuscation techniques to defend against ML-based flow correlation attacks. Experimental studies using both real and synthetic traffic reveal that our proposed attack is successful against state-of-the-art anonymous routing in NoC architectures with high accuracy (up to 99%) for diverse traffic patterns, while our lightweight countermeasure can defend against ML-based attacks with minor hardware and performance overhead. △ Less

Submitted 1 February, 2024; v1 submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.15067 [pdf, other]

Logic Locking based Trojans: A Friend Turns Foe

Authors: Yuntao Liu, Aruna Jayasena, Prabhat Mishra, Ankur Srivastava

Abstract: Logic locking and hardware Trojans are two fields in hardware security that have been mostly developed independently from each other. In this paper, we identify the relationship between these two fields. We find that a common structure that exists in many logic locking techniques has desirable properties of hardware Trojans (HWT). We then construct a novel type of HWT, called Trojans based on Logi… ▽ More Logic locking and hardware Trojans are two fields in hardware security that have been mostly developed independently from each other. In this paper, we identify the relationship between these two fields. We find that a common structure that exists in many logic locking techniques has desirable properties of hardware Trojans (HWT). We then construct a novel type of HWT, called Trojans based on Logic Locking (TroLL), in a way that can evade state-of-the-art ATPG-based HWT detection techniques. In an effort to detect TroLL, we propose customization of existing state-of-the-art ATPG-based HWT detection approaches as well as adapting the SAT-based attacks on logic locking to HWT detection. In our experiments, we use random sampling as reference. It is shown that the customized ATPG-based approaches are the best performing but only offer limited improvement over random sampling. Moreover, their efficacy also diminishes as TroLL's triggers become longer, i.e., have more bits specified). We thereby highlight the need to find a scalable HWT detection approach for TroLL. △ Less

Submitted 26 September, 2023; originally announced September 2023.

Comments: 9 pages, double column, 8 figures, IEEE format

arXiv:2309.08002 [pdf, other]

doi 10.1109/TCAD.2024.3383961

HIVE: Scalable Hardware-Firmware Co-Verification using Scenario-based Decomposition and Automated Hint Extraction

Authors: Aruna Jayasena, Prabhat Mishra

Abstract: Hardware-firmware co-verification is critical to design trustworthy systems. While formal methods can provide verification guarantees, due to the complexity of firmware and hardware, it can lead to state space explosion. There are promising avenues to reduce the state space during firmware verification through manual abstraction of hardware or manual generation of hints. Manual development of abst… ▽ More Hardware-firmware co-verification is critical to design trustworthy systems. While formal methods can provide verification guarantees, due to the complexity of firmware and hardware, it can lead to state space explosion. There are promising avenues to reduce the state space during firmware verification through manual abstraction of hardware or manual generation of hints. Manual development of abstraction or hints requires domain expertise and can be time-consuming and error-prone, leading to incorrect proofs or inaccurate results. In this paper, we effectively combine the scalability of simulation-based validation and the completeness of formal verification. Our proposed approach is applicable to actual firmware and hardware implementations without requiring any manual intervention during formal model generation or hint extraction. To reduce the state space complexity, we utilize both static module-level analysis and dynamic execution of verification scenarios to automatically generate system-level hints. These hints guide the underlying solver to perform scalable equivalence checking using proofs. The extracted hints are validated against the implementation before using them in the proofs. Experimental evaluation on RISC-V based systems demonstrates that our proposed framework is scalable due to scenario-based decomposition and automated hint extraction. Moreover, our fully automated framework can identify complex bugs in actual firmware-hardware implementations. △ Less

Submitted 16 April, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024

arXiv:2307.14322 [pdf, other]

Modeling Inverse Demand Function with Explainable Dual Neural Networks

Authors: Zhiyu Cao, Zihan Chen, Prerna Mishra, Hamed Amini, Zachary Feinstein

Abstract: Financial contagion has been widely recognized as a fundamental risk to the financial system. Particularly potent is price-mediated contagion, wherein forced liquidations by firms depress asset prices and propagate financial stress, enabling crises to proliferate across a broad spectrum of seemingly unrelated entities. Price impacts are currently modeled via exogenous inverse demand functions. How… ▽ More Financial contagion has been widely recognized as a fundamental risk to the financial system. Particularly potent is price-mediated contagion, wherein forced liquidations by firms depress asset prices and propagate financial stress, enabling crises to proliferate across a broad spectrum of seemingly unrelated entities. Price impacts are currently modeled via exogenous inverse demand functions. However, in real-world scenarios, only the initial shocks and the final equilibrium asset prices are typically observable, leaving actual asset liquidations largely obscured. This missing data presents significant limitations to calibrating the existing models. To address these challenges, we introduce a novel dual neural network structure that operates in two sequential stages: the first neural network maps initial shocks to predicted asset liquidations, and the second network utilizes these liquidations to derive resultant equilibrium prices. This data-driven approach can capture both linear and non-linear forms without pre-specifying an analytical structure; furthermore, it functions effectively even in the absence of observable liquidation data. Experiments with simulated datasets demonstrate that our model can accurately predict equilibrium asset prices based solely on initial shocks, while revealing a strong alignment between predicted and true liquidations. Our explainable framework contributes to the understanding and modeling of price-mediated contagion and provides valuable insights for financial authorities to construct effective stress tests and regulatory policies. △ Less

Submitted 5 October, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

Comments: Accepted and selected for oral presentation at ICAIF 2023, NY, US

ACM Class: J.1; I.2.6

arXiv:2307.13165 [pdf, other]

doi 10.1007/978-3-031-56060-6_14

Investigating the Robustness of Sequential Recommender Systems Against Training Data Perturbations

Authors: Filippo Betello, Federico Siciliano, Pushkar Mishra, Fabrizio Silvestri

Abstract: Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is… ▽ More Sequential Recommender Systems (SRSs) are widely employed to model user behavior over time. However, their robustness in the face of perturbations in training data remains a largely understudied yet critical issue. A fundamental challenge emerges in previous studies aimed at assessing the robustness of SRSs: the Rank-Biased Overlap (RBO) similarity is not particularly suited for this task as it is designed for infinite rankings of items and thus shows limitations in real-world scenarios. For instance, it fails to achieve a perfect score of 1 for two identical finite-length rankings. To address this challenge, we introduce a novel contribution: Finite Rank-Biased Overlap (FRBO), an enhanced similarity tailored explicitly for finite rankings. This innovation facilitates a more intuitive evaluation in practical settings. In pursuit of our goal, we empirically investigate the impact of removing items at different positions within a temporally ordered sequence. We evaluate two distinct SRS models across multiple datasets, measuring their performance using metrics such as Normalized Discounted Cumulative Gain (NDCG) and Rank List Sensitivity. Our results demonstrate that removing items at the end of the sequence has a statistically significant impact on performance, with NDCG decreasing up to 60%. Conversely, removing items from the beginning or middle has no significant effect. These findings underscore the criticality of the position of perturbed items in the training data. As we spotlight the vulnerabilities inherent in current SRSs, we fervently advocate for intensified research efforts to fortify their robustness against adversarial perturbations. △ Less

Submitted 27 December, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

Journal ref: Advances in Information Retrieval. ECIR 2024. Lecture Notes in Computer Science, vol 14609

arXiv:2307.09288 [pdf, other]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be a suitable substitute for closed-source models. We provide a detailed description of our approach to fine-tuning and safety improvements of Llama 2-Chat in order to enable the community to build on our work and contribute to the responsible development of LLMs. △ Less

Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2307.05162 [pdf, other]

SuryaKiran at MEDIQA-Sum 2023: Leveraging LoRA for Clinical Dialogue Summarization

Authors: Kunal Suri, Prakhar Mishra, Saumajit Saha, Atul Singh

Abstract: Finetuning Large Language Models helps improve the results for domain-specific use cases. End-to-end finetuning of large language models is time and resource intensive and has high storage requirements to store the finetuned version of the large language model. Parameter Efficient Fine Tuning (PEFT) methods address the time and resource challenges by keeping the large language model as a fixed bas… ▽ More Finetuning Large Language Models helps improve the results for domain-specific use cases. End-to-end finetuning of large language models is time and resource intensive and has high storage requirements to store the finetuned version of the large language model. Parameter Efficient Fine Tuning (PEFT) methods address the time and resource challenges by keeping the large language model as a fixed base and add additional layers, which the PEFT methods finetune. This paper demonstrates the evaluation results for one such PEFT method Low Rank Adaptation (LoRA), for Clinical Dialogue Summarization. The evaluation results show that LoRA works at par with end-to-end finetuning for a large language model. The paper presents the evaluations done for solving both the Subtask A and B from ImageCLEFmedical {https://www.imageclef.org/2023/medical} △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2305.04887 [pdf, other]

Hardware Acceleration of Explainable Artificial Intelligence

Authors: Zhixin Pan, Prabhat Mishra

Abstract: Machine learning (ML) is successful in achieving human-level artificial intelligence in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While recent efforts on explainable AI (XAI) has received significant attention, most of the existing solutions are not applicable in real-time systems since they map interpretability as an optimization problem, whi… ▽ More Machine learning (ML) is successful in achieving human-level artificial intelligence in various fields. However, it lacks the ability to explain an outcome due to its black-box nature. While recent efforts on explainable AI (XAI) has received significant attention, most of the existing solutions are not applicable in real-time systems since they map interpretability as an optimization problem, which leads to numerous iterations of time-consuming complex computations. Although there are existing hardware-based acceleration framework for XAI, they are implemented through FPGA and designed for specific tasks, leading to expensive cost and lack of flexibility. In this paper, we propose a simple yet efficient framework to accelerate various XAI algorithms with existing hardware accelerators. Specifically, this paper makes three important contributions. (1) The proposed method is the first attempt in exploring the effectiveness of Tensor Processing Unit (TPU) to accelerate XAI. (2) Our proposed solution explores the close relationship between several existing XAI algorithms with matrix computations, and exploits the synergy between convolution and Fourier transform, which takes full advantage of TPU's inherent ability in accelerating matrix computations. (3) Our proposed approach can lead to real-time outcome interpretation. Extensive experimental evaluation demonstrates that proposed approach deployed on TPU can provide drastic improvement in interpretation time (39x on average) as well as energy efficiency (69x on average) compared to existing acceleration techniques. △ Less

Submitted 4 May, 2023; originally announced May 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2103.11927

arXiv:2304.11315 [pdf, other]

Unmatched uncertainty mitigation through neural network supported model predictive control

Authors: Mateus V. Gasparino, Prabhat K. Mishra, Girish Chowdhary

Abstract: This paper presents a deep learning based model predictive control (MPC) algorithm for systems with unmatched and bounded state-action dependent uncertainties of unknown structure. We utilize a deep neural network (DNN) as an oracle in the underlying optimization problem of learning based MPC (LBMPC) to estimate unmatched uncertainties. Generally, non-parametric oracles such as DNN are considered… ▽ More This paper presents a deep learning based model predictive control (MPC) algorithm for systems with unmatched and bounded state-action dependent uncertainties of unknown structure. We utilize a deep neural network (DNN) as an oracle in the underlying optimization problem of learning based MPC (LBMPC) to estimate unmatched uncertainties. Generally, non-parametric oracles such as DNN are considered difficult to employ with LBMPC due to the technical difficulties associated with estimation of their coefficients in real time. We employ a dual-timescale adaptation mechanism, where the weights of the last layer of the neural network are updated in real time while the inner layers are trained on a slower timescale using the training data collected online and selectively stored in a buffer. Our results are validated through a numerical experiment on the compression system model of jet engine. These results indicate that the proposed approach is implementable in real time and carries the theoretical guarantees of LBMPC. △ Less

Submitted 22 April, 2023; originally announced April 2023.

arXiv:2303.06274 [pdf]

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, Jinxi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

Abstract: Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2303.01550 [pdf, other]

Modeling and Exploration of Gain Competition Attacks in Optical Network-on-Chip Architectures

Authors: Khushboo Rani, Hansika Weerasena, Stephen A. Butler, Subodha Charles, Prabhat Mishra

Abstract: Network-on-Chip (NoC) enables energy-efficient communication between numerous components in System-on-Chip architectures. The optical NoC is widely considered a key technology to overcome the bandwidth and energy limitations of traditional electrical on-chip interconnects. While optical NoC can offer high performance, they come with inherent security vulnerabilities due to the nature of optical in… ▽ More Network-on-Chip (NoC) enables energy-efficient communication between numerous components in System-on-Chip architectures. The optical NoC is widely considered a key technology to overcome the bandwidth and energy limitations of traditional electrical on-chip interconnects. While optical NoC can offer high performance, they come with inherent security vulnerabilities due to the nature of optical interconnects. In this paper, we investigate the gain competition attack in optical NoCs, which can be initiated by an attacker injecting a high-power signal to the optical waveguide, robbing the legitimate signals of amplification. To the best of our knowledge, our proposed approach is the first attempt to investigate gain competition attacks as a security threat in optical NoCs. We model the attack and analyze its effects on optical NoC performance. We also propose potential attack detection techniques and countermeasures to mitigate the attack. Our experimental evaluation using different NoC topologies and diverse traffic patterns demonstrates the effectiveness of our modeling and exploration of gain competition attacks in optical NoC architectures. △ Less

Submitted 2 March, 2023; originally announced March 2023.

Comments: 6 pages, 13 figures

arXiv:2303.01487 [pdf, other]

quAssert: Automatic Generation of Quantum Assertions

Authors: Hasini Witharana, Daniel Volya, Prabhat Mishra

Abstract: Functional validation is necessary to detect any errors during quantum computation. There are promising avenues to debug quantum circuits using runtime assertions. However, the existing approaches rely on the expertise of the verification engineers to manually design and insert the assertions in suitable locations. In this paper, we propose automated generation and placement of quantum assertions… ▽ More Functional validation is necessary to detect any errors during quantum computation. There are promising avenues to debug quantum circuits using runtime assertions. However, the existing approaches rely on the expertise of the verification engineers to manually design and insert the assertions in suitable locations. In this paper, we propose automated generation and placement of quantum assertions based on static analysis and random sampling of quantum circuits. Specifically, this paper makes two important contributions. We automatically uncover special properties of a quantum circuit, such as purely classical states, superposition states, and entangled states using statistical methods. We also perform automated placement of quantum assertions to maximize the functional coverage as well as minimize the hardware overhead. We demonstrate the effectiveness of the generated assertions in error detection using a suite of quantum benchmarks, including Shor's factoring algorithm and Grover's search algorithm. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2302.13558 [pdf, other]

Deep Model Predictive Control

Authors: Prabhat K. Mishra, Mateus V. Gasparino, Andres E. B. Velasquez, Girish Chowdhary

Abstract: This paper presents a deep learning based model predictive control algorithm for control affine nonlinear discrete time systems with matched and bounded state-dependent uncertainties of unknown structure. Since the structure of uncertainties is not known, a deep neural network (DNN) is employed to approximate the disturbances. In order to avoid any unwanted behavior during the learning phase, a tu… ▽ More This paper presents a deep learning based model predictive control algorithm for control affine nonlinear discrete time systems with matched and bounded state-dependent uncertainties of unknown structure. Since the structure of uncertainties is not known, a deep neural network (DNN) is employed to approximate the disturbances. In order to avoid any unwanted behavior during the learning phase, a tube based model predictive controller is employed, which ensures satisfaction of constraints and input-to-state stability of the closed-loop states. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Comments: arXiv admin note: text overlap with arXiv:2104.07171

Journal ref: Conference on Robot Learning (CoRL'22): Workshop on Learning to Adapt and Improve in the Real World, 2022

arXiv:2302.12241 [pdf, other]

doi 10.1145/3655621

Sequence-Based Incremental Concolic Testing of RTL Models

Authors: Hasini Witharana, Aruna Jayasena, Prabhat Mishra

Abstract: Concolic testing is a scalable solution for automated generation of directed tests for validation of hardware designs. Unfortunately, concolic testing also fails to cover complex corner cases such as hard-to-activate branches. In this paper, we propose an incremental concolic testing technique to cover hard-to-activate branches in register-transfer level models. We show that a complex branch condi… ▽ More Concolic testing is a scalable solution for automated generation of directed tests for validation of hardware designs. Unfortunately, concolic testing also fails to cover complex corner cases such as hard-to-activate branches. In this paper, we propose an incremental concolic testing technique to cover hard-to-activate branches in register-transfer level models. We show that a complex branch condition can be viewed as a sequence of easy-to-activate events. We map the branch coverage problem to the coverage of a sequence of events. We propose an efficient algorithm to cover the sequence of events using concolic testing. Specifically, the test generated to activate the current event is used as the starting point to activate the next event in the sequence. Experimental results demonstrate that our approach can be used to generate directed tests to cover complex corner cases while state-of-the-art methods fail to activate them. △ Less

Submitted 23 February, 2023; originally announced February 2023.

ACM Class: B.5.3

Journal ref: ACM Transactions on Design Automation of Electronic Systems, 2024

arXiv:2302.08984 [pdf, ps, other]

doi 10.1109/VLSID60093.2024.00079

Design for Trust utilizing Rareness Reduction

Authors: Aruna Jayasena, Prabhat Mishra

Abstract: Increasing design complexity and reduced time-to-market have motivated manufacturers to outsource some parts of the System-on-Chip (SoC) design flow to third-party vendors. This provides an opportunity for attackers to introduce hardware Trojans by constructing stealthy triggers consisting of rare events (e.g., rare signals, states, and transitions). There are promising test generation-based hardw… ▽ More Increasing design complexity and reduced time-to-market have motivated manufacturers to outsource some parts of the System-on-Chip (SoC) design flow to third-party vendors. This provides an opportunity for attackers to introduce hardware Trojans by constructing stealthy triggers consisting of rare events (e.g., rare signals, states, and transitions). There are promising test generation-based hardware Trojan detection techniques that rely on the activation of rare events. In this paper, we investigate rareness reduction as a design-for-trust solution to make it harder for an adversary to hide Trojans (easier for Trojan detection). Specifically, we analyze different avenues to reduce the potential rare trigger cases, including design diversity and area optimization. While there is a good understanding of the relationship between area, power, energy, and performance, this research provides a better insight into the dependency between area and security. Our experimental evaluation demonstrates that area reduction leads to a reduction in rareness. It also reveals that reducing rareness leads to faster Trojan detection as well as improved coverage by Trojan detection methods. △ Less

Submitted 16 April, 2024; v1 submitted 17 February, 2023; originally announced February 2023.

Comments: 37th International Conference on VLSI Design, 2024

arXiv:2302.06118 [pdf, other]

Lightweight Encryption and Anonymous Routing in NoC based SoCs

Authors: Subodha Charles, Prabhat Mishra

Abstract: Advances in manufacturing technologies have enabled System-on-Chip (SoC) designers to integrate an increasing number of cores on a single SoC. Increasing SoC complexity coupled with tight time-to-market deadlines has led to increased utilization of Intellectual Property (IP) cores from third-party vendors. SoC supply chain is widely acknowledged as a major source of security vulnerabilities. Poten… ▽ More Advances in manufacturing technologies have enabled System-on-Chip (SoC) designers to integrate an increasing number of cores on a single SoC. Increasing SoC complexity coupled with tight time-to-market deadlines has led to increased utilization of Intellectual Property (IP) cores from third-party vendors. SoC supply chain is widely acknowledged as a major source of security vulnerabilities. Potentially malicious third-party IPs integrated on the same Network-on-Chip (NoC) with the trusted components can lead to security and trust concerns. While secure communication is a well-studied problem in the computer networks domain, it is not feasible to implement those solutions on resource-constrained SoCs. In this paper, we present a lightweight encryption and anonymous routing protocol for communication between IP cores in NoC based SoCs. Our method eliminates the major overhead associated with traditional encryption and anonymous routing protocols using a novel secret sharing mechanism while ensuring that the desired security goals are met. Experimental results demonstrate that existing security solutions on NoC can introduce significant (1.5X) performance degradation, whereas our approach provides the same security features with minor (4%) impact on performance. △ Less

Submitted 13 February, 2023; originally announced February 2023.

arXiv:2301.09738 [pdf, other]

Security of Electrical, Optical and Wireless On-Chip Interconnects: A Survey

Authors: Hansika Weerasena, Prabhat Mishra

Abstract: The advancement of manufacturing technologies has enabled the integration of more intellectual property (IP) cores on the same system-on-chip (SoC). Scalable and high throughput on-chip communication architecture has become a vital component in today's SoCs. Diverse technologies such as electrical, wireless, optical, and hybrid are available for on-chip communication with different architectures s… ▽ More The advancement of manufacturing technologies has enabled the integration of more intellectual property (IP) cores on the same system-on-chip (SoC). Scalable and high throughput on-chip communication architecture has become a vital component in today's SoCs. Diverse technologies such as electrical, wireless, optical, and hybrid are available for on-chip communication with different architectures supporting them. Security of the on-chip communication is crucial because exploiting any vulnerability would be a goldmine for an attacker. In this survey, we provide a comprehensive review of threat models, attacks, and countermeasures over diverse on-chip communication technologies as well as sophisticated architectures. △ Less

Submitted 28 September, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

Comments: 41 pages, 24 figures, 4 tables

arXiv:2301.03980 [pdf]

Language Models sounds the Death Knell of Knowledge Graphs

Authors: Kunal Suri, Atul Singh, Prakhar Mishra, Swapna Sourav Rout, Rajesh Sabapathy

Abstract: Healthcare domain generates a lot of unstructured and semi-structured text. Natural Language processing (NLP) has been used extensively to process this data. Deep Learning based NLP especially Large Language Models (LLMs) such as BERT have found broad acceptance and are used extensively for many applications. A Language Model is a probability distribution over a word sequence. Self-supervised Lear… ▽ More Healthcare domain generates a lot of unstructured and semi-structured text. Natural Language processing (NLP) has been used extensively to process this data. Deep Learning based NLP especially Large Language Models (LLMs) such as BERT have found broad acceptance and are used extensively for many applications. A Language Model is a probability distribution over a word sequence. Self-supervised Learning on a large corpus of data automatically generates deep learning-based language models. BioBERT and Med-BERT are language models pre-trained for the healthcare domain. Healthcare uses typical NLP tasks such as question answering, information extraction, named entity recognition, and search to simplify and improve processes. However, to ensure robust application of the results, NLP practitioners need to normalize and standardize them. One of the main ways of achieving normalization and standardization is the use of Knowledge Graphs. A Knowledge Graph captures concepts and their relationships for a specific domain, but their creation is time-consuming and requires manual intervention from domain experts, which can prove expensive. SNOMED CT (Systematized Nomenclature of Medicine -- Clinical Terms), Unified Medical Language System (UMLS), and Gene Ontology (GO) are popular ontologies from the healthcare domain. SNOMED CT and UMLS capture concepts such as disease, symptoms and diagnosis and GO is the world's largest source of information on the functions of genes. Healthcare has been dealing with an explosion in information about different types of drugs, diseases, and procedures. This paper argues that using Knowledge Graphs is not the best solution for solving problems in this domain. We present experiments using LLMs for the healthcare domain to demonstrate that language models provide the same functionality as knowledge graphs, thereby making knowledge graphs redundant. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: 5 pages

arXiv:2301.03957 [pdf, ps, other]

AI based approach to Trailer Generation for Online Educational Courses

Authors: Prakhar Mishra, Chaitali Diwan, Srinath Srinivasa, G. Srinivasaraghavan

Abstract: In this paper, we propose an AI based approach to Trailer Generation in the form of short videos for online educational courses. Trailers give an overview of the course to the learners and help them make an informed choice about the courses they want to learn. It also helps to generate curiosity and interest among the learners and encourages them to pursue a course. While it is possible to manuall… ▽ More In this paper, we propose an AI based approach to Trailer Generation in the form of short videos for online educational courses. Trailers give an overview of the course to the learners and help them make an informed choice about the courses they want to learn. It also helps to generate curiosity and interest among the learners and encourages them to pursue a course. While it is possible to manually generate the trailers, it requires extensive human efforts and skills over a broad spectrum of design, span selection, video editing, domain knowledge, etc., thus making it time-consuming and expensive, especially in an academic setting. The framework we propose in this work is a template based method for video trailer generation, where most of the textual content of the trailer is auto-generated and the trailer video is automatically generated, by leveraging Machine Learning and Natural Language Processing techniques. The proposed trailer is in the form of a timeline consisting of various fragments created by selecting, para-phrasing or generating content using various proposed techniques. The fragments are further enhanced by adding voice-over text, subtitles, animations, etc., to create a holistic experience. Finally, we perform user evaluation with 63 human evaluators for evaluating the trailers generated by our system and the results obtained were encouraging. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: 8 pages, 9 figures

arXiv:2301.00114 [pdf, other]

Skeletal Video Anomaly Detection using Deep Learning: Survey, Challenges and Future Directions

Authors: Pratik K. Mishra, Alex Mihailidis, Shehroz S. Khan

Abstract: The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the… ▽ More The existing methods for video anomaly detection mostly utilize videos containing identifiable facial and appearance-based features. The use of videos with identifiable faces raises privacy concerns, especially when used in a hospital or community-based setting. Appearance-based features can also be sensitive to pixel-based noise, straining the anomaly detection methods to model the changes in the background and making it difficult to focus on the actions of humans in the foreground. Structural information in the form of skeletons describing the human motion in the videos is privacy-protecting and can overcome some of the problems posed by appearance-based features. In this paper, we present a survey of privacy-protecting deep learning anomaly detection methods using skeletons extracted from videos. We present a novel taxonomy of algorithms based on the various learning approaches. We conclude that skeleton-based approaches for anomaly detection can be a plausible privacy-protecting alternative for video anomaly detection. Lastly, we identify major open research questions and provide guidelines to address them. △ Less

Submitted 17 January, 2024; v1 submitted 30 December, 2022; originally announced January 2023.

Comments: This work has been accepted by IEEE Transactions on Emerging Topics in Computational Intelligence

arXiv:2212.10682 [pdf, other]

Privacy-Protecting Behaviours of Risk Detection in People with Dementia using Videos

Authors: Pratik K. Mishra, Andrea Iaboni, Bing Ye, Kristine Newman, Alex Mihailidis, Shehroz S. Khan

Abstract: People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent i… ▽ More People living with dementia often exhibit behavioural and psychological symptoms of dementia that can put their and others' safety at risk. Existing video surveillance systems in long-term care facilities can be used to monitor such behaviours of risk to alert the staff to prevent potential injuries or death in some cases. However, these behaviours of risk events are heterogeneous and infrequent in comparison to normal events. Moreover, analyzing raw videos can also raise privacy concerns. In this paper, we present two novel privacy-protecting video-based anomaly detection approaches to detect behaviours of risks in people with dementia. We either extracted body pose information as skeletons or used semantic segmentation masks to replace multiple humans in the scene with their semantic boundaries. Our work differs from most existing approaches for video anomaly detection that focus on appearance-based features, which can put the privacy of a person at risk and is also susceptible to pixel-based noise, including illumination and viewing direction. We used anonymized videos of normal activities to train customized spatio-temporal convolutional autoencoders and identify behaviours of risk as anomalies. We showed our results on a real-world study conducted in a dementia care unit with patients with dementia, containing approximately 21 hours of normal activities data for training and 9 hours of data containing normal and behaviours of risk events for testing. We compared our approaches with the original RGB videos and obtained a similar area under the receiver operating characteristic curve performance of 0.807 for the skeleton-based approach and 0.823 for the segmentation mask-based approach. △ Less

Submitted 17 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

arXiv:2211.15268 [pdf, other]

Scientific and Creative Analogies in Pretrained Language Models

Authors: Tamara Czinczoll, Helen Yannakoudakis, Pushkar Mishra, Ekaterina Shutova

Abstract: This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing sys… ▽ More This paper examines the encoding of analogy in large-scale pretrained language models, such as BERT and GPT-2. Existing analogy datasets typically focus on a limited set of analogical relations, with a high similarity of the two domains between which the analogy holds. As a more realistic setup, we introduce the Scientific and Creative Analogy dataset (SCAN), a novel analogy dataset containing systematic mappings of multiple attributes and relational structures across dissimilar domains. Using this dataset, we test the analogical reasoning capabilities of several widely-used pretrained language models (LMs). We find that state-of-the-art LMs achieve low performance on these complex analogy tasks, highlighting the challenges still posed by analogy understanding. △ Less

Submitted 28 November, 2022; originally announced November 2022.

Comments: To be published in Findings of EMNLP 2022

arXiv:2211.01338 [pdf, other]

Technology Pipeline for Large Scale Cross-Lingual Dubbing of Lecture Videos into Multiple Indian Languages

Authors: Anusha Prakash, Arun Kumar, Ashish Seth, Bhagyashree Mukherjee, Ishika Gupta, Jom Kuriakose, Jordan Fernandes, K V Vikram, Mano Ranjith Kumar M, Metilda Sagaya Mary, Mohammad Wajahat, Mohana N, Mudit Batra, Navina K, Nihal John George, Nithya Ravi, Pruthwik Mishra, Sudhanshu Srivastava, Vasista Sai Lodagala, Vandan Mujadia, Kada Sai Venkata Vineeth, Vrunda Sukhadia, Dipti Sharma, Hema Murthy, Pushpak Bhattacharya , et al. (2 additional authors not shown)

Abstract: Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages… ▽ More Cross-lingual dubbing of lecture videos requires the transcription of the original audio, correction and removal of disfluencies, domain term discovery, text-to-text translation into the target language, chunking of text using target language rhythm, text-to-speech synthesis followed by isochronous lipsyncing to the original video. This task becomes challenging when the source and target languages belong to different language families, resulting in differences in generated audio duration. This is further compounded by the original speaker's rhythm, especially for extempore speech. This paper describes the challenges in regenerating English lecture videos in Indian languages semi-automatically. A prototype is developed for dubbing lectures into 9 Indian languages. A mean-opinion-score (MOS) is obtained for two languages, Hindi and Tamil, on two different courses. The output video is compared with the original video in terms of MOS (1-5) and lip synchronisation with scores of 4.09 and 3.74, respectively. The human effort also reduces by 75%. △ Less

Submitted 1 November, 2022; originally announced November 2022.

arXiv:2210.15540 [pdf, other]

doi 10.1142/S0129065722500307

Masked Transformer for image Anomaly Localization

Authors: Axel De Nardin, Pankaj Mishra, Gian Luca Foresti, Claudio Piciarelli

Abstract: Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction… ▽ More Image anomaly detection consists in detecting images or image portions that are visually different from the majority of the samples in a dataset. The task is of practical importance for various real-life applications like biomedical image analysis, visual inspection in industrial production, banking, traffic management, etc. Most of the current deep learning approaches rely on image reconstruction: the input image is projected in some latent space and then reconstructed, assuming that the network (mostly trained on normal data) will not be able to reconstruct the anomalous portions. However, this assumption does not always hold. We thus propose a new model based on the Vision Transformer architecture with patch masking: the input image is split in several patches, and each patch is reconstructed only from the surrounding data, thus ignoring the potentially anomalous information contained in the patch itself. We then show that multi-resolution patches and their collective embeddings provide a large improvement in the model's performance compared to the exclusive use of the traditional square patches. The proposed model has been tested on popular anomaly detection datasets such as MVTec and head CT and achieved good results when compared to other state-of-the-art approaches. △ Less

Submitted 27 October, 2022; originally announced October 2022.

Journal ref: Int J Neural Syst. 2022;32(7):2250030

arXiv:2207.09980 [pdf, other]

ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective

Authors: Yihong Chen, Pushkar Mishra, Luca Franceschi, Pasquale Minervini, Pontus Stenetorp, Sebastian Riedel

Abstract: Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs). However, unlike GNNs, FMs struggle to incorporate node features and generalise to unseen nodes in inductive settings. Our work bridges the gap between FMs and GNNs by proposing ReFactor GNNs. This new architecture draws upon… ▽ More Factorisation-based Models (FMs), such as DistMult, have enjoyed enduring success for Knowledge Graph Completion (KGC) tasks, often outperforming Graph Neural Networks (GNNs). However, unlike GNNs, FMs struggle to incorporate node features and generalise to unseen nodes in inductive settings. Our work bridges the gap between FMs and GNNs by proposing ReFactor GNNs. This new architecture draws upon both modelling paradigms, which previously were largely thought of as disjoint. Concretely, using a message-passing formalism, we show how FMs can be cast as GNNs by reformulating the gradient descent procedure as message-passing operations, which forms the basis of our ReFactor GNNs. Across a multitude of well-established KGC benchmarks, our ReFactor GNNs achieve comparable transductive performance to FMs, and state-of-the-art inductive performance while using an order of magnitude fewer parameters. △ Less

Submitted 27 October, 2022; v1 submitted 20 July, 2022; originally announced July 2022.

Comments: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

MSC Class: 68T05; 68T07; 68T50 ACM Class: I.2.7; I.2.6

Showing 1–50 of 117 results for author: Mishra, P