Search | arXiv e-print repository

Sampling-Based Attack for Centrality Disruption in Complex Networks

Authors: Fariba Afrin Irany, Soumya Sarakar, Animesh Mukherjee, Sanjukta Bhowmick

Abstract: Many mobile networks are represented as graphs to obtain insight to their connectivity and transmission properties. Among these properties centrality resilience, that is, how well centralities, such as closeness and betweennesss, are maintained under attacks is a critical factor for proper functioning of a network. In this paper, we study the centrality resilience of complex networks by developing… ▽ More Many mobile networks are represented as graphs to obtain insight to their connectivity and transmission properties. Among these properties centrality resilience, that is, how well centralities, such as closeness and betweennesss, are maintained under attacks is a critical factor for proper functioning of a network. In this paper, we study the centrality resilience of complex networks by developing attack models to disrupt the rank of the top path-based centrality vertices. To develop our attack models, we extend the concept of rich clubs of influential vertices to the more general framework of scattered rich clubs. We define scattered rich clubs as dense subgraphs of high centrality vertices that are spread (scattered) across the network. Finding scattered rich clubs, although of polynomial time complexity, is extremely expensive computationally. We use snowball sampling to identify these important substructures as well as to identify which edges to target in our proposed attack models. Our results over a set of real world networks demonstrate that our proposed algorithm is effective in finding the single or scattered rich clubs efficiently and in successfully disrupting the centrality rankings of the network. To summarize, we propose sampling-based attack models for testing the resilience of networks with respect to centrality rankings. As part of this process, we introduce scattered rich clubs, a generalized form of the rich club model, efficient algorithms to detect them, and demonstrate their relation to network resilience. △ Less

Submitted 9 July, 2024; originally announced July 2024.

Comments: 10 pages, 8 figures, 3 tables

arXiv:2407.06137 [pdf, ps, other]

OMuSense-23: A Multimodal Dataset for Contactless Breathing Pattern Recognition and Biometric Analysis

Authors: Manuel Lage Cañellas, Le Nguyen, Anirban Mukherjee, Constantino Álvarez Casado, Xiaoting Wu, Praneeth Susarla, Sasan Sharifipour, Dinesh B. Jayagopi, Miguel Bordallo López

Abstract: In the domain of non-contact biometrics and human activity recognition, the lack of a versatile, multimodal dataset poses a significant bottleneck. To address this, we introduce the Oulu Multi Sensing (OMuSense-23) dataset that includes biosignals obtained from a mmWave radar, and an RGB-D camera. The dataset features data from 50 individuals in three distinct poses -- standing, sitting, and lying… ▽ More In the domain of non-contact biometrics and human activity recognition, the lack of a versatile, multimodal dataset poses a significant bottleneck. To address this, we introduce the Oulu Multi Sensing (OMuSense-23) dataset that includes biosignals obtained from a mmWave radar, and an RGB-D camera. The dataset features data from 50 individuals in three distinct poses -- standing, sitting, and lying down -- each featuring four specific breathing pattern activities: regular breathing, reading, guided breathing, and apnea, encompassing both typical situations (e.g., sitting with normal breathing) and critical conditions (e.g., lying down without breathing). In our work, we present a detailed overview of the OMuSense-23 dataset, detailing the data acquisition protocol, describing the process for each participant. In addition, we provide, a baseline evaluation of several data analysis tasks related to biometrics, breathing pattern recognition and pose identification. Our results achieve a pose identification accuracy of 87\% and breathing pattern activity recognition of 83\% using features extracted from biosignals. The OMuSense-23 dataset is publicly available as resource for other researchers and practitioners in the field. △ Less

Submitted 22 May, 2024; originally announced July 2024.

arXiv:2407.02067 [pdf, other]

Crossroads of Continents: Automated Artifact Extraction for Cultural Adaptation with Large Multimodal Models

Authors: Anjishnu Mukherjee, Ziwei Zhu, Antonios Anastasopoulos

Abstract: In this work, we present a comprehensive three-phase study to examine (1) the effectiveness of large multimodal models (LMMs) in recognizing cultural contexts; (2) the accuracy of their representations of diverse cultures; and (3) their ability to adapt content across cultural boundaries. We first introduce Dalle Street, a large-scale dataset generated by DALL-E 3 and validated by humans, containi… ▽ More In this work, we present a comprehensive three-phase study to examine (1) the effectiveness of large multimodal models (LMMs) in recognizing cultural contexts; (2) the accuracy of their representations of diverse cultures; and (3) their ability to adapt content across cultural boundaries. We first introduce Dalle Street, a large-scale dataset generated by DALL-E 3 and validated by humans, containing 9,935 images of 67 countries and 10 concept classes. We reveal disparities in cultural understanding at the sub-region level with both open-weight (LLaVA) and closed-source (GPT-4V) models on Dalle Street and other existing benchmarks. Next, we assess models' deeper culture understanding by an artifact extraction task and identify over 18,000 artifacts associated with different countries. Finally, we propose a highly composable pipeline, CultureAdapt, to adapt images from culture to culture. Our findings reveal a nuanced picture of the cultural competence of LMMs, highlighting the need to develop culture-aware systems. Dataset and code are available at https://github.com/iamshnoo/crossroads △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: under review

arXiv:2407.02066 [pdf, other]

BiasDora: Exploring Hidden Biased Associations in Vision-Language Models

Authors: Chahat Raj, Anjishnu Mukherjee, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu

Abstract: Existing works examining Vision Language Models (VLMs) for social biases predominantly focus on a limited set of documented bias associations, such as gender:profession or race:crime. This narrow scope often overlooks a vast range of unexamined implicit associations, restricting the identification and, hence, mitigation of such biases. We address this gap by probing VLMs to (1) uncover hidden, imp… ▽ More Existing works examining Vision Language Models (VLMs) for social biases predominantly focus on a limited set of documented bias associations, such as gender:profession or race:crime. This narrow scope often overlooks a vast range of unexamined implicit associations, restricting the identification and, hence, mitigation of such biases. We address this gap by probing VLMs to (1) uncover hidden, implicit associations across 9 bias dimensions. We systematically explore diverse input and output modalities and (2) demonstrate how biased associations vary in their negativity, toxicity, and extremity. Our work (3) identifies subtle and extreme biases that are typically not recognized by existing methodologies. We make the Dataset of retrieved associations, (Dora), publicly available here https://github.com/chahatraj/BiasDora. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Under Review

arXiv:2407.02030 [pdf, other]

Breaking Bias, Building Bridges: Evaluation and Mitigation of Social Biases in LLMs via Contact Hypothesis

Authors: Chahat Raj, Anjishnu Mukherjee, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu

Abstract: Large Language Models (LLMs) perpetuate social biases, reflecting prejudices in their training data and reinforcing societal stereotypes and inequalities. Our work explores the potential of the Contact Hypothesis, a concept from social psychology for debiasing LLMs. We simulate various forms of social contact through LLM prompting to measure their influence on the model's biases, mirroring how int… ▽ More Large Language Models (LLMs) perpetuate social biases, reflecting prejudices in their training data and reinforcing societal stereotypes and inequalities. Our work explores the potential of the Contact Hypothesis, a concept from social psychology for debiasing LLMs. We simulate various forms of social contact through LLM prompting to measure their influence on the model's biases, mirroring how intergroup interactions can reduce prejudices in social contexts. We create a dataset of 108,000 prompts following a principled approach replicating social contact to measure biases in three LLMs (LLaMA 2, Tulu, and NousHermes) across 13 social bias dimensions. We propose a unique debiasing technique, Social Contact Debiasing (SCD), that instruction-tunes these models with unbiased responses to prompts. Our research demonstrates that LLM responses exhibit social biases when subject to contact probing, but more importantly, these biases can be significantly reduced by up to 40% in 1 epoch of instruction tuning LLaMA 2 following our SCD strategy. Our code and data are available at https://github.com/chahatraj/breakingbias. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: Under Review

arXiv:2407.01732 [pdf, other]

Investigating Nudges toward Related Sellers on E-commerce Marketplaces: A Case Study on Amazon

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Krishna P. Gummadi

Abstract: E-commerce marketplaces provide business opportunities to millions of sellers worldwide. Some of these sellers have special relationships with the marketplace by virtue of using their subsidiary services (e.g., fulfillment and/or shipping services provided by the marketplace) -- we refer to such sellers collectively as Related Sellers. When multiple sellers offer to sell the same product, the mark… ▽ More E-commerce marketplaces provide business opportunities to millions of sellers worldwide. Some of these sellers have special relationships with the marketplace by virtue of using their subsidiary services (e.g., fulfillment and/or shipping services provided by the marketplace) -- we refer to such sellers collectively as Related Sellers. When multiple sellers offer to sell the same product, the marketplace helps a customer in selecting an offer (by a seller) through (a) a default offer selection algorithm, (b) showing features about each of the offers and the corresponding sellers (price, seller performance metrics, seller's number of ratings etc.), and (c) finally evaluating the sellers along these features. In this paper, we perform an end-to-end investigation into how the above apparatus can nudge customers toward the Related Sellers on Amazon's four different marketplaces in India, USA, Germany and France. We find that given explicit choices, customers' preferred offers and algorithmically selected offers can be significantly different. We highlight that Amazon is adopting different performance metric evaluation policies for different sellers, potentially benefiting Related Sellers. For instance, such policies result in notable discrepancy between the actual performance metric and the presented performance metric of Related Sellers. We further observe that among the seller-centric features visible to customers, sellers' number of ratings influences their decisions the most, yet it may not reflect the true quality of service by the seller, rather reflecting the scale at which the seller operates, thereby implicitly steering customers toward larger Related Sellers. Moreover, when customers are shown the rectified metrics for the different sellers, their preference toward Related Sellers is almost halved. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: This work has been accepted for presentation at the ACM Conference on Computer-Supported Cooperative Work and Social Computing (CSCW) 2024. It will appear in Proceedings of the ACM on Human-Computer Interaction

arXiv:2407.00229 [pdf, other]

SemUV: Deep Learning based semantic manipulation over UV texture map of virtual human heads

Authors: Anirban Mukherjee, Venkat Suprabath Bitra, Vignesh Bondugula, Tarun Reddy Tallapureddy, Dinesh Babu Jayagopi

Abstract: Designing and manipulating virtual human heads is essential across various applications, including AR, VR, gaming, human-computer interaction and VFX. Traditional graphic-based approaches require manual effort and resources to achieve accurate representation of human heads. While modern deep learning techniques can generate and edit highly photorealistic images of faces, their focus remains predom… ▽ More Designing and manipulating virtual human heads is essential across various applications, including AR, VR, gaming, human-computer interaction and VFX. Traditional graphic-based approaches require manual effort and resources to achieve accurate representation of human heads. While modern deep learning techniques can generate and edit highly photorealistic images of faces, their focus remains predominantly on 2D facial images. This limitation makes them less suitable for 3D applications. Recognizing the vital role of editing within the UV texture space as a key component in the 3D graphics pipeline, our work focuses on this aspect to benefit graphic designers by providing enhanced control and precision in appearance manipulation. Research on existing methods within the UV texture space is limited, complex, and poses challenges. In this paper, we introduce SemUV: a simple and effective approach using the FFHQ-UV dataset for semantic manipulation directly within the UV texture space. We train a StyleGAN model on the publicly available FFHQ-UV dataset, and subsequently train a boundary for interpolation and semantic feature manipulation. Through experiments comparing our method with 2D manipulation technique, we demonstrate its superior ability to preserve identity while effectively modifying semantic features such as age, gender, and facial hair. Our approach is simple, agnostic to other 3D components such as structure, lighting, and rendering, and also enables seamless integration into standard 3D graphics pipelines without demanding extensive domain expertise, time, or resources. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: CVIP 2024 Preprint

arXiv:2406.19543 [pdf, other]

Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management

Authors: Seid Muhie Yimam, Daryna Dementieva, Tim Fischer, Daniil Moskovskiy, Naquee Rizwan, Punyajoy Saha, Sarthak Roy, Martin Semmann, Alexander Panchenko, Chris Biemann, Animesh Mukherjee

Abstract: Despite regulations imposed by nations and social media platforms, such as recent EU regulations targeting digital violence, abusive content persists as a significant challenge. Existing approaches primarily rely on binary solutions, such as outright blocking or banning, yet fail to address the complex nature of abusive speech. In this work, we propose a more comprehensive approach called Demarcat… ▽ More Despite regulations imposed by nations and social media platforms, such as recent EU regulations targeting digital violence, abusive content persists as a significant challenge. Existing approaches primarily rely on binary solutions, such as outright blocking or banning, yet fail to address the complex nature of abusive speech. In this work, we propose a more comprehensive approach called Demarcation scoring abusive speech based on four aspect -- (i) severity scale; (ii) presence of a target; (iii) context scale; (iv) legal scale -- and suggesting more options of actions like detoxification, counter speech generation, blocking, or, as a final measure, human intervention. Through a thorough analysis of abusive speech regulations across diverse jurisdictions, platforms, and research papers we highlight the gap in preventing measures and advocate for tailored proactive steps to combat its multifaceted manifestations. Our work aims to inform future strategies for effectively addressing abusive speech online. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.14012 [pdf, other]

Seeing Through AI's Lens: Enhancing Human Skepticism Towards LLM-Generated Fake News

Authors: Navid Ayoobi, Sadat Shahriar, Arjun Mukherjee

Abstract: LLMs offer valuable capabilities, yet they can be utilized by malicious users to disseminate deceptive information and generate fake news. The growing prevalence of LLMs poses difficulties in crafting detection approaches that remain effective across various text domains. Additionally, the absence of precautionary measures for AI-generated news on online social platforms is concerning. Therefore,… ▽ More LLMs offer valuable capabilities, yet they can be utilized by malicious users to disseminate deceptive information and generate fake news. The growing prevalence of LLMs poses difficulties in crafting detection approaches that remain effective across various text domains. Additionally, the absence of precautionary measures for AI-generated news on online social platforms is concerning. Therefore, there is an urgent need to improve people's ability to differentiate between news articles written by humans and those produced by LLMs. By providing cues in human-written and LLM-generated news, we can help individuals increase their skepticism towards fake LLM-generated news. This paper aims to elucidate simple markers that help individuals distinguish between articles penned by humans and those created by LLMs. To achieve this, we initially collected a dataset comprising 39k news articles authored by humans or generated by four distinct LLMs with varying degrees of fake. We then devise a metric named Entropy-Shift Authorship Signature (ESAS) based on the information theory and entropy principles. The proposed ESAS ranks terms or entities, like POS tagging, within news articles based on their relevance in discerning article authorship. We demonstrate the effectiveness of our metric by showing the high accuracy attained by a basic method, i.e., TF-IDF combined with logistic regression classifier, using a small set of terms with the highest ESAS score. Consequently, we introduce and scrutinize these top ESAS-ranked terms to aid individuals in strengthening their skepticism towards LLM-generated fake news. △ Less

Submitted 20 June, 2024; originally announced June 2024.

arXiv:2406.12274 [pdf, other]

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

Authors: Somnath Banerjee, Soham Tripathy, Sayan Layek, Shanu Kumar, Animesh Mukherjee, Rima Hazra

Abstract: Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through editing techniques to language models can further compromise safety. To address these issues, we propose SafeInfer, a context-adaptive, decoding-time safety alignment strategy for generating safe responses to… ▽ More Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through editing techniques to language models can further compromise safety. To address these issues, we propose SafeInfer, a context-adaptive, decoding-time safety alignment strategy for generating safe responses to user queries. SafeInfer comprises two phases: the safety amplification phase, which employs safe demonstration examples to adjust the model's hidden states and increase the likelihood of safer outputs, and the safety-guided decoding phase, which influences token selection based on safety-optimized distributions, ensuring the generated content complies with ethical guidelines. Further, we present HarmEval, a novel benchmark for extensive safety evaluations, designed to address potential misuse scenarios in accordance with the policies of leading AI tech giants. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2406.11139 [pdf, other]

Breaking Boundaries: Investigating the Effects of Model Editing on Cross-linguistic Performance

Authors: Somnath Banerjee, Avik Halder, Rajarshi Mandal, Sayan Layek, Ian Soboroff, Rima Hazra, Animesh Mukherjee

Abstract: The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Ll… ▽ More The integration of pretrained language models (PLMs) like BERT and GPT has revolutionized NLP, particularly for English, but it has also created linguistic imbalances. This paper strategically identifies the need for linguistic equity by examining several knowledge editing techniques in multilingual contexts. We evaluate the performance of models such as Mistral, TowerInstruct, OpenHathi, Tamil-Llama, and Kan-Llama across languages including English, German, French, Italian, Spanish, Hindi, Tamil, and Kannada. Our research identifies significant discrepancies in normal and merged models concerning cross-lingual consistency. We employ strategies like 'each language for itself' (ELFI) and 'each language for others' (ELFO) to stress-test these models. Our findings demonstrate the potential for LLMs to overcome linguistic barriers, laying the groundwork for future research in achieving linguistic inclusivity in AI technologies. △ Less

Submitted 17 July, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

Comments: Under review

arXiv:2405.13559 [pdf]

Identification of microstructure from macroscopic measurement using inverse multiscale analysis

Authors: Anjan Mukherjee, Biswanth Banerjee

Abstract: Most of the tailored materials are heterogeneous at the ingredient level. Analysis of those heterogeneous structures requires the knowledge of microstructure. With the knowledge of microstructure, multiscale analysis is carried out with homogenization at the micro level. Second-order homogenization is carried out whenever the ingredient size is comparable to the structure size. Therefore, knowledg… ▽ More Most of the tailored materials are heterogeneous at the ingredient level. Analysis of those heterogeneous structures requires the knowledge of microstructure. With the knowledge of microstructure, multiscale analysis is carried out with homogenization at the micro level. Second-order homogenization is carried out whenever the ingredient size is comparable to the structure size. Therefore, knowledge of microstructure and its size is indispensable to analyzing those heterogeneous structures. Again, any structural response contains all the information of microstructure, like microstructure distribution, volume fraction, size of ingredients, etc. Here, inverse analysis is carried out to identify a heterogeneous microstructure from macroscopic measurement. Two-step inverse analysis is carried out in the identification process; in the first step, the macrostructures length scale and effective properties are identified from the macroscopic measurement using gradient-based optimization. In the second step, those effective properties and length scales are used to determine the microstructure in inverse second-order homogenization. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Structural Engineering Convention SEC 2023

arXiv:2405.13384 [pdf, other]

Elastic-gap free strain gradient crystal plasticity model that effectively account for plastic slip gradient and grain boundary dissipation

Authors: Anjan Mukherjee, Biswanath Banerjee

Abstract: This paper proposes an elastic-gap free strain gradient crystal plasticity model that addresses dissipation caused by plastic slip gradient and grain boundary (GB) Burger tensor. The model involves splitting plastic slip gradient and GB Burger tensor into energetic dissipative quantities. Unlike conventional models, the bulk and GB defect energy are considered to be a quadratic functional of the e… ▽ More This paper proposes an elastic-gap free strain gradient crystal plasticity model that addresses dissipation caused by plastic slip gradient and grain boundary (GB) Burger tensor. The model involves splitting plastic slip gradient and GB Burger tensor into energetic dissipative quantities. Unlike conventional models, the bulk and GB defect energy are considered to be a quadratic functional of the energetic portion of slip gradient and GB Burgers tensor. The higher-order stresses for each individual slip systems and GB stresses are derived from the defect energy, following a similar evolution as the Armstrong-Frederick type backstress model in classical plasticity. The evolution equations consist of a hardening and a relaxation term. The relaxation term brings the nonlinearity in hardening and causes an additional dissipation. The applicability of the proposed model is numerically established with the help of two-dimensional finite element implementation. Specifically, the bulk and GB relaxation coefficients are critically evaluated based on various circumstances, considering single crystal infinite shear layer, periodic bicrystal shearing, and bicrystal tension problem. In contrast to the Gurtin-type model, the proposed model smoothly captures the apparent strengthening at saturation without causing any abrupt stress jump under non-proportional loading conditions. Moreover, when subjected to cyclic loading, the stress-strain curve maintains its curvature during reverse loading. The numerical simulation reveals that the movement of geometrically necessary dislocation (GND) towards the GB is influenced by the bulk recovery coefficient, while the dissipation and amount of accumulation of GND near the GB are controlled by the GB recovery coefficient. △ Less

Submitted 22 May, 2024; originally announced May 2024.

Comments: Submitted in Journal of the Mechanics and Physics of Solids

arXiv:2405.07795 [pdf, other]

Improved Bound for Robust Causal Bandits with Linear Models

Authors: Zirui Yan, Arpan Mukherjee, Burak Varıcı, Ali Tajer

Abstract: This paper investigates the robustness of causal bandits (CBs) in the face of temporal model fluctuations. This setting deviates from the existing literature's widely-adopted assumption of constant causal models. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown and subject to… ▽ More This paper investigates the robustness of causal bandits (CBs) in the face of temporal model fluctuations. This setting deviates from the existing literature's widely-adopted assumption of constant causal models. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown and subject to variations over time. The goal is to design a sequence of interventions that incur the smallest cumulative regret compared to an oracle aware of the entire causal model and its fluctuations. A robust CB algorithm is proposed, and its cumulative regret is analyzed by establishing both upper and lower bounds on the regret. It is shown that in a graph with maximum in-degree $d$, length of the largest causal path $L$, and an aggregate model deviation $C$, the regret is upper bounded by $\tilde{\mathcal{O}}(d^{L-\frac{1}{2}}(\sqrt{T} + C))$ and lower bounded by $Ω(d^{\frac{L}{2}-2}\max\{\sqrt{T}\; ,\; d^2C\})$. The proposed algorithm achieves nearly optimal $\tilde{\mathcal{O}}(\sqrt{T})$ regret when $C$ is $o(\sqrt{T})$, maintaining sub-linear regret for a broad range of $C$. △ Less

Submitted 13 May, 2024; originally announced May 2024.

Comments: 11 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2310.19794

arXiv:2405.03963 [pdf, other]

ERATTA: Extreme RAG for Table To Answers with Large Language Models

Authors: Sohini Roychowdhury, Marko Krema, Anvar Mahammad, Brian Moore, Arijit Mukherjee, Punit Prakashchandra

Abstract: Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions in the recent past. However, the choice of use-cases that incorporate RAG with LLMs have been either generic or extremely domain specific, thereby questioning the scalability and generalizability of RAG-LLM approaches. In this work, we propose a unique LLM-based… ▽ More Large language models (LLMs) with retrieval augmented-generation (RAG) have been the optimal choice for scalable generative AI solutions in the recent past. However, the choice of use-cases that incorporate RAG with LLMs have been either generic or extremely domain specific, thereby questioning the scalability and generalizability of RAG-LLM approaches. In this work, we propose a unique LLM-based system where multiple LLMs can be invoked to enable data authentication, user query routing, data retrieval and custom prompting for question answering capabilities from data tables that are highly varying and large in size. Our system is tuned to extract information from Enterprise-level data products and furnish real time responses under 10 seconds. One prompt manages user-to-data authentication followed by three prompts to route, fetch data and generate a customizable prompt natural language responses. Additionally, we propose a five metric scoring module that detects and reports hallucinations in the LLM responses. Our proposed system and scoring metrics achieve >90% confidence scores across hundreds of user queries in the sustainability, financial health and social media domains. Extensions to the proposed extreme RAG architectures can enable heterogeneous source querying using LLMs. △ Less

Submitted 14 May, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

Comments: 5 pages, 3 tables, Asilomar SSC Conference, 2024

arXiv:2404.08624 [pdf, other]

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

Authors: Matteo Tucat, Anirbit Mukherjee

Abstract: In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width. We present empirical evidence that our theoretically founded regularized gradient clipping algorithm is also competitive with the state-of-the-art deep-learning heuristics. Hence th… ▽ More In this work, we instantiate a regularized form of the gradient clipping algorithm and prove that it can converge to the global minima of deep neural network loss functions provided that the net is of sufficient width. We present empirical evidence that our theoretically founded regularized gradient clipping algorithm is also competitive with the state-of-the-art deep-learning heuristics. Hence the algorithm presented here constitutes a new approach to rigorous deep learning. The modification we do to standard gradient clipping is designed to leverage the PL* condition, a variant of the Polyak-Lojasiewicz inequality which was recently proven to be true for various neural networks for any depth within a neighborhood of the initialisation. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Comments: 16 pages, 4 figures

arXiv:2404.04979 [pdf, other]

CAVIAR: Categorical-Variable Embeddings for Accurate and Robust Inference

Authors: Anirban Mukherjee, Hannah Hanwen Chang

Abstract: Social science research often hinges on the relationship between categorical variables and outcomes. We introduce CAVIAR, a novel method for embedding categorical variables that assume values in a high-dimensional ambient space but are sampled from an underlying manifold. Our theoretical and numerical analyses outline challenges posed by such categorical variables in causal inference. Specifically… ▽ More Social science research often hinges on the relationship between categorical variables and outcomes. We introduce CAVIAR, a novel method for embedding categorical variables that assume values in a high-dimensional ambient space but are sampled from an underlying manifold. Our theoretical and numerical analyses outline challenges posed by such categorical variables in causal inference. Specifically, dynamically varying and sparse levels can lead to violations of the Donsker conditions and a failure of the estimation functionals to converge to a tight Gaussian process. Traditional approaches, including the exclusion of rare categorical levels and principled variable selection models like LASSO, fall short. CAVIAR embeds the data into a lower-dimensional global coordinate system. The mapping can be derived from both structured and unstructured data, and ensures stable and robust estimates through dimensionality reduction. In a dataset of direct-to-consumer apparel sales, we illustrate how high-dimensional categorical variables, such as zip codes, can be succinctly represented, facilitating inference and analysis. △ Less

Submitted 11 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.04436 [pdf, other]

AI Knowledge and Reasoning: Emulating Expert Creativity in Scientific Research

Authors: Anirban Mukherjee, Hannah Hanwen Chang

Abstract: We investigate whether modern AI can emulate expert creativity in complex scientific endeavors. We introduce novel methodology that utilizes original research articles published after the AI's training cutoff, ensuring no prior exposure, mitigating concerns of rote memorization and prior training. The AI are tasked with redacting findings, predicting outcomes from redacted research, and assessing… ▽ More We investigate whether modern AI can emulate expert creativity in complex scientific endeavors. We introduce novel methodology that utilizes original research articles published after the AI's training cutoff, ensuring no prior exposure, mitigating concerns of rote memorization and prior training. The AI are tasked with redacting findings, predicting outcomes from redacted research, and assessing prediction accuracy against reported results. Analysis on 589 published studies in four leading psychology journals over a 28-month period, showcase the AI's proficiency in understanding specialized research, deductive reasoning, and evaluating evidentiary alignment--cognitive hallmarks of human subject matter expertise and creativity. These findings suggest the potential of general-purpose AI to transform academia, with roles requiring knowledge-based creativity become increasingly susceptible to technological substitution. △ Less

Submitted 5 April, 2024; originally announced April 2024.

arXiv:2404.00185 [pdf, other]

On Inherent Adversarial Robustness of Active Vision Systems

Authors: Amitangshu Mukherjee, Timur Ibrayev, Kaushik Roy

Abstract: Current Deep Neural Networks are vulnerable to adversarial examples, which alter their predictions by adding carefully crafted noise. Since human eyes are robust to such inputs, it is possible that the vulnerability stems from the standard way of processing inputs in one shot by processing every pixel with the same importance. In contrast, neuroscience suggests that the human vision system can dif… ▽ More Current Deep Neural Networks are vulnerable to adversarial examples, which alter their predictions by adding carefully crafted noise. Since human eyes are robust to such inputs, it is possible that the vulnerability stems from the standard way of processing inputs in one shot by processing every pixel with the same importance. In contrast, neuroscience suggests that the human vision system can differentiate salient features by (1) switching between multiple fixation points (saccades) and (2) processing the surrounding with a non-uniform external resolution (foveation). In this work, we advocate that the integration of such active vision mechanisms into current deep learning systems can offer robustness benefits. Specifically, we empirically demonstrate the inherent robustness of two active vision methods - GFNet and FALcon - under a black box threat model. By learning and inferencing based on downsampled glimpses obtained from multiple distinct fixation points within an input, we show that these active methods achieve (2-3) times greater robustness compared to a standard passive convolutional network under state-of-the-art adversarial attacks. More importantly, we provide illustrative and interpretable visualization analysis that demonstrates how performing inference from distinct fixation points makes active vision methods less vulnerable to malicious inputs. △ Less

Submitted 5 April, 2024; v1 submitted 29 March, 2024; originally announced April 2024.

arXiv:2404.00017 [pdf, other]

Psittacines of Innovation? Assessing the True Novelty of AI Creations

Authors: Anirban Mukherjee

Abstract: We examine whether Artificial Intelligence (AI) systems generate truly novel ideas rather than merely regurgitating patterns learned during training. Utilizing a novel experimental design, we task an AI with generating project titles for hypothetical crowdfunding campaigns. We compare within AI-generated project titles, measuring repetition and complexity. We compare between the AI-generated title… ▽ More We examine whether Artificial Intelligence (AI) systems generate truly novel ideas rather than merely regurgitating patterns learned during training. Utilizing a novel experimental design, we task an AI with generating project titles for hypothetical crowdfunding campaigns. We compare within AI-generated project titles, measuring repetition and complexity. We compare between the AI-generated titles and actual observed field data using an extension of maximum mean discrepancy--a metric derived from the application of kernel mean embeddings of statistical distributions to high-dimensional machine learning (large language) embedding vectors--yielding a structured analysis of AI output novelty. Results suggest that (1) the AI generates unique content even under increasing task complexity, and at the limits of its computational capabilities, (2) the generated content has face validity, being consistent with both inputs to other generative AI and in qualitative comparison to field data, and (3) exhibits divergence from field data, mitigating concerns relating to intellectual property rights. We discuss implications for copyright and trademark law. △ Less

Submitted 17 March, 2024; originally announced April 2024.

arXiv:2403.18623 [pdf, other]

Antitrust, Amazon, and Algorithmic Auditing

Authors: Abhisek Dash, Abhijnan Chakraborty, Saptarshi Ghosh, Animesh Mukherjee, Jens Frankenreiter, Stefan Bechtold, Krishna P. Gummadi

Abstract: In digital markets, antitrust law and special regulations aim to ensure that markets remain competitive despite the dominating role that digital platforms play today in everyone's life. Unlike traditional markets, market participant behavior is easily observable in these markets. We present a series of empirical investigations into the extent to which Amazon engages in practices that are typically… ▽ More In digital markets, antitrust law and special regulations aim to ensure that markets remain competitive despite the dominating role that digital platforms play today in everyone's life. Unlike traditional markets, market participant behavior is easily observable in these markets. We present a series of empirical investigations into the extent to which Amazon engages in practices that are typically described as self-preferencing. We discuss how the computer science tools used in this paper can be used in a regulatory environment that is based on algorithmic auditing and requires regulating digital markets at scale. △ Less

Submitted 25 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

Comments: The paper has been accepted to appear at Journal of Institutional and Theoretical Economics (JITE) 2024

arXiv:2403.17692 [pdf, other]

Manifold-Guided Lyapunov Control with Diffusion Models

Authors: Amartya Mukherjee, Thanin Quartz, Jun Liu

Abstract: This paper presents a novel approach to generating stabilizing controllers for a large class of dynamical systems using diffusion models. The core objective is to develop stabilizing control functions by identifying the closest asymptotically stable vector field relative to a predetermined manifold and adjusting the control function based on this finding. To achieve this, we employ a diffusion mod… ▽ More This paper presents a novel approach to generating stabilizing controllers for a large class of dynamical systems using diffusion models. The core objective is to develop stabilizing control functions by identifying the closest asymptotically stable vector field relative to a predetermined manifold and adjusting the control function based on this finding. To achieve this, we employ a diffusion model trained on pairs consisting of asymptotically stable vector fields and their corresponding Lyapunov functions. Our numerical results demonstrate that this pre-trained model can achieve stabilization over previously unseen systems efficiently and rapidly, showcasing the potential of our approach in fast zero-shot control and generalizability. △ Less

Submitted 26 March, 2024; originally announced March 2024.

Comments: 14 pages

arXiv:2403.15977 [pdf, other]

doi 10.1109/TCDS.2024.3390597

Towards Two-Stream Foveation-based Active Vision Learning

Authors: Timur Ibrayev, Amitangshu Mukherjee, Sai Aparna Aketi, Kaushik Roy

Abstract: Deep neural network (DNN) based machine perception frameworks process the entire input in a one-shot manner to provide answers to both "what object is being observed" and "where it is located". In contrast, the "two-stream hypothesis" from neuroscience explains the neural processing in the human visual cortex as an active vision system that utilizes two separate regions of the brain to answer the… ▽ More Deep neural network (DNN) based machine perception frameworks process the entire input in a one-shot manner to provide answers to both "what object is being observed" and "where it is located". In contrast, the "two-stream hypothesis" from neuroscience explains the neural processing in the human visual cortex as an active vision system that utilizes two separate regions of the brain to answer the what and the where questions. In this work, we propose a machine learning framework inspired by the "two-stream hypothesis" and explore the potential benefits that it offers. Specifically, the proposed framework models the following mechanisms: 1) ventral (what) stream focusing on the input regions perceived by the fovea part of an eye (foveation), 2) dorsal (where) stream providing visual guidance, and 3) iterative processing of the two streams to calibrate visual focus and process the sequence of focused image patches. The training of the proposed framework is accomplished by label-based DNN training for the ventral stream model and reinforcement learning for the dorsal stream model. We show that the two-stream foveation-based learning is applicable to the challenging task of weakly-supervised object localization (WSOL), where the training data is limited to the object class or its attributes. The framework is capable of both predicting the properties of an object and successfully localizing it by predicting its bounding box. We also show that, due to the independent nature of the two streams, the dorsal model can be applied on its own to unseen images to localize objects from different datasets. △ Less

Submitted 20 April, 2024; v1 submitted 23 March, 2024; originally announced March 2024.

Comments: Accepted version of the article, 18 pages, 14 figures

Journal ref: IEEE Transactions on Cognitive and Developmental Systems, 2024

arXiv:2403.14938 [pdf, ps, other]

On Zero-Shot Counterspeech Generation by LLMs

Authors: Punyajoy Saha, Aalok Agrawal, Abhik Jana, Chris Biemann, Animesh Mukherjee

Abstract: With the emergence of numerous Large Language Models (LLM), the usage of such models in various Natural Language Processing (NLP) applications is increasing extensively. Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech - counterspeech pairs, but none of these attempts explores the intrinsic properties of large lan… ▽ More With the emergence of numerous Large Language Models (LLM), the usage of such models in various Natural Language Processing (NLP) applications is increasing extensively. Counterspeech generation is one such key task where efforts are made to develop generative models by fine-tuning LLMs with hatespeech - counterspeech pairs, but none of these attempts explores the intrinsic properties of large language models in zero-shot settings. In this work, we present a comprehensive analysis of the performances of four LLMs namely GPT-2, DialoGPT, ChatGPT and FlanT5 in zero-shot settings for counterspeech generation, which is the first of its kind. For GPT-2 and DialoGPT, we further investigate the deviation in performance with respect to the sizes (small, medium, large) of the models. On the other hand, we propose three different prompting strategies for generating different types of counterspeech and analyse the impact of such strategies on the performance of the models. Our analysis shows that there is an improvement in generation quality for two datasets (17%), however the toxicity increase (25%) with increase in model size. Considering type of model, GPT-2 and FlanT5 models are significantly better in terms of counterspeech quality but also have high toxicity as compared to DialoGPT. ChatGPT are much better at generating counter speech than other models across all metrics. In terms of prompting, we find that our proposed strategies help in improving counter speech generation across all the models. △ Less

Submitted 22 March, 2024; originally announced March 2024.

Comments: 12 pages, 7 tables, accepted at LREC-COLING 2024

arXiv:2403.14706 [pdf, other]

Safeguarding Marketing Research: The Generation, Identification, and Mitigation of AI-Fabricated Disinformation

Authors: Anirban Mukherjee

Abstract: Generative AI has ushered in the ability to generate content that closely mimics human contributions, introducing an unprecedented threat: Deployed en masse, these models can be used to manipulate public opinion and distort perceptions, resulting in a decline in trust towards digital platforms. This study contributes to marketing literature and practice in three ways. First, it demonstrates the pr… ▽ More Generative AI has ushered in the ability to generate content that closely mimics human contributions, introducing an unprecedented threat: Deployed en masse, these models can be used to manipulate public opinion and distort perceptions, resulting in a decline in trust towards digital platforms. This study contributes to marketing literature and practice in three ways. First, it demonstrates the proficiency of AI in fabricating disinformative user-generated content (UGC) that mimics the form of authentic content. Second, it quantifies the disruptive impact of such UGC on marketing research, highlighting the susceptibility of analytics frameworks to even minimal levels of disinformation. Third, it proposes and evaluates advanced detection frameworks, revealing that standard techniques are insufficient for filtering out AI-generated disinformation. We advocate for a comprehensive approach to safeguarding marketing research that integrates advanced algorithmic solutions, enhanced human oversight, and a reevaluation of regulatory and ethical frameworks. Our study seeks to serve as a catalyst, providing a foundation for future research and policy-making aimed at navigating the intricate challenges at the nexus of technology, ethics, and marketing. △ Less

Submitted 17 March, 2024; originally announced March 2024.

arXiv:2403.10058 [pdf, other]

RID-TWIN: An end-to-end pipeline for automatic face de-identification in videos

Authors: Anirban Mukherjee, Monjoy Narayan Choudhury, Dinesh Babu Jayagopi

Abstract: Face de-identification in videos is a challenging task in the domain of computer vision, primarily used in privacy-preserving applications. Despite the considerable progress achieved through generative vision models, there remain multiple challenges in the latest approaches. They lack a comprehensive discussion and evaluation of aspects such as realism, temporal coherence, and preservation of non-… ▽ More Face de-identification in videos is a challenging task in the domain of computer vision, primarily used in privacy-preserving applications. Despite the considerable progress achieved through generative vision models, there remain multiple challenges in the latest approaches. They lack a comprehensive discussion and evaluation of aspects such as realism, temporal coherence, and preservation of non-identifiable features. In our work, we propose RID-Twin: a novel pipeline that leverages the state-of-the-art generative models, and decouples identity from motion to perform automatic face de-identification in videos. We investigate the task from a holistic point of view and discuss how our approach addresses the pertinent existing challenges in this domain. We evaluate the performance of our methodology on the widely employed VoxCeleb2 dataset, and also a custom dataset designed to accommodate the limitations of certain behavioral variations absent in the VoxCeleb2 dataset. We discuss the implications and advantages of our work and suggest directions for future research. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: This work has been submitted to IEEE ICIP 2024

arXiv:2403.09404 [pdf, other]

Heuristic Reasoning in AI: Instrumental Use and Mimetic Absorption

Authors: Anirban Mukherjee, Hannah Hanwen Chang

Abstract: Deviating from conventional perspectives that frame artificial intelligence (AI) systems solely as logic emulators, we propose a novel program of heuristic reasoning. We distinguish between the 'instrumental' use of heuristics to match resources with objectives, and 'mimetic absorption,' whereby heuristics manifest randomly and universally. Through a series of innovative experiments, including var… ▽ More Deviating from conventional perspectives that frame artificial intelligence (AI) systems solely as logic emulators, we propose a novel program of heuristic reasoning. We distinguish between the 'instrumental' use of heuristics to match resources with objectives, and 'mimetic absorption,' whereby heuristics manifest randomly and universally. Through a series of innovative experiments, including variations of the classic Linda problem and a novel application of the Beauty Contest game, we uncover trade-offs between maximizing accuracy and reducing effort that shape the conditions under which AIs transition between exhaustive logical processing and the use of cognitive shortcuts (heuristics). We provide evidence that AIs manifest an adaptive balancing of precision and efficiency, consistent with principles of resource-rational human cognition as explicated in classical theories of bounded rationality and dual-process theory. Our findings reveal a nuanced picture of AI cognition, where trade-offs between resources and objectives lead to the emulation of biological systems, especially human cognition, despite AIs being designed without a sense of self and lacking introspective capabilities. △ Less

Submitted 18 March, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.09289 [pdf, other]

Silico-centric Theory of Mind

Authors: Anirban Mukherjee, Hannah Hanwen Chang

Abstract: Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by hu… ▽ More Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by human false-belief experiments, we present an AI ('focal AI') with a scenario where its clone undergoes a human-centric ToM assessment. We prompt the focal AI to assess whether its clone would benefit from additional instructions. Concurrently, we give its clones the ToM assessment, both with and without the instructions, thereby engaging the focal AI in higher-order counterfactual reasoning akin to human mentalizing--with respect to humans in one test and to other AI in another. We uncover a discrepancy: Contemporary AI demonstrates near-perfect accuracy on human-centric ToM assessments. Since information embedded in one AI is identically embedded in its clone, additional instructions are redundant. Yet, we observe AI crafting elaborate instructions for their clones, erroneously anticipating a need for assistance. An independent referee AI agrees with these unsupported expectations. Neither the focal AI nor the referee demonstrates ToM in our 'silico-centric' test. △ Less

Submitted 14 March, 2024; originally announced March 2024.

arXiv:2403.05434 [pdf]

Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs

Authors: Arijit Nag, Animesh Mukherjee, Niloy Ganguly, Soumen Chakrabarti

Abstract: Large Language Models (LLMs) exhibit impressive zero/few-shot inference and generation quality for high-resource languages (HRLs). A few of them have been trained on low-resource languages (LRLs) and give decent performance. Owing to the prohibitive costs of training LLMs, they are usually used as a network service, with the client charged by the count of input and output tokens. The number of tok… ▽ More Large Language Models (LLMs) exhibit impressive zero/few-shot inference and generation quality for high-resource languages (HRLs). A few of them have been trained on low-resource languages (LRLs) and give decent performance. Owing to the prohibitive costs of training LLMs, they are usually used as a network service, with the client charged by the count of input and output tokens. The number of tokens strongly depends on the script and language, as well as the LLM's subword vocabulary. We show that LRLs are at a pricing disadvantage, because the well-known LLMs produce more tokens for LRLs than HRLs. This is because most currently popular LLMs are optimized for HRL vocabularies. Our objective is to level the playing field: reduce the cost of processing LRLs in contemporary LLMs while ensuring that predictive and generative qualities are not compromised. As means to reduce the number of tokens processed by the LLM, we consider code-mixing, translation, and transliteration of LRLs to HRLs. We perform an extensive study using the IndicXTREME classification and six generative tasks dataset, covering 15 Indic and 3 other languages, while using GPT-4 (one of the costliest LLM services released so far) as a commercial LLM. We observe and analyze interesting patterns involving token count, cost, and quality across a multitude of languages and tasks. We show that choosing the best policy to interact with the LLM can reduce cost by 90% while giving better or comparable performance compared to communicating with the LLM in the original LRL. △ Less

Submitted 18 April, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.04298 [pdf, other]

doi 10.1109/WI-IAT55865.2022.00096

Understanding how social discussion platforms like Reddit are influencing financial behavior

Authors: Sachin Thukral, Suyash Sangwan, Arnab Chatterjee, Lipika Dey, Aaditya Agrawal, Pramit Kumar Chandra, Animesh Mukherjee

Abstract: This study proposes content and interaction analysis techniques for a large repository created from social media content. Though we have presented our study for a large platform dedicated to discussions around financial topics, the proposed methods are generic and applicable to all platforms. Along with an extension of topic extraction method using Latent Dirichlet Allocation, we propose a few mea… ▽ More This study proposes content and interaction analysis techniques for a large repository created from social media content. Though we have presented our study for a large platform dedicated to discussions around financial topics, the proposed methods are generic and applicable to all platforms. Along with an extension of topic extraction method using Latent Dirichlet Allocation, we propose a few measures to assess user participation, influence and topic affinities specifically. Our study also maps user-generated content to components of behavioral finance. While these types of information are usually gathered through surveys, it is obvious that large scale data analysis from social media can reveal many potentially unknown or rare insights. Characterising users based on their platform behavior to provide critical insights about how communities are formed and trust is established in these platforms using graphical analysis is also studied. △ Less

Submitted 12 March, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

Comments: 8 pages, 8 figures, 3 tables, and 1 algorithm; Published in WI-IAT 2022 (The 21st IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology)

Journal ref: IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT) 2022 (pp. 612-619)

arXiv:2402.16159 [pdf, other]

DistALANER: Distantly Supervised Active Learning Augmented Named Entity Recognition in the Open Source Software Ecosystem

Authors: Somnath Banerjee, Avik Dutta, Aaditya Agrawal, Rima Hazra, Animesh Mukherjee

Abstract: With the AI revolution in place, the trend for building automated systems to support professionals in different domains such as the open source software systems, healthcare systems, banking systems, transportation systems and many others have become increasingly prominent. A crucial requirement in the automation of support tools for such systems is the early identification of named entities, which… ▽ More With the AI revolution in place, the trend for building automated systems to support professionals in different domains such as the open source software systems, healthcare systems, banking systems, transportation systems and many others have become increasingly prominent. A crucial requirement in the automation of support tools for such systems is the early identification of named entities, which serves as a foundation for developing specialized functionalities. However, due to the specific nature of each domain, different technical terminologies and specialized languages, expert annotation of available data becomes expensive and challenging. In light of these challenges, this paper proposes a novel named entity recognition (NER) technique specifically tailored for the open-source software systems. Our approach aims to address the scarcity of annotated software data by employing a comprehensive two-step distantly supervised annotation process. This process strategically leverages language heuristics, unique lookup tables, external knowledge sources, and an active learning approach. By harnessing these powerful techniques, we not only enhance model performance but also effectively mitigate the limitations associated with cost and the scarcity of expert annotators. It is noteworthy that our model significantly outperforms the state-of-the-art LLMs by a substantial margin. We also show the effectiveness of NER in the downstream task of relation extraction. △ Less

Submitted 20 June, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

Comments: Accepted at ECML-PKDD 2024 (Long Paper)

arXiv:2402.15302 [pdf, other]

How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries

Authors: Somnath Banerjee, Sayan Layek, Rima Hazra, Animesh Mukherjee

Abstract: In this study, we tackle a growing concern around the safety and ethical use of large language models (LLMs). Despite their potential, these models can be tricked into producing harmful or unethical content through various sophisticated methods, including 'jailbreaking' techniques and targeted manipulation. Our work zeroes in on a specific issue: to what extent LLMs can be led astray by asking the… ▽ More In this study, we tackle a growing concern around the safety and ethical use of large language models (LLMs). Despite their potential, these models can be tricked into producing harmful or unethical content through various sophisticated methods, including 'jailbreaking' techniques and targeted manipulation. Our work zeroes in on a specific issue: to what extent LLMs can be led astray by asking them to generate responses that are instruction-centric such as a pseudocode, a program or a software snippet as opposed to vanilla text. To investigate this question, we introduce TechHazardQA, a dataset containing complex queries which should be answered in both text and instruction-centric formats (e.g., pseudocodes), aimed at identifying triggers for unethical responses. We query a series of LLMs -- Llama-2-13b, Llama-2-7b, Mistral-V2 and Mistral 8X7B -- and ask them to generate both text and instruction-centric responses. For evaluation we report the harmfulness score metric as well as judgements from GPT-4 and humans. Overall, we observe that asking LLMs to produce instruction-centric responses enhances the unethical response generation by ~2-38% across the models. As an additional objective, we investigate the impact of model editing using the ROME technique, which further increases the propensity for generating undesirable content. In particular, asking edited LLMs to generate instruction-centric responses further increases the unethical response generation by ~3-16% across the different models. △ Less

Submitted 15 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

Comments: Under review. {https://huggingface.co/datasets/SoftMINER-Group/TechHazardQA}

arXiv:2402.14702 [pdf, other]

InfFeed: Influence Functions as a Feedback to Improve the Performance of Subjective Tasks

Authors: Somnath Banerjee, Maulindu Sarkar, Punyajoy Saha, Binny Mathew, Animesh Mukherjee

Abstract: Recently, influence functions present an apparatus for achieving explainability for deep neural models by quantifying the perturbation of individual train instances that might impact a test prediction. Our objectives in this paper are twofold. First we incorporate influence functions as a feedback into the model to improve its performance. Second, in a dataset extension exercise, using influence f… ▽ More Recently, influence functions present an apparatus for achieving explainability for deep neural models by quantifying the perturbation of individual train instances that might impact a test prediction. Our objectives in this paper are twofold. First we incorporate influence functions as a feedback into the model to improve its performance. Second, in a dataset extension exercise, using influence functions to automatically identify data points that have been initially `silver' annotated by some existing method and need to be cross-checked (and corrected) by annotators to improve the model performance. To meet these objectives, in this paper, we introduce InfFeed, which uses influence functions to compute the influential instances for a target instance. Toward the first objective, we adjust the label of the target instance based on its influencer(s) label. In doing this, InfFeed outperforms the state-of-the-art baselines (including LLMs) by a maximum macro F1-score margin of almost 4% for hate speech classification, 3.5% for stance classification, and 3% for irony and 2% for sarcasm detection. Toward the second objective we show that manually re-annotating only those silver annotated data points in the extension set that have a negative influence can immensely improve the model performance bringing it very close to the scenario where all the data points in the extension set have gold labels. This allows for huge reduction of the number of data points that need to be manually annotated since out of the silver annotated extension dataset, the influence function scheme picks up ~1/1000 points that need manual correction. △ Less

Submitted 9 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

Comments: Accepted at LREC-COLING 2024 (Long Paper)

arXiv:2402.13771 [pdf, other]

Mask-up: Investigating Biases in Face Re-identification for Masked Faces

Authors: Siddharth D Jaiswal, Ankit Kr. Verma, Animesh Mukherjee

Abstract: AI based Face Recognition Systems (FRSs) are now widely distributed and deployed as MLaaS solutions all over the world, moreso since the COVID-19 pandemic for tasks ranging from validating individuals' faces while buying SIM cards to surveillance of citizens. Extensive biases have been reported against marginalized groups in these systems and have led to highly discriminatory outcomes. The post-pa… ▽ More AI based Face Recognition Systems (FRSs) are now widely distributed and deployed as MLaaS solutions all over the world, moreso since the COVID-19 pandemic for tasks ranging from validating individuals' faces while buying SIM cards to surveillance of citizens. Extensive biases have been reported against marginalized groups in these systems and have led to highly discriminatory outcomes. The post-pandemic world has normalized wearing face masks but FRSs have not kept up with the changing times. As a result, these systems are susceptible to mask based face occlusion. In this study, we audit four commercial and nine open-source FRSs for the task of face re-identification between different varieties of masked and unmasked images across five benchmark datasets (total 14,722 images). These simulate a realistic validation/surveillance task as deployed in all major countries around the world. Three of the commercial and five of the open-source FRSs are highly inaccurate; they further perpetuate biases against non-White individuals, with the lowest accuracy being 0%. A survey for the same task with 85 human participants also results in a low accuracy of 40%. Thus a human-in-the-loop moderation in the pipeline does not alleviate the concerns, as has been frequently hypothesized in literature. Our large-scale study shows that developers, lawmakers and users of such services need to rethink the design principles behind FRSs, especially for the task of face re-identification, taking cognizance of observed biases. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: This work has been submitted to the IEEE for possible publication

arXiv:2402.12881 [pdf, other]

GRAFFORD: A Benchmark Dataset for Testing the Knowledge of Object Affordances of Language and Vision Models

Authors: Sayantan Adak, Daivik Agrawal, Animesh Mukherjee, Somak Aditya

Abstract: We investigate the knowledge of object affordances in pre-trained language models (LMs) and pre-trained Vision-Language models (VLMs). Transformers-based large pre-trained language models (PTLM) learn contextual representation from massive amounts of unlabeled text and are shown to perform impressively in downstream NLU tasks. In parallel, a growing body of literature shows that PTLMs fail inconsi… ▽ More We investigate the knowledge of object affordances in pre-trained language models (LMs) and pre-trained Vision-Language models (VLMs). Transformers-based large pre-trained language models (PTLM) learn contextual representation from massive amounts of unlabeled text and are shown to perform impressively in downstream NLU tasks. In parallel, a growing body of literature shows that PTLMs fail inconsistently and non-intuitively, showing a lack of reasoning and grounding. To take a first step toward quantifying the effect of grounding (or lack thereof), we curate a novel and comprehensive dataset of object affordances -- GrAFFORD, characterized by 15 affordance classes. Unlike affordance datasets collected in vision and language domains, we annotate in-the-wild sentences with objects and affordances. Experimental results reveal that PTLMs exhibit limited reasoning abilities when it comes to uncommon object affordances. We also observe that pre-trained VLMs do not necessarily capture object affordances effectively. Through few-shot fine-tuning, we demonstrate improvement in affordance knowledge in PTLMs and VLMs. Our research contributes a novel dataset for language grounding tasks, and presents insights into LM capabilities, advancing the understanding of object affordances. Codes and data are available at https://github.com/sayantan11995/Affordance △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.12198 [pdf, other]

Zero shot VLMs for hate meme detection: Are we there yet?

Authors: Naquee Rizwan, Paramananda Bhaskar, Mithun Das, Swadhin Satyaprakash Majhi, Punyajoy Saha, Animesh Mukherjee

Abstract: Multimedia content on social media is rapidly evolving, with memes gaining prominence as a distinctive form. Unfortunately, some malicious users exploit memes to target individuals or vulnerable communities, making it imperative to identify and address such instances of hateful memes. Extensive research has been conducted to address this issue by developing hate meme detection models. However, a n… ▽ More Multimedia content on social media is rapidly evolving, with memes gaining prominence as a distinctive form. Unfortunately, some malicious users exploit memes to target individuals or vulnerable communities, making it imperative to identify and address such instances of hateful memes. Extensive research has been conducted to address this issue by developing hate meme detection models. However, a notable limitation of traditional machine/deep learning models is the requirement for labeled datasets for accurate classification. Recently, the research community has witnessed the emergence of several visual language models that have exhibited outstanding performance across various tasks. In this study, we aim to investigate the efficacy of these visual language models in handling intricate tasks such as hate meme detection. We use various prompt settings to focus on zero-shot classification of hateful/harmful memes. Through our analysis, we observe that large VLMs are still vulnerable for zero-shot hate meme detection. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.08563 [pdf, other]

Denoising Diffusion Restoration Tackles Forward and Inverse Problems for the Laplace Operator

Authors: Amartya Mukherjee, Melissa M. Stadt, Lena Podina, Mohammad Kohandel, Jun Liu

Abstract: Diffusion models have emerged as a promising class of generative models that map noisy inputs to realistic images. More recently, they have been employed to generate solutions to partial differential equations (PDEs). However, they still struggle with inverse problems in the Laplacian operator, for instance, the Poisson equation, because the eigenvalues that are large in magnitude amplify the meas… ▽ More Diffusion models have emerged as a promising class of generative models that map noisy inputs to realistic images. More recently, they have been employed to generate solutions to partial differential equations (PDEs). However, they still struggle with inverse problems in the Laplacian operator, for instance, the Poisson equation, because the eigenvalues that are large in magnitude amplify the measurement noise. This paper presents a novel approach for the inverse and forward solution of PDEs through the use of denoising diffusion restoration models (DDRM). DDRMs were used in linear inverse problems to restore original clean signals by exploiting the singular value decomposition (SVD) of the linear operator. Equivalently, we present an approach to restore the solution and the parameters in the Poisson equation by exploiting the eigenvalues and the eigenfunctions of the Laplacian operator. Our results show that using denoising diffusion restoration significantly improves the estimation of the solution and parameters. Our research, as a result, pioneers the integration of diffusion models with the principles of underlying physics to solve PDEs. △ Less

Submitted 14 February, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 29 pages

arXiv:2402.07262 [pdf, other]

Low-Resource Counterspeech Generation for Indic Languages: The Case of Bengali and Hindi

Authors: Mithun Das, Saurabh Kumar Pandey, Shivansh Sethi, Punyajoy Saha, Animesh Mukherjee

Abstract: With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can "counter" the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network. However, most of the efforts so far have been primarily focused on English. To bridge the gap for low-resource languages such as Bengali and… ▽ More With the rise of online abuse, the NLP community has begun investigating the use of neural architectures to generate counterspeech that can "counter" the vicious tone of such abusive speech and dilute/ameliorate their rippling effect over the social network. However, most of the efforts so far have been primarily focused on English. To bridge the gap for low-resource languages such as Bengali and Hindi, we create a benchmark dataset of 5,062 abusive speech/counterspeech pairs, of which 2,460 pairs are in Bengali and 2,602 pairs are in Hindi. We implement several baseline models considering various interlingual transfer mechanisms with different configurations to generate suitable counterspeech to set up an effective benchmark. We observe that the monolingual setup yields the best performance. Further, using synthetic transfer, language models can generate counterspeech to some extent; specifically, we notice that transferability is better when languages belong to the same language family. △ Less

Submitted 11 February, 2024; originally announced February 2024.

Comments: Accepted to the Findings of the ACL: EACL 2024

arXiv:2401.12671 [pdf, other]

Context Matters: Pushing the Boundaries of Open-Ended Answer Generation with Graph-Structured Knowledge Context

Authors: Somnath Banerjee, Amruit Sahoo, Sayan Layek, Avik Dutta, Rima Hazra, Animesh Mukherjee

Abstract: In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external d… ▽ More In the continuously advancing AI landscape, crafting context-rich and meaningful responses via Large Language Models (LLMs) is essential. Researchers are becoming more aware of the challenges that LLMs with fewer parameters encounter when trying to provide suitable answers to open-ended questions. To address these hurdles, the integration of cutting-edge strategies, augmentation of rich external domain knowledge to LLMs, offers significant improvements. This paper introduces a novel framework that combines graph-driven context retrieval in conjunction to knowledge graphs based enhancement, honing the proficiency of LLMs, especially in domain specific community question answering platforms like AskUbuntu, Unix, and ServerFault. We conduct experiments on various LLMs with different parameter sizes to evaluate their ability to ground knowledge and determine factual accuracy in answers to open-ended questions. Our methodology GraphContextGen consistently outperforms dominant text-based retrieval systems, demonstrating its robustness and adaptability to a larger number of use cases. This advancement highlights the importance of pairing context rich data retrieval with LLMs, offering a renewed approach to knowledge sourcing and generation in AI systems. We also show that, due to rich contextual data retrieval, the crucial entities, along with the generated answer, remain factually coherent with the gold answer. △ Less

Submitted 5 March, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2401.05834 [pdf, ps, other]

Modeling Online Paging in Multi-Core Systems

Authors: Mathieu Mari, Anish Mukherjee, Runtian Ren, Piotr Sankowski

Abstract: Web requests are growing exponentially since the 90s due to the rapid development of the Internet. This process was further accelerated by the introduction of cloud services. It has been observed statistically that memory or web requests generally follow power-law distribution, Breslau et al. INFOCOM'99. That is, the $i^{\text{th}}$ most popular web page is requested with a probability proportiona… ▽ More Web requests are growing exponentially since the 90s due to the rapid development of the Internet. This process was further accelerated by the introduction of cloud services. It has been observed statistically that memory or web requests generally follow power-law distribution, Breslau et al. INFOCOM'99. That is, the $i^{\text{th}}$ most popular web page is requested with a probability proportional to $1 / i^α$ ($α> 0$ is a constant). Furthermore, this study, which was performed more than 20 years ago, indicated Zipf-like behavior, i.e., that $α\le 1$. Surprisingly, the memory access traces coming from petabyte-size modern cloud systems not only show that $α$ can be bigger than one but also illustrate a shifted power-law distribution -- called Pareto type II or Lomax. These previously not reported phenomenon calls for statistical explanation. Our first contribution is a new statistical {\it multi-core power-law} model indicating that double-power law can be attributed to the presence of multiple cores running many virtual machines in parallel on such systems. We verify experimentally the applicability of this model using the Kolmogorov-Smirnov test (K-S test). The second contribution of this paper is a theoretical analysis indicating why LRU and LFU-based algorithms perform well in practice on data satisfying power-law or multi-core assumptions. We provide an explanation by studying the online paging problem in the stochastic input model, i.e., the input is a random sequence with each request independently drawn from a page set according to a distribution $π$. We derive formulas (as a function of the page probabilities in $π$) to upper bound their ratio-of-expectations, which help in establishing O(1) performance ratio given the random sequence following power-law and multi-core power-law distributions. △ Less

Submitted 12 January, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

arXiv:2401.05538 [pdf, other]

Multi-objective Feature Selection in Remote Health Monitoring Applications

Authors: Le Ngu Nguyen, Constantino Álvarez Casado, Manuel Lage Cañellas, Anirban Mukherjee, Nhi Nguyen, Dinesh Babu Jayagopi, Miguel Bordallo López

Abstract: Radio frequency (RF) signals have facilitated the development of non-contact human monitoring tasks, such as vital signs measurement, activity recognition, and user identification. In some specific scenarios, an RF signal analysis framework may prioritize the performance of one task over that of others. In response to this requirement, we employ a multi-objective optimization approach inspired by… ▽ More Radio frequency (RF) signals have facilitated the development of non-contact human monitoring tasks, such as vital signs measurement, activity recognition, and user identification. In some specific scenarios, an RF signal analysis framework may prioritize the performance of one task over that of others. In response to this requirement, we employ a multi-objective optimization approach inspired by biological principles to select discriminative features that enhance the accuracy of breathing patterns recognition while simultaneously impeding the identification of individual users. This approach is validated using a novel vital signs dataset consisting of 50 subjects engaged in four distinct breathing patterns. Our findings indicate a remarkable result: a substantial divergence in accuracy between breathing recognition and user identification. As a complementary viewpoint, we present a contrariwise result to maximize user identification accuracy and minimize the system's capacity for breathing activity recognition. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: Under review

arXiv:2401.02649 [pdf, other]

Enhancing 3D-Air Signature by Pen Tip Tail Trajectory Awareness: Dataset and Featuring by Novel Spatio-temporal CNN

Authors: Saurabh Atreya, Maheswar Bora, Aritra Mukherjee, Abhijit Das

Abstract: This work proposes a novel process of using pen tip and tail 3D trajectory for air signature. To acquire the trajectories we developed a new pen tool and a stereo camera was used. We proposed SliT-CNN, a novel 2D spatial-temporal convolutional neural network (CNN) for better featuring of the air signature. In addition, we also collected an air signature dataset from $45$ signers. Skilled forgery s… ▽ More This work proposes a novel process of using pen tip and tail 3D trajectory for air signature. To acquire the trajectories we developed a new pen tool and a stereo camera was used. We proposed SliT-CNN, a novel 2D spatial-temporal convolutional neural network (CNN) for better featuring of the air signature. In addition, we also collected an air signature dataset from $45$ signers. Skilled forgery signatures per user are also collected. A detailed benchmarking of the proposed dataset using existing techniques and proposed CNN on existing and proposed dataset exhibit the effectiveness of our methodology. △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: Accepted and presented in IJCB 2023

arXiv:2401.02646 [pdf, other]

Recent Advancement in 3D Biometrics using Monocular Camera

Authors: Aritra Mukherjee, Abhijit Das

Abstract: Recent literature has witnessed significant interest towards 3D biometrics employing monocular vision for robust authentication methods. Motivated by this, in this work we seek to provide insight on recent development in the area of 3D biometrics employing monocular vision. We present the similarity and dissimilarity of 3D monocular biometrics and classical biometrics, listing the strengths and ch… ▽ More Recent literature has witnessed significant interest towards 3D biometrics employing monocular vision for robust authentication methods. Motivated by this, in this work we seek to provide insight on recent development in the area of 3D biometrics employing monocular vision. We present the similarity and dissimilarity of 3D monocular biometrics and classical biometrics, listing the strengths and challenges. Further, we provide an overview of recent techniques in 3D biometrics with monocular vision, as well as application systems adopted by the industry. Finally, we discuss open research problems in this area of research △ Less

Submitted 5 January, 2024; originally announced January 2024.

Comments: Accepted and presented in IJCB 2023

arXiv:2312.16256 [pdf, other]

DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision

Authors: Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, Xuanmao Li, Xingpeng Sun, Rohan Ashok, Aniruddha Mukherjee, Hao Kang, Xiangrui Kong, Gang Hua, Tianyi Zhang, Bedrich Benes, Aniket Bera

Abstract: We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not… ▽ More We have witnessed significant progress in deep learning-based 3D vision, ranging from neural radiance field (NeRF) based 3D representation learning to applications in novel view synthesis (NVS). However, existing scene-level datasets for deep learning-based 3D vision, limited to either synthetic environments or a narrow selection of real-world scenes, are quite insufficient. This insufficiency not only hinders a comprehensive benchmark of existing methods but also caps what could be explored in deep learning-based 3D analysis. To address this critical gap, we present DL3DV-10K, a large-scale scene dataset, featuring 51.2 million frames from 10,510 videos captured from 65 types of point-of-interest (POI) locations, covering both bounded and unbounded scenes, with different levels of reflection, transparency, and lighting. We conducted a comprehensive benchmark of recent NVS methods on DL3DV-10K, which revealed valuable insights for future research in NVS. In addition, we have obtained encouraging results in a pilot study to learn generalizable NeRF from DL3DV-10K, which manifests the necessity of a large-scale scene-level dataset to forge a path toward a foundation model for learning 3D representation. Our DL3DV-10K dataset, benchmark results, and models will be publicly accessible at https://dl3dv-10k.github.io/DL3DV-10K/. △ Less

Submitted 29 December, 2023; v1 submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.07601 [pdf, other]

Non-contact Multimodal Indoor Human Monitoring Systems: A Survey

Authors: Le Ngu Nguyen, Praneeth Susarla, Anirban Mukherjee, Manuel Lage Cañellas, Constantino Álvarez Casado, Xiaoting Wu, Olli~Silvén, Dinesh Babu Jayagopi, Miguel Bordallo López

Abstract: Indoor human monitoring systems leverage a wide range of sensors, including cameras, radio devices, and inertial measurement units, to collect extensive data from users and the environment. These sensors contribute diverse data modalities, such as video feeds from cameras, received signal strength indicators and channel state information from WiFi devices, and three-axis acceleration data from ine… ▽ More Indoor human monitoring systems leverage a wide range of sensors, including cameras, radio devices, and inertial measurement units, to collect extensive data from users and the environment. These sensors contribute diverse data modalities, such as video feeds from cameras, received signal strength indicators and channel state information from WiFi devices, and three-axis acceleration data from inertial measurement units. In this context, we present a comprehensive survey of multimodal approaches for indoor human monitoring systems, with a specific focus on their relevance in elderly care. Our survey primarily highlights non-contact technologies, particularly cameras and radio devices, as key components in the development of indoor human monitoring systems. Throughout this article, we explore well-established techniques for extracting features from multimodal data sources. Our exploration extends to methodologies for fusing these features and harnessing multiple modalities to improve the accuracy and robustness of machine learning models. Furthermore, we conduct comparative analysis across different data modalities in diverse human monitoring tasks and undertake a comprehensive examination of existing multimodal datasets. This extensive survey not only highlights the significance of indoor human monitoring systems but also affirms their versatile applications. In particular, we emphasize their critical role in enhancing the quality of elderly care, offering valuable insights into the development of non-contact monitoring solutions applicable to the needs of aging populations. △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: 19 pages, 5 figures

arXiv:2312.05686 [pdf, other]

Privacy Preserving Multi-Agent Reinforcement Learning in Supply Chains

Authors: Ananta Mukherjee, Peeyush Kumar, Boling Yang, Nishanth Chandran, Divya Gupta

Abstract: This paper addresses privacy concerns in multi-agent reinforcement learning (MARL), specifically within the context of supply chains where individual strategic data must remain confidential. Organizations within the supply chain are modeled as agents, each seeking to optimize their own objectives while interacting with others. As each organization's strategy is contingent on neighboring strategies… ▽ More This paper addresses privacy concerns in multi-agent reinforcement learning (MARL), specifically within the context of supply chains where individual strategic data must remain confidential. Organizations within the supply chain are modeled as agents, each seeking to optimize their own objectives while interacting with others. As each organization's strategy is contingent on neighboring strategies, maintaining privacy of state and action-related information is crucial. To tackle this challenge, we propose a game-theoretic, privacy-preserving mechanism, utilizing a secure multi-party computation (MPC) framework in MARL settings. Our major contribution is the successful implementation of a secure MPC framework, SecFloat on EzPC, to solve this problem. However, simply implementing policy gradient methods such as MADDPG operations using SecFloat, while conceptually feasible, would be programmatically intractable. To overcome this hurdle, we devise a novel approach that breaks down the forward and backward pass of the neural network into elementary operations compatible with SecFloat , creating efficient and secure versions of the MADDPG algorithm. Furthermore, we present a learning mechanism that carries out floating point operations in a privacy-preserving manner, an important feature for successful learning in MARL framework. Experiments reveal that there is on average 68.19% less supply chain wastage in 2 PC compared to no data share, while also giving on average 42.27% better average cumulative revenue for each player. This work paves the way for practical, privacy-preserving MARL, promising significant improvements in secure computation within supply chain contexts and broadly. △ Less

Submitted 9 December, 2023; originally announced December 2023.

arXiv:2312.01500 [pdf, other]

Unsupervised Approach to Evaluate Sentence-Level Fluency: Do We Really Need Reference?

Authors: Gopichand Kanumolu, Lokesh Madasu, Pavan Baswani, Ananya Mukherjee, Manish Shrivastava

Abstract: Fluency is a crucial goal of all Natural Language Generation (NLG) systems. Widely used automatic evaluation metrics fall short in capturing the fluency of machine-generated text. Assessing the fluency of NLG systems poses a challenge since these models are not limited to simply reusing words from the input but may also generate abstractions. Existing reference-based fluency evaluations, such as w… ▽ More Fluency is a crucial goal of all Natural Language Generation (NLG) systems. Widely used automatic evaluation metrics fall short in capturing the fluency of machine-generated text. Assessing the fluency of NLG systems poses a challenge since these models are not limited to simply reusing words from the input but may also generate abstractions. Existing reference-based fluency evaluations, such as word overlap measures, often exhibit weak correlations with human judgments. This paper adapts an existing unsupervised technique for measuring text fluency without the need for any reference. Our approach leverages various word embeddings and trains language models using Recurrent Neural Network (RNN) architectures. We also experiment with other available multilingual Language Models (LMs). To assess the performance of the models, we conduct a comparative analysis across 10 Indic languages, correlating the obtained fluency scores with human judgments. Our code and human-annotated benchmark test-set for fluency is available at https://github.com/AnanyaCoder/TextFluencyForIndicLanaguges. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: Accepted at IJCNLP-AACL SEALP Workshop

arXiv:2311.07592 [pdf, other]

Hallucination-minimized Data-to-answer Framework for Financial Decision-makers

Authors: Sohini Roychowdhury, Andres Alvarez, Brian Moore, Marko Krema, Maria Paz Gelpi, Federico Martin Rodriguez, Angel Rodriguez, Jose Ramon Cabrejas, Pablo Martinez Serrano, Punit Agrawal, Arijit Mukherjee

Abstract: Large Language Models (LLMs) have been applied to build several automation and personalized question-answering prototypes so far. However, scaling such prototypes to robust products with minimized hallucinations or fake responses still remains an open challenge, especially in niche data-table heavy domains such as financial decision making. In this work, we present a novel Langchain-based framewor… ▽ More Large Language Models (LLMs) have been applied to build several automation and personalized question-answering prototypes so far. However, scaling such prototypes to robust products with minimized hallucinations or fake responses still remains an open challenge, especially in niche data-table heavy domains such as financial decision making. In this work, we present a novel Langchain-based framework that transforms data tables into hierarchical textual data chunks to enable a wide variety of actionable question answering. First, the user-queries are classified by intention followed by automated retrieval of the most relevant data chunks to generate customized LLM prompts per query. Next, the custom prompts and their responses undergo multi-metric scoring to assess for hallucinations and response confidence. The proposed system is optimized with user-query intention classification, advanced prompting, data scaling capabilities and it achieves over 90% confidence scores for a variety of user-queries responses ranging from {What, Where, Why, How, predict, trend, anomalies, exceptions} that are crucial for financial decision making applications. The proposed data to answers framework can be extended to other analytical domains such as sales and payroll to ensure optimal hallucination control guardrails. △ Less

Submitted 9 November, 2023; originally announced November 2023.

Comments: 11 pages, 5 figures, 4 tables

arXiv:2311.05870 [pdf]

Automated Heterogeneous Low-Bit Quantization of Multi-Model Deep Learning Inference Pipeline

Authors: Jayeeta Mondal, Swarnava Dey, Arijit Mukherjee

Abstract: Multiple Deep Neural Networks (DNNs) integrated into single Deep Learning (DL) inference pipelines e.g. Multi-Task Learning (MTL) or Ensemble Learning (EL), etc., albeit very accurate, pose challenges for edge deployment. In these systems, models vary in their quantization tolerance and resource demands, requiring meticulous tuning for accuracy-latency balance. This paper introduces an automated h… ▽ More Multiple Deep Neural Networks (DNNs) integrated into single Deep Learning (DL) inference pipelines e.g. Multi-Task Learning (MTL) or Ensemble Learning (EL), etc., albeit very accurate, pose challenges for edge deployment. In these systems, models vary in their quantization tolerance and resource demands, requiring meticulous tuning for accuracy-latency balance. This paper introduces an automated heterogeneous quantization approach for DL inference pipelines with multiple DNNs. △ Less

Submitted 10 November, 2023; originally announced November 2023.

Journal ref: LBQNN@ICCV2023

arXiv:2310.19794 [pdf, other]

Robust Causal Bandits for Linear Models

Authors: Zirui Yan, Arpan Mukherjee, Burak Varıcı, Ali Tajer

Abstract: Sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model f… ▽ More Sequential design of experiments for optimizing a reward function in causal systems can be effectively modeled by the sequential design of interventions in causal bandits (CBs). In the existing literature on CBs, a critical assumption is that the causal models remain constant over time. However, this assumption does not necessarily hold in complex systems, which constantly undergo temporal model fluctuations. This paper addresses the robustness of CBs to such model fluctuations. The focus is on causal systems with linear structural equation models (SEMs). The SEMs and the time-varying pre- and post-interventional statistical models are all unknown. Cumulative regret is adopted as the design criteria, based on which the objective is to design a sequence of interventions that incur the smallest cumulative regret with respect to an oracle aware of the entire causal model and its fluctuations. First, it is established that the existing approaches fail to maintain regret sub-linearity with even a few instances of model deviation. Specifically, when the number of instances with model deviation is as few as $T^\frac{1}{2L}$, where $T$ is the time horizon and $L$ is the longest causal path in the graph, the existing algorithms will have linear regret in $T$. Next, a robust CB algorithm is designed, and its regret is analyzed, where upper and information-theoretic lower bounds on the regret are established. Specifically, in a graph with $N$ nodes and maximum degree $d$, under a general measure of model deviation $C$, the cumulative regret is upper bounded by $\tilde{\mathcal{O}}(d^{L-\frac{1}{2}}(\sqrt{NT} + NC))$ and lower bounded by $Ω(d^{\frac{L}{2}-2}\max\{\sqrt{T},d^2C\})$. Comparing these bounds establishes that the proposed algorithm achieves nearly optimal $\tilde{\mathcal{O}}(\sqrt{T})$ regret when $C$ is $o(\sqrt{T})$ and maintains sub-linear regret for a broader range of $C$. △ Less

Submitted 4 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Showing 1–50 of 293 results for author: Mukherjee, A