Search | arXiv e-print repository

V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

Authors: Abdur Rahman, Rajat Chawla, Muskaan Kumar, Arkajit Datta, Adarsh Jha, Mukunda NS, Ishaan Bhola

Abstract: In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting th… ▽ More In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting the potential of existing models to enhance automation levels. To bridge this gap, this paper presents V-Zen, an innovative Multimodal Large Language Model (MLLM) meticulously crafted to revolutionise the domain of GUI understanding and grounding. Equipped with dual-resolution image encoders, V-Zen establishes new benchmarks in efficient grounding and next-action prediction, thereby laying the groundwork for self-operating computer systems. Complementing V-Zen is the GUIDE dataset, an extensive collection of real-world GUI elements and task-based sequences, serving as a catalyst for specialised fine-tuning. The successful integration of V-Zen and GUIDE marks the dawn of a new era in multimodal AI research, opening the door to intelligent, autonomous computing experiences. This paper extends an invitation to the research community to join this exciting journey, shaping the future of GUI automation. In the spirit of open science, our code, data, and model will be made publicly available, paving the way for multimodal dialogue scenarios with intricate and precise interactions. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.08305 [pdf, other]

Collateral Portfolio Optimization in Crypto-Backed Stablecoins

Authors: Bretislav Hajek, Daniel Reijsbergen, Anwitaman Datta, Jussi Keppo

Abstract: Stablecoins - crypto tokens whose value is pegged to a real-world asset such as the US Dollar - are an important component of the DeFi ecosystem as they mitigate the impact of token price volatility. In crypto-backed stablecoins, the peg is founded on the guarantee that in case of system shutdown, each stablecoin can be exchanged for a basket of other crypto tokens worth approximately its nominal… ▽ More Stablecoins - crypto tokens whose value is pegged to a real-world asset such as the US Dollar - are an important component of the DeFi ecosystem as they mitigate the impact of token price volatility. In crypto-backed stablecoins, the peg is founded on the guarantee that in case of system shutdown, each stablecoin can be exchanged for a basket of other crypto tokens worth approximately its nominal value. However, price fluctuations that affect the collateral tokens may cause this guarantee to be invalidated. In this work, we investigate the impact of the collateral portfolio's composition on the resilience to this type of catastrophic event. For stablecoins whose developers maintain a significant portion of the collateral (e.g., MakerDAO's Dai), we propose two portfolio optimization methods, based on convex optimization and (semi)variance minimization, that account for the correlation between the various token prices. We compare the optimal portfolios to the historical evolution of Dai's collateral portfolio, and to aid reproducibility, we have made our data and code publicly available. △ Less

Submitted 14 May, 2024; originally announced May 2024.

Comments: Accepted for presentation at MARBLE 2024

arXiv:2404.03587 [pdf, other]

Anticipate & Collab: Data-driven Task Anticipation and Knowledge-driven Planning for Human-robot Collaboration

Authors: Shivam Singh, Karthik Swaminathan, Raghav Arora, Ramandeep Singh, Ahana Datta, Dipanjan Das, Snehasis Banerjee, Mohan Sridharan, Madhava Krishna

Abstract: An agent assisting humans in daily living activities can collaborate more effectively by anticipating upcoming tasks. Data-driven methods represent the state of the art in task anticipation, planning, and related problems, but these methods are resource-hungry and opaque. Our prior work introduced a proof of concept framework that used an LLM to anticipate 3 high-level tasks that served as goals f… ▽ More An agent assisting humans in daily living activities can collaborate more effectively by anticipating upcoming tasks. Data-driven methods represent the state of the art in task anticipation, planning, and related problems, but these methods are resource-hungry and opaque. Our prior work introduced a proof of concept framework that used an LLM to anticipate 3 high-level tasks that served as goals for a classical planning system that computed a sequence of low-level actions for the agent to achieve these goals. This paper describes DaTAPlan, our framework that significantly extends our prior work toward human-robot collaboration. Specifically, DaTAPlan planner computes actions for an agent and a human to collaboratively and jointly achieve the tasks anticipated by the LLM, and the agent automatically adapts to unexpected changes in human action outcomes and preferences. We evaluate DaTAPlan capabilities in a realistic simulation environment, demonstrating accurate task anticipation, effective human-robot collaboration, and the ability to adapt to unexpected changes. Project website: https://dataplan-hrc.github.io △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2404.01329 [pdf, other]

Unraveling the Dynamics of Television Debates and Social Media Engagement: Insights from an Indian News Show

Authors: Kiran Garimella, Abhilash Datta

Abstract: The relationship between television shows and social media has become increasingly intertwined in recent years. Social media platforms, particularly Twitter, have emerged as significant sources of public opinion and discourse on topics discussed in television shows. In India, news debates leverage the popularity of social media to promote hashtags and engage users in discussions and debates on a d… ▽ More The relationship between television shows and social media has become increasingly intertwined in recent years. Social media platforms, particularly Twitter, have emerged as significant sources of public opinion and discourse on topics discussed in television shows. In India, news debates leverage the popularity of social media to promote hashtags and engage users in discussions and debates on a daily basis. This paper focuses on the analysis of one of India's most prominent and widely-watched TV news debate shows: "Arnab Goswami-The Debate". The study examines the content of the show by analyzing the hashtags used to promote it and the social media data corresponding to these hashtags. The findings reveal that the show exhibits a strong bias towards the ruling Bharatiya Janata Party (BJP), with over 60% of the debates featuring either pro-BJP or anti-opposition content. Social media support for the show primarily comes from BJP supporters. Notably, BJP leaders and influencers play a significant role in promoting the show on social media, leveraging their existing networks and resources to artificially trend specific hashtags. Furthermore, the study uncovers a reciprocal flow of information between the TV show and social media. We find evidence that the show's choice of topics is linked to social media posts made by party workers, suggesting a dynamic interplay between traditional media and online platforms. By exploring the complex interaction between television debates and social media support, this study contributes to a deeper understanding of the evolving relationship between these two domains in the digital age. The findings hold implications for media researchers and practitioners, offering insights into the ways in which social media can influence traditional media and vice versa. △ Less

Submitted 29 March, 2024; originally announced April 2024.

Comments: Accepted at ICWSM 2024. Please cite the ICWSM version

arXiv:2404.01226 [pdf, other]

Stable Code Technical Report

Authors: Nikhil Pinnaparaju, Reshinth Adithyan, Duy Phung, Jonathan Tow, James Baicoianu, Ashish Datta, Maksym Zhuravinskyi, Dakota Mahan, Marco Bellagente, Carlos Riquelme, Nathan Cooper

Abstract: We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally, we introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing quest… ▽ More We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally, we introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing question-answering and instruction-based tasks. In this technical report, we detail the data and training procedure leading to both models. Their weights are available via Hugging Face for anyone to download and use at https://huggingface.co/stabilityai/stable-code-3b and https://huggingface.co/stabilityai/stable-code-instruct-3b. This report contains thorough evaluations of the models, including multilingual programming benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of its release, Stable Code is the state-of-the-art open model under 3B parameters and even performs comparably to larger models of sizes 7 billion and 15 billion parameters on the popular Multi-PL benchmark. Stable Code Instruct also exhibits state-of-the-art performance on the MT-Bench coding tasks and on Multi-PL completion compared to other instruction tuned models. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2403.10171 [pdf]

AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation

Authors: Arkajit Datta, Tushar Verma, Rajat Chawla, Mukunda N. S, Ishaan Bhola

Abstract: In recent advancements within the domain of Large Language Models (LLMs), there has been a notable emergence of agents capable of addressing Robotic Process Automation (RPA) challenges through enhanced cognitive capabilities and sophisticated reasoning. This development heralds a new era of scalability and human-like adaptability in goal attainment. In this context, we introduce AUTONODE (Autonomo… ▽ More In recent advancements within the domain of Large Language Models (LLMs), there has been a notable emergence of agents capable of addressing Robotic Process Automation (RPA) challenges through enhanced cognitive capabilities and sophisticated reasoning. This development heralds a new era of scalability and human-like adaptability in goal attainment. In this context, we introduce AUTONODE (Autonomous User-interface Transformation through Online Neuro-graphic Operations and Deep Exploration). AUTONODE employs advanced neuro-graphical techniques to facilitate autonomous navigation and task execution on web interfaces, thereby obviating the necessity for predefined scripts or manual intervention. Our engine empowers agents to comprehend and implement complex workflows, adapting to dynamic web environments with unparalleled efficiency. Our methodology synergizes cognitive functionalities with robotic automation, endowing AUTONODE with the ability to learn from experience. We have integrated an exploratory module, DoRA (Discovery and mapping Operation for graph Retrieval Agent), which is instrumental in constructing a knowledge graph that the engine utilizes to optimize its actions and achieve objectives with minimal supervision. The versatility and efficacy of AUTONODE are demonstrated through a series of experiments, highlighting its proficiency in managing a diverse array of web-based tasks, ranging from data extraction to transaction processing. △ Less

Submitted 27 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: Accepted in MIPR-2024

arXiv:2403.08773 [pdf, other]

Veagle: Advancements in Multimodal Representation Learning

Authors: Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola

Abstract: Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image cap… ▽ More Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image captioning and visual question answering (VQA) to visual grounding. While these models have showcased significant advancements, challenges persist in accurately interpreting images and answering the question, a common occurrence in real-world scenarios. This paper introduces a novel approach to enhance the multimodal capabilities of existing models. In response to the limitations observed in current Vision Language Models (VLMs) and Multimodal Large Language Models (MLLMs), our proposed model Veagle, incorporates a unique mechanism inspired by the successes and insights of previous works. Veagle leverages a dynamic mechanism to project encoded visual information directly into the language model. This dynamic approach allows for a more nuanced understanding of intricate details present in visual contexts. To validate the effectiveness of Veagle, we conduct comprehensive experiments on benchmark datasets, emphasizing tasks such as visual question answering and image understanding. Our results indicate a improvement of 5-6 \% in performance, with Veagle outperforming existing models by a notable margin. The outcomes underscore the model's versatility and applicability beyond traditional benchmarks. △ Less

Submitted 18 January, 2024; originally announced March 2024.

arXiv:2403.04026 [pdf, other]

Spanning Tree-based Query Plan Enumeration

Authors: Yesdaulet Izenov, Asoke Datta, Brian Tsan, Abylay Amanbayev, Florin Rusu

Abstract: In this work, we define the problem of finding an optimal query plan as finding spanning trees with low costs. This approach empowers the utilization of a series of spanning tree algorithms, thereby enabling systematic exploration of the plan search space over a join graph. Capitalizing on the polynomial time complexity of spanning tree algorithms, we present the Ensemble Spanning Tree Enumeration… ▽ More In this work, we define the problem of finding an optimal query plan as finding spanning trees with low costs. This approach empowers the utilization of a series of spanning tree algorithms, thereby enabling systematic exploration of the plan search space over a join graph. Capitalizing on the polynomial time complexity of spanning tree algorithms, we present the Ensemble Spanning Tree Enumeration (ESTE) strategy. ESTE employs two conventional spanning tree algorithms, Prim's and Kruskal's, together to enhance the robustness of the query optimizer. In ESTE, multiple query plans are enumerated exploring different areas of the search space. This positions ESTE as an intermediate strategy between exhaustive and heuristic enumeration strategies. We show that ESTE is more robust in identifying efficient query plans for large queries. In the case of data modifications and workload demand increase, we believe that our approach can be a cheaper alternative to maintain optimizer robustness by integrating additional spanning tree algorithms rather than completely changing the optimizer to another plan enumeration algorithm. The experimental evaluation shows that ESTE achieves better consistency in plan quality and optimization time than existing solutions while identifying similarly optimal plans. △ Less

Submitted 6 March, 2024; originally announced March 2024.

arXiv:2402.17834 [pdf, other]

Stable LM 2 1.6B Technical Report

Authors: Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta, Meng Lee, Emad Mostaque, Michael Pieler, Nikhil Pinnaparju, Paulo Rocha, Harry Saini, Hannah Teufel, Niccolo Zanichelli, Carlos Riquelme

Abstract: We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including z… ▽ More We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including zero- and few-shot benchmarks, multilingual benchmarks, and the MT benchmark focusing on multi-turn dialogues. At the time of publishing this report, StableLM 2 1.6B was the state-of-the-art open model under 2B parameters by a significant margin. Given its appealing small size, we also provide throughput measurements on a number of edge devices. In addition, we open source several quantized checkpoints and provide their performance metrics compared to the original model. △ Less

Submitted 27 February, 2024; originally announced February 2024.

Comments: 23 pages, 6 figures

arXiv:2402.15873 [pdf, ps, other]

SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection

Authors: Ayan Datta, Aryan Chandramania, Radhika Mamidi

Abstract: This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted… ▽ More This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection. △ Less

Submitted 9 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

arXiv:2402.15037 [pdf, other]

Analyzing Games in Maker Protocol Part One: A Multi-Agent Influence Diagram Approach Towards Coordination

Authors: Abhimanyu Nag, Samrat Gupta, Sudipan Sinha, Arka Datta

Abstract: Decentralized Finance (DeFi) ecosystems, exemplified by the Maker Protocol, rely on intricate games to maintain stability and security. Understanding the dynamics of these games is crucial for ensuring the robustness of the system. This motivating research proposes a novel methodology leveraging Multi-Agent Influence Diagrams (MAID), originally proposed by Koller and Milch, to dissect and analyze… ▽ More Decentralized Finance (DeFi) ecosystems, exemplified by the Maker Protocol, rely on intricate games to maintain stability and security. Understanding the dynamics of these games is crucial for ensuring the robustness of the system. This motivating research proposes a novel methodology leveraging Multi-Agent Influence Diagrams (MAID), originally proposed by Koller and Milch, to dissect and analyze the games within the Maker stablecoin protocol. By representing users and governance of the Maker protocol as agents and their interactions as edges in a graph, we capture the complex network of influences governing agent behaviors. Furthermore in the upcoming papers, we will show a Nash Equilibrium model to elucidate strategies that promote coordination and enhance economic security within the ecosystem. Through this approach, we aim to motivate the use of this method to introduce a new method of formal verification of game theoretic security in DeFi platforms. △ Less

Submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.05620 [pdf, ps, other]

Optimized Denial-of-Service Threats on the Scalability of LT Coded Blockchains

Authors: Harikrishnan K., J. Harshan, Anwitaman Datta

Abstract: Coded blockchains have acquired prominence in the recent past as a promising approach to slash the storage costs as well as to facilitate scalability. Within this class, Luby Transform (LT) coded blockchains are an appealing choice for scalability in heterogeneous networks owing to the availability of a wide range of low-complexity LT decoders. While these architectures have been studied from the… ▽ More Coded blockchains have acquired prominence in the recent past as a promising approach to slash the storage costs as well as to facilitate scalability. Within this class, Luby Transform (LT) coded blockchains are an appealing choice for scalability in heterogeneous networks owing to the availability of a wide range of low-complexity LT decoders. While these architectures have been studied from the aspects of storage savings and scalability, not much is known in terms of their security vulnerabilities. Pointing at this research gap, in this work, we present novel denial-of-service (DoS) threats on LT coded blockchains that target nodes with specific decoding capabilities, thereby preventing them from joining the network. Our proposed threats are non-oblivious in nature, wherein adversaries gain access to the archived blocks, and choose to execute their threat on a subset of them based on underlying coding scheme. We show that our optimized threats can achieve the same level of damage as that of blind attacks, however, with limited amount of resources. This is the first work of its kind that opens up new questions on designing coded blockchains to jointly provide storage savings, scalability and resilience to optimized threats. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: To be presented in IEEE International Conference on Communications 2024

arXiv:2402.04489 [pdf, other]

De-amplifying Bias from Differential Privacy in Language Model Fine-tuning

Authors: Sanjari Srivastava, Piotr Mardziel, Zhikhun Zhang, Archana Ahlawat, Anupam Datta, John C Mitchell

Abstract: Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworth… ▽ More Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworthy ML pose a challenge to those wishing to address both. We show that DP amplifies gender, racial, and religious bias when fine-tuning large language models (LLMs), producing models more biased than ones fine-tuned without DP. We find the cause of the amplification to be a disparity in convergence of gradients across sub-groups. Through the case of binary gender bias, we demonstrate that Counterfactual Data Augmentation (CDA), a known method for addressing bias, also mitigates bias amplification by DP. As a consequence, DP and CDA together can be used to fine-tune models while maintaining both fairness and privacy. △ Less

Submitted 6 February, 2024; originally announced February 2024.

arXiv:2401.10419 [pdf]

M3BUNet: Mobile Mean Max UNet for Pancreas Segmentation on CT-Scans

Authors: Juwita juwita, Ghulam Mubashar Hassan, Naveed Akhtar, Amitava Datta

Abstract: Segmenting organs in CT scan images is a necessary process for multiple downstream medical image analysis tasks. Currently, manual CT scan segmentation by radiologists is prevalent, especially for organs like the pancreas, which requires a high level of domain expertise for reliable segmentation due to factors like small organ size, occlusion, and varying shapes. When resorting to automated pancre… ▽ More Segmenting organs in CT scan images is a necessary process for multiple downstream medical image analysis tasks. Currently, manual CT scan segmentation by radiologists is prevalent, especially for organs like the pancreas, which requires a high level of domain expertise for reliable segmentation due to factors like small organ size, occlusion, and varying shapes. When resorting to automated pancreas segmentation, these factors translate to limited reliable labeled data to train effective segmentation models. Consequently, the performance of contemporary pancreas segmentation models is still not within acceptable ranges. To improve that, we propose M3BUNet, a fusion of MobileNet and U-Net neural networks, equipped with a novel Mean-Max (MM) attention that operates in two stages to gradually segment pancreas CT images from coarse to fine with mask guidance for object detection. This approach empowers the network to surpass segmentation performance achieved by similar network architectures and achieve results that are on par with complex state-of-the-art methods, all while maintaining a low parameter count. Additionally, we introduce external contour segmentation as a preprocessing step for the coarse stage to assist in the segmentation process through image standardization. For the fine segmentation stage, we found that applying a wavelet decomposition filter to create multi-input images enhances pancreas segmentation performance. We extensively evaluate our approach on the widely known NIH pancreas dataset and MSD pancreas dataset. Our approach demonstrates a considerable performance improvement, achieving an average Dice Similarity Coefficient (DSC) value of up to 89.53% and an Intersection Over Union (IOU) score of up to 81.16 for the NIH pancreas dataset, and 88.60% DSC and 79.90% IOU for the MSD Pancreas dataset. △ Less

Submitted 18 January, 2024; originally announced January 2024.

arXiv:2401.04385 [pdf, other]

Machine unlearning through fine-grained model parameters perturbation

Authors: Zhiwei Zuo, Zhuo Tang, Kenli Li, Anwitaman Datta

Abstract: Machine unlearning techniques, which involve retracting data records and reducing influence of said data on trained models, help with the user privacy protection objective but incur significant computational costs. Weight perturbation-based unlearning is a general approach, but it typically involves globally modifying the parameters. We propose fine-grained Top-K and Random-k parameters perturbed… ▽ More Machine unlearning techniques, which involve retracting data records and reducing influence of said data on trained models, help with the user privacy protection objective but incur significant computational costs. Weight perturbation-based unlearning is a general approach, but it typically involves globally modifying the parameters. We propose fine-grained Top-K and Random-k parameters perturbed inexact machine unlearning strategies that address the privacy needs while keeping the computational costs tractable. In order to demonstrate the efficacy of our strategies we also tackle the challenge of evaluating the effectiveness of machine unlearning by considering the model's generalization performance across both unlearning and remaining data. To better assess the unlearning effect and model generalization, we propose novel metrics, namely, the forgetting rate and memory retention rate. However, for inexact machine unlearning, current metrics are inadequate in quantifying the degree of forgetting that occurs after unlearning strategies are applied. To address this, we introduce SPD-GAN, which subtly perturbs the distribution of data targeted for unlearning. Then, we evaluate the degree of unlearning by measuring the performance difference of the models on the perturbed unlearning data before and after the unlearning process. By implementing these innovative techniques and metrics, we achieve computationally efficacious privacy protection in machine learning applications without significant sacrifice of model performance. Furthermore, this approach provides a novel method for evaluating the degree of unlearning. △ Less

Submitted 8 July, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

arXiv:2401.02457 [pdf, other]

eCIL-MU: Embedding based Class Incremental Learning and Machine Unlearning

Authors: Zhiwei Zuo, Zhuo Tang, Bin Wang, Kenli Li, Anwitaman Datta

Abstract: New categories may be introduced over time, or existing categories may need to be reclassified. Class incremental learning (CIL) is employed for the gradual acquisition of knowledge about new categories while preserving information about previously learned ones in such dynamic environments. It might also be necessary to also eliminate the influence of related categories on the model to adapt to re… ▽ More New categories may be introduced over time, or existing categories may need to be reclassified. Class incremental learning (CIL) is employed for the gradual acquisition of knowledge about new categories while preserving information about previously learned ones in such dynamic environments. It might also be necessary to also eliminate the influence of related categories on the model to adapt to reclassification. We thus introduce class-level machine unlearning (MU) within CIL. Typically, MU methods tend to be time-consuming and can potentially harm the model's performance. A continuous stream of unlearning requests could lead to catastrophic forgetting. To address these issues, we propose a non-destructive eCIL-MU framework based on embedding techniques to map data into vectors and then be stored in vector databases. Our approach exploits the overlap between CIL and MU tasks for acceleration. Experiments demonstrate the capability of achieving unlearning effectiveness and orders of magnitude (upto $\sim 278\times$) of acceleration. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2312.17013 [pdf, other]

Perspectives of Global and Hong Kong's Media on China's Belt and Road Initiative

Authors: Le Cong Khoo, Anwitaman Datta

Abstract: This study delves into the media analysis of China's ambitious Belt and Road Initiative (BRI), which, in a polarized world, and furthermore, owing to the very polarizing nature of the initiative itself, has received both strong criticisms and conversely positive coverage in media from across the world. In that context, Hong Kong's dynamic media environment, with a particular focus on its drastical… ▽ More This study delves into the media analysis of China's ambitious Belt and Road Initiative (BRI), which, in a polarized world, and furthermore, owing to the very polarizing nature of the initiative itself, has received both strong criticisms and conversely positive coverage in media from across the world. In that context, Hong Kong's dynamic media environment, with a particular focus on its drastically changing press freedom before and after the implementation of the National Security Law is of further interest. Leveraging data science techniques, this study employs Global Database of Events, Language, and Tone (GDELT) to comprehensively collect and analyse (English) news articles on the BRI. Through sentiment analysis, we uncover patterns in media coverage over different periods from several countries across the globe, and delve further to investigate the the media situation in the Hong Kong region. This work thus provides valuable insights into how the Belt and Road Initiative has been portrayed in the media and its evolving reception on the global stage, with a specific emphasis on the unique media landscape of Hong Kong. In an era characterised by increasing globalisation and inter-connectivity, but also competition for influence, animosity and trade-wars, understanding the perceptions and coverage of such significant international projects is crucial. This work stands as an interdisciplinary endeavour merging geopolitical science and data science to uncover the intricate dynamics of media coverage in general, and with an added emphasis on Hong Kong. △ Less

Submitted 28 December, 2023; originally announced December 2023.

Comments: 23 pages, 16 figures, 7 tables, 1 appendix

arXiv:2311.17293 [pdf, other]

Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates

Authors: Asoke Datta, Brian Tsan, Yesdaulet Izenov, Florin Rusu

Abstract: Most query optimizers rely on cardinality estimates to determine optimal execution plans. While traditional databases such as PostgreSQL, Oracle, and Db2 utilize many types of synopses -- including histograms, samples, and sketches -- recent main-memory databases like DuckDB and Heavy.AI often operate with minimal or no estimates, yet their performance does not necessarily suffer. To the best of o… ▽ More Most query optimizers rely on cardinality estimates to determine optimal execution plans. While traditional databases such as PostgreSQL, Oracle, and Db2 utilize many types of synopses -- including histograms, samples, and sketches -- recent main-memory databases like DuckDB and Heavy.AI often operate with minimal or no estimates, yet their performance does not necessarily suffer. To the best of our knowledge, no analytical comparison has been conducted between optimizers with and without cardinality estimates to understand their performance characteristics in different settings, such as indexed, non-indexed, and multi-threaded. In this paper, we present a comparative analysis between optimizers that use cardinality estimates and those that do not. We use the Join Order Benchmark (JOB) for our evaluation and true cardinalities as the baseline. Our investigation reveals that cardinality estimates have marginal impact in non-indexed settings. Meanwhile, when indexes are available, inaccurate estimates may lead to sub-optimal physical operators -- even with an optimal join order. Furthermore, the impact of cardinality estimates is less significant in highly-parallel main-memory databases. △ Less

Submitted 28 November, 2023; originally announced November 2023.

arXiv:2311.05046 [pdf, other]

On the Consistency of Maximum Likelihood Estimation of Probabilistic Principal Component Analysis

Authors: Arghya Datta, Sayak Chakrabarty

Abstract: Probabilistic principal component analysis (PPCA) is currently one of the most used statistical tools to reduce the ambient dimension of the data. From multidimensional scaling to the imputation of missing data, PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance. Despite this wide applicability in various fields, hardly any theoretical guarante… ▽ More Probabilistic principal component analysis (PPCA) is currently one of the most used statistical tools to reduce the ambient dimension of the data. From multidimensional scaling to the imputation of missing data, PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance. Despite this wide applicability in various fields, hardly any theoretical guarantees exist to justify the soundness of the maximal likelihood (ML) solution for this model. In fact, it is well known that the maximum likelihood estimation (MLE) can only recover the true model parameters up to a rotation. The main obstruction is posed by the inherent identifiability nature of the PPCA model resulting from the rotational symmetry of the parameterization. To resolve this ambiguity, we propose a novel approach using quotient topological spaces and in particular, we show that the maximum likelihood solution is consistent in an appropriate quotient Euclidean space. Furthermore, our consistency results encompass a more general class of estimators beyond the MLE. Strong consistency of the ML estimate and consequently strong covariance estimation of the PPCA model have also been established under a compactness assumption. △ Less

Submitted 13 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

Comments: 15 pages, 1 figure, to appear in NeurIPS 2023. Update: included minor typographical corrections

arXiv:2310.19834 [pdf, other]

AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System

Authors: Shakshi Sharma, Anwitaman Datta, Rajesh Sharma

Abstract: Misinformation has emerged as a major societal threat in recent years in general; specifically in the context of the COVID-19 pandemic, it has wrecked havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable solutions for combating misinformation are the need of the hour. This work explored how existing information obtained from social media and augmented with more curated fact… ▽ More Misinformation has emerged as a major societal threat in recent years in general; specifically in the context of the COVID-19 pandemic, it has wrecked havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable solutions for combating misinformation are the need of the hour. This work explored how existing information obtained from social media and augmented with more curated fact checked data repositories can be harnessed to facilitate automated rebuttal of misinformation at scale. While the ideas herein can be generalized and reapplied in the broader context of misinformation mitigation using a multitude of information sources and catering to the spectrum of social media platforms, this work serves as a proof of concept, and as such, it is confined in its scope to only rebuttal of tweets, and in the specific context of misinformation regarding COVID-19. It leverages two publicly available datasets, viz. FaCov (fact-checked articles) and misleading (social media Twitter) data on COVID-19 Vaccination. △ Less

Submitted 29 October, 2023; originally announced October 2023.

arXiv:2310.09361 [pdf, other]

Is Certifying $\ell_p$ Robustness Still Worthwhile?

Authors: Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson

Abstract: Over the years, researchers have developed myriad attacks that exploit the ubiquity of adversarial examples, as well as defenses that aim to guard against the security vulnerabilities posed by such attacks. Of particular interest to this paper are defenses that provide provable guarantees against the class of $\ell_p$-bounded attacks. Certified defenses have made significant progress, taking robus… ▽ More Over the years, researchers have developed myriad attacks that exploit the ubiquity of adversarial examples, as well as defenses that aim to guard against the security vulnerabilities posed by such attacks. Of particular interest to this paper are defenses that provide provable guarantees against the class of $\ell_p$-bounded attacks. Certified defenses have made significant progress, taking robustness certification from toy models and datasets to large-scale problems like ImageNet classification. While this is undoubtedly an interesting academic problem, as the field has matured, its impact in practice remains unclear, thus we find it useful to revisit the motivation for continuing this line of research. There are three layers to this inquiry, which we address in this paper: (1) why do we care about robustness research? (2) why do we care about the $\ell_p$-bounded threat model? And (3) why do we care about certification as opposed to empirical defenses? In brief, we take the position that local robustness certification indeed confers practical value to the field of machine learning. We focus especially on the latter two questions from above. With respect to the first of the two, we argue that the $\ell_p$-bounded threat model acts as a minimal requirement for safe application of models in security-critical domains, while at the same time, evidence has mounted suggesting that local robustness may lead to downstream external benefits not immediately related to robustness. As for the second, we argue that (i) certification provides a resolution to the cat-and-mouse game of adversarial attacks; and furthermore, that (ii) perhaps contrary to popular belief, there may not exist a fundamental trade-off between accuracy, robustness, and certifiability, while moreover, certified training techniques constitute a particularly promising way for learning robust models. △ Less

Submitted 13 October, 2023; originally announced October 2023.

arXiv:2309.05497 [pdf, other]

Personality Detection and Analysis using Twitter Data

Authors: Abhilash Datta, Souvic Chakraborty, Animesh Mukherjee

Abstract: Personality types are important in various fields as they hold relevant information about the characteristics of a human being in an explainable format. They are often good predictors of a person's behaviors in a particular environment and have applications ranging from candidate selection to marketing and mental health. Recently automatic detection of personality traits from texts has gained sign… ▽ More Personality types are important in various fields as they hold relevant information about the characteristics of a human being in an explainable format. They are often good predictors of a person's behaviors in a particular environment and have applications ranging from candidate selection to marketing and mental health. Recently automatic detection of personality traits from texts has gained significant attention in computational linguistics. Most personality detection and analysis methods have focused on small datasets making their experimental observations often limited. To bridge this gap, we focus on collecting and releasing the largest automatically curated dataset for the research community which has 152 million tweets and 56 thousand data points for the Myers-Briggs personality type (MBTI) prediction task. We perform a series of extensive qualitative and quantitative studies on our dataset to analyze the data patterns in a better way and infer conclusions. We show how our intriguing analysis results often follow natural intuition. We also perform a series of ablation studies to show how the baselines perform for our dataset. △ Less

Submitted 11 September, 2023; originally announced September 2023.

Comments: Submitted to ASONAM 2023

arXiv:2309.04505 [pdf, other]

doi 10.1109/TrustCom60117.2023.00377

COVID-19 Detection System: A Comparative Analysis of System Performance Based on Acoustic Features of Cough Audio Signals

Authors: Asmaa Shati, Ghulam Mubashar Hassan, Amitava Datta

Abstract: A wide range of respiratory diseases, such as cold and flu, asthma, and COVID-19, affect people's daily lives worldwide. In medical practice, respiratory sounds are widely used in medical services to diagnose various respiratory illnesses and lung disorders. The traditional diagnosis of such sounds requires specialized knowledge, which can be costly and reliant on human expertise. Despite this, re… ▽ More A wide range of respiratory diseases, such as cold and flu, asthma, and COVID-19, affect people's daily lives worldwide. In medical practice, respiratory sounds are widely used in medical services to diagnose various respiratory illnesses and lung disorders. The traditional diagnosis of such sounds requires specialized knowledge, which can be costly and reliant on human expertise. Despite this, recent advancements, such as cough audio recordings, have emerged as a means to automate the detection of respiratory conditions. Therefore, this research aims to explore various acoustic features that enhance the performance of machine learning (ML) models in detecting COVID-19 from cough signals. It investigates the efficacy of three feature extraction techniques, including Mel Frequency Cepstral Coefficients (MFCC), Chroma, and Spectral Contrast features, when applied to two machine learning algorithms, Support Vector Machine (SVM) and Multilayer Perceptron (MLP), and therefore proposes an efficient CovCepNet detection system. The proposed system provides a practical solution and demonstrates state-of-the-art classification performance, with an AUC of 0.843 on the COUGHVID dataset and 0.953 on the Virufy dataset for COVID-19 detection from cough audio signals. △ Less

Submitted 18 June, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

Comments: 8 pages, 3 figures

Journal ref: 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Exeter, United Kingdom, 2023, pp. 2706-2713

arXiv:2309.04125 [pdf, other]

Blockchain-enabled Data Governance for Privacy-Preserved Sharing of Confidential Data

Authors: Jingchi Zhang, Anwitaman Datta

Abstract: In a traditional cloud storage system, users benefit from the convenience it provides but also take the risk of certain security and privacy issues. To ensure confidentiality while maintaining data sharing capabilities, the Ciphertext-Policy Attribute-based Encryption (CP-ABE) scheme can be used to achieve fine-grained access control in cloud services. However, existing approaches are impaired by… ▽ More In a traditional cloud storage system, users benefit from the convenience it provides but also take the risk of certain security and privacy issues. To ensure confidentiality while maintaining data sharing capabilities, the Ciphertext-Policy Attribute-based Encryption (CP-ABE) scheme can be used to achieve fine-grained access control in cloud services. However, existing approaches are impaired by three critical concerns: illegal authorization, key disclosure, and privacy leakage. To address these, we propose a blockchain-based data governance system that employs blockchain technology and attribute-based encryption to prevent privacy leakage and credential misuse. First, our ABE encryption system can handle multi-authority use cases while protecting identity privacy and hiding access policy, which also protects data sharing against corrupt authorities. Second, applying the Advanced Encryption Standard (AES) for data encryption makes the whole system efficient and responsive to real-world conditions. Furthermore, the encrypted data is stored in a decentralized storage system such as IPFS, which does not rely on any centralized service provider and is, therefore, resilient against single-point failures. Third, illegal authorization activity can be readily identified through the logged on-chain data. Besides the system design, we also provide security proofs to demonstrate the robustness of the proposed system. △ Less

Submitted 8 September, 2023; originally announced September 2023.

Comments: 23 pages, 19 algorithms, 1 figure

arXiv:2309.03204 [pdf, other]

A 9 Transistor SRAM Featuring Array-level XOR Parallelism with Secure Data Toggling Operation

Authors: Zihan Yin, Annewsha Datta, Shwetha Vijayakumar, Ajey Jacob, Akhilesh Jaiswal

Abstract: Security and energy-efficiency are critical for computing applications in general and for edge applications in particular. Digital in-Memory Computing (IMC) in SRAM cells have widely been studied to accelerate inference tasks to maximize both throughput and energy efficiency for intelligent computing at the edge. XOR operations have been of particular interest due to their wide applicability in nu… ▽ More Security and energy-efficiency are critical for computing applications in general and for edge applications in particular. Digital in-Memory Computing (IMC) in SRAM cells have widely been studied to accelerate inference tasks to maximize both throughput and energy efficiency for intelligent computing at the edge. XOR operations have been of particular interest due to their wide applicability in numerous applications that include binary neural networks and encryption. However, existing IMC circuits for XOR acceleration are limited to two rows in a memory array and extending the XOR parallelism to multiple rows in an SRAM array has remained elusive. Further, SRAM is prone to both data imprinting and data remanence security issues, which poses limitations on security . Based on commerical Globalfoundries 22nm mode, we are proposing a novel 9T SRAM cell such that multiple rows of data (entire array) can be XORed in a massively parallel single cycle fashion. The new cell also supports data-toggling within the SRAM cell efficiently to circumvent imprinting attacks and erase the SRAM value in case of remanence attack. △ Less

Submitted 11 August, 2023; originally announced September 2023.

arXiv:2309.00639 [pdf, other]

Misinformation Concierge: A Proof-of-Concept with Curated Twitter Dataset on COVID-19 Vaccination

Authors: Shakshi Sharma, Anwitaman Datta, Vigneshwaran Shankaran, Rajesh Sharma

Abstract: We demonstrate the Misinformation Concierge, a proof-of-concept that provides actionable intelligence on misinformation prevalent in social media. Specifically, it uses language processing and machine learning tools to identify subtopics of discourse and discern non/misleading posts; presents statistical reports for policy-makers to understand the big picture of prevalent misinformation in a timel… ▽ More We demonstrate the Misinformation Concierge, a proof-of-concept that provides actionable intelligence on misinformation prevalent in social media. Specifically, it uses language processing and machine learning tools to identify subtopics of discourse and discern non/misleading posts; presents statistical reports for policy-makers to understand the big picture of prevalent misinformation in a timely manner; and recommends rebuttal messages for specific pieces of misinformation, identified from within the corpus of data - providing means to intervene and counter misinformation promptly. The Misinformation Concierge proof-of-concept using a curated dataset is accessible at: https://demo-frontend-uy34.onrender.com/ △ Less

Submitted 25 August, 2023; originally announced September 2023.

Comments: This is a preprinted version of our CIKM paper. Please cite our CIKM paper

arXiv:2308.14840 [pdf, other]

doi 10.1561/3300000041

Identifying and Mitigating the Security Risks of Generative AI

Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well… ▽ More Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address. △ Less

Submitted 28 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

Journal ref: Foundations and Trends in Privacy and Security 6 (2023) 1-52

arXiv:2308.02163 [pdf, other]

BlockChain I/O: Enabling Cross-Chain Commerce

Authors: Anwitaman Datta, Daniël Reijsbergen, Jingchi Zhang, Suman Majumder

Abstract: Blockchain technology enables secure tokens transfers in digital marketplaces, and recent advances in this field provide other desirable properties such as efficiency, privacy, and price stability. However, these properties do not always generalize to a setting across multiple independent blockchains. Despite the growing number of existing blockchain platforms, there is a lack of an overarching fr… ▽ More Blockchain technology enables secure tokens transfers in digital marketplaces, and recent advances in this field provide other desirable properties such as efficiency, privacy, and price stability. However, these properties do not always generalize to a setting across multiple independent blockchains. Despite the growing number of existing blockchain platforms, there is a lack of an overarching framework whose components provide all of the necessary properties for practical cross-chain commerce. We present BlockChain I/O to provide such a framework. BlockChain I/O introduces entities called cross-chain services to relay information between different blockchains. The proposed design ensures that cross-chain services cannot violate transaction safety, and they are furthermore disincentivized from other types of misbehavior through an audit system. BlockChain I/O uses native stablecoins to mitigate price fluctuations, and a decentralized ID system to allow users to prove aspects of their identity without violating privacy. After presenting the core architecture of BlockChain I/O, we demonstrate how to use it to implement a cross-chain marketplace and discuss how its desirable properties continue to hold in the end-to-end system. Finally, we use experimental evaluations to demonstrate BlockChain I/O's practical performance. △ Less

Submitted 28 June, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

arXiv:2307.14041 [pdf, other]

GovernR: Provenance and Confidentiality Guarantees In Research Data Repositories

Authors: Anwitaman Datta, Chua Chiah Soon, Wangfan Gu

Abstract: We propose cryptographic protocols to incorporate time provenance guarantees while meeting confidentiality and controlled sharing needs for research data. We demonstrate the efficacy of these mechanisms by developing and benchmarking a practical tool, GovernR, which furthermore takes into usability issues and is compatible with a popular open-sourced research data storage platform, Dataverse. In d… ▽ More We propose cryptographic protocols to incorporate time provenance guarantees while meeting confidentiality and controlled sharing needs for research data. We demonstrate the efficacy of these mechanisms by developing and benchmarking a practical tool, GovernR, which furthermore takes into usability issues and is compatible with a popular open-sourced research data storage platform, Dataverse. In doing so, we identify and provide a solution addressing an important gap (though applicable to only niche use cases) in practical research data management. △ Less

Submitted 26 July, 2023; originally announced July 2023.

Comments: 2 Figures, 3 Tables

arXiv:2307.05373 [pdf, other]

Classification of sleep stages from EEG, EOG and EMG signals by SSNet

Authors: Haifa Almutairi, Ghulam Mubashar Hassan, Amitava Datta

Abstract: Classification of sleep stages plays an essential role in diagnosing sleep-related diseases including Sleep Disorder Breathing (SDB) disease. In this study, we propose an end-to-end deep learning architecture, named SSNet, which comprises of two deep learning networks based on Convolutional Neuron Networks (CNN) and Long Short Term Memory (LSTM). Both deep learning networks extract features from t… ▽ More Classification of sleep stages plays an essential role in diagnosing sleep-related diseases including Sleep Disorder Breathing (SDB) disease. In this study, we propose an end-to-end deep learning architecture, named SSNet, which comprises of two deep learning networks based on Convolutional Neuron Networks (CNN) and Long Short Term Memory (LSTM). Both deep learning networks extract features from the combination of Electrooculogram (EOG), Electroencephalogram (EEG), and Electromyogram (EMG) signals, as each signal has distinct features that help in the classification of sleep stages. The features produced by the two-deep learning networks are concatenated to pass to the fully connected layer for the classification. The performance of our proposed model is evaluated by using two public datasets Sleep-EDF Expanded dataset and ISRUC-Sleep dataset. The accuracy and Kappa coefficient are 96.36% and 93.40% respectively, for classifying three classes of sleep stages using Sleep-EDF Expanded dataset. Whereas, the accuracy and Kappa coefficient are 96.57% and 83.05% respectively for five classes of sleep stages using Sleep-EDF Expanded dataset. Our model achieves the best performance in classifying sleep stages when compared with the state-of-the-art techniques. △ Less

Submitted 2 July, 2023; originally announced July 2023.

arXiv:2306.09754 [pdf, other]

CroCoDai: A Stablecoin for Cross-Chain Commerce

Authors: Daniël Reijsbergen, Bretislav Hajek, Tien Tuan Anh Dinh, Jussi Keppo, Hank Korth, Anwitaman Datta

Abstract: Decentralized Finance (DeFi), in which digital assets are exchanged without trusted intermediaries, has grown rapidly in value in recent years. The global DeFi ecosystem is fragmented into multiple blockchains, fueling the demand for cross-chain commerce. Existing approaches for cross-chain transactions, e.g., bridges and cross-chain deals, achieve atomicity by locking assets in escrow. However, l… ▽ More Decentralized Finance (DeFi), in which digital assets are exchanged without trusted intermediaries, has grown rapidly in value in recent years. The global DeFi ecosystem is fragmented into multiple blockchains, fueling the demand for cross-chain commerce. Existing approaches for cross-chain transactions, e.g., bridges and cross-chain deals, achieve atomicity by locking assets in escrow. However, locking up assets increases the financial risks for the participants, especially due to price fluctuations and the long latency of cross-chain transactions. Stablecoins, which are pegged to a non-volatile asset such as the US dollar, help mitigate the risk associated with price fluctuations. However, existing stablecoin designs are tied to individual blockchain platforms, and trusted parties or complex protocols are needed to exchange stablecoin tokens between blockchains. Our goal is to design a practical stablecoin for cross-chain commerce. Realizing this goal requires addressing two challenges. The first challenge is to support a large and growing number of blockchains efficiently. The second challenge is to be resilient to price fluctuations and blockchain platform failures. We present CroCoDai to address these challenges. We also present three prototype implementations of our stablecoin system, and show that it incurs small execution overhead. △ Less

Submitted 20 June, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.09735 [pdf, other]

PIEChain -- A Practical Blockchain Interoperability Framework

Authors: Daniël Reijsbergen, Aung Maw, Jingchi Zhang, Tien Tuan Anh Dinh, Anwitaman Datta

Abstract: A plethora of different blockchain platforms have emerged in recent years, but many of them operate in silos. As such, there is a need for reliable cross-chain communication to enable blockchain interoperability. Blockchain interoperability is challenging because transactions can typically not be reverted - as such, if one transaction is committed then the protocol must ensure that all related tra… ▽ More A plethora of different blockchain platforms have emerged in recent years, but many of them operate in silos. As such, there is a need for reliable cross-chain communication to enable blockchain interoperability. Blockchain interoperability is challenging because transactions can typically not be reverted - as such, if one transaction is committed then the protocol must ensure that all related transactions are committed as well. Existing interoperability approaches, e.g., Cosmos and Polkadot, are limited in the sense that they only support interoperability between their own subchains, or require intrusive changes to existing blockchains. To overcome this limitation, we propose PIEChain, a general, Kafka-based cross-chain communication framework. We utilize PIEChain for a practical case study: a cross-chain auction in which users who hold tokens on multiple chains bid for a ticket sold on another chain. PIEChain is the first publicly available, practical implementation of a general framework for cross-chain communication. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2306.01750 [pdf, other]

A Survey of Explainable AI and Proposal for a Discipline of Explanation Engineering

Authors: Clive Gomes, Lalitha Natraj, Shijun Liu, Anushka Datta

Abstract: In this survey paper, we deep dive into the field of Explainable Artificial Intelligence (XAI). After introducing the scope of this paper, we start by discussing what an "explanation" really is. We then move on to discuss some of the existing approaches to XAI and build a taxonomy of the most popular methods. Next, we also look at a few applications of these and other XAI techniques in four primar… ▽ More In this survey paper, we deep dive into the field of Explainable Artificial Intelligence (XAI). After introducing the scope of this paper, we start by discussing what an "explanation" really is. We then move on to discuss some of the existing approaches to XAI and build a taxonomy of the most popular methods. Next, we also look at a few applications of these and other XAI techniques in four primary domains: finance, autonomous driving, healthcare and manufacturing. We end by introducing a promising discipline, "Explanation Engineering," which includes a systematic approach for designing explainability into AI systems. △ Less

Submitted 20 May, 2023; originally announced June 2023.

arXiv:2306.01540 [pdf, other]

CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

Authors: Ayush Agrawal, Raghav Arora, Ahana Datta, Snehasis Banerjee, Brojeshwar Bhowmick, Krishna Murthy Jatavallabhula, Mohan Sridharan, Madhava Krishna

Abstract: This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifica… ▽ More This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifically, it (a)encodes a knowledge graph of prior human preferences about the room location of different objects in home environments, (b) incorporates vision-language features to support multimodal queries based on images or text, and (c) uses a graph network to learn object-room affinities based on embeddings of the prior knowledge and the vision-language features. We demonstrate that our approach provides better estimates of the most appropriate location of objects from a benchmark set of object categories in comparison with state-of-the-art baselines △ Less

Submitted 2 June, 2023; originally announced June 2023.

Journal ref: RO-MAN 2023 Conference

arXiv:2305.18330 [pdf, other]

#REVAL: a semantic evaluation framework for hashtag recommendation

Authors: Areej Alsini, Du Q. Huynh, Amitava Datta

Abstract: Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of ev… ▽ More Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of evaluating hashtag similarities is inadequate as it ignores the semantic correlation between the recommended and ground truth hashtags. To tackle this problem, we propose a novel semantic evaluation framework for hashtag recommendation, called #REval. This framework includes an internal module referred to as BERTag, which automatically learns the hashtag embeddings. We investigate on how the #REval framework performs under different word embedding methods and different numbers of synonyms and hashtags in the recommendation using our proposed #REval-hit-ratio measure. Our experiments of the proposed framework on three large datasets show that #REval gave more meaningful hashtag synonyms for hashtag recommendation evaluation. Our analysis also highlights the sensitivity of the framework to the word embedding technique, with #REval based on BERTag more superior over #REval based on FastText and Word2Vec. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 18 pages, 4 figures

ACM Class: I.2.7

arXiv:2305.10625 [pdf, other]

Measuring and Mitigating Local Instability in Deep Neural Networks

Authors: Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan

Abstract: Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability proble… ▽ More Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability problem by studying how the predictions of a model change, even when it is retrained on the same data, as a consequence of stochasticity in the training process. For Natural Language Understanding (NLU) tasks, we find instability in predictions for a significant fraction of queries. We formulate principled metrics, like per-sample ``label entropy'' across training runs or within a single training run, to quantify this phenomenon. Intriguingly, we find that unstable predictions do not appear at random, but rather appear to be clustered in data-specific ways. We study data-agnostic regularization methods to improve stability and propose new data-centric methods that exploit our local stability estimates. We find that our localized data-specific mitigation strategy dramatically outperforms data-agnostic methods, and comes within 90% of the gold standard, achieved by ensembling, at a fraction of the computational cost △ Less

Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

Comments: To be published in Findings of the Association for Computational Linguistics (ACL), 2023

arXiv:2305.06178 [pdf]

Sequence-Agnostic Multi-Object Navigation

Authors: Nandiraju Gireesh, Ayush Agrawal, Ahana Datta, Snehasis Banerjee, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna

Abstract: The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the obj… ▽ More The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the object classes are to be explored is provided in advance. This is a strong limitation in practical applications characterized by dynamic changes. This paper describes a deep reinforcement learning framework for sequence-agnostic MultiON based on an actor-critic architecture and a suitable reward specification. Our framework leverages past experiences and seeks to reward progress toward individual as well as multiple target object classes. We use photo-realistic scenes from the Gibson benchmark dataset in the AI Habitat 3D simulation environment to experimentally show that our method performs better than a pre-sequenced approach and a state of the art ON method extended to MultiON. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Journal ref: ICRA 2023 conference

arXiv:2304.09157 [pdf, other]

Neural networks for geospatial data

Authors: Wentao Zhan, Abhirup Datta

Abstract: Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all… ▽ More Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all other advantages including use of Gaussian Processes to explicitly model the spatial covariance, enabling inference on the covariate effect through the mean and on the spatial dependence through the covariance, and offering predictions at new locations via kriging. We propose NN-GLS, a new neural network estimation algorithm for the non-linear mean in GP models that explicitly accounts for the spatial covariance through generalized least squares (GLS), the same loss used in the linear case. We show that NN-GLS admits a representation as a special type of graph neural network (GNN). This connection facilitates use of standard neural network computational techniques for irregular geospatial data, enabling novel and scalable mini-batching, backpropagation, and kriging schemes. Theoretically, we show that NN-GLS will be consistent for irregularly observed spatially correlated data processes. We also provide a finite sample concentration rate, which quantifies the need to accurately model the spatial covariance in neural networks for dependent data. To our knowledge, these are the first large-sample results for any neural network algorithm for irregular spatial data. We demonstrate the methodology through simulated and real datasets. △ Less

Submitted 24 May, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

arXiv:2211.04628 [pdf]

doi 10.1016/j.bspc.2023.104780

MP-SeizNet: A Multi-Path CNN Bi-LSTM Network for Seizure-Type Classification Using EEG

Authors: Hezam Albaqami, Ghulam Mubashar Hassan, Amitava Datta

Abstract: Seizure type identification is essential for the treatment and management of epileptic patients. However, it is a difficult process known to be time consuming and labor intensive. Automated diagnosis systems, with the advancement of machine learning algorithms, have the potential to accelerate the classification process, alert patients, and support physicians in making quick and accurate decisions… ▽ More Seizure type identification is essential for the treatment and management of epileptic patients. However, it is a difficult process known to be time consuming and labor intensive. Automated diagnosis systems, with the advancement of machine learning algorithms, have the potential to accelerate the classification process, alert patients, and support physicians in making quick and accurate decisions. In this paper, we present a novel multi-path seizure-type classification deep learning network (MP-SeizNet), consisting of a convolutional neural network (CNN) and a bidirectional long short-term memory neural network (Bi-LSTM) with an attention mechanism. The objective of this study was to classify specific types of seizures, including complex partial, simple partial, absence, tonic, and tonic-clonic seizures, using only electroencephalogram (EEG) data. The EEG data is fed to our proposed model in two different representations. The CNN was fed with wavelet-based features extracted from the EEG signals, while the Bi-LSTM was fed with raw EEG signals to let our MP-SeizNet jointly learns from different representations of seizure data for more accurate information learning. The proposed MP-SeizNet was evaluated using the largest available EEG epilepsy database, the Temple University Hospital EEG Seizure Corpus, TUSZ v1.5.2. We evaluated our proposed model across different patient data using three-fold cross-validation and across seizure data using five-fold cross-validation, achieving F1 scores of 87.6% and 98.1%, respectively. △ Less

Submitted 1 March, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Journal ref: Biomed. Signal Process. Control. 84 (2023) 104780

arXiv:2208.07665 [pdf, other]

QPQ 1DLT: A system for the rapid deployment of secure and efficient EVM-based blockchains

Authors: Simone Bottoni, Anwitaman Datta, Federico Franzoni, Emanuele Ragnoli, Roberto Ripamonti, Christian Rondanini, Gokhan Sagirlar, Alberto Trombetta

Abstract: Limited scalability and transaction costs are, among others, some of the critical issues that hamper a wider adoption of distributed ledger technologies (DLT). That is particularly true for the Ethereum blockchain, which, so far, has been the ecosystem with the highest adoption rate. Quite a few solutions, especially on the Ethereum side of things, have been attempted in the last few years. Most o… ▽ More Limited scalability and transaction costs are, among others, some of the critical issues that hamper a wider adoption of distributed ledger technologies (DLT). That is particularly true for the Ethereum blockchain, which, so far, has been the ecosystem with the highest adoption rate. Quite a few solutions, especially on the Ethereum side of things, have been attempted in the last few years. Most of them adopt the approach to offload transactions from the blockchain mainnet, a.k.a. Level 1 (L1), to a separate network. Such systems are collectively known as Level 2 (L2) systems. While mitigating the scalability issue, the adoption of L2 introduces additional drawbacks: users have to trust that the L2 system has correctly performed transactions or, conversely, high computational power is required to prove transactions correctness. In addition, significant technical knowledge is needed to set up and manage such an L2 system. To tackle such limitations, we propose 1DLT: a novel system that enables rapid and trustless deployment of an Ethereum Virtual Machine based blockchain that overcomes those drawbacks. △ Less

Submitted 16 August, 2022; originally announced August 2022.

arXiv:2206.00192 [pdf, other]

Order-sensitive Shapley Values for Evaluating Conceptual Soundness of NLP Models

Authors: Kaiji Lu, Anupam Datta

Abstract: Previous works show that deep NLP models are not always conceptually sound: they do not always learn the correct linguistic concepts. Specifically, they can be insensitive to word order. In order to systematically evaluate models for their conceptual soundness with respect to word order, we introduce a new explanation method for sequential data: Order-sensitive Shapley Values (OSV). We conduct an… ▽ More Previous works show that deep NLP models are not always conceptually sound: they do not always learn the correct linguistic concepts. Specifically, they can be insensitive to word order. In order to systematically evaluate models for their conceptual soundness with respect to word order, we introduce a new explanation method for sequential data: Order-sensitive Shapley Values (OSV). We conduct an extensive empirical evaluation to validate the method and surface how well various deep NLP models learn word order. Using synthetic data, we first show that OSV is more faithful in explaining model behavior than gradient-based methods. Second, applying to the HANS dataset, we discover that the BERT-based NLI model uses only the word occurrences without word orders. Although simple data augmentation improves accuracy on HANS, OSV shows that the augmented model does not fundamentally improve the model's learning of order. Third, we discover that not all sentiment analysis models learn negation properly: some fail to capture the correct syntax of the negation construct. Finally, we show that pretrained language models such as BERT may rely on the absolute positions of subject words to learn long-range Subject-Verb Agreement. With each NLP task, we also demonstrate how OSV can be leveraged to generate adversarial examples. △ Less

Submitted 31 May, 2022; originally announced June 2022.

arXiv:2205.11850 [pdf, other]

Faithful Explanations for Deep Graph Models

Authors: Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Abstract: This paper studies faithful explanations for Graph Neural Networks (GNNs). First, we provide a new and general method for formally characterizing the faithfulness of explanations for GNNs. It applies to existing explanation methods, including feature attributions and subgraph explanations. Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the… ▽ More This paper studies faithful explanations for Graph Neural Networks (GNNs). First, we provide a new and general method for formally characterizing the faithfulness of explanations for GNNs. It applies to existing explanation methods, including feature attributions and subgraph explanations. Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the nonlinear effect of edge features, while existing subgraph explanation methods are not faithful. Third, we introduce \emph{k-hop Explanation with a Convolutional Core} (KEC), a new explanation method that provably maximizes faithfulness to the original GNN by leveraging information about the graph structure in its adjacency matrix and its \emph{k-th} power. Lastly, our empirical results over both synthetic and real-world datasets for classification and anomaly detection tasks with GNNs demonstrate the effectiveness of our approach. △ Less

Submitted 24 May, 2022; originally announced May 2022.

arXiv:2205.07870 [pdf, other]

Unsupervised Driving Behavior Analysis using Representation Learning and Exploiting Group-based Training

Authors: Soma Bandyopadhyay, Anish Datta, Shruti Sachan, Arpan Pal

Abstract: Driving behavior monitoring plays a crucial role in managing road safety and decreasing the risk of traffic accidents. Driving behavior is affected by multiple factors like vehicle characteristics, types of roads, traffic, but, most importantly, the pattern of driving of individuals. Current work performs a robust driving pattern analysis by capturing variations in driving patterns. It forms consi… ▽ More Driving behavior monitoring plays a crucial role in managing road safety and decreasing the risk of traffic accidents. Driving behavior is affected by multiple factors like vehicle characteristics, types of roads, traffic, but, most importantly, the pattern of driving of individuals. Current work performs a robust driving pattern analysis by capturing variations in driving patterns. It forms consistent groups by learning compressed representation of time series (Auto Encoded Compact Sequence) using a multi-layer seq-2-seq autoencoder and exploiting hierarchical clustering along with recommending the choice of best distance measure. Consistent groups aid in identifying variations in driving patterns of individuals captured in the dataset. These groups are generated for both train and hidden test data. The consistent groups formed using train data, are exploited for training multiple instances of the classifier. Obtained choice of best distance measure is used to select the best train-test pair of consistent groups. We have experimented on the publicly available UAH-DriveSet dataset considering the signals captured from IMU sensors (accelerometer and gyroscope) for classifying driving behavior. We observe proposed method, significantly outperforms the benchmark performance. △ Less

Submitted 12 May, 2022; originally announced May 2022.

Comments: 7 figures, 8 pages , 7 tables, accepted and presented conference AAAI 2022 AI for Transportation Workshop (Prefinal version)

arXiv:2203.07731 [pdf]

Evaluating BERT-based Pre-training Language Models for Detecting Misinformation

Authors: Rini Anggrainingsih, Ghulam Mubashar Hassan, Amitava Datta

Abstract: It is challenging to control the quality of online information due to the lack of supervision over all the information posted online. Manual checking is almost impossible given the vast number of posts made on online media and how quickly they spread. Therefore, there is a need for automated rumour detection techniques to limit the adverse effects of spreading misinformation. Previous studies main… ▽ More It is challenging to control the quality of online information due to the lack of supervision over all the information posted online. Manual checking is almost impossible given the vast number of posts made on online media and how quickly they spread. Therefore, there is a need for automated rumour detection techniques to limit the adverse effects of spreading misinformation. Previous studies mainly focused on finding and extracting the significant features of text data. However, extracting features is time-consuming and not a highly effective process. This study proposes the BERT- based pre-trained language models to encode text data into vectors and utilise neural network models to classify these vectors to detect misinformation. Furthermore, different language models (LM) ' performance with different trainable parameters was compared. The proposed technique is tested on different short and long text datasets. The result of the proposed technique has been compared with the state-of-the-art techniques on the same datasets. The results show that the proposed technique performs better than the state-of-the-art techniques. We also tested the proposed technique by combining the datasets. The results demonstrated that the large data training and testing size considerably improves the technique's performance. △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: 17 pages, 2 figures, 10 tables

arXiv:2203.00511 [pdf, other]

doi 10.3390/app12115702

Wavelet-Based Multi-Class Seizure Type Classification System

Authors: Hezam Albaqami, Ghulam Mubashar Hassan, Amitava Datta

Abstract: Epilepsy is one of the most common brain diseases that affect more than 1\% of the world's population. It is characterized by recurrent seizures, which come in different types and are treated differently. Electroencephalography (EEG) is commonly used in medical services to diagnose seizures and their types. The accurate identification of seizures helps to provide optimal treatment and accurate inf… ▽ More Epilepsy is one of the most common brain diseases that affect more than 1\% of the world's population. It is characterized by recurrent seizures, which come in different types and are treated differently. Electroencephalography (EEG) is commonly used in medical services to diagnose seizures and their types. The accurate identification of seizures helps to provide optimal treatment and accurate information to the patient. However, the manual diagnostic procedures of epileptic seizures are laborious and highly-specialized. Moreover, EEG manual evaluation is a process known to have a low inter-rater agreement among experts. This paper presents a novel automatic technique that involves extraction of specific features from EEG signals using Dual-tree Complex Wavelet Transform (DTCWT) and classifying them. We evaluated the proposed technique on TUH EEG Seizure Corpus (TUSZ) ver.1.5.2 dataset and compared the performance with existing state-of-the-art techniques using overall F1-score due to class imbalance seizure types. Our proposed technique achieved the best results of weighted F1-score of 99.1\% and 74.7\% for seizure-wise and patient-wise classification respectively, thereby setting new benchmark results for this dataset. △ Less

Submitted 19 February, 2022; originally announced March 2022.

arXiv:2112.06456 [pdf, other]

Real Time Action Recognition from Video Footage

Authors: Tasnim Sakib Apon, Mushfiqul Islam Chowdhury, MD Zubair Reza, Arpita Datta, Syeda Tanjina Hasan, MD. Golam Rabiul Alam

Abstract: Crime rate is increasing proportionally with the increasing rate of the population. The most prominent approach was to introduce Closed-Circuit Television (CCTV) camera-based surveillance to tackle the issue. Video surveillance cameras have added a new dimension to detect crime. Several research works on autonomous security camera surveillance are currently ongoing, where the fundamental goal is t… ▽ More Crime rate is increasing proportionally with the increasing rate of the population. The most prominent approach was to introduce Closed-Circuit Television (CCTV) camera-based surveillance to tackle the issue. Video surveillance cameras have added a new dimension to detect crime. Several research works on autonomous security camera surveillance are currently ongoing, where the fundamental goal is to discover violent activity from video feeds. From the technical viewpoint, this is a challenging problem because analyzing a set of frames, i.e., videos in temporal dimension to detect violence might need careful machine learning model training to reduce false results. This research focuses on this problem by integrating state-of-the-art Deep Learning methods to ensure a robust pipeline for autonomous surveillance for detecting violent activities, e.g., kicking, punching, and slapping. Initially, we designed a dataset of this specific interest, which contains 600 videos (200 for each action). Later, we have utilized existing pre-trained model architectures to extract features, and later used deep learning network for classification. Also, We have classified our models' accuracy, and confusion matrix on different pre-trained architectures like VGG16, InceptionV3, ResNet50, Xception and MobileNet V2 among which VGG16 and MobileNet V2 performed better. △ Less

Submitted 13 December, 2021; originally announced December 2021.

arXiv:2111.11623 [pdf, other]

A Modular Framework for Centrality and Clustering in Complex Networks

Authors: Frederique Oggier, Silivanxay Phetsouvanh, Anwitaman Datta

Abstract: The structure of many complex networks includes edge directionality and weights on top of their topology. Network analysis that can seamlessly consider combination of these properties are desirable. In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an informa… ▽ More The structure of many complex networks includes edge directionality and weights on top of their topology. Network analysis that can seamlessly consider combination of these properties are desirable. In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an information theoretic measure for computing centrality. Our principal contributions include a generalized model of Markov entropic centrality with the flexibility to tune the importance of node degrees, edge weights and directions, with a closed-form asymptotic analysis. It leads to a novel two-stage graph clustering algorithm. The centrality analysis helps reason about the suitability of our approach to cluster a given graph, and determine `query' nodes, around which to explore local community structures, leading to an agglomerative clustering mechanism. The entropic centrality computations are amortized by our clustering algorithm, making it computationally efficient: compared to prior approaches using Markov entropic centrality for clustering, our experiments demonstrate multiple orders of magnitude of speed-up. Our clustering algorithm naturally inherits the flexibility to accommodate edge directionality, as well as different interpretations and interplay between edge weights and node degrees. Overall, this paper thus not only makes significant theoretical and conceptual contributions, but also translates the findings into artifacts of practical relevance, yielding new, effective and scalable centrality computations and graph clustering algorithms, whose efficacy has been validated through extensive benchmarking experiments. △ Less

Submitted 22 November, 2021; originally announced November 2021.

arXiv:2111.00163 [pdf, other]

Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates

Authors: Asoke Datta, Yesdaulet Izenov, Brian Tsan, Florin Rusu

Abstract: The Join Order Benchmark (JOB) has become the de facto standard to assess the performance of relational database query optimizers due to its complexity and completeness. In order to compute the optimal execution plan -- join order -- existing solutions employ extensive data synopses and correlations -- functional dependencies -- between table attributes. These structures incur significant overhead… ▽ More The Join Order Benchmark (JOB) has become the de facto standard to assess the performance of relational database query optimizers due to its complexity and completeness. In order to compute the optimal execution plan -- join order -- existing solutions employ extensive data synopses and correlations -- functional dependencies -- between table attributes. These structures incur significant overhead to design, build, and maintain. In this paper, we present \textit{Simpli}city \textit{Simpli}fied (\textit{Simpli-Squared}), a very simple join ordering algorithm that achieves unexpectedly good results. Simpli-Squared computes the join order without using any statistics or cardinality estimates. It takes as input only the referential integrity constraints declared at schema definition and the number of tuples (size) in the base tables. The join order of a given query is computed by splitting the join graph along the many-to-many joins and sorting the tables based on their size. The tables involved in one-to-many joins are greedily included based on size and the query join graph. The resulting plan can be efficiently generated by a lightweight query rewriting procedure. Experiments on the JOB benchmark in PostgreSQL show that Simpli-Squared achieves runtimes having an increase of only up to 16\% -- and sometimes even a reduction -- compared to four state-of-the-art solutions that are considerably more intricate. Based on these results, we question whether JOB adequately tests query optimizers or if accurate cardinality estimation is such a fundamental requirement for performing well on the JOB benchmark. △ Less

Submitted 29 October, 2021; originally announced November 2021.

arXiv:2110.03109 [pdf, other]

Consistent Counterfactuals for Deep Models

Authors: Emily Black, Zifan Wang, Matt Fredrikson, Anupam Datta

Abstract: Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or fine-tuned. This paper studies the consistency of model… ▽ More Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or fine-tuned. This paper studies the consistency of model prediction on counterfactual examples in deep networks under small changes to initial training conditions, such as weight initialization and leave-one-out variations in data, as often occurs during model deployment. We demonstrate experimentally that counterfactual examples for deep models are often inconsistent across such small changes, and that increasing the cost of the counterfactual, a stability-enhancing mitigation suggested by prior work in the context of simpler models, is not a reliable heuristic in deep networks. Rather, our analysis shows that a model's local Lipschitz continuity around the counterfactual is key to its consistency across related models. To this end, we propose Stable Neighbor Search as a way to generate more consistent counterfactual explanations, and illustrate the effectiveness of this approach on several benchmark datasets. △ Less

Submitted 6 October, 2021; originally announced October 2021.

arXiv:2109.02975 [pdf, other]

BERT based classification system for detecting rumours on Twitter

Authors: Rini Anggrainingsih, Ghulam Mubashar Hassan, Amitava Datta

Abstract: The role of social media in opinion formation has far-reaching implications in all spheres of society. Though social media provide platforms for expressing news and views, it is hard to control the quality of posts due to the sheer volumes of posts on platforms like Twitter and Facebook. Misinformation and rumours have lasting effects on society, as they tend to influence people's opinions and als… ▽ More The role of social media in opinion formation has far-reaching implications in all spheres of society. Though social media provide platforms for expressing news and views, it is hard to control the quality of posts due to the sheer volumes of posts on platforms like Twitter and Facebook. Misinformation and rumours have lasting effects on society, as they tend to influence people's opinions and also may motivate people to act irrationally. It is therefore very important to detect and remove rumours from these platforms. The only way to prevent the spread of rumours is through automatic detection and classification of social media posts. Our focus in this paper is the Twitter social medium, as it is relatively easy to collect data from Twitter. The majority of previous studies used supervised learning approaches to classify rumours on Twitter. These approaches rely on feature extraction to obtain both content and context features from the text of tweets to distinguish rumours and non-rumours. Manually extracting features however is time-consuming considering the volume of tweets. We propose a novel approach to deal with this problem by utilising sentence embedding using BERT to identify rumours on Twitter, rather than the usual feature extraction techniques. We use sentence embedding using BERT to represent each tweet's sentences into a vector according to the contextual meaning of the tweet. We classify those vectors into rumours or non-rumours by using various supervised learning techniques. Our BERT based models improved the accuracy by approximately 10% as compared to previous methods. △ Less

Submitted 7 September, 2021; originally announced September 2021.

Comments: Consists of 10 pages, 5 figures, and 8 tables, has been submitted to IEEE transactions on Computational and Social Systems (still underreview process)

Showing 1–50 of 157 results for author: Datta, A