Skip to main content

Showing 1–50 of 157 results for author: Datta, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.15341  [pdf, other

    cs.AI cs.CV

    V-Zen: Efficient GUI Understanding and Precise Grounding With A Novel Multimodal LLM

    Authors: Abdur Rahman, Rajat Chawla, Muskaan Kumar, Arkajit Datta, Adarsh Jha, Mukunda NS, Ishaan Bhola

    Abstract: In the rapidly evolving landscape of AI research and application, Multimodal Large Language Models (MLLMs) have emerged as a transformative force, adept at interpreting and integrating information from diverse modalities such as text, images, and Graphical User Interfaces (GUIs). Despite these advancements, the nuanced interaction and understanding of GUIs pose a significant challenge, limiting th… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2405.08305  [pdf, other

    cs.CR

    Collateral Portfolio Optimization in Crypto-Backed Stablecoins

    Authors: Bretislav Hajek, Daniel Reijsbergen, Anwitaman Datta, Jussi Keppo

    Abstract: Stablecoins - crypto tokens whose value is pegged to a real-world asset such as the US Dollar - are an important component of the DeFi ecosystem as they mitigate the impact of token price volatility. In crypto-backed stablecoins, the peg is founded on the guarantee that in case of system shutdown, each stablecoin can be exchanged for a basket of other crypto tokens worth approximately its nominal… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

    Comments: Accepted for presentation at MARBLE 2024

  3. arXiv:2404.03587  [pdf, other

    cs.RO cs.AI

    Anticipate & Collab: Data-driven Task Anticipation and Knowledge-driven Planning for Human-robot Collaboration

    Authors: Shivam Singh, Karthik Swaminathan, Raghav Arora, Ramandeep Singh, Ahana Datta, Dipanjan Das, Snehasis Banerjee, Mohan Sridharan, Madhava Krishna

    Abstract: An agent assisting humans in daily living activities can collaborate more effectively by anticipating upcoming tasks. Data-driven methods represent the state of the art in task anticipation, planning, and related problems, but these methods are resource-hungry and opaque. Our prior work introduced a proof of concept framework that used an LLM to anticipate 3 high-level tasks that served as goals f… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  4. arXiv:2404.01329  [pdf, other

    cs.SI

    Unraveling the Dynamics of Television Debates and Social Media Engagement: Insights from an Indian News Show

    Authors: Kiran Garimella, Abhilash Datta

    Abstract: The relationship between television shows and social media has become increasingly intertwined in recent years. Social media platforms, particularly Twitter, have emerged as significant sources of public opinion and discourse on topics discussed in television shows. In India, news debates leverage the popularity of social media to promote hashtags and engage users in discussions and debates on a d… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: Accepted at ICWSM 2024. Please cite the ICWSM version

  5. arXiv:2404.01226  [pdf, other

    cs.CL

    Stable Code Technical Report

    Authors: Nikhil Pinnaparaju, Reshinth Adithyan, Duy Phung, Jonathan Tow, James Baicoianu, Ashish Datta, Maksym Zhuravinskyi, Dakota Mahan, Marco Bellagente, Carlos Riquelme, Nathan Cooper

    Abstract: We introduce Stable Code, the first in our new-generation of code language models series, which serves as a general-purpose base code language model targeting code completion, reasoning, math, and other software engineering-based tasks. Additionally, we introduce an instruction variant named Stable Code Instruct that allows conversing with the model in a natural chat interface for performing quest… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  6. arXiv:2403.10171  [pdf

    cs.AI cs.CV

    AUTONODE: A Neuro-Graphic Self-Learnable Engine for Cognitive GUI Automation

    Authors: Arkajit Datta, Tushar Verma, Rajat Chawla, Mukunda N. S, Ishaan Bhola

    Abstract: In recent advancements within the domain of Large Language Models (LLMs), there has been a notable emergence of agents capable of addressing Robotic Process Automation (RPA) challenges through enhanced cognitive capabilities and sophisticated reasoning. This development heralds a new era of scalability and human-like adaptability in goal attainment. In this context, we introduce AUTONODE (Autonomo… ▽ More

    Submitted 27 May, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted in MIPR-2024

  7. arXiv:2403.08773  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    Veagle: Advancements in Multimodal Representation Learning

    Authors: Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola

    Abstract: Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image cap… ▽ More

    Submitted 18 January, 2024; originally announced March 2024.

  8. arXiv:2403.04026  [pdf, other

    cs.DB

    Spanning Tree-based Query Plan Enumeration

    Authors: Yesdaulet Izenov, Asoke Datta, Brian Tsan, Abylay Amanbayev, Florin Rusu

    Abstract: In this work, we define the problem of finding an optimal query plan as finding spanning trees with low costs. This approach empowers the utilization of a series of spanning tree algorithms, thereby enabling systematic exploration of the plan search space over a join graph. Capitalizing on the polynomial time complexity of spanning tree algorithms, we present the Ensemble Spanning Tree Enumeration… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  9. arXiv:2402.17834  [pdf, other

    cs.CL stat.ML

    Stable LM 2 1.6B Technical Report

    Authors: Marco Bellagente, Jonathan Tow, Dakota Mahan, Duy Phung, Maksym Zhuravinskyi, Reshinth Adithyan, James Baicoianu, Ben Brooks, Nathan Cooper, Ashish Datta, Meng Lee, Emad Mostaque, Michael Pieler, Nikhil Pinnaparju, Paulo Rocha, Harry Saini, Hannah Teufel, Niccolo Zanichelli, Carlos Riquelme

    Abstract: We introduce StableLM 2 1.6B, the first in a new generation of our language model series. In this technical report, we present in detail the data and training procedure leading to the base and instruction-tuned versions of StableLM 2 1.6B. The weights for both models are available via Hugging Face for anyone to download and use. The report contains thorough evaluations of these models, including z… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

    Comments: 23 pages, 6 figures

  10. arXiv:2402.15873  [pdf, ps, other

    cs.CL

    SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection

    Authors: Ayan Datta, Aryan Chandramania, Radhika Mamidi

    Abstract: This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted… ▽ More

    Submitted 9 April, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  11. arXiv:2402.15037  [pdf, other

    cs.GT econ.GN

    Analyzing Games in Maker Protocol Part One: A Multi-Agent Influence Diagram Approach Towards Coordination

    Authors: Abhimanyu Nag, Samrat Gupta, Sudipan Sinha, Arka Datta

    Abstract: Decentralized Finance (DeFi) ecosystems, exemplified by the Maker Protocol, rely on intricate games to maintain stability and security. Understanding the dynamics of these games is crucial for ensuring the robustness of the system. This motivating research proposes a novel methodology leveraging Multi-Agent Influence Diagrams (MAID), originally proposed by Koller and Milch, to dissect and analyze… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

  12. arXiv:2402.05620  [pdf, ps, other

    cs.IT

    Optimized Denial-of-Service Threats on the Scalability of LT Coded Blockchains

    Authors: Harikrishnan K., J. Harshan, Anwitaman Datta

    Abstract: Coded blockchains have acquired prominence in the recent past as a promising approach to slash the storage costs as well as to facilitate scalability. Within this class, Luby Transform (LT) coded blockchains are an appealing choice for scalability in heterogeneous networks owing to the availability of a wide range of low-complexity LT decoders. While these architectures have been studied from the… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: To be presented in IEEE International Conference on Communications 2024

  13. arXiv:2402.04489  [pdf, other

    cs.LG cs.CR cs.CY stat.ME

    De-amplifying Bias from Differential Privacy in Language Model Fine-tuning

    Authors: Sanjari Srivastava, Piotr Mardziel, Zhikhun Zhang, Archana Ahlawat, Anupam Datta, John C Mitchell

    Abstract: Fairness and privacy are two important values machine learning (ML) practitioners often seek to operationalize in models. Fairness aims to reduce model bias for social/demographic sub-groups. Privacy via differential privacy (DP) mechanisms, on the other hand, limits the impact of any individual's training data on the resulting model. The trade-offs between privacy and fairness goals of trustworth… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  14. arXiv:2401.10419  [pdf

    eess.IV cs.CV cs.LG

    M3BUNet: Mobile Mean Max UNet for Pancreas Segmentation on CT-Scans

    Authors: Juwita juwita, Ghulam Mubashar Hassan, Naveed Akhtar, Amitava Datta

    Abstract: Segmenting organs in CT scan images is a necessary process for multiple downstream medical image analysis tasks. Currently, manual CT scan segmentation by radiologists is prevalent, especially for organs like the pancreas, which requires a high level of domain expertise for reliable segmentation due to factors like small organ size, occlusion, and varying shapes. When resorting to automated pancre… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

  15. arXiv:2401.04385  [pdf, other

    cs.LG cs.AI

    Machine unlearning through fine-grained model parameters perturbation

    Authors: Zhiwei Zuo, Zhuo Tang, Kenli Li, Anwitaman Datta

    Abstract: Machine unlearning techniques, which involve retracting data records and reducing influence of said data on trained models, help with the user privacy protection objective but incur significant computational costs. Weight perturbation-based unlearning is a general approach, but it typically involves globally modifying the parameters. We propose fine-grained Top-K and Random-k parameters perturbed… ▽ More

    Submitted 8 July, 2024; v1 submitted 9 January, 2024; originally announced January 2024.

  16. arXiv:2401.02457  [pdf, other

    cs.LG cs.AI

    eCIL-MU: Embedding based Class Incremental Learning and Machine Unlearning

    Authors: Zhiwei Zuo, Zhuo Tang, Bin Wang, Kenli Li, Anwitaman Datta

    Abstract: New categories may be introduced over time, or existing categories may need to be reclassified. Class incremental learning (CIL) is employed for the gradual acquisition of knowledge about new categories while preserving information about previously learned ones in such dynamic environments. It might also be necessary to also eliminate the influence of related categories on the model to adapt to re… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  17. arXiv:2312.17013  [pdf, other

    cs.SI

    Perspectives of Global and Hong Kong's Media on China's Belt and Road Initiative

    Authors: Le Cong Khoo, Anwitaman Datta

    Abstract: This study delves into the media analysis of China's ambitious Belt and Road Initiative (BRI), which, in a polarized world, and furthermore, owing to the very polarizing nature of the initiative itself, has received both strong criticisms and conversely positive coverage in media from across the world. In that context, Hong Kong's dynamic media environment, with a particular focus on its drastical… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: 23 pages, 16 figures, 7 tables, 1 appendix

  18. arXiv:2311.17293  [pdf, other

    cs.DB

    Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates

    Authors: Asoke Datta, Brian Tsan, Yesdaulet Izenov, Florin Rusu

    Abstract: Most query optimizers rely on cardinality estimates to determine optimal execution plans. While traditional databases such as PostgreSQL, Oracle, and Db2 utilize many types of synopses -- including histograms, samples, and sketches -- recent main-memory databases like DuckDB and Heavy.AI often operate with minimal or no estimates, yet their performance does not necessarily suffer. To the best of o… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  19. arXiv:2311.05046  [pdf, other

    stat.ML cs.LG

    On the Consistency of Maximum Likelihood Estimation of Probabilistic Principal Component Analysis

    Authors: Arghya Datta, Sayak Chakrabarty

    Abstract: Probabilistic principal component analysis (PPCA) is currently one of the most used statistical tools to reduce the ambient dimension of the data. From multidimensional scaling to the imputation of missing data, PPCA has a broad spectrum of applications ranging from science and engineering to quantitative finance. Despite this wide applicability in various fields, hardly any theoretical guarante… ▽ More

    Submitted 13 November, 2023; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 15 pages, 1 figure, to appear in NeurIPS 2023. Update: included minor typographical corrections

  20. arXiv:2310.19834  [pdf, other

    cs.AI cs.IR cs.SI

    AMIR: Automated MisInformation Rebuttal -- A COVID-19 Vaccination Datasets based Recommendation System

    Authors: Shakshi Sharma, Anwitaman Datta, Rajesh Sharma

    Abstract: Misinformation has emerged as a major societal threat in recent years in general; specifically in the context of the COVID-19 pandemic, it has wrecked havoc, for instance, by fuelling vaccine hesitancy. Cost-effective, scalable solutions for combating misinformation are the need of the hour. This work explored how existing information obtained from social media and augmented with more curated fact… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  21. arXiv:2310.09361  [pdf, other

    cs.LG

    Is Certifying $\ell_p$ Robustness Still Worthwhile?

    Authors: Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson

    Abstract: Over the years, researchers have developed myriad attacks that exploit the ubiquity of adversarial examples, as well as defenses that aim to guard against the security vulnerabilities posed by such attacks. Of particular interest to this paper are defenses that provide provable guarantees against the class of $\ell_p$-bounded attacks. Certified defenses have made significant progress, taking robus… ▽ More

    Submitted 13 October, 2023; originally announced October 2023.

  22. arXiv:2309.05497  [pdf, other

    cs.CL cs.CY

    Personality Detection and Analysis using Twitter Data

    Authors: Abhilash Datta, Souvic Chakraborty, Animesh Mukherjee

    Abstract: Personality types are important in various fields as they hold relevant information about the characteristics of a human being in an explainable format. They are often good predictors of a person's behaviors in a particular environment and have applications ranging from candidate selection to marketing and mental health. Recently automatic detection of personality traits from texts has gained sign… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Submitted to ASONAM 2023

  23. COVID-19 Detection System: A Comparative Analysis of System Performance Based on Acoustic Features of Cough Audio Signals

    Authors: Asmaa Shati, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: A wide range of respiratory diseases, such as cold and flu, asthma, and COVID-19, affect people's daily lives worldwide. In medical practice, respiratory sounds are widely used in medical services to diagnose various respiratory illnesses and lung disorders. The traditional diagnosis of such sounds requires specialized knowledge, which can be costly and reliant on human expertise. Despite this, re… ▽ More

    Submitted 18 June, 2024; v1 submitted 8 September, 2023; originally announced September 2023.

    Comments: 8 pages, 3 figures

    Journal ref: 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Exeter, United Kingdom, 2023, pp. 2706-2713

  24. arXiv:2309.04125  [pdf, other

    cs.CR

    Blockchain-enabled Data Governance for Privacy-Preserved Sharing of Confidential Data

    Authors: Jingchi Zhang, Anwitaman Datta

    Abstract: In a traditional cloud storage system, users benefit from the convenience it provides but also take the risk of certain security and privacy issues. To ensure confidentiality while maintaining data sharing capabilities, the Ciphertext-Policy Attribute-based Encryption (CP-ABE) scheme can be used to achieve fine-grained access control in cloud services. However, existing approaches are impaired by… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: 23 pages, 19 algorithms, 1 figure

  25. arXiv:2309.03204  [pdf, other

    cs.AR

    A 9 Transistor SRAM Featuring Array-level XOR Parallelism with Secure Data Toggling Operation

    Authors: Zihan Yin, Annewsha Datta, Shwetha Vijayakumar, Ajey Jacob, Akhilesh Jaiswal

    Abstract: Security and energy-efficiency are critical for computing applications in general and for edge applications in particular. Digital in-Memory Computing (IMC) in SRAM cells have widely been studied to accelerate inference tasks to maximize both throughput and energy efficiency for intelligent computing at the edge. XOR operations have been of particular interest due to their wide applicability in nu… ▽ More

    Submitted 11 August, 2023; originally announced September 2023.

  26. arXiv:2309.00639  [pdf, other

    cs.CL cs.SI

    Misinformation Concierge: A Proof-of-Concept with Curated Twitter Dataset on COVID-19 Vaccination

    Authors: Shakshi Sharma, Anwitaman Datta, Vigneshwaran Shankaran, Rajesh Sharma

    Abstract: We demonstrate the Misinformation Concierge, a proof-of-concept that provides actionable intelligence on misinformation prevalent in social media. Specifically, it uses language processing and machine learning tools to identify subtopics of discourse and discern non/misleading posts; presents statistical reports for policy-makers to understand the big picture of prevalent misinformation in a timel… ▽ More

    Submitted 25 August, 2023; originally announced September 2023.

    Comments: This is a preprinted version of our CIKM paper. Please cite our CIKM paper

  27. Identifying and Mitigating the Security Risks of Generative AI

    Authors: Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang

    Abstract: Every major technical invention resurfaces the dual-use dilemma -- the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well… ▽ More

    Submitted 28 December, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Journal ref: Foundations and Trends in Privacy and Security 6 (2023) 1-52

  28. arXiv:2308.02163  [pdf, other

    cs.CR

    BlockChain I/O: Enabling Cross-Chain Commerce

    Authors: Anwitaman Datta, Daniël Reijsbergen, Jingchi Zhang, Suman Majumder

    Abstract: Blockchain technology enables secure tokens transfers in digital marketplaces, and recent advances in this field provide other desirable properties such as efficiency, privacy, and price stability. However, these properties do not always generalize to a setting across multiple independent blockchains. Despite the growing number of existing blockchain platforms, there is a lack of an overarching fr… ▽ More

    Submitted 28 June, 2024; v1 submitted 4 August, 2023; originally announced August 2023.

  29. arXiv:2307.14041  [pdf, other

    cs.CR cs.DL

    GovernR: Provenance and Confidentiality Guarantees In Research Data Repositories

    Authors: Anwitaman Datta, Chua Chiah Soon, Wangfan Gu

    Abstract: We propose cryptographic protocols to incorporate time provenance guarantees while meeting confidentiality and controlled sharing needs for research data. We demonstrate the efficacy of these mechanisms by developing and benchmarking a practical tool, GovernR, which furthermore takes into usability issues and is compatible with a popular open-sourced research data storage platform, Dataverse. In d… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: 2 Figures, 3 Tables

  30. arXiv:2307.05373  [pdf, other

    eess.SP cs.AI cs.LG

    Classification of sleep stages from EEG, EOG and EMG signals by SSNet

    Authors: Haifa Almutairi, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: Classification of sleep stages plays an essential role in diagnosing sleep-related diseases including Sleep Disorder Breathing (SDB) disease. In this study, we propose an end-to-end deep learning architecture, named SSNet, which comprises of two deep learning networks based on Convolutional Neuron Networks (CNN) and Long Short Term Memory (LSTM). Both deep learning networks extract features from t… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  31. arXiv:2306.09754  [pdf, other

    cs.CR

    CroCoDai: A Stablecoin for Cross-Chain Commerce

    Authors: Daniël Reijsbergen, Bretislav Hajek, Tien Tuan Anh Dinh, Jussi Keppo, Hank Korth, Anwitaman Datta

    Abstract: Decentralized Finance (DeFi), in which digital assets are exchanged without trusted intermediaries, has grown rapidly in value in recent years. The global DeFi ecosystem is fragmented into multiple blockchains, fueling the demand for cross-chain commerce. Existing approaches for cross-chain transactions, e.g., bridges and cross-chain deals, achieve atomicity by locking assets in escrow. However, l… ▽ More

    Submitted 20 June, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  32. arXiv:2306.09735  [pdf, other

    cs.CR

    PIEChain -- A Practical Blockchain Interoperability Framework

    Authors: Daniël Reijsbergen, Aung Maw, Jingchi Zhang, Tien Tuan Anh Dinh, Anwitaman Datta

    Abstract: A plethora of different blockchain platforms have emerged in recent years, but many of them operate in silos. As such, there is a need for reliable cross-chain communication to enable blockchain interoperability. Blockchain interoperability is challenging because transactions can typically not be reverted - as such, if one transaction is committed then the protocol must ensure that all related tra… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  33. arXiv:2306.01750  [pdf, other

    cs.AI cs.HC

    A Survey of Explainable AI and Proposal for a Discipline of Explanation Engineering

    Authors: Clive Gomes, Lalitha Natraj, Shijun Liu, Anushka Datta

    Abstract: In this survey paper, we deep dive into the field of Explainable Artificial Intelligence (XAI). After introducing the scope of this paper, we start by discussing what an "explanation" really is. We then move on to discuss some of the existing approaches to XAI and build a taxonomy of the most popular methods. Next, we also look at a few applications of these and other XAI techniques in four primar… ▽ More

    Submitted 20 May, 2023; originally announced June 2023.

  34. arXiv:2306.01540  [pdf, other

    cs.RO

    CLIPGraphs: Multimodal Graph Networks to Infer Object-Room Affinities

    Authors: Ayush Agrawal, Raghav Arora, Ahana Datta, Snehasis Banerjee, Brojeshwar Bhowmick, Krishna Murthy Jatavallabhula, Mohan Sridharan, Madhava Krishna

    Abstract: This paper introduces a novel method for determining the best room to place an object in, for embodied scene rearrangement. While state-of-the-art approaches rely on large language models (LLMs) or reinforcement learned (RL) policies for this task, our approach, CLIPGraphs, efficiently combines commonsense domain knowledge, data-driven methods, and recent advances in multimodal learning. Specifica… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

    Journal ref: RO-MAN 2023 Conference

  35. arXiv:2305.18330  [pdf, other

    cs.IR cs.AI cs.CL

    #REVAL: a semantic evaluation framework for hashtag recommendation

    Authors: Areej Alsini, Du Q. Huynh, Amitava Datta

    Abstract: Automatic evaluation of hashtag recommendation models is a fundamental task in many online social network systems. In the traditional evaluation method, the recommended hashtags from an algorithm are firstly compared with the ground truth hashtags for exact correspondences. The number of exact matches is then used to calculate the hit rate, hit ratio, precision, recall, or F1-score. This way of ev… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

    Comments: 18 pages, 4 figures

    ACM Class: I.2.7

  36. arXiv:2305.10625  [pdf, other

    cs.LG

    Measuring and Mitigating Local Instability in Deep Neural Networks

    Authors: Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan

    Abstract: Deep Neural Networks (DNNs) are becoming integral components of real world services relied upon by millions of users. Unfortunately, architects of these systems can find it difficult to ensure reliable performance as irrelevant details like random initialization can unexpectedly change the outputs of a trained system with potentially disastrous consequences. We formulate the model stability proble… ▽ More

    Submitted 18 May, 2023; v1 submitted 17 May, 2023; originally announced May 2023.

    Comments: To be published in Findings of the Association for Computational Linguistics (ACL), 2023

  37. arXiv:2305.06178  [pdf

    cs.RO cs.AI cs.LG

    Sequence-Agnostic Multi-Object Navigation

    Authors: Nandiraju Gireesh, Ayush Agrawal, Ahana Datta, Snehasis Banerjee, Mohan Sridharan, Brojeshwar Bhowmick, Madhava Krishna

    Abstract: The Multi-Object Navigation (MultiON) task requires a robot to localize an instance (each) of multiple object classes. It is a fundamental task for an assistive robot in a home or a factory. Existing methods for MultiON have viewed this as a direct extension of Object Navigation (ON), the task of localising an instance of one object class, and are pre-sequenced, i.e., the sequence in which the obj… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Journal ref: ICRA 2023 conference

  38. arXiv:2304.09157  [pdf, other

    stat.ML cs.LG stat.ME

    Neural networks for geospatial data

    Authors: Wentao Zhan, Abhirup Datta

    Abstract: Analysis of geospatial data has traditionally been model-based, with a mean model, customarily specified as a linear regression on the covariates, and a covariance model, encoding the spatial dependence. We relax the strong assumption of linearity and propose embedding neural networks directly within the traditional geostatistical models to accommodate non-linear mean functions while retaining all… ▽ More

    Submitted 24 May, 2024; v1 submitted 18 April, 2023; originally announced April 2023.

  39. MP-SeizNet: A Multi-Path CNN Bi-LSTM Network for Seizure-Type Classification Using EEG

    Authors: Hezam Albaqami, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: Seizure type identification is essential for the treatment and management of epileptic patients. However, it is a difficult process known to be time consuming and labor intensive. Automated diagnosis systems, with the advancement of machine learning algorithms, have the potential to accelerate the classification process, alert patients, and support physicians in making quick and accurate decisions… ▽ More

    Submitted 1 March, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

    Journal ref: Biomed. Signal Process. Control. 84 (2023) 104780

  40. arXiv:2208.07665  [pdf, other

    cs.DC

    QPQ 1DLT: A system for the rapid deployment of secure and efficient EVM-based blockchains

    Authors: Simone Bottoni, Anwitaman Datta, Federico Franzoni, Emanuele Ragnoli, Roberto Ripamonti, Christian Rondanini, Gokhan Sagirlar, Alberto Trombetta

    Abstract: Limited scalability and transaction costs are, among others, some of the critical issues that hamper a wider adoption of distributed ledger technologies (DLT). That is particularly true for the Ethereum blockchain, which, so far, has been the ecosystem with the highest adoption rate. Quite a few solutions, especially on the Ethereum side of things, have been attempted in the last few years. Most o… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

  41. arXiv:2206.00192  [pdf, other

    cs.CL cs.AI

    Order-sensitive Shapley Values for Evaluating Conceptual Soundness of NLP Models

    Authors: Kaiji Lu, Anupam Datta

    Abstract: Previous works show that deep NLP models are not always conceptually sound: they do not always learn the correct linguistic concepts. Specifically, they can be insensitive to word order. In order to systematically evaluate models for their conceptual soundness with respect to word order, we introduce a new explanation method for sequential data: Order-sensitive Shapley Values (OSV). We conduct an… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  42. arXiv:2205.11850  [pdf, other

    cs.LG cs.AI

    Faithful Explanations for Deep Graph Models

    Authors: Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

    Abstract: This paper studies faithful explanations for Graph Neural Networks (GNNs). First, we provide a new and general method for formally characterizing the faithfulness of explanations for GNNs. It applies to existing explanation methods, including feature attributions and subgraph explanations. Second, our analytical and empirical results demonstrate that feature attribution methods cannot capture the… ▽ More

    Submitted 24 May, 2022; originally announced May 2022.

  43. arXiv:2205.07870  [pdf, other

    cs.LG cs.AI

    Unsupervised Driving Behavior Analysis using Representation Learning and Exploiting Group-based Training

    Authors: Soma Bandyopadhyay, Anish Datta, Shruti Sachan, Arpan Pal

    Abstract: Driving behavior monitoring plays a crucial role in managing road safety and decreasing the risk of traffic accidents. Driving behavior is affected by multiple factors like vehicle characteristics, types of roads, traffic, but, most importantly, the pattern of driving of individuals. Current work performs a robust driving pattern analysis by capturing variations in driving patterns. It forms consi… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: 7 figures, 8 pages , 7 tables, accepted and presented conference AAAI 2022 AI for Transportation Workshop (Prefinal version)

  44. arXiv:2203.07731  [pdf

    cs.CL cs.LG

    Evaluating BERT-based Pre-training Language Models for Detecting Misinformation

    Authors: Rini Anggrainingsih, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: It is challenging to control the quality of online information due to the lack of supervision over all the information posted online. Manual checking is almost impossible given the vast number of posts made on online media and how quickly they spread. Therefore, there is a need for automated rumour detection techniques to limit the adverse effects of spreading misinformation. Previous studies main… ▽ More

    Submitted 15 March, 2022; originally announced March 2022.

    Comments: 17 pages, 2 figures, 10 tables

  45. Wavelet-Based Multi-Class Seizure Type Classification System

    Authors: Hezam Albaqami, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: Epilepsy is one of the most common brain diseases that affect more than 1\% of the world's population. It is characterized by recurrent seizures, which come in different types and are treated differently. Electroencephalography (EEG) is commonly used in medical services to diagnose seizures and their types. The accurate identification of seizures helps to provide optimal treatment and accurate inf… ▽ More

    Submitted 19 February, 2022; originally announced March 2022.

  46. arXiv:2112.06456  [pdf, other

    cs.CV

    Real Time Action Recognition from Video Footage

    Authors: Tasnim Sakib Apon, Mushfiqul Islam Chowdhury, MD Zubair Reza, Arpita Datta, Syeda Tanjina Hasan, MD. Golam Rabiul Alam

    Abstract: Crime rate is increasing proportionally with the increasing rate of the population. The most prominent approach was to introduce Closed-Circuit Television (CCTV) camera-based surveillance to tackle the issue. Video surveillance cameras have added a new dimension to detect crime. Several research works on autonomous security camera surveillance are currently ongoing, where the fundamental goal is t… ▽ More

    Submitted 13 December, 2021; originally announced December 2021.

  47. arXiv:2111.11623  [pdf, other

    cs.SI cs.IT cs.LG

    A Modular Framework for Centrality and Clustering in Complex Networks

    Authors: Frederique Oggier, Silivanxay Phetsouvanh, Anwitaman Datta

    Abstract: The structure of many complex networks includes edge directionality and weights on top of their topology. Network analysis that can seamlessly consider combination of these properties are desirable. In this paper, we study two important such network analysis techniques, namely, centrality and clustering. An information-flow based model is adopted for clustering, which itself builds upon an informa… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

  48. arXiv:2111.00163  [pdf, other

    cs.DB

    Simpli-Squared: A Very Simple Yet Unexpectedly Powerful Join Ordering Algorithm Without Cardinality Estimates

    Authors: Asoke Datta, Yesdaulet Izenov, Brian Tsan, Florin Rusu

    Abstract: The Join Order Benchmark (JOB) has become the de facto standard to assess the performance of relational database query optimizers due to its complexity and completeness. In order to compute the optimal execution plan -- join order -- existing solutions employ extensive data synopses and correlations -- functional dependencies -- between table attributes. These structures incur significant overhead… ▽ More

    Submitted 29 October, 2021; originally announced November 2021.

  49. arXiv:2110.03109  [pdf, other

    cs.LG

    Consistent Counterfactuals for Deep Models

    Authors: Emily Black, Zifan Wang, Matt Fredrikson, Anupam Datta

    Abstract: Counterfactual examples are one of the most commonly-cited methods for explaining the predictions of machine learning models in key areas such as finance and medical diagnosis. Counterfactuals are often discussed under the assumption that the model on which they will be used is static, but in deployment models may be periodically retrained or fine-tuned. This paper studies the consistency of model… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  50. arXiv:2109.02975  [pdf, other

    cs.LG

    BERT based classification system for detecting rumours on Twitter

    Authors: Rini Anggrainingsih, Ghulam Mubashar Hassan, Amitava Datta

    Abstract: The role of social media in opinion formation has far-reaching implications in all spheres of society. Though social media provide platforms for expressing news and views, it is hard to control the quality of posts due to the sheer volumes of posts on platforms like Twitter and Facebook. Misinformation and rumours have lasting effects on society, as they tend to influence people's opinions and als… ▽ More

    Submitted 7 September, 2021; originally announced September 2021.

    Comments: Consists of 10 pages, 5 figures, and 8 tables, has been submitted to IEEE transactions on Computational and Social Systems (still underreview process)