Skip to main content

Showing 1–50 of 321 results for author: Saha, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05800  [pdf, other

    cs.LG cs.AI cs.CV cs.DC

    FedMRL: Data Heterogeneity Aware Federated Multi-agent Deep Reinforcement Learning for Medical Imaging

    Authors: Pranab Sahoo, Ashutosh Tripathi, Sriparna Saha, Samrat Mondal

    Abstract: Despite recent advancements in federated learning (FL) for medical image diagnosis, addressing data heterogeneity among clients remains a significant challenge for practical implementation. A primary hurdle in FL arises from the non-IID nature of data samples across clients, which typically results in a decline in the performance of the aggregated global model. In this study, we introduce FedMRL,… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Accepted to MICCAI 2024

  2. arXiv:2407.02362  [pdf, other

    cs.AR cs.AI cs.LG

    Fast, Scalable, Energy-Efficient Non-element-wise Matrix Multiplication on FPGA

    Authors: Xuqi Zhu, Huaizhi Zhang, JunKyu Lee, Jiacheng Zhu, Chandrajit Pal, Sangeet Saha, Klaus D. McDonald-Maier, Xiaojun Zhai

    Abstract: Modern Neural Network (NN) architectures heavily rely on vast numbers of multiply-accumulate arithmetic operations, constituting the predominant computational cost. Therefore, this paper proposes a high-throughput, scalable and energy efficient non-element-wise matrix multiplication unit on FPGAs as a basic component of the NNs. We firstly streamline inter-layer and intra-layer redundancies of MAD… ▽ More

    Submitted 7 July, 2024; v1 submitted 2 July, 2024; originally announced July 2024.

  3. arXiv:2407.00480  [pdf

    cs.CV

    Development of an interactive GUI using MATLAB for the detection of type and stage of Breast Tumor

    Authors: Poulmi Banerjee, Satadal Saha

    Abstract: Breast cancer is described as one of the most common types of cancer which has been diagnosed mainly in women. When compared in the ratio of male to female, it has been duly found that the prone of having breast cancer is more in females than males. Breast lumps are classified mainly into two groups namely: cancerous and non-cancerous. When we say that the lump in the breast is cancerous, it means… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

  4. UltraGelBot: Autonomous Gel Dispenser for Robotic Ultrasound

    Authors: Deepak Raina, Ziming Zhao, Richard Voyles, Juan Wachs, Subir K. Saha, S. H. Chandrashekhara

    Abstract: Telerobotic and Autonomous Robotic Ultrasound Systems (RUS) help alleviate the need for operator-dependability in free-hand ultrasound examinations. However, the state-of-the-art RUSs still rely on a human operator to apply the ultrasound gel. The lack of standardization in this process often leads to poor imaging of the scanned region. The reason for this has to do with air-gaps between the probe… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

    Comments: 2024 16th Hamlyn Symposium on Medical Robotics (HSMR)

  5. arXiv:2406.13272  [pdf, other

    cs.CV

    AniFaceDiff: High-Fidelity Face Reenactment via Facial Parametric Conditioned Diffusion Models

    Authors: Ken Chen, Sachith Seneviratne, Wei Wang, Dongting Hu, Sanjay Saha, Md. Tarek Hasan, Sanka Rasnayaka, Tamasha Malepathirana, Mingming Gong, Saman Halgamuge

    Abstract: Face reenactment refers to the process of transferring the pose and facial expressions from a reference (driving) video onto a static facial (source) image while maintaining the original identity of the source image. Previous research in this domain has made significant progress by training controllable deep generative models to generate faces based on specific identity, pose and expression condit… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.12931  [pdf, other

    eess.AS cs.CL cs.SD

    Automatic Speech Recognition for Biomedical Data in Bengali Language

    Authors: Shariar Kabir, Nazmun Nahar, Shyamasree Saha, Mamunur Rashid

    Abstract: This paper presents the development of a prototype Automatic Speech Recognition (ASR) system specifically designed for Bengali biomedical data. Recent advancements in Bengali ASR are encouraging, but a lack of domain-specific data limits the creation of practical healthcare ASR models. This project bridges this gap by developing an ASR system tailored for Bengali medical terms like symptoms, sever… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  7. arXiv:2406.09043  [pdf, other

    cs.CL cs.AI

    Language Models are Crossword Solvers

    Authors: Soumadeep Saha, Sutanoya Chakraborty, Saptarshi Saha, Utpal Garain

    Abstract: Crosswords are a form of word puzzle that require a solver to demonstrate a high degree of proficiency in natural language understanding, wordplay, reasoning, and world knowledge, along with adherence to character and length constraints. In this paper we tackle the challenge of solving crosswords with Large Language Models (LLMs). We demonstrate that the current generation of state-of-the art (SoT… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

    Comments: Edited to include missing citation

    ACM Class: I.2.7

  8. arXiv:2406.05598  [pdf, other

    cs.CV

    Understanding Inhibition Through Maximally Tense Images

    Authors: Chris Hamblin, Srijani Saha, Talia Konkle, George Alvarez

    Abstract: We address the functional role of 'feature inhibition' in vision models; that is, what are the mechanisms by which a neural network ensures images do not express a given feature? We observe that standard interpretability tools in the literature are not immediately suited to the inhibitory case, given the asymmetry introduced by the ReLU activation function. Given this, we propose inhibition be und… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  9. arXiv:2406.05344  [pdf, other

    cs.CL

    MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention

    Authors: Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: In the digital world, memes present a unique challenge for content moderation due to their potential to spread harmful content. Although detection methods have improved, proactive solutions such as intervention are still limited, with current research focusing mostly on text-based content, neglecting the widespread influence of multimodal content like memes. Addressing this gap, we present \textit… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  10. arXiv:2406.03556  [pdf, other

    cs.CV

    Npix2Cpix: A GAN-based Image-to-Image Translation Network with Retrieval-Classification Integration for Watermark Retrieval from Historical Document Images

    Authors: Utsab Saha, Sawradip Saha, Shaikh Anowarul Fattah, Mohammad Saquib

    Abstract: The identification and restoration of ancient watermarks have long been a major topic in codicology and history. Classifying historical documents based on watermarks can be difficult due to the diversity of watermarks, crowded and noisy samples, multiple modes of representation, and minor distinctions between classes and intra-class changes. This paper proposes a U-net-based conditional generative… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  11. arXiv:2406.02450  [pdf, other

    cs.LG cs.AI

    A Generalized Apprenticeship Learning Framework for Modeling Heterogeneous Student Pedagogical Strategies

    Authors: Md Mirajul Islam, Xi Yang, John Hostetter, Adittya Soukarjya Saha, Min Chi

    Abstract: A key challenge in e-learning environments like Intelligent Tutoring Systems (ITSs) is to induce effective pedagogical policies efficiently. While Deep Reinforcement Learning (DRL) often suffers from sample inefficiency and reward function design difficulty, Apprenticeship Learning(AL) algorithms can overcome them. However, most AL algorithms can not handle heterogeneity as they assume all demonst… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  12. arXiv:2405.20628  [pdf, other

    cs.AI cs.CL cs.CV

    ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos

    Authors: Krishanu Maity, A. S. Poornash, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: In an era of rapidly evolving internet technology, the surge in multimodal content, including videos, has expanded the horizons of online communication. However, the detection of toxic content in this diverse landscape, particularly in low-resource code-mixed languages, remains a critical challenge. While substantial research has addressed toxic content detection in textual data, the realm of vide… ▽ More

    Submitted 14 July, 2024; v1 submitted 31 May, 2024; originally announced May 2024.

    Comments: Accepted as a Long Paper in ACL Findings 2024. For acceptance details, see https://2024.aclweb.org/program/finding_papers/

  13. arXiv:2405.18506  [pdf, other

    cs.DM

    An Algorithm for the Decomposition of Complete Graph into Minimum Number of Edge-disjoint Trees

    Authors: Antika Sinha, Sanjoy Kumar Saha, Partha Basuchowdhuri

    Abstract: In this work, we study methodical decomposition of an undirected, unweighted complete graph ($K_n$ of order $n$, size $m$) into minimum number of edge-disjoint trees. We find that $x$, a positive integer, is minimum and $x=\lceil\frac{n}{2}\rceil$ as the edge set of $K_n$ is decomposed into edge-disjoint trees of size sequence $M = \{m_1,m_2,...,m_x\}$ where $m_i\le(n-1)$ and $Σ_{i=1}^{x} m_i$ =… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 10 pages, 4 figures and 3 tables

  14. arXiv:2405.15766  [pdf, other

    cs.AI cs.CL cs.CV

    Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development

    Authors: Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Aman Chadha, Samrat Mondal

    Abstract: The mining of adverse drug events (ADEs) is pivotal in pharmacovigilance, enhancing patient safety by identifying potential risks associated with medications, facilitating early detection of adverse events, and guiding regulatory decision-making. Traditional ADE detection methods are reliable but slow, not easily adaptable to large-scale operations, and offer limited information. With the exponent… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

  15. arXiv:2405.11573  [pdf, other

    cs.LG

    Quantile Activation: departing from single point estimation for better generalization across distortions

    Authors: Aditya Challa, Sravan Danda, Laurent Najman, Snehanshu Saha

    Abstract: A classifier is, in its essence, a function which takes an input and returns the class of the input and implicitly assumes an underlying distribution. We argue in this article that one has to move away from this basic tenet to obtain generalisation across distributions. Specifically, the class of the sample should depend on the points from its context distribution for better generalisation across… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  16. arXiv:2405.11181  [pdf, other

    cs.AI cs.CL

    Towards Knowledge-Infused Automated Disease Diagnosis Assistant

    Authors: Mohit Tomar, Abhisek Tiwari, Sriparna Saha

    Abstract: With the advancement of internet communication and telemedicine, people are increasingly turning to the web for various healthcare activities. With an ever-increasing number of diseases and symptoms, diagnosing patients becomes challenging. In this work, we build a diagnosis assistant to assist doctors, which identifies diseases based on patient-doctor interaction. During diagnosis, doctors utiliz… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  17. arXiv:2405.09589  [pdf, other

    cs.LG cs.AI cs.CL cs.CV cs.SD eess.AS

    Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

    Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

    Abstract: The rapid advancement of foundation models (FMs) across language, image, audio, and video domains has shown remarkable capabilities in diverse tasks. However, the proliferation of FMs brings forth a critical challenge: the potential to generate hallucinated outputs, particularly in high-stakes applications. The tendency of foundation models to produce hallucinated content arguably represents the b… ▽ More

    Submitted 20 May, 2024; v1 submitted 15 May, 2024; originally announced May 2024.

  18. arXiv:2405.06124  [pdf, other

    cs.CR

    Demystifying Behavior-Based Malware Detection at Endpoints

    Authors: Yigitcan Kaya, Yizheng Chen, Shoumik Saha, Fabio Pierazzi, Lorenzo Cavallaro, David Wagner, Tudor Dumitras

    Abstract: Machine learning is widely used for malware detection in practice. Prior behavior-based detectors most commonly rely on traces of programs executed in controlled sandboxes. However, sandbox traces are unavailable to the last line of defense offered by security vendors: malware detection at endpoints. A detector at endpoints consumes the traces of programs running on real-world hosts, as sandbox an… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: Behavior-based malware detection with machine learning. 18 pages, 10 figures, 15 tables. Leaderboard: https://malwaredetectioninthewild.github.io

  19. arXiv:2405.05813  [pdf

    cs.SE

    Revitalising Stagecraft: NLP-Driven Sentiment Analysis for Traditional Theater Revival

    Authors: Saikat Samanta, Saptarshi Karmakar, Satayajay Behuria, Shibam Dutta, Soujit Das, Soumik Saha

    Abstract: This paper explores the application of FilmFrenzy, a python based ticket booking web application, in the revival of traditional Indian theatres. Additionally, this research paper explores how NLP can be implemented to improve user experience. Through clarifying audience views and pinpointing opportunities for development, FilmFrenzy aims to promote involvement and rejuvenation in India's conventio… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  20. arXiv:2405.04610  [pdf, other

    eess.IV cs.CV

    Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification

    Authors: Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Busra Kamal Rafa, Mohammad Shafiul Alam

    Abstract: Lung and colon cancer are serious worldwide health challenges that require early and precise identification to reduce mortality risks. However, diagnosis, which is mostly dependent on histopathologists' competence, presents difficulties and hazards when expertise is insufficient. While diagnostic methods like imaging and blood markers contribute to early detection, histopathology remains the gold… ▽ More

    Submitted 14 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted in 4th International Conference on Computing and Communication Networks (ICCCNet-2024)

  21. arXiv:2404.19306  [pdf

    cs.LG cs.AI

    Comprehensive Forecasting-Based Analysis of Hybrid and Stacked Stateful/ Stateless Models

    Authors: Swayamjit Saha

    Abstract: Wind speed is a powerful source of renewable energy, which can be used as an alternative to the non-renewable resources for production of electricity. Renewable sources are clean, infinite and do not impact the environment negatively during production of electrical energy. However, while eliciting electrical energy from renewable resources viz. solar irradiance, wind speed, hydro should require sp… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

    Comments: 8 pages, 14 figures

  22. arXiv:2404.18546  [pdf, other

    cs.IR

    ir_explain: a Python Library of Explainable IR Methods

    Authors: Sourav Saha, Harsh Agarwal, Swastik Mohanty, Mandar Mitra, Debapriyo Majumdar

    Abstract: While recent advancements in Neural Ranking Models have resulted in significant improvements over traditional statistical retrieval models, it is generally acknowledged that the use of large neural architectures and the application of complex language models in Information Retrieval (IR) have reduced the transparency of retrieval methods. Consequently, Explainability and Interpretability have emer… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  23. arXiv:2404.10296  [pdf, other

    cs.LG cs.AI cs.NE

    Engineering software 2.0 by interpolating neural networks: unifying training, solving, and calibration

    Authors: Chanwook Park, Sourav Saha, Jiachen Guo, Xiaoyu Xie, Satyajit Mojumder, Miguel A. Bessa, Dong Qian, Wei Chen, Gregory J. Wagner, Jian Cao, Wing Kam Liu

    Abstract: The evolution of artificial intelligence (AI) and neural network theories has revolutionized the way software is programmed, shifting from a hard-coded series of codes to a vast neural network. However, this transition in engineering software has faced challenges such as data scarcity, multi-modality of data, low model accuracy, and slow inference. Here, we propose a new network based on interpola… ▽ More

    Submitted 22 April, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

    Comments: 9 pages, 3 figures

  24. arXiv:2404.07410  [pdf, other

    cs.CV cs.LG

    Improving Shift Invariance in Convolutional Neural Networks with Translation Invariant Polyphase Sampling

    Authors: Sourajit Saha, Tejas Gokhale

    Abstract: Downsampling operators break the shift invariance of convolutional neural networks (CNNs) and this affects the robustness of features learned by CNNs when dealing with even small pixel-level shift. Through a large-scale correlation analysis framework, we study shift invariance of CNNs by inspecting existing downsampling operators in terms of their maximum-sampling bias (MSB), and find that MSB is… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

  25. arXiv:2404.07214  [pdf, other

    cs.CV cs.AI cs.CL

    Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions

    Authors: Akash Ghosh, Arkadeep Acharya, Sriparna Saha, Vinija Jain, Aman Chadha

    Abstract: The advent of Large Language Models (LLMs) has significantly reshaped the trajectory of the AI revolution. Nevertheless, these LLMs exhibit a notable limitation, as they are primarily adept at processing textual information. To address this constraint, researchers have endeavored to integrate visual capabilities with LLMs, resulting in the emergence of Vision-Language Models (VLMs). These advanced… ▽ More

    Submitted 12 April, 2024; v1 submitted 20 February, 2024; originally announced April 2024.

    Comments: The most extensive and up to date Survey on Visual Language Models covering 76 Visual Language Models

  26. arXiv:2404.03799  [pdf, other

    cs.CV cs.AI

    Language-Guided Instance-Aware Domain-Adaptive Panoptic Segmentation

    Authors: Elham Amin Mansour, Ozan Unal, Suman Saha, Benjamin Bejar, Luc Van Gool

    Abstract: The increasing relevance of panoptic segmentation is tied to the advancements in autonomous driving and AR/VR applications. However, the deployment of such models has been limited due to the expensive nature of dense data annotation, giving rise to unsupervised domain adaptation (UDA). A key challenge in panoptic UDA is reducing the domain gap between a labeled source and an unlabeled target domai… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  27. arXiv:2404.00471  [pdf, other

    physics.med-ph cs.CV cs.LG eess.IV

    Score-Based Diffusion Models for Photoacoustic Tomography Image Reconstruction

    Authors: Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman

    Abstract: Photoacoustic tomography (PAT) is a rapidly-evolving medical imaging modality that combines optical absorption contrast with ultrasound imaging depth. One challenge in PAT is image reconstruction with inadequate acoustic signals due to limited sensor coverage or due to the density of the transducer array. Such cases call for solving an ill-posed inverse reconstruction problem. In this work, we use… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 5 pages

    Journal ref: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Korea, Republic of, 2024, pp. 2470-2474

  28. arXiv:2403.14290  [pdf, other

    cs.SD cs.CV cs.LG eess.AS

    Exploring Green AI for Audio Deepfake Detection

    Authors: Subhajit Saha, Md Sahidullah, Swagatam Das

    Abstract: The state-of-the-art audio deepfake detectors leveraging deep neural networks exhibit impressive recognition performance. Nonetheless, this advantage is accompanied by a significant carbon footprint. This is mainly due to the use of high-performance computing with accelerators and high training time. Studies show that average deep NLP model produces around 626k lbs of CO\textsubscript{2} which is… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: This manuscript is under review in a conference

  29. arXiv:2403.00643  [pdf, ps, other

    cs.DS cs.CC math.NA

    Undercomplete Decomposition of Symmetric Tensors in Linear Time, and Smoothed Analysis of the Condition Number

    Authors: Pascal Koiran, Subhayan Saha

    Abstract: We study symmetric tensor decompositions, i.e., decompositions of the form $T = \sum_{i=1}^r u_i^{\otimes 3}$ where $T$ is a symmetric tensor of order 3 and $u_i \in \mathbb{C}^n$.In order to obtain efficient decomposition algorithms, it is necessary to require additional properties from $u_i$. In this paper we assume that the $u_i$ are linearly independent. This implies $r \leq n$,that is, the de… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 55 pages

    MSC Class: 68W20; 68W40; 65F35; 15A69 ACM Class: F.2.1; G.1.3

  30. arXiv:2402.15570  [pdf, other

    cs.CR cs.AI cs.CL

    Fast Adversarial Attacks on Language Models In One GPU Minute

    Authors: Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan, Priyatham Kattakinda, Atoosa Chegini, Soheil Feizi

    Abstract: In this paper, we introduce a novel class of fast, beam search-based adversarial attack (BEAST) for Language Models (LMs). BEAST employs interpretable parameters, enabling attackers to balance between attack speed, success rate, and the readability of adversarial prompts. The computational efficiency of BEAST facilitates us to investigate its applications on LMs for jailbreaking, eliciting halluci… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  31. arXiv:2402.10453  [pdf, other

    cs.CL

    Steering Conversational Large Language Models for Long Emotional Support Conversations

    Authors: Navid Madani, Sougata Saha, Rohini Srihari

    Abstract: In this study, we address the challenge of consistently following emotional support strategies in long conversations by large language models (LLMs). We introduce the Strategy-Relevant Attention (SRA) metric, a model-agnostic measure designed to evaluate the effectiveness of LLMs in adhering to strategic prompts in emotional support contexts. By analyzing conversations within the Emotional Support… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  32. arXiv:2402.10039  [pdf, other

    cs.CV

    Feature Accentuation: Revealing 'What' Features Respond to in Natural Images

    Authors: Chris Hamblin, Thomas Fel, Srijani Saha, Talia Konkle, George Alvarez

    Abstract: Efforts to decode neural network vision models necessitate a comprehensive grasp of both the spatial and semantic facets governing feature responses within images. Most research has primarily centered around attribution methods, which provide explanations in the form of heatmaps, showing where the model directs its attention for a given feature. However, grasping 'where' alone falls short, as nume… ▽ More

    Submitted 8 June, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

  33. arXiv:2402.07927  [pdf, other

    cs.AI cs.CL cs.HC

    A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications

    Authors: Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Vinija Jain, Samrat Mondal, Aman Chadha

    Abstract: Prompt engineering has emerged as an indispensable technique for extending the capabilities of large language models (LLMs) and vision-language models (VLMs). This approach leverages task-specific instructions, known as prompts, to enhance model efficacy without modifying the core model parameters. Rather than updating the model parameters, prompts allow seamless integration of pre-trained models… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

    Comments: 9 pages, 2 figures

  34. arXiv:2402.07281  [pdf, other

    cs.LG

    Can Tree Based Approaches Surpass Deep Learning in Anomaly Detection? A Benchmarking Study

    Authors: Santonu Sarkar, Shanay Mehta, Nicole Fernandes, Jyotirmoy Sarkar, Snehanshu Saha

    Abstract: Detection of anomalous situations for complex mission-critical systems holds paramount importance when their service continuity needs to be ensured. A major challenge in detecting anomalies from the operational data arises due to the imbalanced class distribution problem since the anomalies are supposed to be rare events. This paper evaluates a diverse array of machine learning-based anomaly detec… ▽ More

    Submitted 25 February, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

  35. arXiv:2402.05025  [pdf, other

    cs.LG

    Strong convexity-guided hyper-parameter optimization for flatter losses

    Authors: Rahul Yedida, Snehanshu Saha

    Abstract: We propose a novel white-box approach to hyper-parameter optimization. Motivated by recent work establishing a relationship between flat minima and generalization, we first establish a relationship between the strong convexity of the loss and its flatness. Based on this, we seek to find hyper-parameter configurations that improve flatness by minimizing the strong convexity of the loss. By using th… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: v1

  36. arXiv:2402.01620  [pdf, other

    cs.CL

    MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models

    Authors: Justin Chih-Yao Chen, Swarnadeep Saha, Elias Stengel-Eskin, Mohit Bansal

    Abstract: Multi-agent interactions between Large Language Model (LLM) agents have shown major improvements on diverse reasoning tasks. However, these involve long generations from multiple models across several rounds, making them expensive. Moreover, these multi-agent approaches fail to provide a final, single model for efficient inference. To address this, we introduce MAGDi, a new method for structured d… ▽ More

    Submitted 7 June, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: ICML 2024 (Camera-ready); First two authors contributed equally; GitHub: https://github.com/dinobby/MAGDi

  37. arXiv:2401.17029  [pdf, other

    astro-ph.CO astro-ph.IM cs.LG

    LADDER: Revisiting the Cosmic Distance Ladder with Deep Learning Approaches and Exploring its Applications

    Authors: Rahul Shah, Soumadeep Saha, Purba Mukherjee, Utpal Garain, Supratik Pal

    Abstract: We investigate the prospect of reconstructing the ''cosmic distance ladder'' of the Universe using a novel deep learning framework called LADDER - Learning Algorithm for Deep Distance Estimation and Reconstruction. LADDER is trained on the apparent magnitude data from the Pantheon Type Ia supernovae compilation, incorporating the full covariance information among data points, to produce prediction… ▽ More

    Submitted 18 July, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

    Comments: 13 pages, 6 sets of figures, 5 tables. To appear in the Astrophys. J. Suppl. Ser. Code available at https://github.com/rahulshah1397/LADDER

  38. arXiv:2401.16559  [pdf, other

    cs.CV

    IEEE BigData 2023 Keystroke Verification Challenge (KVC)

    Authors: Giuseppe Stragapede, Ruben Vera-Rodriguez, Ruben Tolosana, Aythami Morales, Ivan DeAndres-Tame, Naser Damer, Julian Fierrez, Javier-Ortega Garcia, Nahuel Gonzalez, Andrei Shadrikov, Dmitrii Gordin, Leon Schmitt, Daniel Wimmer, Christoph Grossmann, Joerdis Krieger, Florian Heinz, Ron Krestel, Christoffer Mayer, Simon Haberl, Helena Gschrey, Yosuke Yamagishi, Sanjay Saha, Sanka Rasnayaka, Sandareka Wickramanayake, Terence Sim , et al. (4 additional authors not shown)

    Abstract: This paper describes the results of the IEEE BigData 2023 Keystroke Verification Challenge (KVC), that considers the biometric verification performance of Keystroke Dynamics (KD), captured as tweet-long sequences of variable transcript text from over 185,000 subjects. The data are obtained from two of the largest public databases of KD up to date, the Aalto Desktop and Mobile Keystroke Databases,… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

    Comments: 9 pages, 10 pages, 2 figures. arXiv admin note: text overlap with arXiv:2311.06000

  39. arXiv:2401.14098  [pdf, other

    cs.CR

    Carry Your Fault: A Fault Propagation Attack on Side-Channel Protected LWE-based KEM

    Authors: Suparna Kundu, Siddhartha Chowdhury, Sayandeep Saha, Angshuman Karmakar, Debdeep Mukhopadhyay, Ingrid Verbauwhede

    Abstract: Post-quantum cryptographic (PQC) algorithms, especially those based on the learning with errors (LWE) problem, have been subjected to several physical attacks in the recent past. Although the attacks broadly belong to two classes - passive side-channel attacks and active fault attacks, the attack strategies vary significantly due to the inherent complexities of such algorithms. Exploring further a… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    ACM Class: E.3.3

  40. arXiv:2401.09899  [pdf, other

    cs.CL

    Meme-ingful Analysis: Enhanced Understanding of Cyberbullying in Memes Through Multimodal Explanations

    Authors: Prince Jha, Krishanu Maity, Raghav Jain, Apoorv Verma, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: Internet memes have gained significant influence in communicating political, psychological, and sociocultural ideas. While memes are often humorous, there has been a rise in the use of memes for trolling and cyberbullying. Although a wide variety of effective deep learning-based models have been developed for detecting offensive multimodal memes, only a few works have been done on explainability a… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

    Comments: EACL2024

  41. arXiv:2401.09023  [pdf, other

    cs.CL

    Explain Thyself Bully: Sentiment Aided Cyberbullying Detection with Explanation

    Authors: Krishanu Maity, Prince Jha, Raghav Jain, Sriparna Saha, Pushpak Bhattacharyya

    Abstract: Cyberbullying has become a big issue with the popularity of different social media networks and online communication apps. While plenty of research is going on to develop better models for cyberbullying detection in monolingual language, there is very little research on the code-mixed languages and explainability aspect of cyberbullying. Recent laws like "right to explanations" of General Data Pro… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: ICDAR 2023

  42. arXiv:2401.07810  [pdf, other

    cs.CL cs.AI

    Consolidating Strategies for Countering Hate Speech Using Persuasive Dialogues

    Authors: Sougata Saha, Rohini Srihari

    Abstract: Hateful comments are prevalent on social media platforms. Although tools for automatically detecting, flagging, and blocking such false, offensive, and harmful content online have lately matured, such reactive and brute force methods alone provide short-term and superficial remedies while the perpetrators persist. With the public availability of large language models which can generate articulate… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  43. arXiv:2401.06807  [pdf, other

    cs.CL cs.AI

    An EcoSage Assistant: Towards Building A Multimodal Plant Care Dialogue Assistant

    Authors: Mohit Tomar, Abhisek Tiwari, Tulika Saha, Prince Jha, Sriparna Saha

    Abstract: In recent times, there has been an increasing awareness about imminent environmental challenges, resulting in people showing a stronger dedication to taking care of the environment and nurturing green life. The current $19.6 billion indoor gardening industry, reflective of this growing sentiment, not only signifies a monetary value but also speaks of a profound human desire to reconnect with the n… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  44. arXiv:2401.05134  [pdf, other

    cs.AI cs.CL

    Yes, this is what I was looking for! Towards Multi-modal Medical Consultation Concern Summary Generation

    Authors: Abhisek Tiwari, Shreyangshu Bera, Sriparna Saha, Pushpak Bhattacharyya, Samrat Ghosh

    Abstract: Over the past few years, the use of the Internet for healthcare-related tasks has grown by leaps and bounds, posing a challenge in effectively managing and processing information to ensure its efficient utilization. During moments of emotional turmoil and psychological challenges, we frequently turn to the internet as our initial source of support, choosing this over discussing our feelings with o… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  45. arXiv:2401.03005  [pdf, other

    physics.soc-ph cs.CV

    Evolution of urban areas and land surface temperature

    Authors: Sudipan Saha, Tushar Verma, Dario Augusto Borges Oliveira

    Abstract: With the global population on the rise, our cities have been expanding to accommodate the growing number of people. The expansion of cities generally leads to the engulfment of peripheral areas. However, such expansion of urban areas is likely to cause increment in areas with increased land surface temperature (LST). By considering each summer as a data point, we form LST multi-year time-series an… ▽ More

    Submitted 5 January, 2024; originally announced January 2024.

  46. arXiv:2401.01596  [pdf, other

    cs.AI cs.CL

    MedSumm: A Multimodal Approach to Summarizing Code-Mixed Hindi-English Clinical Queries

    Authors: Akash Ghosh, Arkadeep Acharya, Prince Jha, Aniket Gaudgaul, Rajdeep Majumdar, Sriparna Saha, Aman Chadha, Raghav Jain, Setu Sinha, Shivani Agarwal

    Abstract: In the healthcare domain, summarizing medical questions posed by patients is critical for improving doctor-patient interactions and medical decision-making. Although medical data has grown in complexity and quantity, the current body of research in this domain has primarily concentrated on text-based methods, overlooking the integration of visual cues. Also prior works in the area of medical quest… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

    Comments: ECIR 2024

  47. arXiv:2312.12624  [pdf, other

    cs.CL

    Building a Llama2-finetuned LLM for Odia Language Utilizing Domain Knowledge Instruction Set

    Authors: Guneet Singh Kohli, Shantipriya Parida, Sambit Sekhar, Samirit Saha, Nipun B Nair, Parul Agarwal, Sonal Khosla, Kusumlata Patiyal, Debasish Dhal

    Abstract: Building LLMs for languages other than English is in great demand due to the unavailability and performance of multilingual LLMs, such as understanding the local context. The problem is critical for low-resource languages due to the need for instruction sets. In a multilingual country like India, there is a need for LLMs supporting Indic languages to provide generative AI and LLM-based technologie… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

  48. arXiv:2312.11541  [pdf, other

    cs.AI cs.CL

    CLIPSyntel: CLIP and LLM Synergy for Multimodal Question Summarization in Healthcare

    Authors: Akash Ghosh, Arkadeep Acharya, Raghav Jain, Sriparna Saha, Aman Chadha, Setu Sinha

    Abstract: In the era of modern healthcare, swiftly generating medical question summaries is crucial for informed and timely patient care. Despite the increasing complexity and volume of medical data, existing studies have focused solely on text-based summarization, neglecting the integration of visual information. Recognizing the untapped potential of combining textual queries with visual representations of… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  49. arXiv:2312.10933  [pdf, other

    cs.CV cs.AI cs.LG

    SeeBel: Seeing is Believing

    Authors: Sourajit Saha, Shubhashis Roy Dipta

    Abstract: Semantic Segmentation is a significant research field in Computer Vision. Despite being a widely studied subject area, many visualization tools do not exist that capture segmentation quality and dataset statistics such as a class imbalance in the same view. While the significance of discovering and introspecting the correlation between dataset statistics and AI model performance for dense predicti… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

    Comments: PrePrint

  50. arXiv:2312.09407  [pdf, other

    cs.HC

    How Does User Behavior Evolve During Exploratory Visual Analysis?

    Authors: Sanad Saha, Nischal Aryal, Leilani Battle, Arash Termehchy

    Abstract: Exploratory visual analysis (EVA) is an essential stage of the data science pipeline, where users often lack clear analysis goals at the start and iteratively refine them as they learn more about their data. Accurate models of users' exploration behavior are becoming increasingly vital to developing responsive and personalized tools for exploratory visual analysis. Yet we observe a discrepancy bet… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.