Skip to main content

Showing 1–50 of 897 results for author: Singh, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.13522  [pdf, other

    cs.LG

    INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages

    Authors: Abhishek Kumar Singh, Rudra Murthy, Vishwajeet kumar, Jaydeep Sen, Ganesh Ramakrishnan

    Abstract: Large Language Models (LLMs) have demonstrated remarkable zero-shot and few-shot capabilities in unseen tasks, including context-grounded question answering (QA) in English. However, the evaluation of LLMs' capabilities in non-English languages for context-based QA is limited by the scarcity of benchmarks in non-English languages. To address this gap, we introduce Indic-QA, the largest publicly av… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.10452  [pdf, other

    cs.LG cs.AI

    GraphPrint: Extracting Features from 3D Protein Structure for Drug Target Affinity Prediction

    Authors: Amritpal Singh

    Abstract: Accurate drug target affinity prediction can improve drug candidate selection, accelerate the drug discovery process, and reduce drug production costs. Previous work focused on traditional fingerprints or used features extracted based on the amino acid sequence in the protein, ignoring its 3D structure which affects its binding affinity. In this work, we propose GraphPrint: a framework for incorpo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted: The NeurIPS 2023 Workshop on New Frontiers of AI for Drug Discovery and Development (AI4D3 2023), New Orleans, LA, USA, 2023

  3. arXiv:2407.08989  [pdf, other

    cs.CL cs.AI

    Robustness of LLMs to Perturbations in Text

    Authors: Ayush Singh, Navpreet Singh, Shubham Vatsal

    Abstract: Having a clean dataset has been the foundational assumption of most natural language processing (NLP) systems. However, properly written text is rarely found in real-world scenarios and hence, oftentimes invalidates the aforementioned foundational assumption. Recently, Large language models (LLMs) have shown impressive performance, but can they handle the inevitable noise in real-world data? This… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: 8 pages, 1 figure, 6 tables, updated with results also from GPT-4, LLaMa-3

    ACM Class: I.7; I.2.7; I.2.4

  4. arXiv:2407.08888  [pdf, other

    cs.LG

    Uncovering Semantics and Topics Utilized by Threat Actors to Deliver Malicious Attachments and URLs

    Authors: Andrey Yakymovych, Abhishek Singh

    Abstract: Recent threat reports highlight that email remains the top vector for delivering malware to endpoints. Despite these statistics, detecting malicious email attachments and URLs often neglects semantic cues linguistic features and contextual clues. Our study employs BERTopic unsupervised topic modeling to identify common semantics and themes embedded in email to deliver malicious attachments and cal… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 6 Pages, 7 Figures

  5. arXiv:2407.04087  [pdf, other

    cs.NE cs.AI

    Advanced Artificial Intelligence Strategy for Optimizing Urban Rail Network Design using Nature-Inspired Algorithms

    Authors: Hariram Sampath Kumar, Archana Singh, Manish Kumar Ojha

    Abstract: This study introduces an innovative methodology for the planning of metro network routes within the urban environment of Chennai, Tamil Nadu, India. A comparative analysis of the modified Ant Colony Optimization (ACO) method (previously developed) with recent breakthroughs in nature-inspired algorithms demonstrates the modified ACO's superiority over modern techniques. By utilizing the modified AC… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 10 pages, 17 figures

  6. arXiv:2407.00434  [pdf, other

    cs.CL

    Brevity is the soul of wit: Pruning long files for code generation

    Authors: Aaditya K. Singh, Yu Yang, Kushal Tirumala, Mostafa Elhoushi, Ari S. Morcos

    Abstract: Data curation is commonly considered a "secret-sauce" for LLM training, with higher quality data usually leading to better LLM performance. Given the scale of internet-scraped corpora, data pruning has become a larger and larger focus. Specifically, many have shown that de-duplicating data, or sub-selecting higher quality data, can lead to efficiency or performance improvements. Generally, three t… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 15 pages, 5 figures

  7. arXiv:2406.20005  [pdf, other

    eess.IV cs.CV

    Malaria Cell Detection Using Deep Neural Networks

    Authors: Saurabh Sawant, Anurag Singh

    Abstract: Malaria remains one of the most pressing public health concerns globally, causing significant morbidity and mortality, especially in sub-Saharan Africa. Rapid and accurate diagnosis is crucial for effective treatment and disease management. Traditional diagnostic methods, such as microscopic examination of blood smears, are labor-intensive and require significant expertise, which may not be readil… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  8. arXiv:2406.17720  [pdf, other

    cs.CV

    Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

    Authors: Chih-Hsuan Yang, Benjamin Feuer, Zaki Jubery, Zi K. Deng, Andre Nakkab, Md Zahid Hasan, Shivani Chiranjeevi, Kelly Marshall, Nirmal Baishnab, Asheesh K Singh, Arti Singh, Soumik Sarkar, Nirav Merchant, Chinmay Hegde, Baskar Ganapathysubramanian

    Abstract: We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: Preprint under review

  9. arXiv:2406.17339  [pdf, other

    cs.IT eess.SP

    Optimizing Configuration Selection in Reconfigurable-Antenna MIMO Systems: Physics-Inspired Heuristic Solvers

    Authors: I. Krikidis, C. Psomas, A. K. Singh, K. Jamieson

    Abstract: Reconfigurable antenna multiple-input multiple-output (MIMO) is a foundational technology for the continuing evolution of cellular systems, including upcoming 6G communication systems. In this paper, we address the problem of flexible/reconfigurable antenna configuration selection for point-to-point MIMO antenna systems by using physics-inspired heuristics. Firstly, we optimize the antenna configu… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2403.12571

    Journal ref: IEEE Transactions on Communications, 2004

  10. arXiv:2406.16176  [pdf, other

    cs.AI cs.CL cs.LG

    GraphEval2000: Benchmarking and Improving Large Language Models on Graph Datasets

    Authors: Qiming Wu, Zichen Chen, Will Corcoran, Misha Sra, Ambuj K. Singh

    Abstract: Large language models (LLMs) have achieved remarkable success in natural language processing (NLP), demonstrating significant capabilities in processing and understanding text data. However, recent studies have identified limitations in LLMs' ability to reason about graph-structured data. To address this gap, we introduce GraphEval2000, the first comprehensive graph dataset, comprising 40 graph da… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

    Comments: Submitted to NeurIPs 2024 Dataset and Benchmark track, under review

    MSC Class: H.2.8; I.2.6; I.2.7

  11. arXiv:2406.15335  [pdf, other

    cs.CV cs.CY

    Keystroke Dynamics Against Academic Dishonesty in the Age of LLMs

    Authors: Debnath Kundu, Atharva Mehta, Rajesh Kumar, Naman Lal, Avinash Anand, Apoorv Singh, Rajiv Ratn Shah

    Abstract: The transition to online examinations and assignments raises significant concerns about academic integrity. Traditional plagiarism detection systems often struggle to identify instances of intelligent cheating, particularly when students utilize advanced generative AI tools to craft their responses. This study proposes a keystroke dynamics-based method to differentiate between bona fide and assist… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: Accepted for publication at The IEEE International Joint Conference on Biometrics (IJCB2024), contains 9 pages, 3 figures, 3 tables

    ACM Class: I.5.4

  12. arXiv:2406.14639  [pdf, other

    cs.RO

    Differentiable-Optimization Based Neural Policy for Occlusion-Aware Target Tracking

    Authors: Houman Masnavi, Arun Kumar Singh, Farrokh Janabi-Sharifi

    Abstract: Tracking a target in cluttered and dynamic environments is challenging but forms a core component in applications like aerial cinematography. The obstacles in the environment not only pose collision risk but can also occlude the target from the field-of-view of the robot. Moreover, the target future trajectory may be unknown and only its current state can be estimated. In this paper, we propose a… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  13. arXiv:2406.14008  [pdf, other

    cs.AR

    AMC: Access to Miss Correlation Prefetcher for Evolving Graph Analytics

    Authors: Abhishek Singh, Christian Schulte, Xiaochen Guo

    Abstract: Modern memory hierarchies work well with applications that have good spatial locality. Evolving (dynamic) graphs are important applications widely used to model graphs and networks with edge and vertex changes. They exhibit irregular memory access patterns and suffer from a high miss ratio and long miss penalty. Prefetching can be employed to predict and fetch future demand misses. However, curren… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 14 pages, 16 figures

    ACM Class: C.1.1

  14. arXiv:2406.13869  [pdf, other

    cs.LG q-bio.BM

    Global Human-guided Counterfactual Explanations for Molecular Properties via Reinforcement Learning

    Authors: Danqing Wang, Antonis Antoniades, Kha-Dinh Luong, Edwin Zhang, Mert Kosan, Jiachen Li, Ambuj Singh, William Yang Wang, Lei Li

    Abstract: Counterfactual explanations of Graph Neural Networks (GNNs) offer a powerful way to understand data that can naturally be represented by a graph structure. Furthermore, in many domains, it is highly desirable to derive data-driven global explanations or rules that can better explain the high-level properties of the models and data in question. However, evaluating global counterfactual explanations… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted by KDD 2024

  15. arXiv:2406.13081  [pdf, other

    cs.CV

    Class-specific Data Augmentation for Plant Stress Classification

    Authors: Nasla Saleem, Aditya Balu, Talukder Zaki Jubery, Arti Singh, Asheesh K. Singh, Soumik Sarkar, Baskar Ganapathysubramanian

    Abstract: Data augmentation is a powerful tool for improving deep learning-based image classifiers for plant stress identification and classification. However, selecting an effective set of augmentations from a large pool of candidates remains a key challenge, particularly in imbalanced and confounding datasets. We propose an approach for automated class-specific data augmentation using a genetic algorithm.… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

  16. arXiv:2406.10229  [pdf, other

    cs.LG cs.AI

    Quantifying Variance in Evaluation Benchmarks

    Authors: Lovish Madaan, Aaditya K. Singh, Rylan Schaeffer, Andrew Poulton, Sanmi Koyejo, Pontus Stenetorp, Sharan Narang, Dieuwke Hupkes

    Abstract: Evaluation benchmarks are the cornerstone of measuring capabilities of large language models (LLMs), as well as driving progress in said capabilities. Originally designed to make claims about capabilities (or lack thereof) in fully pretrained models, evaluation benchmarks are now also extensively used to decide between various training choices. Despite this widespread usage, we rarely quantify the… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  17. arXiv:2406.09661  [pdf, other

    cs.LO cs.AI eess.SY

    Temporal Planning via Interval Logic Satisfiability for Autonomous Systems

    Authors: Miquel Ramirez, Anubhav Singh, Peter Stuckey, Chris Manzie

    Abstract: Many automated planning methods and formulations rely on suitably designed abstractions or simplifications of the constrained dynamics associated with agents to attain computational scalability. We consider formulations of temporal planning where intervals are associated with both action and fluent atoms, and relations between these are given as sentences in Allen's Interval Logic. We propose a no… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: This publication is an extended version of a manuscript submitted to ICAPS-24 (and rejected). Please contact the first author for queries, comments or discussion of the paper

  18. arXiv:2406.07521  [pdf, other

    cs.DS cs.LG

    Faster Spectral Density Estimation and Sparsification in the Nuclear Norm

    Authors: Yujia Jin, Ishani Karmarkar, Christopher Musco, Aaron Sidford, Apoorv Vikram Singh

    Abstract: We consider the problem of estimating the spectral density of the normalized adjacency matrix of an $n$-node undirected graph. We provide a randomized algorithm that, with $O(nε^{-2})$ queries to a degree and neighbor oracle and in $O(nε^{-3})$ time, estimates the spectrum up to $ε$ accuracy in the Wasserstein-1 metric. This improves on previous state-of-the-art methods, including an $O(nε^{-7})$… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2024

  19. arXiv:2406.07253  [pdf, other

    cs.LG

    Hybrid Reinforcement Learning from Offline Observation Alone

    Authors: Yuda Song, J. Andrew Bagnell, Aarti Singh

    Abstract: We consider the hybrid reinforcement learning setting where the agent has access to both offline data and online interactive access. While Reinforcement Learning (RL) research typically assumes offline data contains complete action, reward and transition information, datasets with only state information (also known as observation-only datasets) are more general, abundant and practical. This motiva… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 34 pages, 7 figures, published at ICML 2024

  20. arXiv:2406.06739  [pdf, other

    cs.CL cs.IR cs.LG

    Scaling the Vocabulary of Non-autoregressive Models for Efficient Generative Retrieval

    Authors: Ravisri Valluri, Akash Kumar Mohankumar, Kushal Dave, Amit Singh, Jian Jiao, Manik Varma, Gaurav Sinha

    Abstract: Generative Retrieval introduces a new approach to Information Retrieval by reframing it as a constrained generation task, leveraging recent advancements in Autoregressive (AR) language models. However, AR-based Generative Retrieval methods suffer from high inference latency and cost compared to traditional dense retrieval techniques, limiting their practical applicability. This paper investigates… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 14 pages, 6 tables, 2 figures

  21. arXiv:2406.03994  [pdf, other

    cs.HC

    Exploring Topic Modelling of User Reviews as a Monitoring Mechanism for Emergent Issues Within Social VR Communities

    Authors: Angelo Singh, Joseph O'Hagan

    Abstract: Users of social virtual reality (VR) platforms often use user reviews to document incidents of witnessed and/or experienced user harassment. However, at present, research has yet to be explore utilising this data as a monitoring mechanism to identify emergent issues within social VR communities. Such a system would be of much benefit to developers and researchers as it would enable the automatic i… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures, 1 table

  22. arXiv:2406.03893  [pdf, other

    cs.CL

    How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?

    Authors: Anushka Singh, Ananya B. Sai, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M Khapra

    Abstract: While machine translation evaluation has been studied primarily for high-resource languages, there has been a recent interest in evaluation for low-resource languages due to the increasing availability of data and models. In this paper, we focus on a zero-shot evaluation setting focusing on low-resource Indian languages, namely Assamese, Kannada, Maithili, and Punjabi. We collect sufficient Multi-… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  23. arXiv:2406.02290  [pdf, other

    cs.LG

    A Study of Optimizations for Fine-tuning Large Language Models

    Authors: Arjun Singh, Nikhil Pandey, Anup Shirgaonkar, Pavan Manoj, Vijay Aski

    Abstract: Fine-tuning large language models is a popular choice among users trying to adapt them for specific applications. However, fine-tuning these models is a demanding task because the user has to examine several factors, such as resource budget, runtime, model size and context length among others. A specific challenge is that fine-tuning is memory intensive, imposing constraints on the required hardwa… ▽ More

    Submitted 6 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: 10 pages, 4 figures. Revised text for clarity, updated references

  24. arXiv:2406.01462  [pdf, other

    cs.LG cs.AI cs.CL

    The Importance of Online Data: Understanding Preference Fine-tuning via Coverage

    Authors: Yuda Song, Gokul Swamy, Aarti Singh, J. Andrew Bagnell, Wen Sun

    Abstract: Learning from human preference data has emerged as the dominant paradigm for fine-tuning large language models (LLMs). The two most common families of techniques -- online reinforcement learning (RL) such as Proximal Policy Optimization (PPO) and offline contrastive methods such as Direct Preference Optimization (DPO) -- were positioned as equivalent in prior work due to the fact that both have to… ▽ More

    Submitted 16 July, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  25. arXiv:2406.00038  [pdf, ps, other

    cs.CL cs.AI

    ViSpeR: Multilingual Audio-Visual Speech Recognition

    Authors: Sanath Narayan, Yasser Abdelaziz Dahou Djilali, Ankit Singh, Eustache Le Bihan, Hakim Hacid

    Abstract: This work presents an extensive and detailed study on Audio-Visual Speech Recognition (AVSR) for five widely spoken languages: Chinese, Spanish, English, Arabic, and French. We have collected large-scale datasets for each language except for English, and have engaged in the training of supervised learning models. Our model, ViSpeR, is trained in a multi-lingual setting, resulting in competitive pe… ▽ More

    Submitted 27 May, 2024; originally announced June 2024.

  26. arXiv:2405.18682  [pdf, other

    cs.CL cs.AI cs.LG

    Can GPT Redefine Medical Understanding? Evaluating GPT on Biomedical Machine Reading Comprehension

    Authors: Shubham Vatsal, Ayush Singh

    Abstract: Large language models (LLMs) have shown remarkable performance on many tasks in different domains. However, their performance in closed-book biomedical machine reading comprehension (MRC) has not been evaluated in depth. In this work, we evaluate GPT on four closed-book biomedical MRC benchmarks. We experiment with different conventional prompting techniques as well as introduce our own novel prom… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  27. arXiv:2405.17700  [pdf, other

    cs.GT cs.LG

    Learning Social Welfare Functions

    Authors: Kanad Shrikar Pardeshi, Itai Shapira, Ariel D. Procaccia, Aarti Singh

    Abstract: Is it possible to understand or imitate a policy maker's rationale by looking at past decisions they made? We formalize this question as the problem of learning social welfare functions belonging to the well-studied family of power mean functions. We focus on two learning tasks; in the first, the input is vectors of utilities of an action (decision or policy) for individuals in a group and their a… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  28. arXiv:2405.15777  [pdf, other

    cs.RO

    Multi-agent Collaborative Perception for Robotic Fleet: A Systematic Review

    Authors: Apoorv Singh, Gaurav Raut, Alka Choudhary

    Abstract: Collaborative perception in multi-robot fleets is a way to incorporate the power of unity in robotic fleets. Collaborative perception refers to the collective ability of multiple entities or agents to share and integrate their sensory information for a more comprehensive understanding of their environment. In other words, it involves the collaboration and fusion of data from various sensors or sou… ▽ More

    Submitted 22 March, 2024; originally announced May 2024.

    Comments: 10 pages, 4 figures, 3 tables

  29. arXiv:2405.15766  [pdf, other

    cs.AI cs.CL cs.CV

    Enhancing Adverse Drug Event Detection with Multimodal Dataset: Corpus Creation and Model Development

    Authors: Pranab Sahoo, Ayush Kumar Singh, Sriparna Saha, Aman Chadha, Samrat Mondal

    Abstract: The mining of adverse drug events (ADEs) is pivotal in pharmacovigilance, enhancing patient safety by identifying potential risks associated with medications, facilitating early detection of adverse events, and guiding regulatory decision-making. Traditional ADE detection methods are reliable but slow, not easily adaptable to large-scale operations, and offer limited information. With the exponent… ▽ More

    Submitted 26 May, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: ACL Findings 2024

  30. arXiv:2405.15468  [pdf, other

    cs.CV cs.GR

    Semantic Aware Diffusion Inverse Tone Mapping

    Authors: Abhishek Goswami, Aru Ranjan Singh, Francesco Banterle, Kurt Debattista, Thomas Bashford-Rogers

    Abstract: The range of real-world scene luminance is larger than the capture capability of many digital camera sensors which leads to details being lost in captured images, most typically in bright regions. Inverse tone mapping attempts to boost these captured Standard Dynamic Range (SDR) images back to High Dynamic Range (HDR) by creating a mapping that linearizes the well exposed values from the SDR image… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  31. arXiv:2405.11659  [pdf, other

    cs.RO cs.CV cs.LG

    Auto-Platoon : Freight by example

    Authors: Tharun V. Puthanveettil, Abhijay Singh, Yashveer Jain, Vinay Bukka, Sameer Arjun S

    Abstract: The work introduces a bio-inspired leader-follower system based on an innovative mechanism proposed as software latching that aims to improve collaboration and coordination between a leader agent and the associated autonomous followers. The system utilizes software latching to establish real-time communication and synchronization between the leader and followers. A layered architecture is proposed… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

  32. arXiv:2405.11487  [pdf, other

    cs.CV

    "Previously on ..." From Recaps to Story Summarization

    Authors: Aditya Kumar Singh, Dhruv Srivastava, Makarand Tapaswi

    Abstract: We introduce multimodal story summarization by leveraging TV episode recaps - short video sequences interweaving key story moments from previous episodes to bring viewers up to speed. We propose PlotSnap, a dataset featuring two crime thriller TV shows with rich recaps and long episodes of 40 minutes. Story summarization labels are unlocked by matching recap shots to corresponding sub-stories in t… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: CVPR 2024; Project page: https://katha-ai.github.io/projects/recap-story-summ/

  33. arXiv:2405.11200  [pdf, other

    cs.CL

    LexGen: Domain-aware Multilingual Lexicon Generation

    Authors: Karthika NJ, Ayush Maheshwari, Atul Kumar Singh, Preethi Jyothi, Ganesh Ramakrishnan, Krishnakant Bhatt

    Abstract: Lexicon or dictionary generation across domains is of significant societal importance, as it can potentially enhance information accessibility for a diverse user base while preserving language identity. Prior work in the field primarily focuses on bilingual lexical induction, which deals with word alignments using mapping-based or corpora-based approaches. Though initiated by researchers, the rese… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  34. arXiv:2405.10376  [pdf, ps, other

    cs.CR cs.AI

    Dealing Doubt: Unveiling Threat Models in Gradient Inversion Attacks under Federated Learning, A Survey and Taxonomy

    Authors: Yichuan Shi, Olivera Kotevska, Viktor Reshniak, Abhishek Singh, Ramesh Raskar

    Abstract: Federated Learning (FL) has emerged as a leading paradigm for decentralized, privacy preserving machine learning training. However, recent research on gradient inversion attacks (GIAs) have shown that gradient updates in FL can leak information on private training samples. While existing surveys on GIAs have focused on the honest-but-curious server threat model, there is a dearth of research categ… ▽ More

    Submitted 16 May, 2024; originally announced May 2024.

  35. arXiv:2405.08776  [pdf

    cs.CV

    FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings

    Authors: Nancy Hada, Aditya Singh, Kavita Vemuri

    Abstract: Indian folk paintings have a rich mosaic of symbols, colors, textures, and stories making them an invaluable repository of cultural legacy. The paper presents a novel approach to classifying these paintings into distinct art forms and tagging them with their unique salient features. A custom dataset named FolkTalent, comprising 2279 digital images of paintings across 12 different forms, has been p… ▽ More

    Submitted 14 May, 2024; originally announced May 2024.

  36. arXiv:2405.06712  [pdf, other

    cs.CL cs.AI

    Digital Diagnostics: The Potential Of Large Language Models In Recognizing Symptoms Of Common Illnesses

    Authors: Gaurav Kumar Gupta, Aditi Singh, Sijo Valayakkad Manikandan, Abul Ehtesham

    Abstract: The recent swift development of LLMs like GPT-4, Gemini, and GPT-3.5 offers a transformative opportunity in medicine and healthcare, especially in digital diagnostics. This study evaluates each model diagnostic abilities by interpreting a user symptoms and determining diagnoses that fit well with common illnesses, and it demonstrates how each of these models could significantly increase diagnostic… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 14 pages, 4 figures

  37. arXiv:2405.05574  [pdf, other

    cs.CV

    Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

    Authors: Debabrata Pal, Anvita Singh, Saumya Saumya, Shouvik Das

    Abstract: The intrinsic capability to perceive depth of field and extract salient information by the Human Vision System (HVS) stimulates a pilot to perform manual landing over an autoland approach. However, harsh weather creates visibility hindrances, and a pilot must have a clear view of runway elements before the minimum decision altitude. To help a pilot in manual landing, a vision-based system tailored… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  38. arXiv:2405.05378  [pdf, other

    cs.CL cs.AI cs.CY cs.HC cs.LG

    "They are uncultured": Unveiling Covert Harms and Social Threats in LLM Generated Conversations

    Authors: Preetam Prabhu Srikar Dammu, Hayoung Jung, Anjali Singh, Monojit Choudhury, Tanushree Mitra

    Abstract: Large language models (LLMs) have emerged as an integral part of modern societies, powering user-facing applications such as personal assistants and enterprise applications like recruitment tools. Despite their utility, research indicates that LLMs perpetuate systemic biases. Yet, prior works on LLM harms predominantly focus on Western concepts like race and gender, often overlooking cultural conc… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  39. arXiv:2405.03075  [pdf, other

    cs.LG

    AnoGAN for Tabular Data: A Novel Approach to Anomaly Detection

    Authors: Aditya Singh, Pavan Reddy

    Abstract: Anomaly detection, a critical facet in data analysis, involves identifying patterns that deviate from expected behavior. This research addresses the complexities inherent in anomaly detection, exploring challenges and adapting to sophisticated malicious activities. With applications spanning cybersecurity, healthcare, finance, and surveillance, anomalies often signify critical information or poten… ▽ More

    Submitted 5 May, 2024; originally announced May 2024.

    Comments: 12 pages, 6 figures, accepted as Short paper at HCII 2024 (https://2024.hci.international)

  40. arXiv:2405.01796  [pdf, other

    cs.CL cs.DL cs.IR

    TOPICAL: TOPIC Pages AutomagicaLly

    Authors: John Giorgi, Amanpreet Singh, Doug Downey, Sergey Feldman, Lucy Lu Wang

    Abstract: Topic pages aggregate useful information about an entity or concept into a single succinct and accessible article. Automated creation of topic pages would enable their rapid curation as information resources, providing an alternative to traditional web search. While most prior work has focused on generating topic pages about biographical entities, in this work, we develop a completely automated pr… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

    Comments: 10 pages, 7 figures, 2 tables, NAACL System Demonstrations 2024

  41. arXiv:2404.18591  [pdf, other

    cs.CV cs.AI

    FashionSD-X: Multimodal Fashion Garment Synthesis using Latent Diffusion

    Authors: Abhishek Kumar Singh, Ioannis Patras

    Abstract: The rapid evolution of the fashion industry increasingly intersects with technological advancements, particularly through the integration of generative AI. This study introduces a novel generative pipeline designed to transform the fashion design process by employing latent diffusion models. Utilizing ControlNet and LoRA fine-tuning, our approach generates high-quality images from multimodal input… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: 9 pages, 8 figures

  42. arXiv:2404.16294  [pdf, other

    cs.CL cs.AI

    LLM-Based Section Identifiers Excel on Open Source but Stumble in Real World Applications

    Authors: Saranya Krishnamoorthy, Ayush Singh, Shabnam Tafreshi

    Abstract: Electronic health records (EHR) even though a boon for healthcare practitioners, are growing convoluted and longer every day. Sifting around these lengthy EHRs is taxing and becomes a cumbersome part of physician-patient interaction. Several approaches have been proposed to help alleviate this prevalent issue either via summarization or sectioning, however, only a few approaches have truly been he… ▽ More

    Submitted 24 April, 2024; originally announced April 2024.

    Comments: To appear in NAACL 2024 at the 6th Clinical Natural Language Processing Workshop

  43. arXiv:2404.14367  [pdf, other

    cs.LG

    Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

    Authors: Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar

    Abstract: Learning from preference labels plays a crucial role in fine-tuning large language models. There are several distinct approaches for preference fine-tuning, including supervised learning, on-policy reinforcement learning (RL), and contrastive learning. Different methods come with different implementation tradeoffs and performance differences, and existing empirical findings present different concl… ▽ More

    Submitted 2 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: International Conference on Machine Learning (ICML), 2024

  44. arXiv:2404.12926  [pdf, other

    cs.AI

    MM-PhyRLHF: Reinforcement Learning Framework for Multimodal Physics Question-Answering

    Authors: Avinash Anand, Janak Kapuriya, Chhavi Kirtani, Apoorv Singh, Jay Saraf, Naman Lal, Jatin Kumar, Adarsh Raj Shivam, Astha Verma, Rajiv Ratn Shah, Roger Zimmermann

    Abstract: Recent advancements in LLMs have shown their significant potential in tasks like text summarization and generation. Yet, they often encounter difficulty while solving complex physics problems that require arithmetic calculation and a good understanding of concepts. Moreover, many physics problems include images that contain important details required to understand the problem's context. We propose… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  45. arXiv:2404.11868  [pdf, other

    cs.CV cs.LG

    OPTiML: Dense Semantic Invariance Using Optimal Transport for Self-Supervised Medical Image Representation

    Authors: Azad Singh, Vandan Gorade, Deepak Mishra

    Abstract: Self-supervised learning (SSL) has emerged as a promising technique for medical image analysis due to its ability to learn without annotations. However, despite the promising potential, conventional SSL methods encounter limitations, including challenges in achieving semantic alignment and capturing subtle details. This leads to suboptimal representations, which fail to accurately capture the unde… ▽ More

    Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  46. arXiv:2404.11018  [pdf, other

    cs.LG cs.AI cs.CL

    Many-Shot In-Context Learning

    Authors: Rishabh Agarwal, Avi Singh, Lei M. Zhang, Bernd Bohnet, Luis Rosias, Stephanie Chan, Biao Zhang, Ankesh Anand, Zaheer Abbas, Azade Nova, John D. Co-Reyes, Eric Chu, Feryal Behbahani, Aleksandra Faust, Hugo Larochelle

    Abstract: Large language models (LLMs) excel at few-shot in-context learning (ICL) -- learning from a few examples provided in context at inference, without any weight updates. Newly expanded context windows allow us to investigate ICL with hundreds or thousands of examples -- the many-shot regime. Going from few-shot to many-shot, we observe significant performance gains across a wide variety of generative… ▽ More

    Submitted 22 May, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

  47. arXiv:2404.08704  [pdf, other

    cs.CL cs.AI

    MM-PhyQA: Multimodal Physics Question-Answering With Multi-Image CoT Prompting

    Authors: Avinash Anand, Janak Kapuriya, Apoorv Singh, Jay Saraf, Naman Lal, Astha Verma, Rushali Gupta, Rajiv Shah

    Abstract: While Large Language Models (LLMs) can achieve human-level performance in various tasks, they continue to face challenges when it comes to effectively tackling multi-step physics reasoning tasks. To identify the shortcomings of existing models and facilitate further research in this area, we curated a novel dataset, MM-PhyQA, which comprises well-constructed, high schoollevel multimodal physics pr… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  48. arXiv:2404.07129  [pdf, other

    cs.LG

    What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation

    Authors: Aaditya K. Singh, Ted Moskovitz, Felix Hill, Stephanie C. Y. Chan, Andrew M. Saxe

    Abstract: In-context learning is a powerful emergent ability in transformer models. Prior work in mechanistic interpretability has identified a circuit element that may be critical for in-context learning -- the induction head (IH), which performs a match-and-copy operation. During training of large transformers on natural language data, IHs emerge around the same time as a notable phase change in the loss.… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 26 pages, 18 figures

  49. arXiv:2404.05951  [pdf, other

    cs.LO

    Syndicate: Synergistic Synthesis of Ranking Function and Invariants for Termination Analysis

    Authors: Yasmin Sarita, Avaljot Singh, Shaurya Gomber, Gagandeep Singh, Mahesh Vishwanathan

    Abstract: Several techniques have been developed to prove the termination of programs. Finding ranking functions is one of the common approaches to do so. A ranking function must be bounded and must reduce at every iteration for all the reachable program states. Since the set of reachable states is often unknown, invariants serve as an over-approximation. Further, in the case of nested loops, the initial se… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

  50. arXiv:2404.05631  [pdf, other

    cs.ET

    Multi Digit Ising Mapping for Low Precision Ising Solvers

    Authors: Abhishek Kumar Singh, Kyle Jamieson

    Abstract: The last couple of years have seen an ever-increasing interest in using different Ising solvers, like Quantum annealers, Coherent Ising machines, and Oscillator-based Ising machines, for solving tough computational problems in various domains. Although the simulations predict massive performance improvements for several tough computational problems, the real implementations of the Ising solvers te… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: version 1.0