Zum Hauptinhalt springen

Showing 1–50 of 105 results for author: Anmol

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.06062  [pdf, other

    eess.AS cs.SD

    Retrieval Augmented Correction of Named Entity Speech Recognition Errors

    Authors: Ernest Pusateri, Anmol Walia, Anirudh Kashi, Bortik Bandyopadhyay, Nadia Hyder, Sayantan Mahinder, Raviteja Anantha, Daben Liu, Sashank Gondala

    Abstract: In recent years, end-to-end automatic speech recognition (ASR) systems have proven themselves remarkably accurate and performant, but these systems still have a significant error rate for entity names which appear infrequently in their training data. In parallel to the rise of end-to-end ASR systems, large language models (LLMs) have proven to be a versatile tool for various natural language proce… ▽ More

    Submitted 9 September, 2024; originally announced September 2024.

    Comments: Submitted to ICASSP 2025

  2. arXiv:2409.02302  [pdf, other

    eess.AS cs.AI cs.SD

    Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024

    Authors: Anmol Guragain, Tianchi Liu, Zihan Pan, Hardik B. Sailor, Qiongqiong Wang

    Abstract: This work details our approach to achieving a leading system with a 1.79% pooled equal error rate (EER) on the evaluation set of the Controlled Singing Voice Deepfake Detection (CtrSVDD). The rapid advancement of generative AI models presents significant challenges for detecting AI-generated deepfake singing voices, attracting increased research attention. The Singing Voice Deepfake Detection (SVD… ▽ More

    Submitted 3 September, 2024; originally announced September 2024.

    Comments: Accepted to the IEEE Spoken Language Technology Workshop (SLT) 2024. Copyright may be transferred without notice, after which this version may no longer be accessible

  3. arXiv:2408.16647  [pdf, other

    cs.CV cs.AI

    DriveGenVLM: Real-world Video Generation for Vision Language Model based Autonomous Driving

    Authors: Yongjie Fu, Anmol Jain, Xuan Di, Xu Chen, Zhaobin Mo

    Abstract: The advancement of autonomous driving technologies necessitates increasingly sophisticated methods for understanding and predicting real-world scenarios. Vision language models (VLMs) are emerging as revolutionary tools with significant potential to influence autonomous driving. In this paper, we propose the DriveGenVLM framework to generate driving videos and use VLMs to understand them. To achie… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

  4. arXiv:2408.13786  [pdf, other

    cs.CV cs.AI cs.MM

    Localization of Synthetic Manipulations in Western Blot Images

    Authors: Anmol Manjunath, Viola Negroni, Sara Mandelli, Daniel Moreira, Paolo Bestagini

    Abstract: Recent breakthroughs in deep learning and generative systems have significantly fostered the creation of synthetic media, as well as the local alteration of real content via the insertion of highly realistic synthetic manipulations. Local image manipulation, in particular, poses serious challenges to the integrity of digital content and societal trust. This problem is not only confined to multimed… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

  5. arXiv:2408.08499  [pdf, ps, other

    cs.LG cs.GT

    The Limitations of Model Retraining in the Face of Performativity

    Authors: Anmol Kabra, Kumar Kshitij Patel

    Abstract: We study stochastic optimization in the context of performative shifts, where the data distribution changes in response to the deployed model. We demonstrate that naive retraining can be provably suboptimal even for simple distribution shifts. The issue worsens when models are retrained given a finite number of samples at each retraining step. We show that adding regularization to retraining corre… ▽ More

    Submitted 15 August, 2024; originally announced August 2024.

    Comments: Accepted to 2024 ICML Workshop on Humans, Algorithmic Decision-Making and Society

  6. arXiv:2407.07000  [pdf, other

    cs.LG cs.AI cs.CL cs.DC

    Etalon: Holistic Performance Evaluation Framework for LLM Inference Systems

    Authors: Amey Agrawal, Anmol Agarwal, Nitin Kedia, Jayashree Mohan, Souvik Kundu, Nipun Kwatra, Ramachandran Ramjee, Alexey Tumanov

    Abstract: Serving large language models (LLMs) in production can incur substantial costs, which has prompted recent advances in inference system optimizations. Today, these systems are evaluated against conventional latency and throughput metrics (eg. TTFT, TBT, Normalised Latency and TPOT). However, these metrics fail to fully capture the nuances of LLM inference, leading to an incomplete assessment of use… ▽ More

    Submitted 29 August, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

  7. arXiv:2406.13831  [pdf, other

    cs.DB

    A Comprehensive Overview of GPU Accelerated Databases

    Authors: Harshit Sharma, Anmol Sharma

    Abstract: Over the past decade, the landscape of data analytics has seen a notable shift towards heterogeneous architectures, particularly the integration of GPUs to enhance overall performance. In the realm of in-memory analytics, which often grapples with memory bandwidth constraints, the adoption of GPUs has proven advantageous, thanks to their superior bandwidth capabilities. The parallel processing pro… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

  8. arXiv:2405.15380  [pdf, other

    cs.AR cs.AI

    Full-stack evaluation of Machine Learning inference workloads for RISC-V systems

    Authors: Debjyoti Bhattacharjee, Anmol, Tommaso Marinelli, Karan Pathak, Peter Kourzanov

    Abstract: Architectural simulators hold a vital role in RISC-V research, providing a crucial platform for workload evaluation without the need for costly physical prototypes. They serve as a dynamic environment for exploring innovative architectural concepts, enabling swift iteration and thorough analysis of performance metrics. As deep learning algorithms become increasingly pervasive, it is essential to b… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: RISC-V Summit Europe 2024

  9. arXiv:2405.05572  [pdf, other

    cs.CL cs.AI

    From Human Judgements to Predictive Models: Unravelling Acceptability in Code-Mixed Sentences

    Authors: Prashant Kodali, Anmol Goel, Likhith Asapu, Vamshi Krishna Bonagiri, Anirudh Govil, Monojit Choudhury, Manish Shrivastava, Ponnurangam Kumaraguru

    Abstract: Current computational approaches for analysing or generating code-mixed sentences do not explicitly model "naturalness" or "acceptability" of code-mixed sentences, but rely on training corpora to reflect distribution of acceptable code-mixed sentences. Modelling human judgement for the acceptability of code-mixed text can help in distinguishing natural code-mixed text and enable quality-controlled… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  10. arXiv:2405.01573  [pdf, other

    cs.SE cs.AI

    Class-Level Code Generation from Natural Language Using Iterative, Tool-Enhanced Reasoning over Repository

    Authors: Ajinkya Deshpande, Anmol Agarwal, Shashank Shet, Arun Iyer, Aditya Kanade, Ramakrishna Bairi, Suresh Parthasarathy

    Abstract: LLMs have demonstrated significant potential in code generation tasks, achieving promising results at the function or statement level across various benchmarks. However, the complexities associated with creating code artifacts like classes, particularly within the context of real-world software repositories, remain underexplored. Prior research treats class-level generation as an isolated task, ne… ▽ More

    Submitted 5 June, 2024; v1 submitted 21 April, 2024; originally announced May 2024.

    Comments: Preprint with additional experiments

  11. arXiv:2404.12643  [pdf

    cs.HC

    AipanVR: A Virtual Reality Experience for Preserving Uttarakhand's Traditional Art Form

    Authors: Nishant Chaudhary, Mihir Raj, Richik Bhattacharjee, Anmol Srivastava, Rakesh Sah, Pankaj Badoni

    Abstract: This paper presents a demonstration of the developed prototype showcasing a way to preserve the Intangible Cultural Heritage of Uttarakhand, India. Aipan is a traditional art form practiced in the Kumaon region in the state of Uttarakhand. It is typically used to decorate floors and walls at places of worship or entrances of homes and is considered auspicious to begin any work or event. This art i… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: Demonstrated at ISMAR 2020

  12. arXiv:2403.08773  [pdf, other

    cs.CV cs.AI cs.CL cs.MM

    Veagle: Advancements in Multimodal Representation Learning

    Authors: Rajat Chawla, Arkajit Datta, Tushar Verma, Adarsh Jha, Anmol Gautam, Ayush Vatsal, Sukrit Chaterjee, Mukunda NS, Ishaan Bhola

    Abstract: Lately, researchers in artificial intelligence have been really interested in how language and vision come together, giving rise to the development of multimodal models that aim to seamlessly integrate textual and visual information. Multimodal models, an extension of Large Language Models (LLMs), have exhibited remarkable capabilities in addressing a diverse array of tasks, ranging from image cap… ▽ More

    Submitted 18 January, 2024; originally announced March 2024.

  13. arXiv:2403.08053  [pdf, other

    cs.CL

    Generating Clarification Questions for Disambiguating Contracts

    Authors: Anmol Singhal, Chirag Jain, Preethu Rose Anish, Arkajyoti Chakraborty, Smita Ghaisas

    Abstract: Enterprises frequently enter into commercial contracts that can serve as vital sources of project-specific requirements. Contractual clauses are obligatory, and the requirements derived from contracts can detail the downstream implementation activities that non-legal stakeholders, including requirement analysts, engineers, and delivery personnel, need to conduct. However, comprehending contracts i… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: 9 pages, 3 figures, accepted to LREC-COLING 2024

  14. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  15. arXiv:2403.03029  [pdf, other

    cs.CL

    Socratic Reasoning Improves Positive Text Rewriting

    Authors: Anmol Goel, Nico Daheim, Iryna Gurevych

    Abstract: Reframing a negative into a positive thought is at the crux of several cognitive approaches to mental health and psychotherapy that could be made more accessible by large language model-based solutions. Such reframing is typically non-trivial and requires multiple rationalization steps to uncover the underlying issue of a negative thought and transform it to be more positive. However, this rationa… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  16. arXiv:2402.16977  [pdf, other

    cs.SE cs.CL

    Dealing with Data for RE: Mitigating Challenges while using NLP and Generative AI

    Authors: Smita Ghaisas, Anmol Singhal

    Abstract: Across the dynamic business landscape today, enterprises face an ever-increasing range of challenges. These include the constantly evolving regulatory environment, the growing demand for personalization within software applications, and the heightened emphasis on governance. In response to these multifaceted demands, large enterprises have been adopting automation that spans from the optimization… ▽ More

    Submitted 28 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: 24 pages, 2 figures, to be published in NLP for Requirements Engineering Book

  17. arXiv:2402.12629  [pdf, other

    cs.MM cs.CY cs.SI

    Television Discourse Decoded: Comprehensive Multimodal Analytics at Scale

    Authors: Anmol Agarwal, Pratyush Priyadarshi, Shiven Sinha, Shrey Gupta, Hitkul Jangra, Ponnurangam Kumaraguru, Kiran Garimella

    Abstract: In this paper, we tackle the complex task of analyzing televised debates, with a focus on a prime time news debate show from India. Previous methods, which often relied solely on text, fall short in capturing the multimodal essence of these debates. To address this gap, we introduce a comprehensive automated toolkit that employs advanced computer vision and speech-to-text techniques for large-scal… ▽ More

    Submitted 6 August, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: KDD 2024 [Updates for Camera Ready version]

  18. arXiv:2402.10567  [pdf, other

    cs.CL cs.AI

    InSaAF: Incorporating Safety through Accuracy and Fairness | Are LLMs ready for the Indian Legal Domain?

    Authors: Yogesh Tripathi, Raghav Donakanti, Sahil Girhepuje, Ishan Kavathekar, Bhaskara Hanuma Vedula, Gokul S Krishnan, Shreya Goyal, Anmol Goel, Balaraman Ravindran, Ponnurangam Kumaraguru

    Abstract: Recent advancements in language technology and Artificial Intelligence have resulted in numerous Language Models being proposed to perform various tasks in the legal domain ranging from predicting judgments to generating summaries. Despite their immense potential, these models have been proven to learn and exhibit societal biases and make unfair predictions. In this study, we explore the ability o… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  19. arXiv:2402.06088  [pdf, other

    cs.CV

    Animated Stickers: Bringing Stickers to Life with Video Diffusion

    Authors: David Yan, Winnie Zhang, Luxin Zhang, Anmol Kalia, Dingkang Wang, Ankit Ramchandani, Miao Liu, Albert Pumarola, Edgar Schoenfeld, Elliot Blanchard, Krishna Narni, Yaqiao Luo, Lawrence Chen, Guan Pang, Ali Thabet, Peter Vajda, Amy Bearman, Licheng Yu

    Abstract: We introduce animated stickers, a video diffusion model which generates an animation conditioned on a text prompt and static sticker image. Our model is built on top of the state-of-the-art Emu text-to-image model, with the addition of temporal layers to model motion. Due to the domain gap, i.e. differences in visual and motion style, a model which performed well on generating natural videos can n… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  20. arXiv:2402.01576  [pdf, other

    cs.RO

    Training Adversarial yet Safe Agent to Characterize Safety Performance of Highly Automated Vehicles

    Authors: Minghao Zhu, Anmol Sidhu, Keith A. Redmill

    Abstract: This paper focuses on safety performance testing and characterization of black-box highly automated vehicles (HAV). Existing testing approaches typically obtain the testing outcomes by deploying the HAV into a specific testing environment. Such a testing environment can involve various passively given testing strategies presented by other traffic participants such as (i) the naturalistic driving p… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  21. arXiv:2401.13239  [pdf, other

    cs.LG cs.HC

    Adaptive Crowdsourcing Via Self-Supervised Learning

    Authors: Anmol Kagrecha, Henrik Marklund, Benjamin Van Roy, Hong Jun Jeon, Richard Zeckhauser

    Abstract: Common crowdsourcing systems average estimates of a latent quantity of interest provided by many crowdworkers to produce a group estimate. We develop a new approach -- predict-each-worker -- that leverages self-supervised learning and a novel aggregation scheme. This approach adapts weights assigned to crowdworkers based on estimates they provided for previous quantities. When skills vary across c… ▽ More

    Submitted 1 February, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

    Comments: 33 pages, 3 figures

  22. arXiv:2401.09640  [pdf, other

    eess.SY cs.AI

    Blackout Mitigation via Physics-guided RL

    Authors: Anmol Dwivedi, Santiago Paternain, Ali Tajer

    Abstract: This paper considers the sequential design of remedial control actions in response to system anomalies for the ultimate objective of preventing blackouts. A physics-guided reinforcement learning (RL) framework is designed to identify effective sequences of real-time remedial look-ahead decisions accounting for the long-term impact on the system's stability. The paper considers a space of control a… ▽ More

    Submitted 31 July, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  23. arXiv:2401.00338  [pdf

    cs.HC

    A Rapid Scoping Review and Conceptual Analysis of the Educational Metaverse in the Global South: Socio-Technical Perspectives

    Authors: Anmol Srivastava

    Abstract: This paper presents a conceptual insight into the Design of the Metaverse to facilitate educational transformation in selected developing nations within the Global South regions, e.g., India. These regions are often afflicted with socio-economic challenges but rich in cultural diversity. By utilizing a socio-technical design approach, this study explores the specific needs and opportunities presen… ▽ More

    Submitted 30 December, 2023; originally announced January 2024.

  24. arXiv:2312.17100  [pdf, other

    cs.LG

    TSPP: A Unified Benchmarking Tool for Time-series Forecasting

    Authors: Jan Bączek, Dmytro Zhylko, Gilberto Titericz, Sajad Darabi, Jean-Francois Puget, Izzy Putterman, Dawid Majchrowski, Anmol Gupta, Kyle Kranen, Pawel Morkisz

    Abstract: While machine learning has witnessed significant advancements, the emphasis has largely been on data acquisition and model creation. However, achieving a comprehensive assessment of machine learning solutions in real-world settings necessitates standardization throughout the entire pipeline. This need is particularly acute in time series forecasting, where diverse settings impede meaningful compar… ▽ More

    Submitted 8 January, 2024; v1 submitted 28 December, 2023; originally announced December 2023.

  25. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  26. arXiv:2312.01398  [pdf, other

    cs.CL cs.AI cs.LG

    Towards Mitigating Perceived Unfairness in Contracts from a Non-Legal Stakeholder's Perspective

    Authors: Anmol Singhal, Preethu Rose Anish, Shirish Karande, Smita Ghaisas

    Abstract: Commercial contracts are known to be a valuable source for deriving project-specific requirements. However, contract negotiations mainly occur among the legal counsel of the parties involved. The participation of non-legal stakeholders, including requirement analysts, engineers, and solution architects, whose primary responsibility lies in ensuring the seamless implementation of contractual terms,… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: 9 pages, 2 figures, to be published in Natural Legal Language Processing Workshop at EMNLP 2023

  27. arXiv:2311.16496  [pdf, other

    cs.LG

    DPOD: Domain-Specific Prompt Tuning for Multimodal Fake News Detection

    Authors: Debarshi Brahma, Amartya Bhattacharya, Suraj Nagaje Mahadev, Anmol Asati, Vikas Verma, Soma Biswas

    Abstract: The spread of fake news using out-of-context images has become widespread and is a relevant problem in this era of information overload. Such out-of-context fake news may arise across different domains like politics, sports, entertainment, etc. In practical scenarios, an inherent problem of imbalance exists among news articles from such widely varying domains, resulting in a few domains with abund… ▽ More

    Submitted 12 March, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

  28. arXiv:2311.10794  [pdf, other

    cs.CV

    Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression

    Authors: Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy Bearman, Dhruv Mahajan

    Abstract: We introduce Style Tailoring, a recipe to finetune Latent Diffusion Models (LDMs) in a distinct domain with high visual quality, prompt alignment and scene diversity. We choose sticker image generation as the target domain, as the images significantly differ from photorealistic samples typically generated by large-scale LDMs. We start with a competent text-to-image model, like Emu, and show that r… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: 10 pages, 5 figures

  29. arXiv:2311.05579  [pdf, other

    cs.CV cs.LG eess.SP

    SigScatNet: A Siamese + Scattering based Deep Learning Approach for Signature Forgery Detection and Similarity Assessment

    Authors: Anmol Chokshi, Vansh Jain, Rajas Bhope, Sudhir Dhage

    Abstract: The surge in counterfeit signatures has inflicted widespread inconveniences and formidable challenges for both individuals and organizations. This groundbreaking research paper introduces SigScatNet, an innovative solution to combat this issue by harnessing the potential of a Siamese deep learning network, bolstered by Scattering wavelets, to detect signature forgery and assess signature similarit… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 7 pages, 8 figures

  30. arXiv:2311.05548  [pdf, other

    cs.CV eess.IV

    L-WaveBlock: A Novel Feature Extractor Leveraging Wavelets for Generative Adversarial Networks

    Authors: Mirat Shah, Vansh Jain, Anmol Chokshi, Guruprasad Parasnis, Pramod Bide

    Abstract: Generative Adversarial Networks (GANs) have risen to prominence in the field of deep learning, facilitating the generation of realistic data from random noise. The effectiveness of GANs often depends on the quality of feature extraction, a critical aspect of their architecture. This paper introduces L-WaveBlock, a novel and robust feature extractor that leverages the capabilities of the Discrete W… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: 12 figures, 8 pages

  31. Towards Effective Paraphrasing for Information Disguise

    Authors: Anmol Agarwal, Shrey Gupta, Vamshi Bonagiri, Manas Gaur, Joseph Reagle, Ponnurangam Kumaraguru

    Abstract: Information Disguise (ID), a part of computational ethics in Natural Language Processing (NLP), is concerned with best practices of textual paraphrasing to prevent the non-consensual use of authors' posts on the Internet. Research on ID becomes important when authors' written online communication pertains to sensitive domains, e.g., mental health. Over time, researchers have utilized AI-based auto… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: Accepted at ECIR 2023

    Journal ref: 45th European Conference on Information Retrieval, ECIR 2023

  32. arXiv:2311.02775  [pdf, other

    cs.LG cs.AI cs.CL

    AI-TA: Towards an Intelligent Question-Answer Teaching Assistant using Open-Source LLMs

    Authors: Yann Hicke, Anmol Agarwal, Qianou Ma, Paul Denny

    Abstract: Responding to the thousands of student questions on online QA platforms each semester has a considerable human cost, particularly in computing courses with rapidly growing enrollments. To address the challenges of scalable and intelligent question-answering (QA), we introduce an innovative solution that leverages open-source Large Language Models (LLMs) from the LLaMA-2 family to ensure data priva… ▽ More

    Submitted 18 December, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Updates for camera-ready submission

    Journal ref: NeurIPS Workshop on Generative AI for Education (GAIED), 2023

  33. arXiv:2310.18590  [pdf, other

    cs.LG cs.AI

    Using Early Readouts to Mediate Featural Bias in Distillation

    Authors: Rishabh Tiwari, Durga Sivasubramanian, Anmol Mekala, Ganesh Ramakrishnan, Pradeep Shenoy

    Abstract: Deep networks tend to learn spurious feature-label correlations in real-world supervised learning tasks. This vulnerability is aggravated in distillation, where a student model may have lesser representational capacity than the corresponding teacher model. Often, knowledge of specific spurious correlations is used to reweight instances & rebalance the learning process. We propose a novel early rea… ▽ More

    Submitted 8 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

  34. arXiv:2310.06080  [pdf, other

    eess.IV cs.CV

    Advancing Diagnostic Precision: Leveraging Machine Learning Techniques for Accurate Detection of Covid-19, Pneumonia, and Tuberculosis in Chest X-Ray Images

    Authors: Aditya Kulkarni, Guruprasad Parasnis, Harish Balasubramanian, Vansh Jain, Anmol Chokshi, Reena Sonkusare

    Abstract: Lung diseases such as COVID-19, tuberculosis (TB), and pneumonia continue to be serious global health concerns that affect millions of people worldwide. In medical practice, chest X-ray examinations have emerged as the norm for diagnosing diseases, particularly chest infections such as COVID-19. Paramedics and scientists are working intensively to create a reliable and precise approach for early-s… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 11 pages, 18 figures, Under review in Discover Artificial Intelligence Journal by Springer Nature

  35. arXiv:2308.13207  [pdf, other

    cs.CL

    LLM2KB: Constructing Knowledge Bases using instruction tuned context aware Large Language Models

    Authors: Anmol Nayak, Hari Prasad Timmapathini

    Abstract: The advent of Large Language Models (LLM) has revolutionized the field of natural language processing, enabling significant progress in various applications. One key area of interest is the construction of Knowledge Bases (KB) using these powerful models. Knowledge bases serve as repositories of structured information, facilitating information retrieval and inference tasks. Our paper proposes LLM2… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 16 pages, 1 figure, LM-KBC 2023 Challenge at International Semantic Web Conference 2023

  36. Finding the Perfect Fit: Applying Regression Models to ClimateBench v1.0

    Authors: Anmol Chaure, Ashok Kumar Behera, Sudip Bhattacharya

    Abstract: Climate projections using data driven machine learning models acting as emulators, is one of the prevailing areas of research to enable policy makers make informed decisions. Use of machine learning emulators as surrogates for computationally heavy GCM simulators reduces time and carbon footprints. In this direction, ClimateBench [1] is a recently curated benchmarking dataset for evaluating the pe… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Journal ref: International Journal of Computer Applications 185(29):31-39, August 2023

  37. arXiv:2308.05449  [pdf, other

    eess.IV cs.CV

    Transforming Breast Cancer Diagnosis: Towards Real-Time Ultrasound to Mammogram Conversion for Cost-Effective Diagnosis

    Authors: Sahar Almahfouz Nasser, Ashutosh Sharma, Anmol Saraf, Amruta Mahendra Parulekar, Purvi Haria, Amit Sethi

    Abstract: Ultrasound (US) imaging is better suited for intraoperative settings because it is real-time and more portable than other imaging techniques, such as mammography. However, US images are characterized by lower spatial resolution noise-like artifacts. This research aims to address these limitations by providing surgeons with mammogram-like image quality in real-time from noisy US images. Unlike prev… ▽ More

    Submitted 10 August, 2023; originally announced August 2023.

  38. arXiv:2308.03467  [pdf, other

    cs.CV

    RoadScan: A Novel and Robust Transfer Learning Framework for Autonomous Pothole Detection in Roads

    Authors: Guruprasad Parasnis, Anmol Chokshi, Vansh Jain, Kailas Devadkar

    Abstract: This research paper presents a novel approach to pothole detection using Deep Learning and Image Processing techniques. The proposed system leverages the VGG16 model for feature extraction and utilizes a custom Siamese network with triplet loss, referred to as RoadScan. The system aims to address the critical issue of potholes on roads, which pose significant risks to road users. Accidents due to… ▽ More

    Submitted 14 October, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

    Comments: 6 pages, 5 figures, Accepted at the IEEE 7th Conference on Communication and Information Technology 2023

  39. arXiv:2306.17674  [pdf, other

    cs.CL

    X-RiSAWOZ: High-Quality End-to-End Multilingual Dialogue Datasets and Few-shot Agents

    Authors: Mehrad Moradshahi, Tianhao Shen, Kalika Bali, Monojit Choudhury, Gaël de Chalendar, Anmol Goel, Sungkyun Kim, Prashant Kodali, Ponnurangam Kumaraguru, Nasredine Semmar, Sina J. Semnani, Jiwon Seo, Vivek Seshadri, Manish Shrivastava, Michael Sun, Aditya Yadavalli, Chaobin You, Deyi Xiong, Monica S. Lam

    Abstract: Task-oriented dialogue research has mainly focused on a few popular languages like English and Chinese, due to the high dataset creation cost for a new language. To reduce the cost, we apply manual editing to automatically translated data. We create a new multilingual benchmark, X-RiSAWOZ, by translating the Chinese RiSAWOZ to 4 languages: English, French, Hindi, Korean; and a code-mixed English-H… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Accepted by ACL 2023 Findings

  40. arXiv:2305.14208  [pdf, other

    cs.CL cs.LG

    Domain Private Transformers for Multi-Domain Dialog Systems

    Authors: Anmol Kabra, Ethan R. Elenberg

    Abstract: Large, general purpose language models have demonstrated impressive performance across many different conversational domains. While multi-domain language models achieve low overall perplexity, their outputs are not guaranteed to stay within the domain of a given input prompt. This paper proposes domain privacy as a novel way to quantify how likely a conditional language model will leak across doma… ▽ More

    Submitted 7 December, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted to Findings of EMNLP 2023 (short paper). Code available at https://github.com/asappresearch/domain-private-transformers

  41. Instance-Level Semantic Maps for Vision Language Navigation

    Authors: Laksh Nanwani, Anmol Agarwal, Kanishk Jain, Raghav Prabhakar, Aaron Monis, Aditya Mathur, Krishna Murthy, Abdul Hafez, Vineet Gandhi, K. Madhava Krishna

    Abstract: Humans have a natural ability to perform semantic associations with the surrounding objects in the environment. This allows them to create a mental map of the environment, allowing them to navigate on-demand when given linguistic instructions. A natural goal in Vision Language Navigation (VLN) research is to impart autonomous agents with similar capabilities. Recent works take a step towards this… ▽ More

    Submitted 1 July, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Journal ref: IEEE RO-MAN 2023

  42. arXiv:2304.08504  [pdf

    cs.ET physics.app-ph

    Schottky Barrier MOSFET Enabled Ultra-Low Power Real-Time Neuron for Neuromorphic Computing

    Authors: Shubham Patil, Jayatika Sakhuja, Ajay Kumar Singh, Anmol Biswas, Vivek Saraswat, Sandeep Kumar, Sandip Lashkare, Udayan Ganguly

    Abstract: Energy-efficient real-time synapses and neurons are essential to enable large-scale neuromorphic computing. In this paper, we propose and demonstrate the Schottky-Barrier MOSFET-based ultra-low power voltage-controlled current source to enable real-time neurons for neuromorphic computing. Schottky-Barrier MOSFET is fabricated on a Silicon-on-insulator platform with polycrystalline Silicon as the c… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  43. arXiv:2304.00763  [pdf

    cs.CV

    BOLLWM: A real-world dataset for bollworm pest monitoring from cotton fields in India

    Authors: Jerome White, Chandan Agrawal, Anmol Ojha, Apoorv Agnihotri, Makkunda Sharma, Jigar Doshi

    Abstract: This paper presents a dataset of agricultural pest images captured over five years by thousands of small holder farmers and farming extension workers across India. The dataset has been used to support a mobile application that relies on artificial intelligence to assist farmers with pest management decisions. Creation came from a mix of organized data collection, and from mobile application usage… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Journal ref: ICLR 2023 workshop on Practical Machine Learning for Developing Countries

  44. arXiv:2304.00171  [pdf, other

    cs.CL cs.SD eess.AS

    Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

    Authors: Rami Botros, Anmol Gulati, Tara N. Sainath, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu

    Abstract: Conformer models maintain a large number of internal states, the vast majority of which are associated with self-attention layers. With limited memory bandwidth, reading these from memory at each inference step can slow down inference. In this paper, we design an optimized conformer that is small enough to meet on-device restrictions and has fast inference on TPUs. We explore various ideas to impr… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  45. arXiv:2303.07247  [pdf

    cs.CL cs.CY

    Are Models Trained on Indian Legal Data Fair?

    Authors: Sahil Girhepuje, Anmol Goel, Gokul S Krishnan, Shreya Goyal, Satyendra Pandey, Ponnurangam Kumaraguru, Balaraman Ravindran

    Abstract: Recent advances and applications of language technology and artificial intelligence have enabled much success across multiple domains like law, medical and mental health. AI-based Language Models, like Judgement Prediction, have recently been proposed for the legal sector. However, these models are strife with encoded social biases picked up from the training data. While bias and fairness have bee… ▽ More

    Submitted 14 May, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Presented at the Symposium on AI and Law (SAIL) 2023

  46. arXiv:2302.06951  [pdf, other

    cs.CL

    Few-shot learning approaches for classifying low resource domain specific software requirements

    Authors: Anmol Nayak, Hari Prasad Timmapathini, Vidhya Murali, Atul Anil Gohad

    Abstract: With the advent of strong pre-trained natural language processing models like BERT, DeBERTa, MiniLM, T5, the data requirement for industries to fine-tune these models to their niche use cases has drastically reduced (typically to a few hundred annotated samples for achieving a reasonable performance). However, the availability of even a few hundred annotated samples may not always be guaranteed in… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: 6 pages, 1 figure

  47. arXiv:2301.01795  [pdf, other

    cs.CV

    PACO: Parts and Attributes of Common Objects

    Authors: Vignesh Ramanathan, Anmol Kalia, Vladan Petrovic, Yi Wen, Baixue Zheng, Baishan Guo, Rui Wang, Aaron Marquez, Rama Kovvuri, Abhishek Kadian, Amir Mousavi, Yiwen Song, Abhimanyu Dubey, Dhruv Mahajan

    Abstract: Object models are gradually progressing from predicting just category labels to providing detailed descriptions of object instances. This motivates the need for large datasets which go beyond traditional object masks and provide richer annotations such as part masks and attributes. Hence, we introduce PACO: Parts and Attributes of Common Objects. It spans 75 object categories, 456 object-part cate… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  48. arXiv:2211.07514  [pdf, other

    cs.CL

    CST5: Data Augmentation for Code-Switched Semantic Parsing

    Authors: Anmol Agarwal, Jigar Gupta, Rahul Goel, Shyam Upadhyay, Pankaj Joshi, Rengarajan Aravamudhan

    Abstract: Extending semantic parsers to code-switched input has been a challenging problem, primarily due to a lack of supervised training data. In this work, we introduce CST5, a new data augmentation technique that finetunes a T5 model using a small seed set ($\approx$100 utterances) to generate code-switched utterances from English utterances. We show that CST5 generates high quality code-switched data,… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

  49. arXiv:2209.10966  [pdf, other

    cs.CL

    Adaptation of domain-specific transformer models with text oversampling for sentiment analysis of social media posts on Covid-19 vaccines

    Authors: Anmol Bansal, Arjun Choudhry, Anubhav Sharma, Seba Susan

    Abstract: Covid-19 has spread across the world and several vaccines have been developed to counter its surge. To identify the correct sentiments associated with the vaccines from social media posts, we fine-tune various state-of-the-art pre-trained transformer models on tweets associated with Covid-19 vaccines. Specifically, we use the recently introduced state-of-the-art pre-trained transformer models RoBE… ▽ More

    Submitted 13 January, 2023; v1 submitted 22 September, 2022; originally announced September 2022.

    Comments: The paper has been accepted for publication in Computer Science journal: http://journals.agh.edu.pl/csci

  50. arXiv:2208.09628  [pdf, other

    cs.LG cs.AI cs.CY

    Are You Comfortable Now: Deep Learning the Temporal Variation in Thermal Comfort in Winters

    Authors: Betty Lala, Srikant Manas Kala, Anmol Rastogi, Kunal Dahiya, Aya Hagishima

    Abstract: Indoor thermal comfort in smart buildings has a significant impact on the health and performance of occupants. Consequently, machine learning (ML) is increasingly used to solve challenges related to indoor thermal comfort. Temporal variability of thermal comfort perception is an important problem that regulates occupant well-being and energy consumption. However, in most ML-based thermal comfort s… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

    Comments: Accepted for publication in IEEE SMC 2022