Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Madan, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.14346  [pdf, other

    cs.IR cs.CL

    Improving Retrieval in Sponsored Search by Leveraging Query Context Signals

    Authors: Akash Kumar Mohankumar, Gururaj K, Gagan Madan, Amit Singh

    Abstract: Accurately retrieving relevant bid keywords for user queries is critical in Sponsored Search but remains challenging, particularly for short, ambiguous queries. Existing dense and generative retrieval models often fail to capture nuanced user intent in these cases. To address this, we propose an approach to enhance query understanding by augmenting queries with rich contextual signals derived from… ▽ More

    Submitted 19 July, 2024; originally announced July 2024.

    Comments: 8 pages, 8 tables, 1 figure

  2. arXiv:2309.14389  [pdf, other

    cs.CV cs.AI

    Analyzing the Efficacy of an LLM-Only Approach for Image-based Document Question Answering

    Authors: Nidhi Hegde, Sujoy Paul, Gagan Madan, Gaurav Aggarwal

    Abstract: Recent document question answering models consist of two key components: the vision encoder, which captures layout and visual elements in images, and a Large Language Model (LLM) that helps contextualize questions to the image and supplements them with external world knowledge to generate accurate answers. However, the relative contributions of the vision encoder and the language model in these ta… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

  3. arXiv:2308.15037  [pdf, other

    cs.CV

    Is it an i or an l: Test-time Adaptation of Text Line Recognition Models

    Authors: Debapriya Tula, Sujoy Paul, Gagan Madan, Peter Garst, Reeve Ingle, Gaurav Aggarwal

    Abstract: Recognizing text lines from images is a challenging problem, especially for handwritten documents due to large variations in writing styles. While text line recognition models are generally trained on large corpora of real and synthetic data, such models can still make frequent mistakes if the handwriting is inscrutable or the image acquisition process adds corruptions, such as noise, blur, compre… ▽ More

    Submitted 29 August, 2023; originally announced August 2023.

  4. arXiv:2306.06823  [pdf, other

    cs.CV cs.CL

    Weakly supervised information extraction from inscrutable handwritten document images

    Authors: Sujoy Paul, Gagan Madan, Akankshya Mishra, Narayan Hegde, Pradeep Kumar, Gaurav Aggarwal

    Abstract: State-of-the-art information extraction methods are limited by OCR errors. They work well for printed text in form-like documents, but unstructured, handwritten documents still remain a challenge. Adapting existing models to domain-specific training data is quite expensive, because of two factors, 1) limited availability of the domain-specific documents (such as handwritten prescriptions, lab note… ▽ More

    Submitted 11 June, 2023; originally announced June 2023.

    Comments: Accepted at ICDAR 2023

  5. arXiv:2303.17376  [pdf, other

    cs.CV cs.AI cs.LG

    A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision

    Authors: Lucas Beyer, Bo Wan, Gagan Madan, Filip Pavetic, Andreas Steiner, Alexander Kolesnikov, André Susano Pinto, Emanuele Bugliarello, Xiao Wang, Qihang Yu, Liang-Chieh Chen, Xiaohua Zhai

    Abstract: There has been a recent explosion of computer vision models which perform many tasks and are composed of an image encoder (usually a ViT) and an autoregressive decoder (usually a Transformer). However, most of this work simply presents one system and its results, leaving many questions regarding design decisions and trade-offs of such systems unanswered. In this work, we aim to provide such answer… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

  6. arXiv:2210.03505  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Sample-Efficient Personalization: Modeling User Parameters as Low Rank Plus Sparse Components

    Authors: Soumyabrata Pal, Prateek Varshney, Prateek Jain, Abhradeep Guha Thakurta, Gagan Madan, Gaurav Aggarwal, Pradeep Shenoy, Gaurav Srivastava

    Abstract: Personalization of machine learning (ML) predictions for individual users/domains/enterprises is critical for practical recommendation systems. Standard personalization approaches involve learning a user/domain specific embedding that is fed into a fixed global model which can be limiting. On the other hand, personalizing/fine-tuning model itself for each user/domain -- a.k.a meta-learning -- has… ▽ More

    Submitted 5 September, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: 104 pages, 7 figures, 2 Tables

  7. arXiv:2207.01508  [pdf

    cs.CY cs.MM

    Understanding misinformation in India: The case for a meaningful regulatory approach for social media platforms

    Authors: Gandharv Dhruv Madan

    Abstract: For research, this paper has included numerous literature that are covering a variety of information on the topics of misinformation, social media and fake news, regulation of misinformation and social media platforms, all presented for India. Studies including thematic analysis of misinformation, brief history on social media and its amplification of misinformation, current and past policy interv… ▽ More

    Submitted 19 June, 2022; originally announced July 2022.

    Comments: 10 pages

  8. arXiv:2111.15521  [pdf, other

    cs.LG cs.CR

    Node-Level Differentially Private Graph Neural Networks

    Authors: Ameya Daigavane, Gagan Madan, Aditya Sinha, Abhradeep Guha Thakurta, Gaurav Aggarwal, Prateek Jain

    Abstract: Graph Neural Networks (GNNs) are a popular technique for modelling graph-structured data and computing node-level representations via aggregation of information from the neighborhood of each node. However, this aggregation implies an increased risk of revealing sensitive information, as a node can participate in the inference for multiple nodes. This implies that standard privacy-preserving machin… ▽ More

    Submitted 26 August, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

    Comments: 20 pages, 4 figures

  9. arXiv:1807.00643  [pdf, other

    cs.AI

    Block-Value Symmetries in Probabilistic Graphical Models

    Authors: Gagan Madan, Ankit Anand, Mausam, Parag Singla

    Abstract: One popular way for lifted inference in probabilistic graphical models is to first merge symmetric states into a single cluster (orbit) and then use these for downstream inference, via variations of orbital MCMC [Niepert, 2012]. These orbits are represented compactly using permutations over variables, and variable-value (VV) pairs, but they can miss several state symmetries in a domain. We defin… ▽ More

    Submitted 8 July, 2018; v1 submitted 2 July, 2018; originally announced July 2018.

    Comments: 11 pages, 3 figures, Accepted in UAI 2018 and StaR AI 2018

  10. arXiv:1705.03751  [pdf, other

    cs.AI cs.CL

    A Survey of Distant Supervision Methods using PGMs

    Authors: Gagan Madan

    Abstract: Relation Extraction refers to the task of populating a database with tuples of the form $r(e_1, e_2)$, where $r$ is a relation and $e_1$, $e_2$ are entities. Distant supervision is one such technique which tries to automatically generate training examples based on an existing KB such as Freebase. This paper is a survey of some of the techniques in distant supervision which primarily rely on Probab… ▽ More

    Submitted 10 May, 2017; originally announced May 2017.

  11. arXiv:1603.04747  [pdf, ps, other

    cs.CL

    Topic Modeling Using Distributed Word Embeddings

    Authors: Ramandeep S Randhawa, Parag Jain, Gagan Madan

    Abstract: We propose a new algorithm for topic modeling, Vec2Topic, that identifies the main topics in a corpus using semantic information captured via high-dimensional distributed word embeddings. Our technique is unsupervised and generates a list of topics ranked with respect to importance. We find that it works better than existing topic modeling techniques such as Latent Dirichlet Allocation for identif… ▽ More

    Submitted 15 March, 2016; originally announced March 2016.