Zum Hauptinhalt springen

Showing 1–26 of 26 results for author: Srinivasan, B V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.18893  [pdf, other

    cs.CV

    AlignIT: Enhancing Prompt Alignment in Customization of Text-to-Image Models

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: We consider the problem of customizing text-to-image diffusion models with user-supplied reference images. Given new prompts, the existing methods can capture the key concept from the reference images but fail to align the generated image with the prompt. In this work, we seek to address this key issue by proposing new methods that can easily be used in conjunction with existing customization meth… ▽ More

    Submitted 27 June, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: 10 pages, 9 figures

  2. arXiv:2406.06938  [pdf, other

    cs.CL

    Post-Hoc Answer Attribution for Grounded and Trustworthy Long Document Comprehension: Task, Insights, and Challenges

    Authors: Abhilasha Sancheti, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: Attributing answer text to its source document for information-seeking questions is crucial for building trustworthy, reliable, and accountable systems. We formulate a new task of post-hoc answer attribution for long document comprehension (LDC). Owing to the lack of long-form abstractive and information-seeking LDC datasets, we refactor existing datasets to assess the strengths and weaknesses of… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to *SEM 2024

  3. arXiv:2406.04673  [pdf, other

    cs.CV cs.AI cs.MM eess.AS

    MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

    Authors: Sanjoy Chowdhury, Sayan Nag, K J Joseph, Balaji Vasan Srinivasan, Dinesh Manocha

    Abstract: Music is a universal language that can communicate emotions and feelings. It forms an essential part of the whole spectrum of creative media, ranging from movies to social media posts. Machine learning models that can synthesize music are predominantly conditioned on textual descriptions of it. Inspired by how musicians compose music not just from a movie script, but also through visualizations, w… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: Accepted at CVPR 2024 as Highlight paper. Webpage: https://schowdhury671.github.io/melfusion_cvpr2024/

  4. arXiv:2405.17980  [pdf, other

    cs.CL

    Peering into the Mind of Language Models: An Approach for Attribution in Contextual Question Answering

    Authors: Anirudh Phukan, Shwetha Somasundaram, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: With the enhancement in the field of generative artificial intelligence (AI), contextual question answering has become extremely relevant. Attributing model generations to the input source document is essential to ensure trustworthiness and reliability. We observe that when large language models (LLMs) are used for contextual question answering, the output answer often consists of text copied verb… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2401.01637  [pdf, other

    cs.CL

    Social Media Ready Caption Generation for Brands

    Authors: Himanshu Maheshwari, Koustava Goswami, Apoorv Saxena, Balaji Vasan Srinivasan

    Abstract: Social media advertisements are key for brand marketing, aiming to attract consumers with captivating captions and pictures or logos. While previous research has focused on generating captions for general images, incorporating brand personalities into social media captioning remains unexplored. Brand personalities are shown to be affecting consumers' behaviours and social interactions and thus are… ▽ More

    Submitted 3 January, 2024; originally announced January 2024.

  6. arXiv:2311.11919  [pdf, other

    cs.CV

    An Image is Worth Multiple Words: Multi-attribute Inversion for Constrained Text-to-Image Synthesis

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Tripti Shukla, Balaji Vasan Srinivasan

    Abstract: We consider the problem of constraining diffusion model outputs with a user-supplied reference image. Our key objective is to extract multiple attributes (e.g., color, object, layout, style) from this single reference image, and then generate new samples with them. One line of existing work proposes to invert the reference images into a single textual conditioning vector, enabling generation of ne… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  7. arXiv:2309.00613  [pdf, other

    cs.CV cs.AI cs.LG

    Iterative Multi-granular Image Editing using Diffusion Models

    Authors: K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic… ▽ More

    Submitted 28 October, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

  8. arXiv:2308.16649  [pdf, other

    cs.CV

    Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval

    Authors: Prateksha Udhayanan, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: We consider the problem of composed image retrieval that takes an input query consisting of an image and a modification text indicating the desired changes to be made on the image and retrieves images that match these changes. Current state-of-the-art techniques that address this problem use global features for the retrieval, resulting in incorrect localization of the regions of interest to be mod… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

  9. arXiv:2307.00910  [pdf, other

    cs.CV cs.AI

    CoPL: Contextual Prompt Learning for Vision-Language Understanding

    Authors: Koustava Goswami, Srikrishna Karanam, Prateksha Udhayanan, K J Joseph, Balaji Vasan Srinivasan

    Abstract: Recent advances in multimodal learning has resulted in powerful vision-language models, whose representations are generalizable across a variety of downstream tasks. Recently, their generalization ability has been further extended by incorporating trainable prompts, borrowed from the natural language processing literature. While such prompt learning techniques have shown impressive results, we ide… ▽ More

    Submitted 12 December, 2023; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted at AAAI 2024

  10. arXiv:2306.14603  [pdf, other

    cs.CV

    Learning with Difference Attention for Visually Grounded Self-supervised Representations

    Authors: Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan

    Abstract: Recent works in self-supervised learning have shown impressive results on single-object images, but they struggle to perform well on complex multi-object images as evidenced by their poor visual grounding. To demonstrate this concretely, we propose visual difference attention (VDA) to compute visual attention maps in an unsupervised fashion by comparing an image with its salient-regions-masked-out… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 15 pages, 14 figures

  11. arXiv:2306.14544  [pdf, other

    cs.CV

    A-STAR: Test-time Attention Segregation and Retention for Text-to-image Synthesis

    Authors: Aishwarya Agarwal, Srikrishna Karanam, K J Joseph, Apoorv Saxena, Koustava Goswami, Balaji Vasan Srinivasan

    Abstract: While recent developments in text-to-image generative models have led to a suite of high-performing methods capable of producing creative imagery from free-form text, there are several limitations. By analyzing the cross-attention representations of these models, we notice two key issues. First, for text prompts that contain multiple concepts, there is a significant amount of pixel-space overlap (… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

    Comments: 15 pages, 16 figures

  12. arXiv:2212.09825  [pdf, other

    cs.CL

    What to Read in a Contract? Party-Specific Summarization of Legal Obligations, Entitlements, and Prohibitions

    Authors: Abhilasha Sancheti, Aparna Garimella, Balaji Vasan Srinivasan, Rachel Rudinger

    Abstract: Reviewing and comprehending key obligations, entitlements, and prohibitions in legal contracts can be a tedious task due to their length and domain-specificity. Furthermore, the key rights and duties requiring review vary for each contracting party. In this work, we propose a new task of party-specific extractive summarization for legal contracts to facilitate faster reviewing and improved compreh… ▽ More

    Submitted 24 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: EMNLP 2023

  13. arXiv:2211.12752  [pdf, other

    cs.CL

    Agent-Specific Deontic Modality Detection in Legal Language

    Authors: Abhilasha Sancheti, Aparna Garimella, Balaji Vasan Srinivasan, Rachel Rudinger

    Abstract: Legal documents are typically long and written in legalese, which makes it particularly difficult for laypeople to understand their rights and duties. While natural language understanding technologies can be valuable in supporting such understanding in the legal domain, the limited availability of datasets annotated for deontic modalities in the legal domain, due to the cost of hiring experts and… ▽ More

    Submitted 23 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022

  14. arXiv:2203.10483  [pdf, other

    cs.CL

    Entailment Relation Aware Paraphrase Generation

    Authors: Abhilasha Sancheti, Balaji Vasan Srinivasan, Rachel Rudinger

    Abstract: We introduce a new task of entailment relation aware paraphrase generation which aims at generating a paraphrase conforming to a given entailment relation (e.g. equivalent, forward entailing, or reverse entailing) with respect to a given input. We propose a reinforcement learning-based weakly-supervised paraphrasing system, ERAP, that can be trained using existing paraphrase and natural language i… ▽ More

    Submitted 20 March, 2022; originally announced March 2022.

    Comments: 11 pages, 10 tables, 2 figures

  15. arXiv:2110.15794  [pdf, other

    cs.CL cs.AI

    CLAUSEREC: A Clause Recommendation Framework for AI-aided Contract Authoring

    Authors: Vinay Aggarwal, Aparna Garimella, Balaji Vasan Srinivasan, Anandhavelu N, Rajiv Jain

    Abstract: Contracts are a common type of legal document that frequent in several day-to-day business workflows. However, there has been very limited NLP research in processing such documents, and even lesser in generating them. These contracts are made up of clauses, and the unique nature of these clauses calls for specific methods to understand and generate such documents. In this paper, we introduce the t… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

  16. arXiv:2104.07000  [pdf, other

    cs.CL

    IGA : An Intent-Guided Authoring Assistant

    Authors: Simeng Sun, Wenlong Zhao, Varun Manjunatha, Rajiv Jain, Vlad Morariu, Franck Dernoncourt, Balaji Vasan Srinivasan, Mohit Iyyer

    Abstract: While large-scale pretrained language models have significantly improved writing assistance functionalities such as autocomplete, more complex and controllable writing assistants have yet to be explored. We leverage advances in language modeling to build an interactive writing assistant that generates and rephrases text according to fine-grained author specifications. Users provide input to our In… ▽ More

    Submitted 19 September, 2021; v1 submitted 14 April, 2021; originally announced April 2021.

    Comments: EMNLP2021

  17. arXiv:2101.11836  [pdf, other

    cs.CL cs.AI cs.LG

    DRAG: Director-Generator Language Modelling Framework for Non-Parallel Author Stylized Rewriting

    Authors: Hrituraj Singh, Gaurav Verma, Aparna Garimella, Balaji Vasan Srinivasan

    Abstract: Author stylized rewriting is the task of rewriting an input text in a particular author's style. Recent works in this area have leveraged Transformer-based language models in a denoising autoencoder setup to generate author stylized text without relying on a parallel corpus of data. However, these approaches are limited by the lack of explicit control of target attributes and being entirely data-d… ▽ More

    Submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted as Long Paper to EACL 2021

  18. arXiv:2010.11578  [pdf, other

    cs.CL

    Multi-Style Transfer with Discriminative Feedback on Disjoint Corpus

    Authors: Navita Goyal, Balaji Vasan Srinivasan, Anandhavelu Natarajan, Abhilasha Sancheti

    Abstract: Style transfer has been widely explored in natural language generation with non-parallel corpus by directly or indirectly extracting a notion of style from source and target domain corpus. A common shortcoming of existing approaches is the prerequisite of joint annotations across all the stylistic dimensions under consideration. Availability of such dataset across a combination of styles limits th… ▽ More

    Submitted 12 April, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Report number: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3500–3510

  19. arXiv:2010.11553  [pdf, other

    cs.CL

    Incorporating Stylistic Lexical Preferences in Generative Language Models

    Authors: Hrituraj Singh, Gaurav Verma, Balaji Vasan Srinivasan

    Abstract: While recent advances in language modeling have resulted in powerful generation models, their generation style remains implicitly dependent on the training data and can not emulate a specific target style. Leveraging the generative capabilities of a transformer-based language models, we present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lex… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: To Appear in Findings of EMNLP 2020

  20. arXiv:2005.05256  [pdf, other

    cs.CL cs.AI cs.LG

    Reinforced Rewards Framework for Text Style Transfer

    Authors: Abhilasha Sancheti, Kundan Krishna, Balaji Vasan Srinivasan, Anandhavelu Natarajan

    Abstract: Style transfer deals with the algorithms to transfer the stylistic properties of a piece of text into that of another while ensuring that the core content is preserved. There has been a lot of interest in the field of text style transfer due to its wide application to tailored text generation. Existing works evaluate the style transfer models based on content preservation and transfer strength. In… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: ECIR 2020

  21. arXiv:2004.14243  [pdf, other

    cs.CL

    Towards Transparent and Explainable Attention Models

    Authors: Akash Kumar Mohankumar, Preksha Nema, Sharan Narasimhan, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

    Abstract: Recent studies on interpretability of attention distributions have led to notions of faithful and plausible explanations for a model's predictions. Attention distributions can be considered a faithful explanation if a higher attention weight implies a greater impact on the model's prediction. They can be considered a plausible explanation if they provide a human-understandable justification for th… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: Accepted at ACL 2020

  22. arXiv:1912.08492  [pdf, other

    cs.CL

    Generating summaries tailored to target characteristics

    Authors: Kushal Chawla, Hrituraj Singh, Arijit Pramanik, Mithlesh Kumar, Balaji Vasan Srinivasan

    Abstract: Recently, research efforts have gained pace to cater to varied user preferences while generating text summaries. While there have been attempts to incorporate a few handpicked characteristics such as length or entities, a holistic view around these preferences is missing and crucial insights on why certain characteristics should be incorporated in a specific manner are absent. With this objective,… ▽ More

    Submitted 18 December, 2019; originally announced December 2019.

    Comments: Appeared in CiCLing 2019

  23. arXiv:1909.09962  [pdf, other

    cs.CL cs.LG

    Adapting Language Models for Non-Parallel Author-Stylized Rewriting

    Authors: Bakhtiyar Syed, Gaurav Verma, Balaji Vasan Srinivasan, Anandhavelu Natarajan, Vasudeva Varma

    Abstract: Given the recent progress in language modeling using Transformer-based neural models and an active interest in generating stylized text, we present an approach to leverage the generalization capabilities of a language model to rewrite an input text in a target author's style. Our proposed approach adapts a pre-trained language model to generate author-stylized text by fine-tuning on the author-spe… ▽ More

    Submitted 31 October, 2020; v1 submitted 22 September, 2019; originally announced September 2019.

    Comments: Accepted for publication in Main Technical Track at AAAI 20

  24. arXiv:1909.08349  [pdf, other

    cs.CL cs.LG

    A Lexical, Syntactic, and Semantic Perspective for Understanding Style in Text

    Authors: Gaurav Verma, Balaji Vasan Srinivasan

    Abstract: With a growing interest in modeling inherent subjectivity in natural language, we present a linguistically-motivated process to understand and analyze the writing style of individuals from three perspectives: lexical, syntactic, and semantic. We discuss the stylistically expressive elements within each of these levels and use existing methods to quantify the linguistic intuitions related to some o… ▽ More

    Submitted 18 September, 2019; originally announced September 2019.

  25. arXiv:1909.05355  [pdf, other

    cs.CL cs.AI

    Let's Ask Again: Refine Network for Automatic Question Generation

    Authors: Preksha Nema, Akash Kumar Mohankumar, Mitesh M. Khapra, Balaji Vasan Srinivasan, Balaraman Ravindran

    Abstract: In this work, we focus on the task of Automatic Question Generation (AQG) where given a passage and an answer the task is to generate the corresponding question. It is desired that the generated question should be (i) grammatically correct (ii) answerable from the passage and (iii) specific to the given answer. An analysis of existing AQG models shows that they produce questions which do not adher… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: accepted in EMNLP 2019 in Main Conference, (10 pages)

  26. arXiv:1901.11492  [pdf

    cs.LG cs.CL stat.ML

    Improving generation quality of pointer networks via guided attention

    Authors: Kushal Chawla, Kundan Krishna, Balaji Vasan Srinivasan

    Abstract: Pointer generator networks have been used successfully for abstractive summarization. Along with the capability to generate novel words, it also allows the model to copy from the input text to handle out-of-vocabulary words. In this paper, we point out two key shortcomings of the summaries generated with this framework via manual inspection, statistical analysis and human evaluation. The first sho… ▽ More

    Submitted 20 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning