Zum Hauptinhalt springen

Showing 1–50 of 53 results for author: Majumder, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.01725  [pdf, other

    cs.CL cs.AI cs.LG

    DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Bhavana Dalvi Mishra, Abhijeetsingh Meena, Aryan Prakhar, Tirth Vora, Tushar Khot, Ashish Sabharwal, Peter Clark

    Abstract: Can the rapid advances in code generation, function calling, and data analysis using large language models (LLMs) help automate the search and verification of hypotheses purely from a set of provided datasets? To evaluate this question, we present DiscoveryBench, the first comprehensive benchmark that formalizes the multi-step process of data-driven discovery. The benchmark is designed to systemat… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Website: https://github.com/allenai/discoverybench

  2. arXiv:2406.08344  [pdf, other

    cs.CV

    Blind Image Deblurring using FFT-ReLU with Deep Learning Pipeline Integration

    Authors: Abdul Mohaimen Al Radi, Prothito Shovon Majumder, Syed Mumtahin Mahmud, Mahdi Mohd Hossain Noki, Md. Haider Ali, Md. Mosaddek Khan

    Abstract: Blind image deblurring is the process of deriving a sharp image and a blur kernel from a blurred image. Blurry images are typically modeled as the convolution of a sharp image with a blur kernel, necessitating the estimation of the unknown blur kernel to perform blind image deblurring effectively. Existing approaches primarily focus on domain-specific features of images, such as salient edges, dar… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: 20 pages, 13 figures

  3. arXiv:2406.06769  [pdf, other

    cs.AI cs.CL

    DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

    Authors: Peter Jansen, Marc-Alexandre Côté, Tushar Khot, Erin Bransom, Bhavana Dalvi Mishra, Bodhisattwa Prasad Majumder, Oyvind Tafjord, Peter Clark

    Abstract: Automated scientific discovery promises to accelerate progress across scientific domains. However, developing and evaluating an AI agent's capacity for end-to-end scientific reasoning is challenging as running real-world experiments is often prohibitively expensive or infeasible. In this work we introduce DISCOVERYWORLD, the first virtual environment for developing and benchmarking an agent's abil… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 9 pages, 4 figures. Preprint, under review

  4. arXiv:2403.15737  [pdf, other

    cs.CL

    Few-shot Dialogue Strategy Learning for Motivational Interviewing via Inductive Reasoning

    Authors: Zhouhang Xie, Bodhisattwa Prasad Majumder, Mengjie Zhao, Yoshinori Maeda, Keiichi Yamada, Hiromi Wakaki, Julian McAuley

    Abstract: We consider the task of building a dialogue system that can motivate users to adopt positive lifestyle changes: Motivational Interviewing. Addressing such a task requires a system that can infer \textit{how} to motivate a user effectively. We propose DIIT, a framework that is capable of learning and applying conversation strategies in the form of natural language inductive rules from expert demons… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  5. arXiv:2403.05535  [pdf, other

    cs.CV cs.AI cs.CL

    Tell, Don't Show!: Language Guidance Eases Transfer Across Domains in Images and Videos

    Authors: Tarun Kalluri, Bodhisattwa Prasad Majumder, Manmohan Chandraker

    Abstract: We introduce LaGTran, a novel framework that utilizes text supervision to guide robust transfer of discriminative knowledge from labeled source to unlabeled target data with domain gaps. While unsupervised adaptation methods have been established to address this problem, they show limitations in handling challenging domain shifts due to their exclusive operation within the pixel-space. Motivated b… ▽ More

    Submitted 5 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

    Comments: ICML 2024 Camera-Ready. Project Page and Code: https://tarun005.github.io/lagtran/

  6. arXiv:2402.13610  [pdf, other

    cs.CL cs.AI cs.LG

    Data-driven Discovery with Large Generative Models

    Authors: Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal, Sanchaita Hazra, Ashish Sabharwal, Peter Clark

    Abstract: With the accumulation of data at an unprecedented rate, its potential to fuel scientific discovery is growing exponentially. This position paper urges the Machine Learning (ML) community to exploit the capabilities of large generative models (LGMs) to develop automated systems for end-to-end data-driven discovery -- a paradigm encompassing the search and verification of hypotheses purely from a se… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

  7. arXiv:2402.03244  [pdf, other

    cs.LG cs.CL

    Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

    Authors: Kolby Nottingham, Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Sameer Singh, Peter Clark, Roy Fox

    Abstract: Large language models (LLMs) have recently been used for sequential decision making in interactive environments. However, leveraging environment reward signals for continual LLM actor improvement is not straightforward. We propose Skill Set Optimization (SSO) for improving LLM actor performance through constructing and refining sets of transferable skills. SSO constructs skills by extracting commo… ▽ More

    Submitted 22 June, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria. PMLR 235, 2024

  8. arXiv:2311.09510  [pdf, other

    cs.CL

    Tailoring with Targeted Precision: Edit-Based Agents for Open-Domain Procedure Customization

    Authors: Yash Kumar Lal, Li Zhang, Faeze Brahman, Bodhisattwa Prasad Majumder, Peter Clark, Niket Tandon

    Abstract: How-to procedures, such as how to plant a garden, are now used by millions of users, but sometimes need customizing to meet a user's specific needs, e.g., planting a garden without pesticides. Our goal is to measure and improve an LLM's ability to perform such customization. Our approach is to test several simple multi-LLM-agent architectures for customization, as well as an end-to-end LLM, using… ▽ More

    Submitted 30 May, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Camera ready version accepted to Findings of ACL 2024

  9. arXiv:2311.07092  [pdf, other

    cs.CL cs.AI

    To Tell The Truth: Language of Deception and Language Models

    Authors: Sanchaita Hazra, Bodhisattwa Prasad Majumder

    Abstract: Text-based misinformation permeates online discourses, yet evidence of people's ability to discern truth from such deceptive textual content is scarce. We analyze a novel TV game show data where conversations in a high-stake environment between individuals with conflicting objectives result in lies. We investigate the manifestation of potentially verifiable language cues of deception in the presen… ▽ More

    Submitted 8 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted as a full paper in NAACL 2024 (Main)

  10. arXiv:2311.03374  [pdf, other

    cs.SE cs.AI cs.IR

    Generative AI for Software Metadata: Overview of the Information Retrieval in Software Engineering Track at FIRE 2023

    Authors: Srijoni Majumdar, Soumen Paul, Debjyoti Paul, Ayan Bandyopadhyay, Samiran Chattopadhyay, Partha Pratim Das, Paul D Clough, Prasenjit Majumder

    Abstract: The Information Retrieval in Software Engineering (IRSE) track aims to develop solutions for automated evaluation of code comments in a machine learning framework based on human and large language model generated labels. In this track, there is a binary classification task to classify comments as useful and not useful. The dataset consists of 9048 code comments and surrounding code snippet pairs e… ▽ More

    Submitted 27 October, 2023; originally announced November 2023.

    Comments: Overview Paper of the Information Retrieval of Software Engineering Track at the Forum for Information Retrieval, 2023

  11. arXiv:2310.10134  [pdf, other

    cs.CL cs.AI cs.LG

    CLIN: A Continually Learning Language Agent for Rapid Task Adaptation and Generalization

    Authors: Bodhisattwa Prasad Majumder, Bhavana Dalvi Mishra, Peter Jansen, Oyvind Tafjord, Niket Tandon, Li Zhang, Chris Callison-Burch, Peter Clark

    Abstract: Language agents have shown some ability to interact with an external environment, e.g., a virtual world such as ScienceWorld, to perform complex tasks, e.g., growing a plant, without the startup costs of reinforcement learning. However, despite their zero-shot capabilities, these agents to date do not continually improve over time beyond performance refinement on a specific task. Here we present C… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Project page: https://allenai.github.io/clin/

  12. arXiv:2310.05746  [pdf, other

    cs.CL cs.AI

    Put Your Money Where Your Mouth Is: Evaluating Strategic Planning and Execution of LLM Agents in an Auction Arena

    Authors: Jiangjie Chen, Siyu Yuan, Rong Ye, Bodhisattwa Prasad Majumder, Kyle Richardson

    Abstract: Recent advancements in Large Language Models (LLMs) showcase advanced reasoning, yet NLP evaluations often depend on static benchmarks. Evaluating this necessitates environments that test strategic reasoning in dynamic, competitive scenarios requiring long-term planning. We introduce AucArena, a novel evaluation suite that simulates auctions, a setting chosen for being highly unpredictable and inv… ▽ More

    Submitted 25 August, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Project page: https://auction-arena.github.io

  13. Large Language Models as Zero-Shot Conversational Recommenders

    Authors: Zhankui He, Zhouhang Xie, Rahul Jha, Harald Steck, Dawen Liang, Yesu Feng, Bodhisattwa Prasad Majumder, Nathan Kallus, Julian McAuley

    Abstract: In this paper, we present empirical studies on conversational recommendation tasks using representative large language models in a zero-shot setting with three primary contributions. (1) Data: To gain insights into model behavior in "in-the-wild" conversational recommendation scenarios, we construct a new dataset of recommendation-related conversations by scraping a popular discussion website. Thi… ▽ More

    Submitted 19 August, 2023; originally announced August 2023.

    Comments: Accepted as CIKM 2023 long paper. Longer version is coming soon (e.g., more details about dataset)

  14. arXiv:2306.02980  [pdf, other

    cs.CL cs.AI

    KNOW How to Make Up Your Mind! Adversarially Detecting and Alleviating Inconsistencies in Natural Language Explanations

    Authors: Myeongjun Jang, Bodhisattwa Prasad Majumder, Julian McAuley, Thomas Lukasiewicz, Oana-Maria Camburu

    Abstract: While recent works have been considerably improving the quality of the natural language explanations (NLEs) generated by a model to justify its predictions, there is very limited research in detecting and alleviating inconsistencies among generated NLEs. In this work, we leverage external knowledge bases to significantly improve on an existing adversarial attack for detecting inconsistent NLEs. We… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Short paper, ACL 2023

    Journal ref: The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023)

  15. arXiv:2305.14929  [pdf, other

    cs.CL

    Aligning Language Models to User Opinions

    Authors: EunJeong Hwang, Bodhisattwa Prasad Majumder, Niket Tandon

    Abstract: An important aspect of developing LLMs that interact with humans is to align models' behavior to their users. It is possible to prompt an LLM into behaving as a certain persona, especially a user group or ideological persona the model captured during its pertaining stage. But, how to best align an LLM with a specific user and not a demographic or ideological group remains an open question. Mining… ▽ More

    Submitted 24 May, 2023; originally announced May 2023.

  16. arXiv:2303.17651  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refine: Iterative Refinement with Self-Feedback

    Authors: Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

    Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Code, data, and demo at https://selfrefine.info/

  17. arXiv:2303.12795  [pdf, other

    cs.CL cs.AI cs.LG

    Named Entity Recognition Based Automatic Generation of Research Highlights

    Authors: Tohida Rehman, Debarshi Kumar Sanyal, Prasenjit Majumder, Samiran Chattopadhyay

    Abstract: A scientific paper is traditionally prefaced by an abstract that summarizes the paper. Recently, research highlights that focus on the main findings of the paper have emerged as a complementary summary in addition to an abstract. However, highlights are not yet as common as abstracts, and are absent in many papers. In this paper, we aim to automatically generate research highlights using different… ▽ More

    Submitted 25 February, 2023; originally announced March 2023.

    Comments: 7 Pages, 3 Figures, 2 Tables

    Journal ref: https://aclanthology.org/2022.sdp-1.18

  18. arXiv:2210.07455  [pdf, other

    cs.CL

    Controlling Bias Exposure for Fair Interpretable Predictions

    Authors: Zexue He, Yu Wang, Julian McAuley, Bodhisattwa Prasad Majumder

    Abstract: Recent work on reducing bias in NLP models usually focuses on protecting or isolating information related to a sensitive attribute (like gender or race). However, when sensitive information is semantically entangled with the task information of the input, e.g., gender information is predictive for a profession, a fair trade-off between task performance and bias mitigation is difficult to achieve.… ▽ More

    Submitted 22 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP-2022 Findings

  19. arXiv:2210.07440  [pdf, other

    cs.CL

    InterFair: Debiasing with Natural Language Feedback for Fair Interpretable Predictions

    Authors: Bodhisattwa Prasad Majumder, Zexue He, Julian McAuley

    Abstract: Debiasing methods in NLP models traditionally focus on isolating information related to a sensitive attribute (e.g., gender or race). We instead argue that a favorable debiasing method should use sensitive information 'fairly,' with explanations, rather than blindly eliminating it. This fair balance is often subjective and can be challenging to achieve algorithmically. We explore two interactive s… ▽ More

    Submitted 23 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted in EMNLP 2023 (Main)

  20. arXiv:2209.12613  [pdf, other

    cs.CL cs.AI cs.LG

    Factual and Informative Review Generation for Explainable Recommendation

    Authors: Zhouhang Xie, Sameer Singh, Julian McAuley, Bodhisattwa Prasad Majumder

    Abstract: Recent models can generate fluent and grammatical synthetic reviews while accurately predicting user ratings. The generated reviews, expressing users' estimated opinions towards related products, are often viewed as natural language 'rationales' for the jointly predicted rating. However, previous studies found that existing models often generate repetitive, universally applicable, and generic expl… ▽ More

    Submitted 28 September, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: Typo in footnote 2023->2022; updated bolding/underline in table 1

  21. arXiv:2209.05409  [pdf, other

    cs.IR cs.AI cs.CL

    On Faithfulness and Coherence of Language Explanations for Recommendation Systems

    Authors: Zhouhang Xie, Julian McAuley, Bodhisattwa Prasad Majumder

    Abstract: Reviews contain rich information about product characteristics and user interests and thus are commonly used to boost recommender system performance. Specifically, previous work show that jointly learning to perform review generation improves rating prediction performance. Meanwhile, these model-produced reviews serve as recommendation explanations, providing the user with insights on predicted ra… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

  22. arXiv:2203.11399  [pdf, other

    cs.CL

    Achieving Conversational Goals with Unsupervised Post-hoc Knowledge Injection

    Authors: Bodhisattwa Prasad Majumder, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Julian McAuley

    Abstract: A limitation of current neural dialog models is that they tend to suffer from a lack of specificity and informativeness in generated responses, primarily due to dependence on training data that covers a limited variety of scenarios and conveys limited knowledge. One way to alleviate this issue is to extract relevant knowledge from external sources at decoding time and incorporate it into the dialo… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022 main conference

  23. arXiv:2112.09301  [pdf

    cs.CL cs.AI cs.SI

    Overview of the HASOC Subtrack at FIRE 2021: Hate Speech and Offensive Content Identification in English and Indo-Aryan Languages

    Authors: Thomas Mandl, Sandip Modha, Gautam Kishore Shahi, Hiren Madhu, Shrey Satapara, Prasenjit Majumder, Johannes Schaefer, Tharindu Ranasinghe, Marcos Zampieri, Durgesh Nandini, Amit Kumar Jaiswal

    Abstract: The widespread of offensive content online such as hate speech poses a growing societal problem. AI tools are necessary for supporting the moderation process at online platforms. For the evaluation of these identification tools, continuous experimentation with data sets in different languages are necessary. The HASOC track (Hate Speech and Offensive Content Identification) is dedicated to develop… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  24. arXiv:2112.05197  [pdf, other

    cs.CL cs.IR

    Self-Supervised Bot Play for Conversational Recommendation with Justifications

    Authors: Shuyang Li, Bodhisattwa Prasad Majumder, Julian McAuley

    Abstract: Conversational recommender systems offer the promise of interactive, engaging ways for users to find items they enjoy. We seek to improve conversational recommendation via three dimensions: 1) We aim to mimic a common mode of human interaction for recommendation: experts justify their suggestions, a seeker explains why they don't like the item, and both parties iterate through the dialog to find a… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

  25. arXiv:2110.01188  [pdf, other

    cs.CL cs.AI cs.IR

    LawSum: A weakly supervised approach for Indian Legal Document Summarization

    Authors: Vedant Parikh, Vidit Mathur, Parth Mehta, Namita Mittal, Prasenjit Majumder

    Abstract: Unlike the courts in western countries, public records of Indian judiciary are completely unstructured and noisy. No large scale publicly available annotated datasets of Indian legal documents exist till date. This limits the scope for legal analytics research. In this work, we propose a new dataset consisting of over 10,000 judgements delivered by the supreme court of India and their correspondin… ▽ More

    Submitted 23 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

  26. arXiv:2109.11708  [pdf, other

    cs.CL

    Detect and Perturb: Neutral Rewriting of Biased and Sensitive Text via Gradient-based Decoding

    Authors: Zexue He, Bodhisattwa Prasad Majumder, Julian McAuley

    Abstract: Written language carries explicit and implicit biases that can distract from meaningful signals. For example, letters of reference may describe male and female candidates differently, or their writing style may indirectly reveal demographic characteristics. At best, such biases distract from the meaningful content of the text; at worst they can lead to unfair outcomes. We investigate the challenge… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: To appear at EMNLP-2021 as Findings

  27. arXiv:2108.05927  [pdf

    cs.CL cs.CY

    Overview of the HASOC track at FIRE 2020: Hate Speech and Offensive Content Identification in Indo-European Languages

    Authors: Thomas Mandla, Sandip Modha, Gautam Kishore Shahi, Amit Kumar Jaiswal, Durgesh Nandini, Daksh Patel, Prasenjit Majumder, Johannes Schäfer

    Abstract: With the growth of social media, the spread of hate speech is also increasing rapidly. Social media are widely used in many countries. Also Hate Speech is spreading in these countries. This brings a need for multilingual Hate Speech detection algorithms. Much research in this area is dedicated to English at the moment. The HASOC track intends to provide a platform to develop and optimize Hate Spee… ▽ More

    Submitted 12 August, 2021; originally announced August 2021.

    Comments: 25 pages

  28. arXiv:2106.13876  [pdf, other

    cs.CL cs.AI cs.LG

    Knowledge-Grounded Self-Rationalization via Extractive and Natural Language Explanations

    Authors: Bodhisattwa Prasad Majumder, Oana-Maria Camburu, Thomas Lukasiewicz, Julian McAuley

    Abstract: Models that generate extractive rationales (i.e., subsets of features) or natural language explanations (NLEs) for their predictions are important for explainable AI. While an extractive rationale provides a quick view of the features most responsible for a prediction, an NLE allows for a comprehensive description of the decision-making process behind a prediction. However, current models that gen… ▽ More

    Submitted 16 September, 2022; v1 submitted 25 June, 2021; originally announced June 2021.

    Comments: Accepted in ICML 2022 as a spotlight

  29. arXiv:2106.08364  [pdf, other

    cs.CL

    Unsupervised Enrichment of Persona-grounded Dialog with Background Stories

    Authors: Bodhisattwa Prasad Majumder, Taylor Berg-Kirkpatrick, Julian McAuley, Harsh Jhamtani

    Abstract: Humans often refer to personal narratives, life experiences, and events to make a conversation more engaging and rich. While persona-grounded dialog models are able to generate responses that follow a given persona, they often miss out on stating detailed experiences or events related to a persona, often leaving conversations shallow and dull. In this work, we equip dialog models with 'background… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL 2021 for oral presentation

  30. arXiv:2104.13671  [pdf, other

    cs.AR cs.LG cs.NI cs.OS

    Continual Learning Approach for Improving the Data and Computation Mapping in Near-Memory Processing System

    Authors: Pritam Majumder, Jiayi Huang, Sungkeun Kim, Abdullah Muzahid, Dylan Siegers, Chia-Che Tsai, Eun Jung Kim

    Abstract: The resurgence of near-memory processing (NMP) with the advent of big data has shifted the computation paradigm from processor-centric to memory-centric computing. To meet the bandwidth and capacity demands of memory-centric computing, 3D memory has been adopted to form a scalable memory-cube network. Along with NMP and memory system development, the mapping for placing data and guiding computatio… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

  31. arXiv:2104.06828  [pdf, other

    cs.CL

    Ask what's missing and what's useful: Improving Clarification Question Generation using Global Knowledge

    Authors: Bodhisattwa Prasad Majumder, Sudha Rao, Michel Galley, Julian McAuley

    Abstract: The ability to generate clarification questions i.e., questions that identify useful missing information in a given context, is important in reducing ambiguity. Humans use previous experience with similar contexts to form a global view and compare it to the given context to ascertain what is missing and what is useful in the context. Inspired by this, we propose a model for clarification question… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: Accepted in NAACL 2021, Code is available at https://github.com/microsoft/clarification-qgen-globalinfo

  32. arXiv:2102.01672  [pdf, other

    cs.CL cs.AI cs.LG

    The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

    Authors: Sebastian Gehrmann, Tosin Adewumi, Karmanya Aggarwal, Pawan Sasanka Ammanamanchi, Aremu Anuoluwapo, Antoine Bosselut, Khyathi Raghavi Chandu, Miruna Clinciu, Dipanjan Das, Kaustubh D. Dhole, Wanyu Du, Esin Durmus, Ondřej Dušek, Chris Emezue, Varun Gangal, Cristina Garbacea, Tatsunori Hashimoto, Yufang Hou, Yacine Jernite, Harsh Jhamtani, Yangfeng Ji, Shailza Jolly, Mihir Kale, Dhruv Kumar, Faisal Ladhak , et al. (31 additional authors not shown)

    Abstract: We introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. Measuring progress in NLG relies on a constantly evolving ecosystem of automated metrics, datasets, and human evaluation standards. Due to this moving target, new models often still evaluate on divergent anglo-centric corpora with well-established, but flawed, metrics. This disconnect makes it… ▽ More

    Submitted 1 April, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

  33. arXiv:2010.03205  [pdf, other

    cs.CL cs.AI

    Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions

    Authors: Bodhisattwa Prasad Majumder, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Julian McAuley

    Abstract: Existing persona-grounded dialog models often fail to capture simple implications of given persona descriptions, something which humans are able to do seamlessly. For example, state-of-the-art models cannot infer that interest in hiking might imply love for nature or longing for a break. In this paper, we propose to expand available persona sentences using existing commonsense knowledge bases and… ▽ More

    Submitted 7 October, 2020; originally announced October 2020.

    Comments: Accepted in EMNLP 2020

  34. arXiv:2004.03090  [pdf, other

    cs.CL

    Interview: A Large-Scale Open-Source Corpus of Media Dialog

    Authors: Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley

    Abstract: Existing conversational datasets consist either of written proxies for dialog or small-scale transcriptions of natural speech. We introduce 'Interview': a large-scale (105K conversations) media dialog dataset collected from news interview transcripts. Compared to existing large-scale proxies for conversational data, language models trained on our dataset exhibit better zero-shot out-of-domain perf… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  35. arXiv:2003.04887  [pdf, other

    cs.LG cs.CL stat.ML

    ReZero is All You Need: Fast Convergence at Large Depth

    Authors: Thomas Bachlechner, Bodhisattwa Prasad Majumder, Huanru Henry Mao, Garrison W. Cottrell, Julian McAuley

    Abstract: Deep networks often suffer from vanishing or exploding gradients due to inefficient signal propagation, leading to long training times or convergence difficulties. Various architecture designs, sophisticated residual-style networks, and initialization schemes have been shown to improve deep signal propagation. Recently, Pennington et al. used free probability theory to show that dynamical isometry… ▽ More

    Submitted 24 June, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

  36. arXiv:1910.04882  [pdf, other

    cs.AR cs.NI

    Remote Control: A Simple Deadlock Avoidance Scheme for Modular System on Chip

    Authors: Pritam Majumder, Sungkeun Kim, Jiayi Huang, Ki Hwan Yum, Eun Jung Kim

    Abstract: The increase in design cost and complexity have motivated designers to adopt modular design of System on Chip (SoC) by integrating independently designed small chiplets. However, it introduces new challenges for correctness validation, increasing chances of forming deadlock in the system involving multiple chiplets. Although there have been many solutions available for deadlock freedom in flat net… ▽ More

    Submitted 10 October, 2019; originally announced October 2019.

  37. arXiv:1909.00105  [pdf, other

    cs.CL cs.AI cs.LG

    Generating Personalized Recipes from Historical User Preferences

    Authors: Bodhisattwa Prasad Majumder, Shuyang Li, Jianmo Ni, Julian McAuley

    Abstract: Existing approaches to recipe generation are unable to create recipes for users with culinary preferences but incomplete knowledge of ingredients in specific dishes. We propose a new task of personalized recipe generation to help these users: expanding a name and incomplete ingredient details into complete natural-text instructions aligned with the user's historical preferences. We attend on techn… ▽ More

    Submitted 30 August, 2019; originally announced September 2019.

    Comments: Accepted in EMNLP 2019. Data and codes are available at https://github.com/majumderb/recipe-personalization

  38. arXiv:1908.09451  [pdf, ps, other

    cs.LG cs.CL stat.ML

    Improving Neural Story Generation by Targeted Common Sense Grounding

    Authors: Huanru Henry Mao, Bodhisattwa Prasad Majumder, Julian McAuley, Garrison W. Cottrell

    Abstract: Stories generated with neural language models have shown promise in grammatical and stylistic consistency. However, the generated stories are still lacking in common sense reasoning, e.g., they often contain sentences deprived of world knowledge. We propose a simple multi-task learning scheme to achieve quantitatively better common sense reasoning in language models by leveraging auxiliary trainin… ▽ More

    Submitted 27 February, 2020; v1 submitted 25 August, 2019; originally announced August 2019.

  39. arXiv:1904.08770  [pdf, other

    cs.IR cs.CL cs.SI

    An Empirical Evaluation of Text Representation Schemes on Multilingual Social Web to Filter the Textual Aggression

    Authors: Sandip Modha, Prasenjit Majumder

    Abstract: This paper attempt to study the effectiveness of text representation schemes on two tasks namely: User Aggression and Fact Detection from the social media contents. In User Aggression detection, The aim is to identify the level of aggression from the contents generated in the Social media and written in the English, Devanagari Hindi and Romanized Hindi. Aggression levels are categorized into three… ▽ More

    Submitted 16 April, 2019; originally announced April 2019.

    Comments: 21 Page, 2 Figure

    ACM Class: I.2.7; I.2.6

  40. arXiv:1809.02343  [pdf, other

    cs.IR cs.AI cs.CL

    Exploiting local and global performance of candidate systems for aggregation of summarization techniques

    Authors: Parth Mehta, Prasenjit Majumder

    Abstract: With an ever growing number of extractive summarization techniques being proposed, there is less clarity then ever about how good each system is compared to the rest. Several studies highlight the variance in performance of these systems with change in datasets or even across documents within the same corpus. An effective way to counter this variance and to make the systems more robust could be to… ▽ More

    Submitted 7 September, 2018; originally announced September 2018.

  41. arXiv:1809.02147  [pdf, other

    cs.CL

    Upcycle Your OCR: Reusing OCRs for Post-OCR Text Correction in Romanised Sanskrit

    Authors: Amrith Krishna, Bodhisattwa Prasad Majumder, Rajesh Shreedhar Bhat, Pawan Goyal

    Abstract: We propose a post-OCR text correction approach for digitising texts in Romanised Sanskrit. Owing to the lack of resources our approach uses OCR models trained for other languages written in Roman. Currently, there exists no dataset available for Romanised Sanskrit OCR. So, we bootstrap a dataset of 430 images, scanned in two different settings and their corresponding ground truth. For training, we… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: This paper has been accepted as a full paper in the SIGNLL Conference on Computational Natural Language Learning (CoNLL), 2018. The code, data and the supplementary material is available at https://github.com/majumderb/sanskrit-ocr

  42. arXiv:1803.11284  [pdf, ps, other

    cs.CL cs.AI

    Deep Recurrent Neural Networks for Product Attribute Extraction in eCommerce

    Authors: Bodhisattwa Prasad Majumder, Aditya Subramanian, Abhinandan Krishnan, Shreyansh Gandhi, Ajinkya More

    Abstract: Extracting accurate attribute qualities from product titles is a vital component in delivering eCommerce customers with a rewarding online shopping experience via an enriched faceted search. We demonstrate the potential of Deep Recurrent Networks in this domain, primarily models such as Bidirectional LSTMs and Bidirectional LSTM-CRF with or without an attention mechanism. These have improved overa… ▽ More

    Submitted 29 March, 2018; originally announced March 2018.

  43. arXiv:1802.04675  [pdf, other

    cs.IR cs.AI cs.CL

    Attention based Sentence Extraction from Scientific Articles using Pseudo-Labeled data

    Authors: Parth Mehta, Gaurav Arora, Prasenjit Majumder

    Abstract: In this work, we present a weakly supervised sentence extraction technique for identifying important sentences in scientific papers that are worthy of inclusion in the abstract. We propose a new attention based deep learning architecture that jointly learns to identify important content, as well as the cue phrases that are indicative of summary worthy sentences. We propose a new context embedding… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

  44. arXiv:1802.00946  [pdf, ps, other

    cs.IR cs.CL

    Content based Weighted Consensus Summarization

    Authors: Parth Mehta, Prasenjit Majumder

    Abstract: Multi-document summarization has received a great deal of attention in the past couple of decades. Several approaches have been proposed, many of which perform equally well and it is becoming in- creasingly difficult to choose one particular system over another. An ensemble of such systems that is able to leverage the strengths of each individual systems can build a better and more robust summary.… ▽ More

    Submitted 3 February, 2018; originally announced February 2018.

  45. arXiv:1710.09085  [pdf, other

    cs.IR cs.CL

    Re-evaluating the need for Modelling Term-Dependence in Text Classification Problems

    Authors: Sounak Banerjee, Prasenjit Majumder, Mandar Mitra

    Abstract: A substantial amount of research has been carried out in developing machine learning algorithms that account for term dependence in text classification. These algorithms offer acceptable performance in most cases but they are associated with a substantial cost. They require significantly greater resources to operate. This paper argues against the justification of the higher costs of these algorith… ▽ More

    Submitted 25 October, 2017; originally announced October 2017.

    Comments: 23 Pages, 16 Figures, 3 Tables, Some Figures at the end of the document because of limiting factors in the Latex format

    MSC Class: 68P20

  46. arXiv:1610.04872  [pdf

    cs.AI cs.DC

    Fault Detection Engine in Intelligent Predictive Analytics Platform for DCIM

    Authors: Bodhisattwa Prasad Majumder, Ayan Sengupta, Sajal jain, Parikshit Bhaduri

    Abstract: With the advancement of huge data generation and data handling capability, Machine Learning and Probabilistic modelling enables an immense opportunity to employ predictive analytics platform in high security critical industries namely data centers, electricity grids, utilities, airport etc. where downtime minimization is one of the primary objectives. This paper proposes a novel, complete architec… ▽ More

    Submitted 16 October, 2016; originally announced October 2016.

    Comments: Accepted in 4th International Conference on Business Analytics and Intelligence (ICBAI 2016)

  47. arXiv:1603.08564  [pdf

    cs.CV stat.ML

    Kernelized Weighted SUSAN based Fuzzy C-Means Clustering for Noisy Image Segmentation

    Authors: Satrajit Mukherjee, Bodhisattwa Prasad Majumder, Aritran Piplai, Swagatam Das

    Abstract: The paper proposes a novel Kernelized image segmentation scheme for noisy images that utilizes the concept of Smallest Univalue Segment Assimilating Nucleus (SUSAN) and incorporates spatial constraints by computing circular colour map induced weights. Fuzzy damping coefficients are obtained for each nucleus or center pixel on the basis of the corresponding weighted SUSAN area values, the weights b… ▽ More

    Submitted 28 March, 2016; originally announced March 2016.

    Comments: Journal Version

  48. arXiv:1505.01606  [pdf

    cs.IR

    A comparative study of approaches in user-centered health information retrieval

    Authors: Harsh Thakkar, Ganesh Iyer, Prasenjit Majumder

    Abstract: In this paper, we survey various user-centered or context-based biomedical health information retrieval systems. We present and discuss the performance of systems submitted in CLEF eHealth 2014 Task 3 for this purpose. We classify and focus on comparing the two most prevalent retrieval models in biomedical information retrieval namely: Language Model (LM) and Vector Space Model (VSM). We also repo… ▽ More

    Submitted 7 May, 2015; originally announced May 2015.

    Comments: 6 pages, 2 figures, 1 table

  49. arXiv:1312.6947   

    cs.CL cs.AI

    Formal Ontology Learning on Factual IS-A Corpus in English using Description Logics

    Authors: Sourish Dasgupta, Ankur Padia, Kushal Shah, Prasenjit Majumder

    Abstract: Ontology Learning (OL) is the computational task of generating a knowledge base in the form of an ontology given an unstructured corpus whose content is in natural language (NL). Several works can be found in this area most of which are limited to statistical and lexico-syntactic pattern matching based techniques Light-Weight OL. These techniques do not lead to very accurate learning mostly becaus… ▽ More

    Submitted 8 March, 2016; v1 submitted 25 December, 2013; originally announced December 2013.

    Comments: This paper has been withdrawn by the author due to requirement of re-evaluation of results

  50. arXiv:1303.5929  [pdf

    cs.AI

    DLOLIS-A: Description Logic based Text Ontology Learning

    Authors: Sourish Dasgupta, Ankur Padia, Kushal Shah, Rupali KaPatel, Prasenjit Majumder

    Abstract: Ontology Learning has been the subject of intensive study for the past decade. Researchers in this field have been motivated by the possibility of automatically building a knowledge base on top of text documents so as to support reasoning based knowledge extraction. While most works in this field have been primarily statistical (known as light-weight Ontology Learning) not much attempt has been ma… ▽ More

    Submitted 24 March, 2013; originally announced March 2013.

    Comments: 11 pages