Zum Hauptinhalt springen

Showing 1–33 of 33 results for author: Prabhumoye, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06380  [pdf, other

    cs.CL

    Data, Data Everywhere: A Guide for Pretraining Dataset Construction

    Authors: Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Bo Liu, Aastha Jhunjhunwala, Zhilin Wang, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: The impressive capabilities of recent language models can be largely attributed to the multi-trillion token pretraining datasets that they are trained on. However, model developers fail to disclose their construction methodology which has lead to a lack of open information on how to develop effective pretraining sets. To address this issue, we perform the first systematic study across the entire p… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: Preprint. Under review

  2. arXiv:2406.11704  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 340B Technical Report

    Authors: Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek , et al. (58 additional authors not shown)

    Abstract: We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward. Our models are open access under the NVIDIA Open Model License Agreement, a permissive model license that allows distribution, modification, and use of the models and its outputs. These models perform competitively to open access models on a wide range of evaluation be… ▽ More

    Submitted 6 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  3. arXiv:2404.11483  [pdf, other

    cs.AI cs.LG

    AgentKit: Structured LLM Reasoning with Dynamic Graphs

    Authors: Yue Wu, Yewen Fan, So Yeon Min, Shrimai Prabhumoye, Stephen McAleer, Yonatan Bisk, Ruslan Salakhutdinov, Yuanzhi Li, Tom Mitchell

    Abstract: We propose an intuitive LLM prompting framework (AgentKit) for multifunctional agents. AgentKit offers a unified framework for explicitly constructing a complex "thought process" from simple natural language prompts. The basic building block in AgentKit is a node, containing a natural language prompt for a specific subtask. The user then puts together chains of nodes, like stacking LEGO pieces. Th… ▽ More

    Submitted 24 July, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

  4. arXiv:2402.16819  [pdf, other

    cs.CL cs.AI cs.LG

    Nemotron-4 15B Technical Report

    Authors: Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta, Vibhu Jawa, Jiwei Liu, Ameya Mahabaleshwarkar, Osvald Nitski, Annika Brundyn, James Maki, Miguel Martinez, Jiaxuan You, John Kamalu, Patrick LeGresley, Denys Fridman, Jared Casper, Ashwath Aithal, Oleksii Kuchaiev, Mohammad Shoeybi , et al. (2 additional authors not shown)

    Abstract: We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. Nemotron-4 15B demonstrates strong performance when assessed on English, multilingual, and coding tasks: it outperforms all existing similarly-sized open models on 4 out of 7 downstream evaluation areas and achieves competitive performance to the leading open models in the remai… ▽ More

    Submitted 27 February, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

  5. arXiv:2305.15486  [pdf, other

    cs.AI cs.LG

    SPRING: Studying the Paper and Reasoning to Play Games

    Authors: Yue Wu, Shrimai Prabhumoye, So Yeon Min, Yonatan Bisk, Ruslan Salakhutdinov, Amos Azaria, Tom Mitchell, Yuanzhi Li

    Abstract: Open-world survival games pose significant challenges for AI algorithms due to their multi-tasking, deep exploration, and goal prioritization requirements. Despite reinforcement learning (RL) being popular for solving games, its high sample complexity limits its effectiveness in complex open-world games like Crafter or Minecraft. We propose a novel approach, SPRING, to read the game's original aca… ▽ More

    Submitted 11 December, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

  6. arXiv:2305.02412  [pdf, other

    cs.CL cs.AI cs.LG

    Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents

    Authors: Yue Wu, So Yeon Min, Yonatan Bisk, Ruslan Salakhutdinov, Amos Azaria, Yuanzhi Li, Tom Mitchell, Shrimai Prabhumoye

    Abstract: Pre-trained large language models (LLMs) capture procedural knowledge about the world. Recent work has leveraged LLM's ability to generate abstract plans to simplify challenging control tasks, either by action scoring, or action modeling (fine-tuning). However, the transformer architecture inherits several constraints that make it difficult for the LLM to directly serve as the agent: e.g. limited… ▽ More

    Submitted 7 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

  7. arXiv:2303.17651  [pdf, other

    cs.CL cs.AI cs.LG

    Self-Refine: Iterative Refinement with Self-Feedback

    Authors: Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark

    Abstract: Like humans, large language models (LLMs) do not always generate the best output on their first try. Motivated by how humans refine their written text, we introduce Self-Refine, an approach for improving initial outputs from LLMs through iterative feedback and refinement. The main idea is to generate an initial output using an LLMs; then, the same LLMs provides feedback for its output and uses it… ▽ More

    Submitted 25 May, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: Code, data, and demo at https://selfrefine.info/

  8. arXiv:2302.07388  [pdf, other

    cs.CL cs.AI

    Adding Instructions during Pretraining: Effective Way of Controlling Toxicity in Language Models

    Authors: Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: Pretrained large language models have become indispensable for solving various natural language processing (NLP) tasks. However, safely deploying them in real world applications is challenging because they generate toxic content. To address this challenge, we propose two novel pretraining data augmentation strategies that significantly reduce model toxicity without compromising its utility. Our tw… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

    Comments: This paper will be presented at EACL 2023

  9. arXiv:2302.07371  [pdf, other

    cs.CL cs.CY

    BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models

    Authors: Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar

    Abstract: Pretrained Language Models (PLMs) harbor inherent social biases that can result in harmful real-world implications. Such social biases are measured through the probability values that PLMs output for different social groups and attributes appearing in a set of test sentences. However, bias testing is currently cumbersome since the test sentences are generated either from a limited set of manual te… ▽ More

    Submitted 6 December, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    MSC Class: 68T50 ACM Class: I.2.7; J.5; K.4.1

  10. arXiv:2211.11798  [pdf, other

    cs.CL cs.AI

    Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

    Authors: Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar

    Abstract: Labeling social-media data for custom dimensions of toxicity and social bias is challenging and labor-intensive. Existing transfer and active learning approaches meant to reduce annotation effort require fine-tuning, which suffers from over-fitting to noise and can cause domain shift with small sample sizes. In this work, we propose a novel Active Transfer Few-shot Instructions (ATF) approach whic… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS Workshop on Transfer Learning for Natural Language Processing, 2022, New Orleans

  11. arXiv:2210.13673  [pdf, other

    cs.CL

    Evaluating Parameter Efficient Learning for Generation

    Authors: Peng Xu, Mostofa Patwary, Shrimai Prabhumoye, Virginia Adams, Ryan J. Prenger, Wei Ping, Nayeon Lee, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: Parameter efficient learning methods (PERMs) have recently gained significant attention as they provide an efficient way for pre-trained language models (PLMs) to adapt to a downstream task. However, these conclusions are mostly drawn from in-domain evaluations over the full training set. In this paper, we present comparisons between PERMs and finetuning from three new perspectives: (1) the effect… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted to EMNLP 2022 main conference

  12. arXiv:2210.06349  [pdf, other

    cs.CL cs.AI

    Context Generation Improves Open Domain Question Answering

    Authors: Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

    Abstract: Closed-book question answering (QA) requires a model to directly answer an open-domain question without access to any external knowledge. Prior work on closed-book QA either directly finetunes or prompts a pretrained language model (LM) to leverage the stored knowledge. However, they do not fully exploit the parameterized knowledge. To address this issue, we propose a two-stage, closed-book QA fra… ▽ More

    Submitted 27 April, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 8 pages; Accepted at EACL2023

  13. arXiv:2203.08745  [pdf, other

    cs.CL cs.AI

    Multi-Stage Prompting for Knowledgeable Dialogue Generation

    Authors: Zihan Liu, Mostofa Patwary, Ryan Prenger, Shrimai Prabhumoye, Wei Ping, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: Existing knowledge-grounded dialogue systems typically use finetuned versions of a pretrained language model (LM) and large-scale knowledge bases. These models typically fail to generalize on topics outside of the knowledge base, and require maintaining separate potentially large checkpoints each time finetuning is needed. In this paper, we aim to address these limitations by leveraging the inhere… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  14. arXiv:2201.11990  [pdf, other

    cs.CL

    Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model

    Authors: Shaden Smith, Mostofa Patwary, Brandon Norick, Patrick LeGresley, Samyam Rajbhandari, Jared Casper, Zhun Liu, Shrimai Prabhumoye, George Zerveas, Vijay Korthikanti, Elton Zhang, Rewon Child, Reza Yazdani Aminabadi, Julie Bernauer, Xia Song, Mohammad Shoeybi, Yuxiong He, Michael Houston, Saurabh Tiwary, Bryan Catanzaro

    Abstract: Pretrained general-purpose language models can achieve state-of-the-art accuracies in various natural language processing domains by adapting to downstream tasks via zero-shot, few-shot and fine-tuning techniques. Because of their success, the size of these models has increased rapidly, requiring high-performance hardware, software, and algorithmic techniques to enable training such large models.… ▽ More

    Submitted 4 February, 2022; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: Shaden Smith and Mostofa Patwary contributed equally

  15. arXiv:2112.07868  [pdf, other

    cs.CL cs.AI

    Few-shot Instruction Prompts for Pretrained Language Models to Detect Social Biases

    Authors: Shrimai Prabhumoye, Rafal Kocielnik, Mohammad Shoeybi, Anima Anandkumar, Bryan Catanzaro

    Abstract: Detecting social bias in text is challenging due to nuance, subjectivity, and difficulty in obtaining good quality labeled datasets at scale, especially given the evolving nature of social biases and society. To address these challenges, we propose a few-shot instruction-based method for prompting pre-trained language models (LMs). We select a few class-balanced exemplars from a small support repo… ▽ More

    Submitted 15 April, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

    Comments: Submission revised with new results

  16. arXiv:2104.12714  [pdf, other

    cs.CL

    Focused Attention Improves Document-Grounded Generation

    Authors: Shrimai Prabhumoye, Kazuma Hashimoto, Yingbo Zhou, Alan W Black, Ruslan Salakhutdinov

    Abstract: Document grounded generation is the task of using the information provided in a document to improve text generation. This work focuses on two different document grounded generation tasks: Wikipedia Update Generation task and Dialogue response generation. Our work introduces two novel adaptations of large scale pre-trained encoder-decoder models focusing on building context driven representation of… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

    Comments: Accepted at North American Chapter of the Association for Computational Linguistics (NAACL) 2021

  17. arXiv:2104.00814  [pdf, other

    cs.CL

    CURIE: An Iterative Querying Approach for Reasoning About Situations

    Authors: Dheeraj Rajagopal, Aman Madaan, Niket Tandon, Yiming Yang, Shrimai Prabhumoye, Abhilasha Ravichander, Peter Clark, Eduard Hovy

    Abstract: Recently, models have been shown to predict the effects of unexpected situations, e.g., would cloudy skies help or hinder plant growth? Given a context, the goal of such situational reasoning is to elicit the consequences of a new situation (st) that arises in that context. We propose a method to iteratively build a graph of relevant consequences explicitly in a structured situational graph (st-gr… ▽ More

    Submitted 5 April, 2021; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: This paper builds upon EIGEN (arXiv:2010.11764) and proposes a general framework for situational reasoning

  18. arXiv:2010.11764  [pdf, other

    cs.CL

    EIGEN: Event Influence GENeration using Pre-trained Language Models

    Authors: Aman Madaan, Dheeraj Rajagopal, Yiming Yang, Abhilasha Ravichander, Eduard Hovy, Shrimai Prabhumoye

    Abstract: Reasoning about events and tracking their influences is fundamental to understanding processes. In this paper, we present EIGEN - a method to leverage pre-trained language models to generate event influences conditioned on a context, nature of their influence, and the distance in a reasoning chain. We also derive a new dataset for research and evaluation of methods for event influence generation.… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

  19. arXiv:2010.04658  [pdf, other

    cs.CL

    Case Study: Deontological Ethics in NLP

    Authors: Shrimai Prabhumoye, Brendon Boldt, Ruslan Salakhutdinov, Alan W Black

    Abstract: Recent work in natural language processing (NLP) has focused on ethical challenges such as understanding and mitigating bias in data and algorithms; identifying objectionable content like hate speech, stereotypes and offensive language; and building frameworks for better system design and data handling practices. However, there has been little discussion about the ethical foundations that underlie… ▽ More

    Submitted 12 April, 2021; v1 submitted 9 October, 2020; originally announced October 2020.

    Comments: Accepted at North American Chapter of the Association for Computational Linguistics (NAACL) 2021

  20. arXiv:2005.01822  [pdf, other

    cs.CL

    Exploring Controllable Text Generation Techniques

    Authors: Shrimai Prabhumoye, Alan W Black, Ruslan Salakhutdinov

    Abstract: Neural controllable text generation is an important area gaining attention due to its plethora of applications. Although there is a large body of prior work in controllable text generation, there is no unifying theme. In this work, we provide a new schema of the pipeline of the generation process by classifying it into five modules. The control of attributes in the generation process requires modi… ▽ More

    Submitted 30 October, 2020; v1 submitted 4 May, 2020; originally announced May 2020.

    Comments: Will be published at COLING 2020

  21. arXiv:2005.00432  [pdf, ps, other

    cs.CL

    Topological Sort for Sentence Ordering

    Authors: Shrimai Prabhumoye, Ruslan Salakhutdinov, Alan W Black

    Abstract: Sentence ordering is the task of arranging the sentences of a given text in the correct order. Recent work using deep neural networks for this task has framed it as a sequence prediction problem. In this paper, we propose a new framing of this task as a constraint solving problem and introduce a new technique to solve it. Additionally, we propose a human evaluation for this task. The results on bo… ▽ More

    Submitted 1 May, 2020; originally announced May 2020.

    Comments: Will be published at the Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL) 2020

  22. arXiv:2004.14257  [pdf, other

    cs.CL

    Politeness Transfer: A Tag and Generate Approach

    Authors: Aman Madaan, Amrith Setlur, Tanmay Parekh, Barnabas Poczos, Graham Neubig, Yiming Yang, Ruslan Salakhutdinov, Alan W Black, Shrimai Prabhumoye

    Abstract: This paper introduces a new task of politeness transfer which involves converting non-polite sentences to polite sentences while preserving the meaning. We also provide a dataset of more than 1.39 instances automatically labeled for politeness to encourage benchmark evaluations on this new task. We design a tag and generate pipeline that identifies stylistic attributes and subsequently generates a… ▽ More

    Submitted 1 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

    Comments: To appear at ACL 2020

  23. arXiv:2002.02878  [pdf, other

    cs.AI cs.CL stat.ML

    I love your chain mail! Making knights smile in a fantasy game world: Open-domain goal-oriented dialogue agents

    Authors: Shrimai Prabhumoye, Margaret Li, Jack Urbanek, Emily Dinan, Douwe Kiela, Jason Weston, Arthur Szlam

    Abstract: Dialogue research tends to distinguish between chit-chat and goal-oriented tasks. While the former is arguably more naturalistic and has a wider use of language, the latter has clearer metrics and a straightforward learning signal. Humans effortlessly combine the two, for example engaging in chit-chat with the goal of exchanging information or eliciting a specific response. Here, we bridge the div… ▽ More

    Submitted 10 February, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  24. arXiv:2001.04980  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Modeling Product Search Relevance in e-Commerce

    Authors: Rahul Radhakrishnan Iyer, Rohan Kohli, Shrimai Prabhumoye

    Abstract: With the rapid growth of e-Commerce, online product search has emerged as a popular and effective paradigm for customers to find desired products and engage in online shopping. However, there is still a big gap between the products that customers really desire to purchase and relevance of products that are suggested in response to a query from the customer. In this paper, we propose a robust way o… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

    Comments: 11 pages, 3 figures

  25. arXiv:1911.09194  [pdf, other

    cs.AI cs.CL cs.LG

    Generating Interactive Worlds with Text

    Authors: Angela Fan, Jack Urbanek, Pratik Ringshia, Emily Dinan, Emma Qian, Siddharth Karamcheti, Shrimai Prabhumoye, Douwe Kiela, Tim Rocktaschel, Arthur Szlam, Jason Weston

    Abstract: Procedurally generating cohesive and interesting game environments is challenging and time-consuming. In order for the relationships between the game elements to be natural, common-sense has to be encoded into arrangement of the elements. In this work, we investigate a machine learning approach for world creation using content from the multi-player text adventure game environment LIGHT. We introdu… ▽ More

    Submitted 4 December, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

  26. arXiv:1906.06425  [pdf, other

    cs.CL cs.AI

    Principled Frameworks for Evaluating Ethics in NLP Systems

    Authors: Shrimai Prabhumoye, Elijah Mayfield, Alan W Black

    Abstract: We critique recent work on ethics in natural language processing. Those discussions have focused on data collection, experimental design, and interventions in modeling. But we argue that we ought to first understand the frameworks of ethics that are being used to evaluate the fairness and justice of algorithmic systems. Here, we begin that discussion by outlining deontological ethics, and envision… ▽ More

    Submitted 14 June, 2019; originally announced June 2019.

    Journal ref: Widening NLP Workshop at ACL 2019

  27. arXiv:1906.06401  [pdf, other

    cs.CL

    "My Way of Telling a Story": Persona based Grounded Story Generation

    Authors: Shrimai Prabhumoye, Khyathi Raghavi Chandu, Ruslan Salakhutdinov, Alan W Black

    Abstract: Visual storytelling is the task of generating stories based on a sequence of images. Inspired by the recent works in neural generation focusing on controlling the form of text, this paper explores the idea of generating these stories in different personas. However, one of the main challenges of performing this task is the lack of a dataset of visual stories in different personas. Having said that,… ▽ More

    Submitted 14 June, 2019; originally announced June 2019.

    Journal ref: Storytelling Workshop at ACL 2019

  28. arXiv:1905.05293  [pdf, other

    cs.CL

    Towards Content Transfer through Grounded Text Generation

    Authors: Shrimai Prabhumoye, Chris Quirk, Michel Galley

    Abstract: Recent work in neural generation has attracted significant interest in controlling the form of text, such as style, persona, and politeness. However, there has been less work on controlling neural text generation for content. This paper introduces the notion of Content Transfer for long-form text generation, where the task is to generate a next sentence in a document that both fits its context and… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Journal ref: Proc. NAACL 2019

  29. arXiv:1902.00098  [pdf, other

    cs.AI cs.CL cs.HC

    The Second Conversational Intelligence Challenge (ConvAI2)

    Authors: Emily Dinan, Varvara Logacheva, Valentin Malykh, Alexander Miller, Kurt Shuster, Jack Urbanek, Douwe Kiela, Arthur Szlam, Iulian Serban, Ryan Lowe, Shrimai Prabhumoye, Alan W Black, Alexander Rudnicky, Jason Williams, Joelle Pineau, Mikhail Burtsev, Jason Weston

    Abstract: We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (i) pretrained Transformer variants are currently the best performing models on this task, (ii) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics lik… ▽ More

    Submitted 31 January, 2019; originally announced February 2019.

  30. arXiv:1809.07358  [pdf, ps, other

    cs.CL

    A Dataset for Document Grounded Conversations

    Authors: Kangyan Zhou, Shrimai Prabhumoye, Alan W Black

    Abstract: This paper introduces a document grounded dataset for text conversations. We define "Document Grounded Conversations" as conversations that are about the contents of a specified document. In this dataset the specified documents were Wikipedia articles about popular movies. The dataset contains 4112 conversations with an average of 21.43 turns per conversation. This positions this dataset to not on… ▽ More

    Submitted 19 September, 2018; originally announced September 2018.

    Journal ref: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2018

  31. arXiv:1809.06284  [pdf, other

    cs.CL

    Style Transfer Through Multilingual and Feedback-Based Back-Translation

    Authors: Shrimai Prabhumoye, Yulia Tsvetkov, Alan W Black, Ruslan Salakhutdinov

    Abstract: Style transfer is the task of transferring an attribute of a sentence (e.g., formality) while maintaining its semantic content. The key challenge in style transfer is to strike a balance between the competing goals, one to preserve meaning and the other to improve the style transfer accuracy. Prior research has identified that the task of meaning preservation is generally harder to attain and eval… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

  32. arXiv:1804.09000  [pdf, other

    cs.CL

    Style Transfer Through Back-Translation

    Authors: Shrimai Prabhumoye, Yulia Tsvetkov, Ruslan Salakhutdinov, Alan W Black

    Abstract: Style transfer is the task of rephrasing the text to contain specific stylistic properties without changing the intent or affect within the context. This paper introduces a new method for automatic style transfer. We first learn a latent representation of the input sentence which is grounded in a language translation model in order to better preserve the meaning of the sentence while reducing styl… ▽ More

    Submitted 24 May, 2018; v1 submitted 24 April, 2018; originally announced April 2018.

    Comments: Accepted at ACL 2018

  33. arXiv:1707.04546  [pdf, other

    cs.CL cs.SI

    Linguistic Markers of Influence in Informal Interactions

    Authors: Shrimai Prabhumoye, Samridhi Choudhary, Evangelia Spiliopoulou, Christopher Bogart, Carolyn Penstein Rose, Alan W Black

    Abstract: There has been a long standing interest in understanding `Social Influence' both in Social Sciences and in Computational Linguistics. In this paper, we present a novel approach to study and measure interpersonal influence in daily interactions. Motivated by the basic principles of influence, we attempt to identify indicative linguistic features of the posts in an online knitting community. We pres… ▽ More

    Submitted 14 July, 2017; originally announced July 2017.

    Comments: 10 pages, Accepted in NLP+CSS workshop for ACL (Association for Computational Linguistics) 2017