Zum Hauptinhalt springen

Showing 51–100 of 110 results for author: Mihalcea, R

.
  1. arXiv:2210.01478  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

    Authors: Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

    Abstract: AI systems are becoming increasingly intertwined with human life. In order to effectively collaborate with humans and ensure safety, AI systems need to be able to understand, interpret and predict human moral judgments and decisions. Human moral judgments are often guided by rules, but not always. A central challenge for AI safety is capturing the flexibility of the human moral mind -- the ability… ▽ More

    Submitted 27 October, 2022; v1 submitted 4 October, 2022; originally announced October 2022.

    Comments: NeurIPS 2022 Oral

  2. arXiv:2209.06650  [pdf, other

    cs.CV cs.CL

    WildQA: In-the-Wild Video Question Answering

    Authors: Santiago Castro, Naihao Deng, Pingxuan Huang, Mihai Burzo, Rada Mihalcea

    Abstract: Existing video understanding datasets mostly focus on human interactions, with little attention being paid to the "in the wild" settings, where the videos are recorded outdoors. We propose WILDQA, a video understanding dataset of videos recorded in outside settings. In addition to video question answering (Video QA), we also introduce the new task of identifying visual support for a given question… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: *: Equal contribution; COLING 2022 oral; project webpage: https://lit.eecs.umich.edu/wildqa/

  3. arXiv:2208.10766  [pdf, other

    cs.SI cs.CL

    We Are in This Together: Quantifying Community Subjective Wellbeing and Resilience

    Authors: MeiXing Dong, Ruixuan Sun, Laura Biester, Rada Mihalcea

    Abstract: The COVID-19 pandemic disrupted everyone's life across the world. In this work, we characterize the subjective wellbeing patterns of 112 cities across the United States during the pandemic prior to vaccine availability, as exhibited in subreddits corresponding to the cities. We quantify subjective wellbeing using positive and negative affect. We then measure the pandemic's impact by comparing a co… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

    Comments: ICWSM'23 full paper, 12 pages

  4. arXiv:2207.05553  [pdf, other

    cs.CL

    Using Paraphrases to Study Properties of Contextual Embeddings

    Authors: Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea

    Abstract: We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings. Using the Paraphrase Database's alignments, we study words within paraphrases as well as phrase representations. We find that contextual… ▽ More

    Submitted 12 July, 2022; originally announced July 2022.

    Comments: Published at NAACL 2022

  5. arXiv:2203.13926  [pdf, other

    cs.CL cs.AI

    CICERO: A Dataset for Contextualized Commonsense Inference in Dialogues

    Authors: Deepanway Ghosal, Siqi Shen, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: This paper addresses the problem of dialogue reasoning with contextualized commonsense inference. We curate CICERO, a dataset of dyadic conversations with five types of utterance-level reasoning-based inferences: cause, subsequent event, prerequisite, motivation, and emotional reaction. The dataset contains 53,105 of such inferences from 5,672 dialogues. We use this dataset to solve relevant gener… ▽ More

    Submitted 6 April, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  6. arXiv:2202.13758  [pdf, other

    cs.CL cs.AI cs.CY cs.LG cs.LO

    Logical Fallacy Detection

    Authors: Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

    Abstract: Reasoning is central to human intelligence. However, fallacious arguments are common, and some exacerbate problems such as spreading misinformation about climate change. In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate… ▽ More

    Submitted 11 December, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: EMNLP 2021 Findings

  7. arXiv:2202.08138  [pdf, other

    cs.CV cs.CL

    When Did It Happen? Duration-informed Temporal Localization of Narrated Actions in Vlogs

    Authors: Oana Ignat, Santiago Castro, Yuhang Zhou, Jiajun Bao, Dandan Shan, Rada Mihalcea

    Abstract: We consider the task of temporal human action localization in lifestyle vlogs. We introduce a novel dataset consisting of manual annotations of temporal localization for 13,000 narrated actions in 1,200 video clips. We present an extensive analysis of this data, which allows us to better understand how the language and visual modalities interact throughout the videos. We propose a simple yet effec… ▽ More

    Submitted 21 February, 2022; v1 submitted 16 February, 2022; originally announced February 2022.

    Comments: arXiv admin note: text overlap with arXiv:1906.04236

  8. arXiv:2202.07094  [pdf, other

    cs.CL

    Matching Tweets With Applicable Fact-Checks Across Languages

    Authors: Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Scott A. Hale, Rada Mihalcea

    Abstract: An important challenge for news fact-checking is the effective dissemination of existing fact-checks. This in turn brings the need for reliable methods to detect previously fact-checked claims. In this paper, we focus on automatically finding existing fact-checks for claims made in social media posts (tweets). We conduct both classification and retrieval experiments, in monolingual (English only),… ▽ More

    Submitted 12 June, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

    Comments: Accepted to De-Factify Workshop at AAAI 2022

  9. arXiv:2110.08445  [pdf, other

    cs.CL

    How Well Do You Know Your Audience? Toward Socially-aware Question Generation

    Authors: Ian Stewart, Rada Mihalcea

    Abstract: When writing, a person may need to anticipate questions from their audience, but different social groups may ask very different types of questions. If someone is writing about a problem they want to resolve, what kind of follow-up question will a domain expert ask, and could the writer better address the expert's information needs by rewriting their original post? In this paper, we explore the tas… ▽ More

    Submitted 24 July, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: SIGDIAL 2022

    ACM Class: I.7

  10. arXiv:2109.13770  [pdf, other

    cs.CL

    Micromodels for Efficient, Explainable, and Reusable Systems: A Case Study on Mental Health

    Authors: Andrew Lee, Jonathan K. Kummerfeld, Lawrence C. An, Rada Mihalcea

    Abstract: Many statistical models have high accuracy on test benchmarks, but are not explainable, struggle in low-resource scenarios, cannot be reused for multiple tasks, and cannot easily integrate domain expertise. These factors limit their use, particularly in settings such as mental health, where it is difficult to annotate datasets and model outputs have significant impact. We introduce a micromodel ar… ▽ More

    Submitted 28 September, 2021; originally announced September 2021.

    Comments: To appear in Findings of EMNLP 2021

  11. arXiv:2109.02747  [pdf, other

    cs.CV cs.CL

    WhyAct: Identifying Action Reasons in Lifestyle Vlogs

    Authors: Oana Ignat, Santiago Castro, Hanwen Miao, Weiji Li, Rada Mihalcea

    Abstract: We aim to automatically identify human action reasons in online videos. We focus on the widespread genre of lifestyle vlogs, in which people perform actions while verbally describing them. We introduce and make publicly available the WhyAct dataset, consisting of 1,077 visual actions manually annotated with their reasons. We describe a multimodal model that leverages visual and textual information… ▽ More

    Submitted 9 September, 2021; v1 submitted 6 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  12. arXiv:2109.02247  [pdf, other

    cs.CL cs.AI

    STaCK: Sentence Ordering with Temporal Commonsense Knowledge

    Authors: Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Sentence order prediction is the task of finding the correct order of sentences in a randomly ordered document. Correctly ordering the sentences requires an understanding of coherence with respect to the chronological sequence of events described in the text. Document-level contextual understanding and commonsense knowledge centered around these events are often essential in uncovering this cohere… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

    Comments: Accepted as a full paper at EMNLP 2021

  13. arXiv:2106.12976  [pdf, other

    cs.CL

    Exploring Self-Identified Counseling Expertise in Online Support Forums

    Authors: Allison Lahnala, Yuntian Zhao, Charles Welch, Jonathan K. Kummerfeld, Lawrence An, Kenneth Resnicow, Rada Mihalcea, Verónica Pérez-Rosas

    Abstract: A growing number of people engage in online health forums, making it important to understand the quality of the advice they receive. In this paper, we explore the role of expertise in responses provided to help-seeking posts regarding mental health. We study the differences between (1) interactions with peers; and (2) interactions with self-identified mental health professionals. First, we show th… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: Accepted to Findings of ACL 2021

  14. arXiv:2106.11791  [pdf, other

    cs.CL cs.AI

    Exemplars-guided Empathetic Response Generation Controlled by the Elements of Human Communication

    Authors: Navonil Majumder, Deepanway Ghosal, Devamanyu Hazarika, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: The majority of existing methods for empathetic response generation rely on the emotion of the context to generate empathetic responses. However, empathy is much more than generating responses with an appropriate emotion. It also often entails subtle expressions of understanding and personal resonance with the situation of the other interlocutor. Unfortunately, such qualities are difficult to quan… ▽ More

    Submitted 4 August, 2021; v1 submitted 22 June, 2021; originally announced June 2021.

  15. arXiv:2106.02359  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

    Authors: Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan, Rada Mihalcea

    Abstract: Recent years have seen many breakthroughs in natural language processing (NLP), transitioning it from a mostly theoretical field to one with many real-world applications. Noting the rising number of applications of other machine learning and AI techniques with pervasive societal impact, we anticipate the rising importance of developing NLP technologies for social good. Inspired by theories in mora… ▽ More

    Submitted 17 January, 2023; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Findings of ACL 2021; also accepted at the NLP for Positive Impact workshop@ACL 2021

  16. arXiv:2106.00510  [pdf, other

    cs.CL cs.AI cs.LG

    CIDER: Commonsense Inference for Dialogue Explanation and Reasoning

    Authors: Deepanway Ghosal, Pengfei Hong, Siqi Shen, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Commonsense inference to understand and explain human language is a fundamental research problem in natural language processing. Explaining human conversations poses a great challenge as it requires contextual understanding, planning, inference, and several aspects of reasoning including causal, temporal, and commonsense reasoning. In this work, we introduce CIDER -- a manually curated dataset tha… ▽ More

    Submitted 29 June, 2021; v1 submitted 1 June, 2021; originally announced June 2021.

    Comments: SIGDIAL 2021

  17. arXiv:2105.08146  [pdf, other

    cs.CL cs.SD eess.AS

    MUSER: MUltimodal Stress Detection using Emotion Recognition as an Auxiliary Task

    Authors: Yiqun Yao, Michalis Papakostas, Mihai Burzo, Mohamed Abouelenien, Rada Mihalcea

    Abstract: The capability to automatically detect human stress can benefit artificial intelligent agents involved in affective computing and human-computer interaction. Stress and emotion are both human affective states, and stress has proven to have important implications on the regulation and expression of emotion. Although a series of methods have been established for multimodal stress detection, limited… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: NAACL 2021 accepted

  18. arXiv:2105.08031  [pdf, other

    cs.CL

    Room to Grow: Understanding Personal Characteristics Behind Self Improvement Using Social Media

    Authors: MeiXing Dong, Xueming Xu, Yiwei Zhang, Ian Stewart, Rada Mihalcea

    Abstract: Many people aim for change, but not everyone succeeds. While there are a number of social psychology theories that propose motivation-related characteristics of those who persist with change, few computational studies have explored the motivational stage of personal change. In this paper, we investigate a new dataset consisting of the writings of people who manifest intention to change, some of wh… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: 10 pages, Accepted to be published at SocialNLP at NAACL'21

  19. arXiv:2104.12918  [pdf, other

    cs.CL

    Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News

    Authors: Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Rada Mihalcea

    Abstract: In this paper, we explore the construction of natural language explanations for news claims, with the goal of assisting fact-checking and news evaluation applications. We experiment with two methods: (1) an extractive method based on Biased TextRank -- a resource-effective unsupervised graph-based algorithm for content extraction; and (2) an abstractive method based on the GPT-2 language model. We… ▽ More

    Submitted 26 April, 2021; originally announced April 2021.

    Comments: Accepted to NLP for Internet Freedom Workshop at NAACL 2021

  20. arXiv:2104.04182  [pdf, other

    cs.CV

    FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework

    Authors: Santiago Castro, Ruoyao Wang, Pingxuan Huang, Ian Stewart, Oana Ignat, Nan Liu, Jonathan C. Stroud, Rada Mihalcea

    Abstract: We propose fill-in-the-blanks as a video understanding evaluation framework and introduce FIBER -- a novel dataset consisting of 28,000 videos and descriptions in support of this evaluation framework. The fill-in-the-blanks setting tests a model's understanding of a video by requiring it to predict a masked noun phrase in the caption of the video, given the video and the surrounding text. The FIBE… ▽ More

    Submitted 22 March, 2022; v1 submitted 9 April, 2021; originally announced April 2021.

    Comments: Accepted at ACL 2022 Main conference. Camera-ready version

  21. arXiv:2102.02917  [pdf, other

    cs.SD cs.AI cs.CL

    Chord Embeddings: Analyzing What They Capture and Their Role for Next Chord Prediction and Artist Attribute Prediction

    Authors: Allison Lahnala, Gauri Kambhatla, Jiajun Peng, Matthew Whitehead, Gillian Minnehan, Eric Guldan, Jonathan K. Kummerfeld, Anıl Çamcı, Rada Mihalcea

    Abstract: Natural language processing methods have been applied in a variety of music studies, drawing the connection between music and language. In this paper, we expand those approaches by investigating \textit{chord embeddings}, which we apply in two case studies to address two key questions: (1) what musical information do chord embeddings capture?; and (2) how might musical applications benefit from th… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: 16 pages, accepted to EvoMUSART

    Journal ref: Computational Intelligence in Music, Sound, Art and Design, 10th International Conference, EvoMUSART 2021

  22. arXiv:2101.10894  [pdf

    cs.CV cs.CY

    White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content

    Authors: Thamar Solorio, Mahsa Shafaei, Christos Smailis, Mona Diab, Theodore Giannakopoulos, Heng Ji, Yang Liu, Rada Mihalcea, Smaranda Muresan, Ioannis Kakadiaris

    Abstract: This white paper presents a summary of the discussions regarding critical considerations to develop an extensive repository of online videos annotated with labels indicating questionable content. The main discussion points include: 1) the type of appropriate labels that will result in a valuable repository for the larger AI community; 2) how to design the collection and annotation process, as well… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

  23. arXiv:2012.11820  [pdf, other

    cs.CL

    Recognizing Emotion Cause in Conversations

    Authors: Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Pengfei Hong, Romila Ghosh, Abhinaba Roy, Niyati Chhaya, Alexander Gelbukh, Rada Mihalcea

    Abstract: We address the problem of recognizing emotion cause in conversations, define two novel sub-tasks of this problem, and provide a corresponding dialogue-level dataset, along with strong Transformer-based baselines. The dataset is available at https://github.com/declare-lab/RECCON. Introduction: Recognizing the cause behind emotions in text is a fundamental yet under-explored area of research in NL… ▽ More

    Submitted 28 July, 2021; v1 submitted 21 December, 2020; originally announced December 2020.

    Comments: https://github.com/declare-lab/RECCON, Accepted at Cognitive Computation

  24. arXiv:2012.06236  [pdf, other

    cs.CL

    Improving Zero Shot Learning Baselines with Commonsense Knowledge

    Authors: Abhinaba Roy, Deepanway Ghosal, Erik Cambria, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Zero shot learning -- the problem of training and testing on a completely disjoint set of classes -- relies greatly on its ability to transfer knowledge from train classes to test classes. Traditionally semantic embeddings consisting of human defined attributes (HA) or distributed word embeddings (DWE) are used to facilitate this transfer by improving the association between visual and semantic em… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

  25. arXiv:2011.06057  [pdf, other

    cs.CL

    Exploring the Value of Personalized Word Embeddings

    Authors: Charles Welch, Jonathan K. Kummerfeld, Verónica Pérez-Rosas, Rada Mihalcea

    Abstract: In this paper, we introduce personalized word embeddings, and examine their value for language modeling. We compare the performance of our proposed prediction model when using personalized versus generic word representations, and study how these representations can be leveraged for improved performance. We provide insight into what types of words can be more accurately predicted when building pers… ▽ More

    Submitted 11 November, 2020; originally announced November 2020.

    Comments: COLING 2020

  26. arXiv:2011.01026  [pdf, other

    cs.CL

    Biased TextRank: Unsupervised Graph-Based Content Extraction

    Authors: Ashkan Kazemi, Verónica Pérez-Rosas, Rada Mihalcea

    Abstract: We introduce Biased TextRank, a graph-based content extraction method inspired by the popular TextRank algorithm that ranks text spans according to their importance for language processing tasks and according to their relevance to an input "focus." Biased TextRank enables focused content extraction for text by modifying the random restarts in the execution of TextRank. The random restart probabili… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: Accepted to COLING 2020

  27. arXiv:2011.00416  [pdf, other

    cs.CL cs.AI cs.LG

    Deep Learning for Text Style Transfer: A Survey

    Authors: Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea

    Abstract: Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others. It has a long history in the field of natural language processing, and recently has re-gained significant attention thanks to the promising performance brought by deep neural models. In this paper, we present a… ▽ More

    Submitted 16 December, 2021; v1 submitted 1 November, 2020; originally announced November 2020.

    Comments: Computational Linguistics Journal 2022

  28. arXiv:2010.02986  [pdf, other

    cs.CL cs.AI cs.LG

    Compositional Demographic Word Embeddings

    Authors: Charles Welch, Jonathan K. Kummerfeld, Verónica Pérez-Rosas, Rada Mihalcea

    Abstract: Word embeddings are usually derived from corpora containing text from many individuals, thus leading to general purpose representations rather than individually personalized representations. While personalized embeddings can be useful to improve language model performance and other language processing tasks, they can only be computed for people with a large amount of longitudinal data, which is no… ▽ More

    Submitted 29 October, 2020; v1 submitted 6 October, 2020; originally announced October 2020.

    Comments: To appear at EMNLP 2020

  29. arXiv:2010.02795  [pdf, other

    cs.CL

    COSMIC: COmmonSense knowledge for eMotion Identification in Conversations

    Authors: Deepanway Ghosal, Navonil Majumder, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: In this paper, we address the task of utterance level emotion recognition in conversations using commonsense knowledge. We propose COSMIC, a new framework that incorporates different elements of commonsense such as mental states, events, and causal relations, and build upon them to learn interactions between interlocutors participating in a conversation. Current state-of-the-art methods often enco… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

  30. arXiv:2010.01454  [pdf, other

    cs.CL

    MIME: MIMicking Emotions for Empathetic Response Generation

    Authors: Navonil Majumder, Pengfei Hong, Shanshan Peng, Jiankun Lu, Deepanway Ghosal, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

    Abstract: Current approaches to empathetic response generation view the set of emotions expressed in the input text as a flat structure, where all the emotions are treated uniformly. We argue that empathetic responses often mimic the emotion of the user to a varying degree, depending on its positivity or negativity and content. We show that the consideration of this polarity-based emotion clusters and emoti… ▽ More

    Submitted 3 October, 2020; originally announced October 2020.

    Comments: EMNLP 2020

  31. arXiv:2009.14109  [pdf, other

    cs.CL

    Improving Low Compute Language Modeling with In-Domain Embedding Initialisation

    Authors: Charles Welch, Rada Mihalcea, Jonathan K. Kummerfeld

    Abstract: Many NLP applications, such as biomedical data and technical support, have 10-100 million tokens of in-domain data and limited computational resources for learning from it. How should we train a language model in this scenario? Most language modeling research considers either a small dataset with a closed vocabulary (like the standard 1 million token Penn Treebank), or the whole web with byte-pair… ▽ More

    Submitted 30 September, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

    Comments: To appear at EMNLP 2020

    ACM Class: I.2.7

  32. arXiv:2009.13902  [pdf, other

    cs.CL

    Utterance-level Dialogue Understanding: An Empirical Study

    Authors: Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: The recent abundance of conversational data on the Web and elsewhere calls for effective NLP systems for dialog understanding. Complete utterance-level understanding often requires context understanding, defined by nearby utterances. In recent years, a number of approaches have been proposed for various utterance-level dialogue understanding tasks. Most of these approaches account for the context… ▽ More

    Submitted 22 October, 2020; v1 submitted 29 September, 2020; originally announced September 2020.

  33. arXiv:2009.04008  [pdf, other

    cs.CL cs.SI

    Quantifying the Effects of COVID-19 on Mental Health Support Forums

    Authors: Laura Biester, Katie Matton, Janarthanan Rajendran, Emily Mower Provost, Rada Mihalcea

    Abstract: The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health. Understanding its impact can inform strategies for mitigating negative consequences. In this work, we seek to better understand the effects of COVID-19 on mental health by examining discussions within mental health support communities on Reddit. First, we quantify… ▽ More

    Submitted 8 September, 2020; originally announced September 2020.

  34. arXiv:2007.03819  [pdf, other

    cs.HC cs.CL cs.CY

    Expressive Interviewing: A Conversational System for Coping with COVID-19

    Authors: Charles Welch, Allison Lahnala, Verónica Pérez-Rosas, Siqi Shen, Sarah Seraj, Larry An, Kenneth Resnicow, James Pennebaker, Rada Mihalcea

    Abstract: The ongoing COVID-19 pandemic has raised concerns for many regarding personal and public health implications, financial security and economic stability. Alongside many other unprecedented challenges, there are increasing concerns over social isolation and mental health. We introduce \textit{Expressive Interviewing}--an interview-style conversational system that draws on ideas from motivational int… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

  35. arXiv:2006.00578  [pdf, other

    cs.CL

    "Judge me by my size (noun), do you?'' YodaLib: A Demographic-Aware Humor Generation Framework

    Authors: Aparna Garimella, Carmen Banea, Nabil Hossain, Rada Mihalcea

    Abstract: The subjective nature of humor makes computerized humor generation a challenging task. We propose an automatic humor generation framework for filling the blanks in Mad Libs stories, while accounting for the demographic backgrounds of the desired audience. We collect a dataset consisting of such stories, which are filled in and judged by carefully selected workers on Amazon Mechanical Turk. We buil… ▽ More

    Submitted 31 May, 2020; originally announced June 2020.

  36. arXiv:2005.00791  [pdf, other

    cs.CL

    KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis

    Authors: Deepanway Ghosal, Devamanyu Hazarika, Abhinaba Roy, Navonil Majumder, Rada Mihalcea, Soujanya Poria

    Abstract: Cross-domain sentiment analysis has received significant attention in recent years, prompted by the need to combat the domain gap between different applications that make use of sentiment analysis. In this paper, we take a novel perspective on this task by exploring the role of external commonsense knowledge. We introduce a new framework, KinGDOM, which utilizes the ConceptNet knowledge graph to e… ▽ More

    Submitted 11 May, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

  37. arXiv:2005.00357  [pdf, other

    cs.CL cs.IR

    Beneath the Tip of the Iceberg: Current Challenges and New Directions in Sentiment Analysis Research

    Authors: Soujanya Poria, Devamanyu Hazarika, Navonil Majumder, Rada Mihalcea

    Abstract: Sentiment analysis as a field has come a long way since it was first introduced as a task nearly 20 years ago. It has widespread commercial applications in various domains like marketing, risk management, market research, and politics, to name a few. Given its saturation in specific subtasks -- such as sentiment polarity classification -- and datasets, there is an underlying perception that this f… ▽ More

    Submitted 16 November, 2020; v1 submitted 1 May, 2020; originally announced May 2020.

    Comments: Published in the IEEE Transactions on Affective Computing (TAFFC)

  38. arXiv:2004.14876  [pdf, other

    cs.CL

    Analyzing the Surprising Variability in Word Embedding Stability Across Languages

    Authors: Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea

    Abstract: Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages. To gain further insight into word embeddings, we explore their stability (e.g., overlap between the nearest neighbors of a word in different embedding spaces) in diverse languages. We discuss linguistic properties that are related to stabi… ▽ More

    Submitted 9 September, 2021; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: Accepted to EMNLP 2021

  39. arXiv:1912.02256  [pdf, other

    cs.CV

    Compositional Temporal Visual Grounding of Natural Language Event Descriptions

    Authors: Jonathan C. Stroud, Ryan McCaffrey, Rada Mihalcea, Jia Deng, Olga Russakovsky

    Abstract: Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions. Compositional modeling becomes central: we first ground atomic descriptions "girl eating an apple," "batter hitting the ball" to short video segments, and then establish the temporal relationships between the segments. This compositional structure enables models to reco… ▽ More

    Submitted 4 December, 2019; originally announced December 2019.

    Comments: Project page: jonathancstroud.com/ctg

  40. arXiv:1910.04980  [pdf, other

    cs.CL

    Conversational Transfer Learning for Emotion Recognition

    Authors: Devamanyu Hazarika, Soujanya Poria, Roger Zimmermann, Rada Mihalcea

    Abstract: Recognizing emotions in conversations is a challenging task due to the presence of contextual dependencies governed by self- and inter-personal influences. Recent approaches have focused on modeling these dependencies primarily via supervised learning. However, purely supervised strategies demand large amounts of annotated data, which is lacking in most of the available corpora in this task. To ta… ▽ More

    Submitted 19 May, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: Information Fusion

  41. arXiv:1909.01543  [pdf, other

    cs.LG stat.ML

    Towards Automatic Detection of Misinformation in Online Medical Videos

    Authors: Rui Hou, Verónica Pérez-Rosas, Stacy Loeb, Rada Mihalcea

    Abstract: Recent years have witnessed a significant increase in the online sharing of medical information, with videos representing a large fraction of such online sources. Previous studies have however shown that more than half of the health-related videos on platforms such as YouTube contain misleading information and biases. Hence, it is crucial to build computational tools that can help evaluate the qua… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

  42. arXiv:1908.06008  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Variational Fusion for Multimodal Sentiment Analysis

    Authors: Navonil Majumder, Soujanya Poria, Gangeshwar Krishnamurthy, Niyati Chhaya, Rada Mihalcea, Alexander Gelbukh

    Abstract: Multimodal fusion is considered a key step in multimodal tasks such as sentiment analysis, emotion detection, question answering, and others. Most of the recent work on multimodal fusion does not guarantee the fidelity of the multimodal representation with respect to the unimodal representations. In this paper, we propose a variational autoencoder-based approach for modality fusion that minimizes… ▽ More

    Submitted 13 August, 2019; originally announced August 2019.

  43. arXiv:1907.08540  [pdf, other

    cs.CL

    Predicting Human Activities from User-Generated Content

    Authors: Steven R. Wilson, Rada Mihalcea

    Abstract: The activities we do are linked to our interests, personality, political preferences, and decisions we make about the future. In this paper, we explore the task of predicting human activities from user-generated content. We collect a dataset containing instances of social media users writing about a range of everyday activities. We then use a state-of-the-art sentence embedding framework tailored… ▽ More

    Submitted 19 July, 2019; originally announced July 2019.

    Comments: ACL 2019

  44. Identifying Visible Actions in Lifestyle Vlogs

    Authors: Oana Ignat, Laura Burdick, Jia Deng, Rada Mihalcea

    Abstract: We consider the task of identifying human actions visible in online videos. We focus on the widely spread genre of lifestyle vlogs, which consist of videos of people performing actions while verbally describing them. Our goal is to identify if actions mentioned in the speech description of a video are visually present. We construct a dataset with crowdsourced manual annotations of visible actions,… ▽ More

    Submitted 10 June, 2019; originally announced June 2019.

    Comments: Accepted at ACL 2019

  45. arXiv:1906.01815  [pdf, other

    cs.CL cs.CV

    Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

    Authors: Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria

    Abstract: Sarcasm is often expressed through several verbal and non-verbal cues, e.g., a change of tone, overemphasis in a word, a drawn-out syllable, or a straight looking face. Most of the recent work in sarcasm detection has been carried out on textual data. In this paper, we argue that incorporating multimodal cues can improve the automatic classification of sarcasm. As a first step towards enabling the… ▽ More

    Submitted 5 June, 2019; originally announced June 2019.

    Comments: Accepted at ACL 2019

  46. arXiv:1905.02947  [pdf, other

    cs.CL cs.AI

    Emotion Recognition in Conversation: Research Challenges, Datasets, and Recent Advances

    Authors: Soujanya Poria, Navonil Majumder, Rada Mihalcea, Eduard Hovy

    Abstract: Emotion is intrinsic to humans and consequently emotion understanding is a key part of human-like artificial intelligence (AI). Emotion recognition in conversation (ERC) is becoming increasingly popular as a new research frontier in natural language processing (NLP) due to its ability to mine opinions from the plethora of publicly available conversational data in platforms such as Facebook, Youtub… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

  47. arXiv:1904.11610  [pdf, other

    cs.CL cs.AI

    Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog

    Authors: Charles Welch, Verónica Pérez-Rosas, Jonathan K. Kummerfeld, Rada Mihalcea

    Abstract: We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners. The corpus consists of half a million instant messages, across several messaging platforms. We focus our analyses on seven speaker attributes, each of which partitions the set of speakers, namely: gender; relative age; family member; romantic partner; classmate; co-worker;… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

    Comments: 15 pages accepted to CICLing 2019

    Journal ref: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing 2019)

  48. arXiv:1903.11672  [pdf, other

    cs.SD cs.HC cs.LG eess.AS

    MuSE-ing on the Impact of Utterance Ordering On Crowdsourced Emotion Annotations

    Authors: Mimansa Jaiswal, Zakaria Aldeneh, Cristian-Paul Bara, Yuanhang Luo, Mihai Burzo, Rada Mihalcea, Emily Mower Provost

    Abstract: Emotion recognition algorithms rely on data annotated with high quality labels. However, emotion expression and perception are inherently subjective. There is generally not a single annotation that can be unambiguously declared "correct". As a result, annotations are colored by the manner in which they were collected. In this paper, we conduct crowdsourcing experiments to investigate this impact o… ▽ More

    Submitted 27 March, 2019; originally announced March 2019.

    Comments: 5 pages, ICASSP 2019

  49. arXiv:1811.07497  [pdf, other

    cs.CL

    A Comparative Analysis of Content-based Geolocation in Blogs and Tweets

    Authors: Konstantinos Pappas, Mahmoud Azab, Rada Mihalcea

    Abstract: The geolocation of online information is an essential component in any geospatial application. While most of the previous work on geolocation has focused on Twitter, in this paper we quantify and compare the performance of text-based geolocation methods on social media data drawn from both Blogger and Twitter. We introduce a novel set of location specific features that are both highly informative… ▽ More

    Submitted 18 November, 2018; originally announced November 2018.

    Comments: 31 pages, 6 figures, 8 tables

    ACM Class: I.2.7

  50. arXiv:1811.00405  [pdf, other

    cs.CL

    DialogueRNN: An Attentive RNN for Emotion Detection in Conversations

    Authors: Navonil Majumder, Soujanya Poria, Devamanyu Hazarika, Rada Mihalcea, Alexander Gelbukh, Erik Cambria

    Abstract: Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, etc. Currently, systems do not treat the parties in the conversation individually by adapting to the speaker of each utterance. In this paper, we describe a new me… ▽ More

    Submitted 25 May, 2019; v1 submitted 1 November, 2018; originally announced November 2018.

    Comments: AAAI 2019