Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Ziems, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.06840  [pdf, other

    cs.CL cs.LG

    Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles

    Authors: Julia Kruk, Michela Marchini, Rijul Magu, Caleb Ziems, David Muchlinski, Diyi Yang

    Abstract: A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. Dog whistling historically originated from United States politics, but in recent years has taken root in social media as a means of evading hate speech detection systems and maintaining plausible deniability. In this paper, we pr… ▽ More

    Submitted 18 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

    Comments: ACL 2024

    ACM Class: J.4; K.4.1; K.4.2

  2. arXiv:2406.04298  [pdf, other

    cs.IR cs.CL

    Measuring and Addressing Indexical Bias in Information Retrieval

    Authors: Caleb Ziems, William Held, Jane Dwivedi-Yu, Diyi Yang

    Abstract: Information Retrieval (IR) systems are designed to deliver relevant content, but traditional systems may not optimize rankings for fairness, neutrality, or the balance of ideas. Consequently, IR can often introduce indexical biases, or biases in the positional order of documents. Although indexical bias can demonstrably affect people's opinion, voting patterns, and other behaviors, these issues re… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ACL 2024

  3. arXiv:2404.15238  [pdf, other

    cs.CL cs.AI

    CultureBank: An Online Community-Driven Knowledge Base Towards Culturally Aware Language Technologies

    Authors: Weiyan Shi, Ryan Li, Yutong Zhang, Caleb Ziems, Chunhua yu, Raya Horesh, Rogério Abreu de Paula, Diyi Yang

    Abstract: To enhance language models' cultural awareness, we design a generalizable pipeline to construct cultural knowledge bases from different online communities on a massive scale. With the pipeline, we construct CultureBank, a knowledge base built upon users' self-narratives with 12K cultural descriptors sourced from TikTok and 11K from Reddit. Unlike previous cultural knowledge resources, CultureBank… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 32 pages, 7 figures, preprint

  4. arXiv:2404.04204  [pdf, other

    cs.CL cs.HC

    Social Skill Training with Large Language Models

    Authors: Diyi Yang, Caleb Ziems, William Held, Omar Shaikh, Michael S. Bernstein, John Mitchell

    Abstract: People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of reach for most people. How can we make social skill training more available, accessible, and inviting? Drawing upon interdisciplinary research from communication and psychology, this perspective paper id… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  5. arXiv:2403.14659  [pdf, other

    cs.CY cs.AI cs.CL

    Social Intelligence Data Infrastructure: Structuring the Present and Navigating the Future

    Authors: Minzhi Li, Weiyan Shi, Caleb Ziems, Diyi Yang

    Abstract: As Natural Language Processing (NLP) systems become increasingly integrated into human social life, these technologies will need to increasingly rely on social intelligence. Although there are many valuable datasets that benchmark isolated dimensions of social intelligence, there does not yet exist any body of work to join these threads into a cohesive subfield in which researchers can quickly ide… ▽ More

    Submitted 27 February, 2024; originally announced March 2024.

  6. arXiv:2310.17887  [pdf, other

    cs.CV cs.LG

    Impressions: Understanding Visual Semiotics and Aesthetic Impact

    Authors: Julia Kruk, Caleb Ziems, Diyi Yang

    Abstract: Is aesthetic impact different from beauty? Is visual salience a reflection of its capacity for effective communication? We present Impressions, a novel dataset through which to investigate the semiotics of images, and how specific visual features and design choices can elicit specific emotions, thoughts and beliefs. We posit that the impactfulness of an image extends beyond formal definitions of a… ▽ More

    Submitted 27 October, 2023; originally announced October 2023.

    Comments: To be published in EMNLP 2023

  7. CoAnnotating: Uncertainty-Guided Work Allocation between Human and Large Language Models for Data Annotation

    Authors: Minzhi Li, Taiwei Shi, Caleb Ziems, Min-Yen Kan, Nancy F. Chen, Zhengyuan Liu, Diyi Yang

    Abstract: Annotated data plays a critical role in Natural Language Processing (NLP) in training models and evaluating their performance. Given recent developments in Large Language Models (LLMs), models such as ChatGPT demonstrate zero-shot capability on many text-annotation tasks, comparable with or even exceeding human annotators. Such LLMs can serve as alternatives for manual annotation, due to lower cos… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  8. arXiv:2306.02475  [pdf, other

    cs.CL

    Modeling Cross-Cultural Pragmatic Inference with Codenames Duet

    Authors: Omar Shaikh, Caleb Ziems, William Held, Aryan J. Pariani, Fred Morstatter, Diyi Yang

    Abstract: Pragmatic reference enables efficient interpersonal communication. Prior work uses simple reference games to test models of pragmatic reasoning, often with unidentified speakers and listeners. In practice, however, speakers' sociocultural background shapes their pragmatic assumptions. For example, readers of this paper assume NLP refers to "Natural Language Processing," and not "Neuro-linguistic P… ▽ More

    Submitted 4 June, 2023; originally announced June 2023.

    Comments: ACL 2023 Findings

  9. arXiv:2305.17008  [pdf, other

    cs.CL

    NormBank: A Knowledge Bank of Situational Social Norms

    Authors: Caleb Ziems, Jane Dwivedi-Yu, Yi-Chia Wang, Alon Halevy, Diyi Yang

    Abstract: We present NormBank, a knowledge bank of 155k situational norms. This resource is designed to ground flexible normative reasoning for interactive, assistive, and collaborative AI systems. Unlike prior commonsense resources, NormBank grounds each inference within a multivalent sociocultural frame, which includes the setting (e.g., restaurant), the agents' contingent roles (waiter, customer), their… ▽ More

    Submitted 24 July, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

  10. TADA: Task-Agnostic Dialect Adapters for English

    Authors: Will Held, Caleb Ziems, Diyi Yang

    Abstract: Large Language Models, the dominant starting point for Natural Language Processing (NLP) applications, fail at a higher rate for speakers of English dialects other than Standard American English (SAE). Prior work addresses this using task-specific data or synthetic data augmentation, both of which require intervention for each dialect and task pair. This poses a scalability issue that prevents the… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 5 Pages; ACL Findings Paper 2023

  11. arXiv:2305.03514  [pdf, other

    cs.CL cs.LG

    Can Large Language Models Transform Computational Social Science?

    Authors: Caleb Ziems, William Held, Omar Shaikh, Jiaao Chen, Zhehao Zhang, Diyi Yang

    Abstract: Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools… ▽ More

    Submitted 26 February, 2024; v1 submitted 12 April, 2023; originally announced May 2023.

    Comments: To appear in "Computational Linguistics" (CL)

  12. arXiv:2212.08011  [pdf, other

    cs.CL

    Multi-VALUE: A Framework for Cross-Dialectal English NLP

    Authors: Caleb Ziems, William Held, Jingfeng Yang, Jwala Dhamala, Rahul Gupta, Diyi Yang

    Abstract: Dialect differences caused by regional, social, and economic factors cause performance discrepancies for many groups of language technology users. Inclusive and equitable language technology must critically be dialect invariant, meaning that performance remains constant over dialectal shifts. Current systems often fall short of this ideal since they are designed and tested on a single dialect: Sta… ▽ More

    Submitted 29 May, 2023; v1 submitted 15 December, 2022; originally announced December 2022.

    Comments: ACL 2023

  13. arXiv:2204.03031  [pdf, other

    cs.CL

    VALUE: Understanding Dialect Disparity in NLU

    Authors: Caleb Ziems, Jiaao Chen, Camille Harris, Jessica Anderson, Diyi Yang

    Abstract: English Natural Language Understanding (NLU) systems have achieved great performances and even outperformed humans on benchmarks like GLUE and SuperGLUE. However, these benchmarks contain only textbook Standard American English (SAE). Other dialects have been largely overlooked in the NLP community. This leads to biased and inequitable NLU systems that serve only a sub-population of speakers. To u… ▽ More

    Submitted 13 September, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: ACL 2022 main conference

  14. arXiv:2204.03021  [pdf, other

    cs.CL

    The Moral Integrity Corpus: A Benchmark for Ethical Dialogue Systems

    Authors: Caleb Ziems, Jane A. Yu, Yi-Chia Wang, Alon Halevy, Diyi Yang

    Abstract: Conversational agents have come increasingly closer to human competence in open-domain dialogue settings; however, such models can reflect insensitive, hurtful, or entirely incoherent viewpoints that erode a user's trust in the moral integrity of the system. Moral deviations are difficult to mitigate because moral judgments are not universal, and there may be multiple competing judgments that appl… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: ACL 2022 main conference

  15. arXiv:2204.02952  [pdf, other

    cs.CL

    Inducing Positive Perspectives with Text Reframing

    Authors: Caleb Ziems, Minzhi Li, Anthony Zhang, Diyi Yang

    Abstract: Sentiment transfer is one popular example of a text style transfer task, where the goal is to reverse the sentiment polarity of a text. With a sentiment reversal comes also a reversal in meaning. We introduce a different but related task called positive reframing in which we neutralize a negative point of view and generate a more positive perspective for the author without contradicting the origin… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: ACL 2022 main conference

  16. arXiv:2109.05325  [pdf, other

    cs.CL

    To Protect and To Serve? Analyzing Entity-Centric Framing of Police Violence

    Authors: Caleb Ziems, Diyi Yang

    Abstract: Framing has significant but subtle effects on public opinion and policy. We propose an NLP framework to measure entity-centric frames. We use it to understand media coverage on police violence in the United States in a new Police Violence Frames Corpus of 82k news articles spanning 7k police killings. Our work uncovers more than a dozen framing devices and reveals significant differences in the wa… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

    Comments: Findings of EMNLP 2021

  17. arXiv:2109.05322  [pdf, other

    cs.CL cs.SI

    Latent Hatred: A Benchmark for Understanding Implicit Hate Speech

    Authors: Mai ElSherief, Caleb Ziems, David Muchlinski, Vaishnavi Anupindi, Jordyn Seybolt, Munmun De Choudhury, Diyi Yang

    Abstract: Hate speech has grown significantly on social media, causing serious consequences for victims of all demographics. Despite much attention being paid to characterize and detect discriminatory speech, most work has focused on explicit or overt hate speech, failing to address a more pervasive form based on coded or indirect language. To fill this gap, this work introduces a theoretically-justified ta… ▽ More

    Submitted 11 September, 2021; originally announced September 2021.

    Comments: EMNLP 2021 main conference

  18. arXiv:2106.11846  [pdf, other

    econ.GN cs.IR

    Quantifying the Impact of Human Capital, Job History, and Language Factors on Job Seniority with a Large-scale Analysis of Resumes

    Authors: Austin P Wright, Caleb Ziems, Haekyu Park, Jon Saad-Falcon, Duen Horng Chau, Diyi Yang, Maria Tomprou

    Abstract: As job markets worldwide have become more competitive and applicant selection criteria have become more opaque, and different (and sometimes contradictory) information and advice is available for job seekers wishing to progress in their careers, it has never been more difficult to determine which factors in a résumé most effectively help career progression. In this work we present a novel, large s… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: 9 Pages, 5 Figures, 8 Tables

  19. arXiv:2005.12423  [pdf, other

    cs.SI cs.CL cs.CY cs.IR physics.soc-ph

    Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis

    Authors: Bing He, Caleb Ziems, Sandeep Soni, Naren Ramakrishnan, Diyi Yang, Srijan Kumar

    Abstract: The spread of COVID-19 has sparked racism and hate on social media targeted towards Asian communities. However, little is known about how racial hate spreads during a pandemic and the role of counterspeech in mitigating this spread. In this work, we study the evolution and spread of anti-Asian hate speech through the lens of Twitter. We create COVID-HATE, the largest dataset of anti-Asian hate and… ▽ More

    Submitted 10 November, 2021; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: ASONAM 2021. The COVID-HATE dataset, annotations, and code are at http://claws.cc.gatech.edu/covid

  20. arXiv:2004.01820  [pdf, other

    cs.SI cs.CL

    Aggressive, Repetitive, Intentional, Visible, and Imbalanced: Refining Representations for Cyberbullying Classification

    Authors: Caleb Ziems, Ymir Vigfusson, Fred Morstatter

    Abstract: Cyberbullying is a pervasive problem in online communities. To identify cyberbullying cases in large-scale social networks, content moderators depend on machine learning classifiers for automatic cyberbullying detection. However, existing models remain unfit for real-world applications, largely due to a shortage of publicly available training data and a lack of standard criteria for assigning grou… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: 12 pages, 5 figures, 22 tables, Accepted to the 14th International AAAI Conference on Web and Social Media, ICWSM'20

  21. arXiv:1911.00776  [pdf, other

    cs.LG stat.ML

    Ten-year Survival Prediction for Breast Cancer Patients

    Authors: Changmao Li, Han He, Yunze Hao, Caleb Ziems

    Abstract: This report assesses different machine learning approaches to 10-year survival prediction of breast cancer patients.

    Submitted 2 November, 2019; originally announced November 2019.