Zum Hauptinhalt springen

Showing 1–29 of 29 results for author: Hemphill, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.11962  [pdf

    cs.SI cs.CL

    Characterizing Online Toxicity During the 2022 Mpox Outbreak: A Computational Analysis of Topical and Network Dynamics

    Authors: Lizhou Fan, Lingyao Li, Libby Hemphill

    Abstract: Background: Online toxicity, encompassing behaviors such as harassment, bullying, hate speech, and the dissemination of misinformation, has become a pressing social concern in the digital age. The 2022 Mpox outbreak, initially termed "Monkeypox" but subsequently renamed to mitigate associated stigmas and societal concerns, serves as a poignant backdrop to this issue. Objective: In this research, w… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

    Comments: 36 pages, 8 figure, and 12 tables

  2. arXiv:2408.04163  [pdf, other

    cs.SI

    Academic collaboration on large language model studies increases overall but varies across disciplines

    Authors: Lingyao Li, Ly Dinh, Songhua Hu, Libby Hemphill

    Abstract: Interdisciplinary collaboration is crucial for addressing complex scientific challenges. Recent advancements in large language models (LLMs) have shown significant potential in benefiting researchers across various fields. To explore the application of LLMs in scientific disciplines and their implications for interdisciplinary collaboration, we collect and analyze 50,391 papers from OpenAlex, an o… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

  3. arXiv:2407.05104  [pdf, other

    cs.CY

    Crowdsourced reviews reveal substantial disparities in public perceptions of parking

    Authors: Lingyao Li, Songhua Hu, Ly Dinh, Libby Hemphill

    Abstract: Due to increased reliance on private vehicles and growing travel demand, parking remains a longstanding urban challenge globally. Quantifying parking perceptions is paramount as it enables decision-makers to identify problematic areas and make informed decisions on parking management. This study introduces a cost-effective and widely accessible data source, crowdsourced online reviews, to investig… ▽ More

    Submitted 6 July, 2024; originally announced July 2024.

  4. arXiv:2406.11980  [pdf, other

    cs.AI cs.CY

    Prompt Design Matters for Computational Social Science Tasks but in Unpredictable Ways

    Authors: Shubham Atreja, Joshua Ashkinaze, Lingyao Li, Julia Mendelsohn, Libby Hemphill

    Abstract: Manually annotating data for computational social science tasks can be costly, time-consuming, and emotionally draining. While recent work suggests that LLMs can perform such annotation tasks in zero-shot settings, little is known about how prompt design impacts LLMs' compliance and accuracy. We conduct a large-scale multi-prompt experiment to test how model selection (ChatGPT, PaLM2, and Falcon7b… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: under review

  5. arXiv:2405.03066  [pdf

    cs.ET

    A scoping review of using Large Language Models (LLMs) to investigate Electronic Health Records (EHRs)

    Authors: Lingyao Li, Jiayan Zhou, Zhenxiang Gao, Wenyue Hua, Lizhou Fan, Huizi Yu, Loni Hagen, Yongfeng Zhang, Themistocles L. Assimes, Libby Hemphill, Siyuan Ma

    Abstract: Electronic Health Records (EHRs) play an important role in the healthcare system. However, their complexity and vast volume pose significant challenges to data interpretation and analysis. Recent advancements in Artificial Intelligence (AI), particularly the development of Large Language Models (LLMs), open up new opportunities for researchers in this domain. Although prior studies have demonstrat… ▽ More

    Submitted 22 May, 2024; v1 submitted 5 May, 2024; originally announced May 2024.

  6. arXiv:2404.13156  [pdf

    cs.SI

    Crowdsourcing public attitudes toward local services through the lens of Google Maps reviews: An urban density-based perspective

    Authors: Lingyao Li, Songhua Hu, Atiyya Shaw, Libby Hemphill

    Abstract: Understanding how urban density impacts public perceptions of urban service is important for informing livable, accessible, and equitable urban planning. Conventional methods such as surveys are limited by their sampling scope, time efficiency, and expense. On the other hand, crowdsourcing through online platforms presents an opportunity for decision-makers to tap into a user-generated source of i… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

  7. arXiv:2401.08899  [pdf, other

    cs.CY

    Landscape of Generative AI in Global News: Topics, Sentiments, and Spatiotemporal Analysis

    Authors: Lu Xian, Lingyao Li, Yiwei Xu, Ben Zefeng Zhang, Libby Hemphill

    Abstract: Generative AI has exhibited considerable potential to transform various industries and public life. The role of news media coverage of generative AI is pivotal in shaping public perceptions and judgments about this significant technological innovation. This paper provides in-depth analysis and rich insights into the temporal and spatial distribution of topics, sentiment, and substantive themes wit… ▽ More

    Submitted 16 January, 2024; originally announced January 2024.

  8. arXiv:2311.17227  [pdf, other

    cs.AI cs.CL cs.CY

    War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars

    Authors: Wenyue Hua, Lizhou Fan, Lingyao Li, Kai Mei, Jianchao Ji, Yingqiang Ge, Libby Hemphill, Yongfeng Zhang

    Abstract: Can we avoid wars at the crossroads of history? This question has been pursued by individuals, scholars, policymakers, and organizations throughout human history. In this research, we attempt to answer the question based on the recent advances of Artificial Intelligence (AI) and Large Language Models (LLMs). We propose \textbf{WarAgent}, an LLM-powered multi-agent AI system, to simulate the partic… ▽ More

    Submitted 30 January, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Comments: 47 pages, 9 figures, 5 tables

  9. arXiv:2309.15827  [pdf

    cs.CL cs.CY

    How We Define Harm Impacts Data Annotations: Explaining How Annotators Distinguish Hateful, Offensive, and Toxic Comments

    Authors: Angela Schöpke-Gonzalez, Siqi Wu, Sagar Kumar, Paul J. Resnick, Libby Hemphill

    Abstract: Computational social science research has made advances in machine learning and natural language processing that support content moderators in detecting harmful content. These advances often rely on training datasets annotated by crowdworkers for harmful content. In designing instructions for annotation tasks to generate training data for these algorithms, researchers often treat the harm concepts… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: 29 pages, 1 figure, 9 tables

    ACM Class: K.4.1

  10. arXiv:2308.05281  [pdf

    cs.SI cs.CL cs.IR cs.LG

    Investigating disaster response through social media data and the Susceptible-Infected-Recovered (SIR) model: A case study of 2020 Western U.S. wildfire season

    Authors: Zihui Ma, Lingyao Li, Libby Hemphill, Gregory B. Baecher, Yubai Yuan

    Abstract: Effective disaster response is critical for affected communities. Responders and decision-makers would benefit from reliable, timely measures of the issues impacting their communities during a disaster, and social media offers a potentially rich data source. Social media can reflect public concerns and demands during a disaster, offering valuable insights for decision-makers to understand evolving… ▽ More

    Submitted 9 January, 2024; v1 submitted 9 August, 2023; originally announced August 2023.

  11. arXiv:2305.18358  [pdf

    cs.IR cs.HC

    DataChat: Prototyping a Conversational Agent for Dataset Search and Visualization

    Authors: Lizhou Fan, Sara Lafia, Lingyao Li, Fangyuan Yang, Libby Hemphill

    Abstract: Data users need relevant context and research expertise to effectively search for and identify relevant datasets. Leading data providers, such as the Inter-university Consortium for Political and Social Research (ICPSR), offer standardized metadata and search tools to support data search. Metadata standards emphasize the machine-readability of data and its documentation. There are opportunities to… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 6 pages, 2 figures, and 1 table. Accepted to the 86th Annual Meeting of the Association for Information Science & Technology

  12. arXiv:2305.02201  [pdf

    cs.CY

    ChatGPT in education: A discourse analysis of worries and concerns on social media

    Authors: Lingyao Li, Zihui Ma, Lizhou Fan, Sanggyu Lee, Huizi Yu, Libby Hemphill

    Abstract: The rapid advancements in generative AI models present new opportunities in the education sector. However, it is imperative to acknowledge and address the potential risks and concerns that may arise with their use. We analyzed Twitter data to identify key concerns related to the use of ChatGPT in education. We employed BERT-based topic modeling to conduct a discourse analysis and social network an… ▽ More

    Submitted 29 April, 2023; originally announced May 2023.

  13. arXiv:2304.10619  [pdf

    cs.CL cs.AI cs.HC

    "HOT" ChatGPT: The promise of ChatGPT in detecting and discriminating hateful, offensive, and toxic comments on social media

    Authors: Lingyao Li, Lizhou Fan, Shubham Atreja, Libby Hemphill

    Abstract: Harmful content is pervasive on social media, poisoning online communities and negatively impacting participation. A common approach to address this issue is to develop detection models that rely on human annotations. However, the tasks required to build such models expose annotators to harmful and offensive content and may require significant time and cost to complete. Generative AI models have t… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

  14. arXiv:2304.02020  [pdf

    cs.DL cs.CL cs.CY cs.SI

    A Bibliometric Review of Large Language Models Research from 2017 to 2023

    Authors: Lizhou Fan, Lingyao Li, Zihui Ma, Sanggyu Lee, Huizi Yu, Libby Hemphill

    Abstract: Large language models (LLMs) are a class of language models that have demonstrated outstanding performance across a range of natural language processing (NLP) tasks and have become a highly sought-after research area, because of their ability to generate human-like language and their potential to revolutionize science and technology. In this study, we conduct bibliometric and discourse analyses of… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

    Comments: 36 pages, 9 figures, and 4 tables

  15. How and Why do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles

    Authors: Sara Lafia, Andrea Thomer, Elizabeth Moss, David Bleckley, Libby Hemphill

    Abstract: Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers' motivations for referencing data. Like reference… ▽ More

    Submitted 16 February, 2023; originally announced February 2023.

    Comments: 35 pages, 2 appendices, 1 table

  16. arXiv:2301.07163  [pdf, other

    cs.CY cs.HC

    AppealMod: Inducing Friction to Reduce Moderator Workload of Handling User Appeals

    Authors: Shubham Atreja, Jane Im, Paul Resnick, Libby Hemphill

    Abstract: As content moderation becomes a central aspect of all social media platforms and online communities, interest has grown in how to make moderation decisions contestable. On social media platforms where individual communities moderate their own activities, the responsibility to address user appeals falls on volunteers from within the community. While there is a growing body of work devoted to unders… ▽ More

    Submitted 9 January, 2024; v1 submitted 17 January, 2023; originally announced January 2023.

    Comments: accepted at CSCW'24

  17. arXiv:2205.11651  [pdf

    cs.DL cs.CL cs.LG

    A Natural Language Processing Pipeline for Detecting Informal Data References in Academic Literature

    Authors: Sara Lafia, Lizhou Fan, Libby Hemphill

    Abstract: Discovering authoritative links between publications and the datasets that they use can be a labor-intensive process. We introduce a natural language processing pipeline that retrieves and reviews publications for informal references to research datasets, which complements the work of data librarians. We first describe the components of the pipeline and then apply it to expand an authoritative bib… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    Comments: 13 pages, 7 figures, 3 tables

  18. Subdivisions and Crossroads: Identifying Hidden Community Structures in a Data Archive's Citation Network

    Authors: Sara Lafia, Lizhou Fan, Andrea Thomer, Libby Hemphill

    Abstract: Data archives are an important source of high quality data in many fields, making them ideal sites to study data reuse. By studying data reuse through citation networks, we are able to learn how hidden research communities - those that use the same scientific datasets - are organized. This paper analyzes the community structure of an authoritative network of datasets cited in academic publications… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

    Comments: 30 pages, 7 tables, 4 figures

  19. arXiv:2204.01790  [pdf, other

    cs.SI cs.IR

    Leaders or Followers? A Temporal Analysis of Tweets from IRA Trolls

    Authors: Siva K. Balasubramanian, Mustafa Bilgic, Aron Culotta, Libby Hemphill, Anita Nikolich, Matthew A. Shapiro

    Abstract: The Internet Research Agency (IRA) influences online political conversations in the United States, exacerbating existing partisan divides and sowing discord. In this paper we investigate the IRA's communication strategies by analyzing trending terms on Twitter to identify cases in which the IRA leads or follows other users. Our analysis focuses on over 38M tweets posted between 2016 and 2017 from… ▽ More

    Submitted 4 April, 2022; originally announced April 2022.

    Comments: ICWSM 2022

  20. arXiv:2203.05112  [pdf, other

    cs.DL cs.CL cs.HC cs.LG

    Librarian-in-the-Loop: A Natural Language Processing Paradigm for Detecting Informal Mentions of Research Data in Academic Literature

    Authors: Lizhou Fan, Sara Lafia, David Bleckley, Elizabeth Moss, Andrea Thomer, Libby Hemphill

    Abstract: Data citations provide a foundation for studying research data impact. Collecting and managing data citations is a new frontier in archival science and scholarly communication. However, the discovery and curation of research data citations is labor intensive. Data citations that reference unique identifiers (i.e. DOIs) are readily findable; however, informal mentions made to research data are more… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

  21. arXiv:2202.04560  [pdf, other

    cs.DL cs.DB cs.HC

    The craft and coordination of data curation: complicating "workflow" views of data science

    Authors: Andrea K. Thomer, Dharma Akmon, Jeremy York, Allison R. B. Tyler, Faye Polasek, Sara Lafia, Libby Hemphill, Elizabeth Yakel

    Abstract: Data curation is the process of making a dataset fit-for-use and archiveable. It is critical to data-intensive science because it makes complex data pipelines possible, makes studies reproducible, and makes data (re)usable. Yet the complexities of the hands-on, technical and intellectual work of data curation is frequently overlooked or downplayed. Obscuring the work of data curation not only rend… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Comments: submitted to CSCW 2022 (Feb revision cycle)

  22. arXiv:2202.00799  [pdf, other

    cs.CY cs.HC

    Remove, Reduce, Inform: What Actions do People Want Social Media Platforms to Take on Potentially Misleading Content?

    Authors: Shubham Atreja, Libby Hemphill, Paul Resnick

    Abstract: To reduce the spread of misinformation, social media platforms may take enforcement actions against offending content, such as adding informational warning labels, reducing distribution, or removing content entirely. However, both their actions and their inactions have been controversial and plagued by allegations of partisan bias. When it comes to specific content items, surprisingly little is kn… ▽ More

    Submitted 12 September, 2023; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: accepted at CSCW 2023

  23. Leveraging Machine Learning to Detect Data Curation Activities

    Authors: Sara Lafia, Andrea Thomer, David Bleckley, Dharma Akmon, Libby Hemphill

    Abstract: This paper describes a machine learning approach for annotating and analyzing data curation work logs at ICPSR, a large social sciences data archive. The systems we studied track curation work and coordinate team decision-making at ICPSR. Repository staff use these systems to organize, prioritize, and document curation work done on datasets, making them promising resources for studying curation wo… ▽ More

    Submitted 30 April, 2021; originally announced May 2021.

    Comments: 10 pages, 4 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  24. arXiv:1909.08189  [pdf, other

    cs.SI cs.LG stat.ML

    Two Computational Models for Analyzing Political Attention in Social Media

    Authors: Libby Hemphill, Angela M. Schöpke-Gonzalez

    Abstract: Understanding how political attention is divided and over what subjects is crucial for research on areas such as agenda setting, framing, and political rhetoric. Existing methods for measuring attention, such as manual labeling according to established codebooks, are expensive and can be restrictive. We describe two computational models that automatically distinguish topics in politicians' social… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: Accepted for publication in the International AAAI Conference on Web and Social Media (ICWSM 2020)

  25. arXiv:1906.01738  [pdf, other

    cs.SI cs.CL cs.CY

    A Just and Comprehensive Strategy for Using NLP to Address Online Abuse

    Authors: David Jurgens, Eshwar Chandrasekharan, Libby Hemphill

    Abstract: Online abusive behavior affects millions and the NLP community has attempted to mitigate this problem by developing technologies to detect abuse. However, current methods have largely focused on a narrow definition of abuse to detriment of victims who seek both validation and solutions. In this position paper, we argue that the community needs to make three substantive changes: (1) expanding our s… ▽ More

    Submitted 6 June, 2019; v1 submitted 4 June, 2019; originally announced June 2019.

    Comments: 9 pages; Accepted to be published at ACL 2019

  26. arXiv:1901.11162  [pdf, other

    cs.SI cs.CY

    Still out there: Modeling and Identifying Russian Troll Accounts on Twitter

    Authors: Jane Im, Eshwar Chandrasekharan, Jackson Sargent, Paige Lighthammer, Taylor Denby, Ankit Bhargava, Libby Hemphill, David Jurgens, Eric Gilbert

    Abstract: There is evidence that Russia's Internet Research Agency attempted to interfere with the 2016 U.S. election by running fake accounts on Twitter - often referred to as "Russian trolls". In this work, we: 1) develop machine learning models that predict whether a Twitter account is a Russian troll within a set of 170K control accounts; and, 2) demonstrate that it is possible to use this model to find… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.

  27. arXiv:1808.09488  [pdf, ps, other

    cs.CY cs.HC

    Crafting Moral Infrastructures: How Nonprofits Use Facebook to Survive

    Authors: Libby Hemphill, A. J. Million, Ingrid Erickson

    Abstract: We present findings from interviews with 23 individuals affiliated with non-profit organizations (NPOs) to understand how they deploy information and communication technologies (ICTs) in civic engagement efforts. Existing research about NPO ICT use is largely critical, but we did not find evidence that NPOs fail to use tools effectively. Rather, we detail how various ICT use on the part of NPOs in… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

  28. arXiv:1804.06759  [pdf, other

    cs.CL cs.SI

    Forecasting the presence and intensity of hostility on Instagram using linguistic and social features

    Authors: Ping Liu, Joshua Guberman, Libby Hemphill, Aron Culotta

    Abstract: Online antisocial behavior, such as cyberbullying, harassment, and trolling, is a widespread problem that threatens free discussion and has negative physical and mental health consequences for victims and communities. While prior work has proposed automated methods to identify hostile comments in online discussions, these methods work retrospectively on comments that have already been posted, maki… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: ICWSM'18

  29. arXiv:1802.08612  [pdf, ps, other

    cs.SI cs.CY cs.HC

    More Specificity, More Attention to Social Context: Reframing How We Address "Bad Actors"

    Authors: Libby Hemphill

    Abstract: To address "bad actors" online, I argue for more specific definitions of acceptable and unacceptable behaviors and explicit attention to the social structures in which behaviors occur.

    Submitted 23 February, 2018; originally announced February 2018.

    Comments: Paper submitted Workshop Paper Submitted to CHI 2018: Understanding "Bad Actors" Online

    ACM Class: H.5.m