Zum Hauptinhalt springen

Showing 1–22 of 22 results for author: Preum, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14695  [pdf, other

    cs.CL cs.AI

    Depth $F_1$: Improving Evaluation of Cross-Domain Text Classification by Measuring Semantic Generalizability

    Authors: Parker Seegmiller, Joseph Gatto, Sarah Masud Preum

    Abstract: Recent evaluations of cross-domain text classification models aim to measure the ability of a model to obtain domain-invariant performance in a target domain given labeled samples in a source domain. The primary strategy for this evaluation relies on assumed differences between source domain samples and target domain samples in benchmark datasets. This evaluation strategy fails to account for the… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2404.01147  [pdf, other

    cs.CL cs.LG

    Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit

    Authors: Parker Seegmiller, Joseph Gatto, Omar Sharif, Madhusudan Basak, Sarah Masud Preum

    Abstract: Large language models (LLMs) have been shown to be proficient in correctly answering questions in the context of online discourse. However, the study of using LLMs to model human-like answers to fact-driven social media questions is still under-explored. In this work, we investigate how LLMs model the wide variety of human answers to fact-driven questions posed on several topic-specific Reddit com… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures

  3. arXiv:2403.10829  [pdf, other

    cs.CL

    Deciphering Hate: Identifying Hateful Memes and Their Targets

    Authors: Eftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah M. Preum

    Abstract: Internet memes have become a powerful means for individuals to express emotions, thoughts, and perspectives on social media. While often considered as a source of humor and entertainment, memes can also disseminate hateful content targeting individuals or communities. Most existing research focuses on the negative aspects of memes in high-resource languages, overlooking the distinctive challenges… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  4. arXiv:2403.03336  [pdf, other

    cs.CL cs.SI

    Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse

    Authors: Joseph Gatto, Madhusudan Basak, Yash Srivastava, Philip Bohlman, Sarah M. Preum

    Abstract: In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment) pairs sourced from Reddit, where post titles contain emerging health-related claims on a topic that is not predefined. The claims are either explicitly or impl… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  5. arXiv:2403.03304  [pdf, other

    cs.CL cs.LG

    Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types

    Authors: Joseph Gatto, Parker Seegmiller, Omar Sharif, Sarah M. Preum

    Abstract: Event Argument Extraction (EAE) is an extremely difficult information extraction problem -- with significant limitations in few-shot cross-domain (FSCD) settings. A common solution to FSCD modeling is data augmentation. Unfortunately, existing augmentation methods are not well-suited to a variety of real-world EAE contexts including (i) The need to model long documents (10+ sentences) (ii) The nee… ▽ More

    Submitted 12 June, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Paper in submission (8 pages)

  6. Sketching AI Concepts with Capabilities and Examples: AI Innovation in the Intensive Care Unit

    Authors: Nur Yildirim, Susanna Zlotnikov, Deniz Sayar, Jeremy M. Kahn, Leigh A. Bukowski, Sher Shah Amin, Kathryn A. Riman, Billie S. Davis, John S. Minturn, Andrew J. King, Dan Ricketts, Lu Tang, Venkatesh Sivaraman, Adam Perer, Sarah M. Preum, James McCann, John Zimmerman

    Abstract: Advances in artificial intelligence (AI) have enabled unprecedented capabilities, yet innovation teams struggle when envisioning AI concepts. Data science teams think of innovations users do not want, while domain experts think of innovations that cannot be built. A lack of effective ideation seems to be a breakdown point. How might multidisciplinary teams identify buildable and desirable use case… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: to appear at CHI 2024

  7. arXiv:2402.09738  [pdf, other

    cs.CL

    Align before Attend: Aligning Visual and Textual Features for Multimodal Hateful Content Detection

    Authors: Eftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah M. Preum

    Abstract: Multimodal hateful content detection is a challenging task that requires complex reasoning across visual and textual modalities. Therefore, creating a meaningful multimodal representation that effectively captures the interplay between visual and textual features through intermediate fusion is critical. Conventional fusion techniques are unable to attend to the modality-specific features effective… ▽ More

    Submitted 15 February, 2024; originally announced February 2024.

    Comments: Accepted to EACL-SRW, 2024

  8. arXiv:2310.19750  [pdf, other

    cs.CL

    Chain-of-Thought Embeddings for Stance Detection on Social Media

    Authors: Joseph Gatto, Omar Sharif, Sarah Masud Preum

    Abstract: Stance detection on social media is challenging for Large Language Models (LLMs), as emerging slang and colloquial language in online conversations often contain deeply implicit stance labels. Chain-of-Thought (COT) prompting has recently been shown to improve performance on stance detection tasks -- alleviating some of these issues. However, COT prompting still struggles with implicit stance iden… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP-2023, 8 pages

  9. arXiv:2310.15010  [pdf, other

    cs.CL

    Statistical Depth for Ranking and Characterizing Transformer-Based Text Embeddings

    Authors: Parker Seegmiller, Sarah Masud Preum

    Abstract: The popularity of transformer-based text embeddings calls for better statistical tools for measuring distributions of such embeddings. One such tool would be a method for ranking texts within a corpus by centrality, i.e. assigning each text a number signifying how representative that text is of the corpus as a whole. However, an intrinsic center-outward ordering of high-dimensional text representa… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  10. arXiv:2309.09877  [pdf, other

    cs.CL

    Not Enough Labeled Data? Just Add Semantics: A Data-Efficient Method for Inferring Online Health Texts

    Authors: Joseph Gatto, Sarah M. Preum

    Abstract: User-generated texts available on the web and social platforms are often long and semantically challenging, making them difficult to annotate. Obtaining human annotation becomes increasingly difficult as problem domains become more specialized. For example, many health NLP problems require domain experts to be a part of the annotation pipeline. Thus, it is crucial that we develop low-resource NLP… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  11. arXiv:2309.06541  [pdf, other

    cs.CL

    Text Encoders Lack Knowledge: Leveraging Generative LLMs for Domain-Specific Semantic Textual Similarity

    Authors: Joseph Gatto, Omar Sharif, Parker Seegmiller, Philip Bohlman, Sarah Masud Preum

    Abstract: Amidst the sharp rise in the evaluation of large language models (LLMs) on various tasks, we find that semantic textual similarity (STS) has been under-explored. In this study, we show that STS can be cast as a text generation problem while maintaining strong performance on multiple STS benchmarks. Additionally, we show generative LLMs significantly outperform existing encoder-based STS models whe… ▽ More

    Submitted 12 September, 2023; originally announced September 2023.

    Comments: Under review GEM@EMNLP-2023, 12 pages

  12. arXiv:2308.09156  [pdf, other

    cs.CL

    Characterizing Information Seeking Events in Health-Related Social Discourse

    Authors: Omar Sharif, Madhusudan Basak, Tanzia Parvin, Ava Scharfstein, Alphonso Bradham, Jacob T. Borodovsky, Sarah E. Lord, Sarah M. Preum

    Abstract: Social media sites have become a popular platform for individuals to seek and share health information. Despite the progress in natural language processing for social media mining, a gap remains in analyzing health-related texts on social discourse in the context of events. Event-driven analysis can offer insights into different facets of healthcare at an individual and collective level, including… ▽ More

    Submitted 19 December, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted at AAAI-2024. 9 pages, 6 tables, 2 figures

  13. arXiv:2303.09366  [pdf, other

    cs.CL cs.LG

    The Scope of In-Context Learning for the Extraction of Medical Temporal Constraints

    Authors: Parker Seegmiller, Joseph Gatto, Madhusudan Basak, Diane Cook, Hassan Ghasemzadeh, John Stankovic, Sarah Preum

    Abstract: Medications often impose temporal constraints on everyday patient activity. Violations of such medical temporal constraints (MTCs) lead to a lack of treatment adherence, in addition to poor health outcomes and increased healthcare expenses. These MTCs are found in drug usage guidelines (DUGs) in both patient education materials and clinical texts. Computationally representing MTCs in DUGs will adv… ▽ More

    Submitted 16 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  14. arXiv:2302.09665  [pdf, other

    cs.AI

    CitySpec with Shield: A Secure Intelligent Assistant for Requirement Formalization

    Authors: Zirong Chen, Issa Li, Haoxiang Zhang, Sarah Preum, John A. Stankovic, Meiyi Ma

    Abstract: An increasing number of monitoring systems have been developed in smart cities to ensure that the real-time operations of a city satisfy safety and performance requirements. However, many existing city requirements are written in English with missing, inaccurate, or ambiguous information. There is a high demand for assisting city policymakers in converting human-specified requirements to machine-u… ▽ More

    Submitted 30 March, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2206.03132

  15. arXiv:2301.11508  [pdf, other

    cs.CL

    Theme-driven Keyphrase Extraction to Analyze Social Media Discourse

    Authors: William Romano, Omar Sharif, Madhusudan Basak, Joseph Gatto, Sarah Preum

    Abstract: Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis, a gap remains in applying keyphrase extraction to health-related content. Keyphrase extraction is used to identify salient concepts in social media discourse w… ▽ More

    Submitted 28 May, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 11 pages, 2 figures, submitted to ICWSM. This version represents a substantial expansion and refocus of the previous manuscript, including new experiments, expanded data analysis, and comprehensive discussions

  16. arXiv:2301.07051  [pdf, other

    cs.LG

    ActSafe: Predicting Violations of Medical Temporal Constraints for Medication Adherence

    Authors: Parker Seegmiller, Joseph Gatto, Abdullah Mamun, Hassan Ghasemzadeh, Diane Cook, John Stankovic, Sarah Masud Preum

    Abstract: Prescription medications often impose temporal constraints on regular health behaviors (RHBs) of patients, e.g., eating before taking medication. Violations of such medical temporal constraints (MTCs) can result in adverse effects. Detecting and predicting such violations before they occur can help alert the patient. We formulate the problem of modeling MTCs and develop a proof-of-concept solution… ▽ More

    Submitted 17 January, 2023; originally announced January 2023.

  17. arXiv:2210.03246  [pdf, other

    cs.CL

    HealthE: Classifying Entities in Online Textual Health Advice

    Authors: Joseph Gatto, Parker Seegmiller, Garrett Johnston, Sarah M. Preum

    Abstract: The processing of entities in natural language is essential to many medical NLP systems. Unfortunately, existing datasets vastly under-represent the entities required to model public health relevant texts such as health advice often found on sites like WebMD. People rely on such information for personal health management and clinically relevant decision making. In this work, we release a new annot… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  18. arXiv:2209.11102  [pdf, other

    cs.CL

    Scope of Pre-trained Language Models for Detecting Conflicting Health Information

    Authors: Joseph Gatto, Madhusudan Basak, Sarah M. Preum

    Abstract: An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a unique challenge where information that is accurate in the context of one diagnosis can be conflicting in the context of another. For example, people suffering from d… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  19. arXiv:2206.07152  [pdf, other

    cs.AI cs.FL cs.LG

    An Intelligent Assistant for Converting City Requirements to Formal Specification

    Authors: Zirong Chen, Isaac Li, Haoxiang Zhang, Sarah Preum, John Stankovic, Meiyi Ma

    Abstract: As more and more monitoring systems have been deployed to smart cities, there comes a higher demand for converting new human-specified requirements to machine-understandable formal specifications automatically. However, these human-specific requirements are often written in English and bring missing, inaccurate, or ambiguous information. In this paper, we present CitySpec, an intelligent assistant… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: This demo paper is accepted by SMARTCOMP 2022

  20. arXiv:2206.03132  [pdf, other

    cs.AI cs.CL cs.LG cs.SE

    CitySpec: An Intelligent Assistant System for Requirement Specification in Smart Cities

    Authors: Zirong Chen, Isaac Li, Haoxiang Zhang, Sarah Preum, John A. Stankovic, Meiyi Ma

    Abstract: An increasing number of monitoring systems have been developed in smart cities to ensure that real-time operations of a city satisfy safety and performance requirements. However, many existing city requirements are written in English with missing, inaccurate, or ambiguous information. There is a high demand for assisting city policy makers in converting human-specified requirements to machine-unde… ▽ More

    Submitted 14 June, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: This paper is accepted by SMARTCOMP 2022

  21. arXiv:2007.05831  [pdf, other

    cs.CY

    MFED: A System for Monitoring Family Eating Dynamics

    Authors: Md Abu Sayeed Mondol, Brooke Bell, Meiyi Ma, Ridwan Alam, Ifat Emi, Sarah Masud Preum, Kayla de la Haye, Donna Spruijt-Metz, John C. Lach, John A. Stankovic

    Abstract: Obesity is a risk factor for many health issues, including heart disease, diabetes, osteoarthritis, and certain cancers. One of the primary behavioral causes, dietary intake, has proven particularly challenging to measure and track. Current behavioral science suggests that family eating dynamics (FED) have high potential to impact child and parent dietary intake, and ultimately the risk of obesity… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

  22. arXiv:1910.12444  [pdf

    cs.HC cs.CY

    Information Seeking and Information Processing Behaviors Among Type 2 Diabetics

    Authors: Sarah Masud Preum, Kate Clark, Ashley Davis, Konstantine Khutsishvilli, Rupa S Valdez

    Abstract: Effective patient education is critical for managing Type 2 Diabetes Mellitus (T2DM), one of the most common chronic diseases in the United States. While some studies focus on the information-seeking behavior of T2DM patients, other self-education behaviors including information processing and utilization are rarely explored in the context of T2DM. This study sought to assess two self-education be… ▽ More

    Submitted 28 October, 2019; originally announced October 2019.