Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Basak, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01147  [pdf, other

    cs.CL cs.LG

    Do LLMs Find Human Answers To Fact-Driven Questions Perplexing? A Case Study on Reddit

    Authors: Parker Seegmiller, Joseph Gatto, Omar Sharif, Madhusudan Basak, Sarah Masud Preum

    Abstract: Large language models (LLMs) have been shown to be proficient in correctly answering questions in the context of online discourse. However, the study of using LLMs to model human-like answers to fact-driven social media questions is still under-explored. In this work, we investigate how LLMs model the wide variety of human answers to fact-driven questions posed on several topic-specific Reddit com… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures

  2. arXiv:2403.03336  [pdf, other

    cs.CL cs.SI

    Scope of Large Language Models for Mining Emerging Opinions in Online Health Discourse

    Authors: Joseph Gatto, Madhusudan Basak, Yash Srivastava, Philip Bohlman, Sarah M. Preum

    Abstract: In this paper, we develop an LLM-powered framework for the curation and evaluation of emerging opinion mining in online health communities. We formulate emerging opinion mining as a pairwise stance detection problem between (title, comment) pairs sourced from Reddit, where post titles contain emerging health-related claims on a topic that is not predefined. The claims are either explicitly or impl… ▽ More

    Submitted 5 March, 2024; originally announced March 2024.

  3. arXiv:2308.09156  [pdf, other

    cs.CL

    Characterizing Information Seeking Events in Health-Related Social Discourse

    Authors: Omar Sharif, Madhusudan Basak, Tanzia Parvin, Ava Scharfstein, Alphonso Bradham, Jacob T. Borodovsky, Sarah E. Lord, Sarah M. Preum

    Abstract: Social media sites have become a popular platform for individuals to seek and share health information. Despite the progress in natural language processing for social media mining, a gap remains in analyzing health-related texts on social discourse in the context of events. Event-driven analysis can offer insights into different facets of healthcare at an individual and collective level, including… ▽ More

    Submitted 19 December, 2023; v1 submitted 17 August, 2023; originally announced August 2023.

    Comments: Accepted at AAAI-2024. 9 pages, 6 tables, 2 figures

  4. arXiv:2303.09366  [pdf, other

    cs.CL cs.LG

    The Scope of In-Context Learning for the Extraction of Medical Temporal Constraints

    Authors: Parker Seegmiller, Joseph Gatto, Madhusudan Basak, Diane Cook, Hassan Ghasemzadeh, John Stankovic, Sarah Preum

    Abstract: Medications often impose temporal constraints on everyday patient activity. Violations of such medical temporal constraints (MTCs) lead to a lack of treatment adherence, in addition to poor health outcomes and increased healthcare expenses. These MTCs are found in drug usage guidelines (DUGs) in both patient education materials and clinical texts. Computationally representing MTCs in DUGs will adv… ▽ More

    Submitted 16 October, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

  5. arXiv:2301.11508  [pdf, other

    cs.CL

    Theme-driven Keyphrase Extraction to Analyze Social Media Discourse

    Authors: William Romano, Omar Sharif, Madhusudan Basak, Joseph Gatto, Sarah Preum

    Abstract: Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis, a gap remains in applying keyphrase extraction to health-related content. Keyphrase extraction is used to identify salient concepts in social media discourse w… ▽ More

    Submitted 28 May, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: 11 pages, 2 figures, submitted to ICWSM. This version represents a substantial expansion and refocus of the previous manuscript, including new experiments, expanded data analysis, and comprehensive discussions

  6. arXiv:2209.11102  [pdf, other

    cs.CL

    Scope of Pre-trained Language Models for Detecting Conflicting Health Information

    Authors: Joseph Gatto, Madhusudan Basak, Sarah M. Preum

    Abstract: An increasing number of people now rely on online platforms to meet their health information needs. Thus identifying inconsistent or conflicting textual health information has become a safety-critical task. Health advice data poses a unique challenge where information that is accurate in the context of one diagnosis can be conflicting in the context of another. For example, people suffering from d… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  7. arXiv:2009.09359  [pdf, other

    cs.CL

    Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

    Authors: Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin, Masum Hasan, Madhusudan Basak, M. Sohel Rahman, Rifat Shahriyar

    Abstract: Despite being the seventh most widely spoken language in the world, Bengali has received much less attention in machine translation literature due to being low in resources. Most publicly available parallel corpora for Bengali are not large enough; and have rather poor quality, mostly because of incorrect sentence alignments resulting from erroneous sentence segmentation, and also because of a hig… ▽ More

    Submitted 7 October, 2020; v1 submitted 20 September, 2020; originally announced September 2020.

    Comments: EMNLP 2020