Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Small, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.01652  [pdf, other

    cs.CL cs.AI

    Towards Better Generalization in Open-Domain Question Answering by Mitigating Context Memorization

    Authors: Zixuan Zhang, Revanth Gangi Reddy, Kevin Small, Tong Zhang, Heng Ji

    Abstract: Open-domain Question Answering (OpenQA) aims at answering factual questions with an external large-scale knowledge corpus. However, real-world knowledge is not static; it updates and evolves continually. Such a dynamic characteristic of knowledge poses a vital challenge for these models, as the trained models need to constantly adapt to the latest information to make sure that the answers remain a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted to NAACL 2024 Findings

  2. arXiv:2402.11060  [pdf, other

    cs.CL cs.AI cs.IR

    Persona-DB: Efficient Large Language Model Personalization for Response Prediction with Collaborative Data Refinement

    Authors: Chenkai Sun, Ke Yang, Revanth Gangi Reddy, Yi R. Fung, Hou Pong Chan, Kevin Small, ChengXiang Zhai, Heng Ji

    Abstract: The increasing demand for personalized interactions with large language models (LLMs) calls for methodologies capable of accurately and efficiently identifying user opinions and preferences. Retrieval augmentation emerges as an effective strategy, as it can accommodate a vast number of users without the costs from fine-tuning. Existing research, however, has largely focused on enhancing the retrie… ▽ More

    Submitted 20 August, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  3. arXiv:2310.16197  [pdf, other

    cs.CL

    Background Summarization of Event Timelines

    Authors: Adithya Pratapa, Kevin Small, Markus Dreyer

    Abstract: Generating concise summaries of news events is a challenging natural language processing task. While journalists often curate timelines to highlight key sub-events, newcomers to a news event face challenges in catching up on its historical context. In this paper, we address this need by introducing the task of background news summarization, which complements each timeline update with a background… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 camera-ready

  4. arXiv:2301.12473  [pdf, other

    cs.CL cs.LG

    Large Language Models for Biomedical Knowledge Graph Construction: Information extraction from EMR notes

    Authors: Vahan Arsenyan, Spartak Bughdaryan, Fadi Shaya, Kent Small, Davit Shahnazaryan

    Abstract: The automatic construction of knowledge graphs (KGs) is an important research area in medicine, with far-reaching applications spanning drug discovery and clinical trial design. These applications hinge on the accurate identification of interactions among medical and biological entities. In this study, we propose an end-to-end machine learning solution based on large language models (LLMs) that ut… ▽ More

    Submitted 9 December, 2023; v1 submitted 29 January, 2023; originally announced January 2023.

  5. arXiv:2212.01146  [pdf, other

    cs.CL

    SumREN: Summarizing Reported Speech about Events in News

    Authors: Revanth Gangi Reddy, Heba Elfardy, Hou Pong Chan, Kevin Small, Heng Ji

    Abstract: A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i.e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i.e., reported statements). However, existing work on news summarization almost exclusively focuses on the event details. In this work, we pro… ▽ More

    Submitted 7 March, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Accepted at AAAI 2023

  6. arXiv:2205.12386  [pdf, other

    cs.CL cs.IR cs.LG

    PLAtE: A Large-scale Dataset for List Page Web Extraction

    Authors: Aidan San, Yuan Zhuang, Jan Bakus, Colin Lockard, David Ciemiewicz, Sandeep Atluri, Yangfeng Ji, Kevin Small, Heba Elfardy

    Abstract: Recently, neural models have been leveraged to significantly improve the performance of information extraction from semi-structured websites. However, a barrier for continued progress is the small number of datasets large enough to train these models. In this work, we introduce the PLAtE (Pages of Lists Attribute Extraction) benchmark dataset as a challenging new web extraction task. PLAtE focuses… ▽ More

    Submitted 15 June, 2023; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted to ACL Industry Track 2023

  7. arXiv:2205.00042  [pdf, other

    cs.CL

    Answer Consolidation: Formulation and Benchmarking

    Authors: Wenxuan Zhou, Qiang Ning, Heba Elfardy, Kevin Small, Muhao Chen

    Abstract: Current question answering (QA) systems primarily consider the single-answer scenario, where each question is assumed to be paired with one correct answer. However, in many real-world QA applications, multiple answer scenarios arise where consolidating answers into a comprehensive and non-redundant set of answers is a more efficient user interface. In this paper, we formulate the problem of answer… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

    Comments: NAACL 2022

  8. arXiv:2112.08544  [pdf, other

    cs.CL cs.AI

    NewsClaims: A New Benchmark for Claim Detection from News with Attribute Knowledge

    Authors: Revanth Gangi Reddy, Sai Chetan, Zhenhailong Wang, Yi R. Fung, Kathryn Conger, Ahmed Elsayed, Martha Palmer, Preslav Nakov, Eduard Hovy, Kevin Small, Heng Ji

    Abstract: Claim detection and verification are crucial for news understanding and have emerged as promising technologies for mitigating misinformation and disinformation in the news. However, most existing work has focused on claim sentence analysis while overlooking additional crucial attributes (e.g., the claimer and the main object associated with the claim). In this work, we present NewsClaims, a new be… ▽ More

    Submitted 23 November, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: Accepted at EMNLP 2022

  9. arXiv:2109.04689  [pdf, other

    cs.CL cs.AI cs.LG

    Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning

    Authors: Li Zhou, Kevin Small, Yong Zhang, Sandeep Atluri

    Abstract: Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset… ▽ More

    Submitted 10 September, 2021; originally announced September 2021.

    Comments: To appear in Proceedings of EMNLP 2021

  10. arXiv:2011.07353  [pdf, other

    eess.IV cs.CV

    Pneumothorax and chest tube classification on chest x-rays for detection of missed pneumothorax

    Authors: Benedikt Graf, Arkadiusz Sitek, Amin Katouzian, Yen-Fu Lu, Arun Krishnan, Justin Rafael, Kirstin Small, Yiting Xie

    Abstract: Chest x-ray imaging is widely used for the diagnosis of pneumothorax and there has been significant interest in developing automated methods to assist in image interpretation. We present an image classification pipeline which detects pneumothorax as well as the various types of chest tubes that are commonly used to treat pneumothorax. Our multi-stage algorithm is based on lung segmentation followe… ▽ More

    Submitted 14 November, 2020; originally announced November 2020.

  11. Summary-Oriented Question Generation for Informational Queries

    Authors: Xusen Yin, Li Zhou, Kevin Small, Jonathan May

    Abstract: Users frequently ask simple factoid questions for question answering (QA) systems, attenuating the impact of myriad recent works that support more complex questions. Prompting users with automatically generated suggested questions (SQs) can improve user understanding of QA system capabilities and thus facilitate more effective use. We aim to produce self-explanatory questions that focus on main do… ▽ More

    Submitted 9 July, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

    Comments: 17 pages

  12. arXiv:2008.06924  [pdf, other

    cs.LG cs.AI

    Inverse Reinforcement Learning with Natural Language Goals

    Authors: Li Zhou, Kevin Small

    Abstract: Humans generally use natural language to communicate task requirements to each other. Ideally, natural language should also be usable for communicating goals to autonomous machines (e.g., robots) to minimize friction in task specification. However, understanding and mapping natural language goals to sequences of states and actions is challenging. Specifically, existing work along these lines has e… ▽ More

    Submitted 15 December, 2020; v1 submitted 16 August, 2020; originally announced August 2020.

    Comments: To appear in Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI 2021)

  13. arXiv:2005.10464  [pdf, other

    cs.CL

    Fluent Response Generation for Conversational Question Answering

    Authors: Ashutosh Baheti, Alan Ritter, Kevin Small

    Abstract: Question answering (QA) is an important aspect of open-domain conversational agents, garnering specific research focus in the conversational QA (ConvQA) subtask. One notable limitation of recent ConvQA efforts is the response being answer span extraction from the target corpus, thus ignoring the natural language generation (NLG) aspect of high-quality conversational agents. In this work, we propos… ▽ More

    Submitted 16 December, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

    Comments: 2020 Annual Conference of the Association for Computational Linguistics

  14. arXiv:1911.06192  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Multi-domain Dialogue State Tracking as Dynamic Knowledge Graph Enhanced Question Answering

    Authors: Li Zhou, Kevin Small

    Abstract: Multi-domain dialogue state tracking (DST) is a critical component for conversational AI systems. The domain ontology (i.e., specification of domains, slots, and values) of a conversational AI system is generally incomplete, making the capability for DST models to generalize to new slots, values, and domains during inference imperative. In this paper, we propose to model multi-domain DST as a ques… ▽ More

    Submitted 20 June, 2020; v1 submitted 7 November, 2019; originally announced November 2019.

  15. arXiv:1811.12591  [pdf, other

    cs.LG stat.ML

    Active Learning in Recommendation Systems with Multi-level User Preferences

    Authors: Yuheng Bu, Kevin Small

    Abstract: While recommendation systems generally observe user behavior passively, there has been an increased interest in directly querying users to learn their specific preferences. In such settings, considering queries at different levels of granularity to optimize user information acquisition is crucial to efficiently providing a good user experience. In this work, we study the active learning problem wi… ▽ More

    Submitted 29 November, 2018; originally announced November 2018.

  16. arXiv:1712.02838  [pdf, ps, other

    cs.AI cs.CL cs.LG

    End-to-End Offline Goal-Oriented Dialog Policy Learning via Policy Gradient

    Authors: Li Zhou, Kevin Small, Oleg Rokhlenko, Charles Elkan

    Abstract: Learning a goal-oriented dialog policy is generally performed offline with supervised learning algorithms or online with reinforcement learning (RL). Additionally, as companies accumulate massive quantities of dialog transcripts between customers and trained human agents, encoder-decoder methods have gained popularity as agent utterances can be directly treated as supervision without the need for… ▽ More

    Submitted 7 December, 2017; originally announced December 2017.

    Comments: Workshop on Conversational AI, NIPS 2017, Long Beach, CA, USA