Zum Hauptinhalt springen

Showing 1–10 of 10 results for author: Truong, T H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.02421  [pdf, other

    cs.CL

    Revisiting subword tokenization: A case study on affixal negation in large language models

    Authors: Thinh Hung Truong, Yulia Otmakhova, Karin Verspoor, Trevor Cohn, Timothy Baldwin

    Abstract: In this work, we measure the impact of affixal negation on modern English large language models (LLMs). In affixal negation, the negated meaning is expressed through a negative morpheme, which is potentially challenging for LLMs as their tokenizers are often not morphologically plausible. We conduct extensive experiments using LLMs with different subword tokenization methods, which lead to several… ▽ More

    Submitted 4 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

    Comments: NAACL 2024

  2. arXiv:2306.08189  [pdf, other

    cs.CL

    Language models are not naysayers: An analysis of language models on negation benchmarks

    Authors: Thinh Hung Truong, Timothy Baldwin, Karin Verspoor, Trevor Cohn

    Abstract: Negation has been shown to be a major bottleneck for masked language models, such as BERT. However, whether this finding still holds for larger-sized auto-regressive language models (``LLMs'') has not been studied comprehensively. With the ever-increasing volume of research and applications of LLMs, we take a step back to evaluate the ability of current-generation LLMs to handle negation, a fundam… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

  3. arXiv:2305.13693  [pdf, other

    cs.CL

    Automated Metrics for Medical Multi-Document Summarization Disagree with Human Evaluations

    Authors: Lucy Lu Wang, Yulia Otmakhova, Jay DeYoung, Thinh Hung Truong, Bailey E. Kuehl, Erin Bransom, Byron C. Wallace

    Abstract: Evaluating multi-document summarization (MDS) quality is difficult. This is especially true in the case of MDS for biomedical literature reviews, where models must synthesize contradicting evidence reported across different documents. Prior work has shown that rather than performing the task, models may exploit shortcuts that are difficult to detect using standard n-gram similarity metrics such as… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: ACL 2023; Github: https://github.com/allenai/mslr-annotated-dataset

  4. arXiv:2210.03256  [pdf, other

    cs.CL

    Not another Negation Benchmark: The NaN-NLI Test Suite for Sub-clausal Negation

    Authors: Thinh Hung Truong, Yulia Otmakhova, Timothy Baldwin, Trevor Cohn, Jey Han Lau, Karin Verspoor

    Abstract: Negation is poorly captured by current language models, although the extent of this problem is not widely understood. We introduce a natural language inference (NLI) test suite to enable probing the capabilities of NLP methods, with the aim of understanding sub-clausal negation. The test suite contains premise--hypothesis pairs where the premise contains sub-clausal negation and the hypothesis is… ▽ More

    Submitted 13 October, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: AACL-ICJNLP 2022

  5. From Disfluency Detection to Intent Detection and Slot Filling

    Authors: Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen

    Abstract: We present the first empirical study investigating the influence of disfluency detection on downstream tasks of intent detection and slot filling. We perform this study for Vietnamese -- a low-resource language that has no previous study as well as no public dataset available for disfluency detection. First, we extend the fluent Vietnamese intent detection and slot filling dataset PhoATIS by manua… ▽ More

    Submitted 17 September, 2022; originally announced September 2022.

    Comments: In Proceedings of INTERSPEECH 2022

  6. arXiv:2205.04012  [pdf, other

    cs.CL

    Improving negation detection with negation-focused pre-training

    Authors: Thinh Hung Truong, Timothy Baldwin, Trevor Cohn, Karin Verspoor

    Abstract: Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text. Recent work has shown that state-of-the-art NLP models underperform on samples containing negation in various tasks, and that negation detection models do not transfer well across domains. We propose a new negatio… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  7. arXiv:2202.07858  [pdf, ps, other

    cs.CL cs.IR

    ITTC @ TREC 2021 Clinical Trials Track

    Authors: Thinh Hung Truong, Yulia Otmakhova, Rahmad Mahendra, Timothy Baldwin, Jey Han Lau, Trevor Cohn, Lawrence Cavedon, Damiano Spina, Karin Verspoor

    Abstract: This paper describes the submissions of the Natural Language Processing (NLP) team from the Australian Research Council Industrial Transformation Training Centre (ITTC) for Cognitive Computing in Medical Technologies to the TREC 2021 Clinical Trials Track. The task focuses on the problem of matching eligible clinical trials to topics constituting a summary of a patient's admission notes. We explor… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: 7 pages

  8. arXiv:2104.03879  [pdf, other

    cs.CL

    COVID-19 Named Entity Recognition for Vietnamese

    Authors: Thinh Hung Truong, Mai Hoang Dao, Dat Quoc Nguyen

    Abstract: The current COVID-19 pandemic has lead to the creation of many corpora that facilitate NLP research and downstream applications to help fight the pandemic. However, most of these corpora are exclusively for English. As the pandemic is a global problem, it is worth creating COVID-19 related datasets for languages other than English. In this paper, we present the first manually-annotated COVID-19 do… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

    Comments: To appear in Proceedings of NAACL 2021

  9. arXiv:2104.02021  [pdf, other

    cs.CL

    Intent Detection and Slot Filling for Vietnamese

    Authors: Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen

    Abstract: Intent detection and slot filling are important tasks in spoken and natural language understanding. However, Vietnamese is a low-resource language in these research topics. In this paper, we present the first public intent detection and slot filling dataset for Vietnamese. In addition, we also propose a joint model for intent detection and slot filling, that extends the recent state-of-the-art Joi… ▽ More

    Submitted 9 June, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: To appear in Proceedings of INTERSPEECH 2021; The first two authors contributed equally to this work

  10. arXiv:1908.09766  [pdf

    cs.NI eess.SY

    A Hybrid of Adaptation and Dynamic Routing based on SDN for Improving QoE in HTTP Adaptive VBR Video Streaming

    Authors: Hong Thinh Pham, Ngoc Nam Pham, Huu Thanh Nguyen, Alan Marshall, Thu Huong Truong

    Abstract: Recently, HTTP Adaptive Streaming HAS has received significant attention from both industry and academia based on its ability to enhancing media streaming services over the Internet. Recent research solutions that have tried to improve HAS by adaptation at the client side only may not be completely effective without interacting with routing decisions in the upper layers. In this paper, we address… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: 14 pages, 17 figures, IJCSNS International Journal of Computer Science and Network Security, http://paper.ijcsns.org/07_book/201907/20190708.pdf

    Journal ref: VOL.19 No.7, July 2019