Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Vu, D N L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.00122  [pdf, other

    cs.CL

    A Course Shared Task on Evaluating LLM Output for Clinical Questions

    Authors: Yufang Hou, Thy Thy Tran, Doan Nam Long Vu, Yiwen Cao, Kai Li, Lukas Rohde, Iryna Gurevych

    Abstract: This paper presents a shared task that we organized at the Foundations of Language Technology (FoLT) course in 2023/2024 at the Technical University of Darmstadt, which focuses on evaluating the output of Large Language Models (LLMs) in generating harmful answers to health-related clinical questions. We describe the task design considerations and report the feedback we received from the students.… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: accepted at the sixth Workshop on Teaching NLP (co-located with ACL 2024)

  2. arXiv:2407.18789  [pdf, other

    cs.CL

    Granularity is crucial when applying differential privacy to text: An investigation for neural machine translation

    Authors: Doan Nam Long Vu, Timour Igamberdiev, Ivan Habernal

    Abstract: Applying differential privacy (DP) by means of the DP-SGD algorithm to protect individual data points during training is becoming increasingly popular in NLP. However, the choice of granularity at which DP is applied is often neglected. For example, neural machine translation (NMT) typically operates on the sentence-level granularity. From the perspective of DP, this setup assumes that each senten… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

  3. arXiv:2311.14465  [pdf, other

    cs.CL

    DP-NMT: Scalable Differentially-Private Machine Translation

    Authors: Timour Igamberdiev, Doan Nam Long Vu, Felix Künnecke, Zhuo Yu, Jannik Holmer, Ivan Habernal

    Abstract: Neural machine translation (NMT) is a widely popular text generation task, yet there is a considerable research gap in the development of privacy-preserving NMT models, despite significant data privacy concerns for NMT systems. Differentially private stochastic gradient descent (DP-SGD) is a popular method for training machine learning models with concrete privacy guarantees; however, the implemen… ▽ More

    Submitted 24 April, 2024; v1 submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted at EACL 2024

  4. arXiv:2209.02317  [pdf, other

    cs.CL

    Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?

    Authors: Doan Nam Long Vu, Nafise Sadat Moosavi, Steffen Eger

    Abstract: The evaluation of recent embedding-based evaluation metrics for text generation is primarily based on measuring their correlation with human evaluations on standard benchmarks. However, these benchmarks are mostly from similar domains to those used for pretraining word embeddings. This raises concerns about the (lack of) generalization of embedding-based metrics to new and noisy domains that conta… ▽ More

    Submitted 7 September, 2022; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: COLING 2022 camera-ready version