Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Tzou, N

Searching in archive cs. Search in all archives.
.
  1. STEER: Semantic Turn Extension-Expansion Recognition for Voice Assistants

    Authors: Leon Liyang Zhang, Jiarui Lu, Joel Ruben Antony Moniz, Aditya Kulkarni, Dhivya Piraviperumal, Tien Dung Tran, Nicholas Tzou, Hong Yu

    Abstract: In the context of a voice assistant system, steering refers to the phenomenon in which a user issues a follow-up command attempting to direct or clarify a previous turn. We propose STEER, a steering detection model that predicts whether a follow-up turn is a user's attempt to steer the previous command. Constructing a training dataset for steering use cases poses challenges due to the cold-start p… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Industry Track

  2. arXiv:2110.06416  [pdf, other

    cs.CV cs.LG

    MMIU: Dataset for Visual Intent Understanding in Multimodal Assistants

    Authors: Alkesh Patel, Joel Ruben Antony Moniz, Roman Nguyen, Nick Tzou, Hadas Kotek, Vincent Renkens

    Abstract: In multimodal assistant, where vision is also one of the input modalities, the identification of user intent becomes a challenging task as visual input can influence the outcome. Current digital assistants take spoken input and try to determine the user intent from conversational or device context. So, a dataset, which includes visual input (i.e. images or videos for the corresponding questions ta… ▽ More

    Submitted 30 October, 2021; v1 submitted 12 October, 2021; originally announced October 2021.

    Comments: Extended abstract accepted for WeCNLP 2021