Search | arXiv e-print repository

arXiv:2401.15966 [pdf, ps, other]

Response Generation for Cognitive Behavioral Therapy with Large Language Models: Comparative Study with Socratic Questioning

Authors: Kenta Izumi, Hiroki Tanaka, Kazuhiro Shidara, Hiroyoshi Adachi, Daisuke Kanayama, Takashi Kudo, Satoshi Nakamura

Abstract: Dialogue systems controlled by predefined or rule-based scenarios derived from counseling techniques, such as cognitive behavioral therapy (CBT), play an important role in mental health apps. Despite the need for responsible responses, it is conceivable that using the newly emerging LLMs to generate contextually relevant utterances will enhance these apps. In this study, we construct dialogue modu… ▽ More Dialogue systems controlled by predefined or rule-based scenarios derived from counseling techniques, such as cognitive behavioral therapy (CBT), play an important role in mental health apps. Despite the need for responsible responses, it is conceivable that using the newly emerging LLMs to generate contextually relevant utterances will enhance these apps. In this study, we construct dialogue modules based on a CBT scenario focused on conventional Socratic questioning using two kinds of LLMs: a Transformer-based dialogue model further trained with a social media empathetic counseling dataset, provided by Osaka Prefecture (OsakaED), and GPT-4, a state-of-the art LLM created by OpenAI. By comparing systems that use LLM-generated responses with those that do not, we investigate the impact of generated responses on subjective evaluations such as mood change, cognitive change, and dialogue quality (e.g., empathy). As a result, no notable improvements are observed when using the OsakaED model. When using GPT-4, the amount of mood change, empathy, and other dialogue qualities improve significantly. Results suggest that GPT-4 possesses a high counseling ability. However, they also indicate that even when using a dialogue model trained with a human counseling dataset, it does not necessarily yield better outcomes compared to scenario-based dialogues. While presenting LLM-generated responses, including GPT-4, and having them interact directly with users in real-life mental health care services may raise ethical issues, it is still possible for human professionals to produce example responses or response templates using LLMs in advance in systems that use rules, scenarios, or example responses. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted by IWSDS2024

arXiv:2304.13693 [pdf, other]

Requirements Engineering, Software Testing and Education: A Systematic Mapping

Authors: Thalia S. Santana, Taciana N. Kudo, Renato F. Bulcão-Neto

Abstract: The activities of requirements engineering and software testing are intrinsically related to each other, as these two areas are linked when seeking to specify and also ensure the expectations of a software product, with quality and on time. This systematic mapping study aims to verify how requirements and testing are being addressed together in the educational context. The activities of requirements engineering and software testing are intrinsically related to each other, as these two areas are linked when seeking to specify and also ensure the expectations of a software product, with quality and on time. This systematic mapping study aims to verify how requirements and testing are being addressed together in the educational context. △ Less

Submitted 26 April, 2023; originally announced April 2023.

Comments: 20 pages, in Portuguese language

arXiv:2008.09459 [pdf]

Metamodel Quality Requirements and Evaluation (MQuaRE)

Authors: Taciana Novo Kudo, Renato F. Bulcão-Neto, Auri Marcelo Rizzo Vincenzi

Abstract: Models are the primary artifacts of model-driven software engineering (MDSD) [1], and a terminal model is a representation that conforms to a given software metamodel [2, 3]. As the quality of a software metamodel directly impacts the quality of terminal models, software metamodel quality is an essential aspect of MDSD. However, the literature reports a few proposals for metamodel quality evaluati… ▽ More Models are the primary artifacts of model-driven software engineering (MDSD) [1], and a terminal model is a representation that conforms to a given software metamodel [2, 3]. As the quality of a software metamodel directly impacts the quality of terminal models, software metamodel quality is an essential aspect of MDSD. However, the literature reports a few proposals for metamodel quality evaluation, but most lack a general solution for the quality issue. Some efforts focus on quality measures [4], a quality evaluation model [5], or a quality evaluation model with structural measures borrowed from OO design [6, 7, 8]. Thus, we support there is a need for a more thorough solution for metamodel quality evaluation, with potential benefits to MDSD in general. This document describes a metamodel quality evaluation framework called MQuaRE (Metamodel Quality Requirements and Evaluation). MQuaRE is an integrated framework composed of metamodel quality requirements, a metamodel quality model, metamodel quality measures, and an evaluation process, with a great contribution of the ISO/IEC 25000 series [9] for software product quality evaluation. △ Less

Submitted 9 September, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

arXiv:1808.06226 [pdf, other]

SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing

Authors: Taku Kudo, John Richardson

Abstract: This paper describes SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, including Neural Machine Translation. It provides open-source C++ and Python implementations for subword units. While existing subword segmentation tools assume that the input is pre-tokenized into word sequences, SentencePiece can train subword models directly fr… ▽ More This paper describes SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, including Neural Machine Translation. It provides open-source C++ and Python implementations for subword units. While existing subword segmentation tools assume that the input is pre-tokenized into word sequences, SentencePiece can train subword models directly from raw sentences, which allows us to make a purely end-to-end and language independent system. We perform a validation experiment of NMT on English-Japanese machine translation, and find that it is possible to achieve comparable accuracy to direct subword training from raw sentences. We also compare the performance of subword training and segmentation with various configurations. SentencePiece is available under the Apache 2 license at https://github.com/google/sentencepiece. △ Less

Submitted 19 August, 2018; originally announced August 2018.

Comments: Accepted as a demo paper at EMNLP2018

arXiv:1804.10959 [pdf, other]

Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates

Authors: Taku Kudo

Abstract: Subword units are an effective way to alleviate the open vocabulary problems in neural machine translation (NMT). While sentences are usually converted into unique subword sequences, subword segmentation is potentially ambiguous and multiple segmentations are possible even with the same vocabulary. The question addressed in this paper is whether it is possible to harness the segmentation ambiguity… ▽ More Subword units are an effective way to alleviate the open vocabulary problems in neural machine translation (NMT). While sentences are usually converted into unique subword sequences, subword segmentation is potentially ambiguous and multiple segmentations are possible even with the same vocabulary. The question addressed in this paper is whether it is possible to harness the segmentation ambiguity as a noise to improve the robustness of NMT. We present a simple regularization method, subword regularization, which trains the model with multiple subword segmentations probabilistically sampled during training. In addition, for better subword sampling, we propose a new subword segmentation algorithm based on a unigram language model. We experiment with multiple corpora and report consistent improvements especially on low resource and out-of-domain settings. △ Less

Submitted 29 April, 2018; originally announced April 2018.

Comments: Accepted as a long paper at ACL2018

arXiv:1609.08144 [pdf, other]

Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation

Authors: Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Łukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith , et al. (6 additional authors not shown)

Abstract: Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NM… ▽ More Neural Machine Translation (NMT) is an end-to-end learning approach for automated translation, with the potential to overcome many of the weaknesses of conventional phrase-based translation systems. Unfortunately, NMT systems are known to be computationally expensive both in training and in translation inference. Also, most NMT systems have difficulty with rare words. These issues have hindered NMT's use in practical deployments and services, where both accuracy and speed are essential. In this work, we present GNMT, Google's Neural Machine Translation system, which attempts to address many of these issues. Our model consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. To improve parallelism and therefore decrease training time, our attention mechanism connects the bottom layer of the decoder to the top layer of the encoder. To accelerate the final translation speed, we employ low-precision arithmetic during inference computations. To improve handling of rare words, we divide words into a limited set of common sub-word units ("wordpieces") for both input and output. This method provides a good balance between the flexibility of "character"-delimited models and the efficiency of "word"-delimited models, naturally handles translation of rare words, and ultimately improves the overall accuracy of the system. Our beam search technique employs a length-normalization procedure and uses a coverage penalty, which encourages generation of an output sentence that is most likely to cover all the words in the source sentence. On the WMT'14 English-to-French and English-to-German benchmarks, GNMT achieves competitive results to state-of-the-art. Using a human side-by-side evaluation on a set of isolated simple sentences, it reduces translation errors by an average of 60% compared to Google's phrase-based production system. △ Less

Submitted 8 October, 2016; v1 submitted 26 September, 2016; originally announced September 2016.

Showing 1–6 of 6 results for author: Kudo, T