Search | arXiv e-print repository

Jill Watson: A Virtual Teaching Assistant powered by ChatGPT

Authors: Karan Taneja, Pratyusha Maiti, Sandeep Kakar, Pranav Guruprasad, Sanjeev Rao, Ashok K. Goel

Abstract: Conversational AI agents often require extensive datasets for training that are not publicly released, are limited to social chit-chat or handling a specific domain, and may not be easily extended to accommodate the latest advances in AI technologies. This paper introduces Jill Watson, a conversational Virtual Teaching Assistant (VTA) leveraging the capabilities of ChatGPT. Jill Watson based on Ch… ▽ More Conversational AI agents often require extensive datasets for training that are not publicly released, are limited to social chit-chat or handling a specific domain, and may not be easily extended to accommodate the latest advances in AI technologies. This paper introduces Jill Watson, a conversational Virtual Teaching Assistant (VTA) leveraging the capabilities of ChatGPT. Jill Watson based on ChatGPT requires no prior training and uses a modular design to allow the integration of new APIs using a skill-based architecture inspired by XiaoIce. Jill Watson is also well-suited for intelligent textbooks as it can process and converse using multiple large documents. We exclusively utilize publicly available resources for reproducibility and extensibility. Comparative analysis shows that our system outperforms the legacy knowledge-based Jill Watson as well as the OpenAI Assistants service. We employ many safety measures that reduce instances of hallucinations and toxicity. The paper also includes real-world examples from a classroom setting that demonstrate different features of Jill Watson and its effectiveness. △ Less

Submitted 17 May, 2024; originally announced May 2024.

arXiv:2401.05467 [pdf, other]

Active Label Correction for Building LLM-based Modular AI Systems

Authors: Karan Taneja, Ashok Goel

Abstract: Large Language Models (LLMs) have been used to build modular AI systems such as HuggingGPT, Microsoft Bing Chat, and more. To improve such systems after deployment using the data collected from human interactions, each module can be replaced by a fine-tuned model but the annotations received from LLMs are low quality. We propose that active label correction can be used to improve the data quality… ▽ More Large Language Models (LLMs) have been used to build modular AI systems such as HuggingGPT, Microsoft Bing Chat, and more. To improve such systems after deployment using the data collected from human interactions, each module can be replaced by a fine-tuned model but the annotations received from LLMs are low quality. We propose that active label correction can be used to improve the data quality by only examining a fraction of the dataset. In this paper, we analyze the noise in datasets annotated by ChatGPT and study denoising it with human feedback. Our results show that active label correction can lead to oracle performance with feedback on fewer examples than the number of noisy examples in the dataset across three different NLP tasks. △ Less

Submitted 17 May, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

arXiv:2401.05199 [pdf, other]

Monte Carlo Tree Search for Recipe Generation using GPT-2

Authors: Karan Taneja, Richard Segal, Richard Goodwin

Abstract: Automatic food recipe generation methods provide a creative tool for chefs to explore and to create new, and interesting culinary delights. Given the recent success of large language models (LLMs), they have the potential to create new recipes that can meet individual preferences, dietary constraints, and adapt to what is in your refrigerator. Existing research on using LLMs to generate recipes ha… ▽ More Automatic food recipe generation methods provide a creative tool for chefs to explore and to create new, and interesting culinary delights. Given the recent success of large language models (LLMs), they have the potential to create new recipes that can meet individual preferences, dietary constraints, and adapt to what is in your refrigerator. Existing research on using LLMs to generate recipes has shown that LLMs can be finetuned to generate realistic-sounding recipes. However, on close examination, these generated recipes often fail to meet basic requirements like including chicken as an ingredient in chicken dishes. In this paper, we propose RecipeMC, a text generation method using GPT-2 that relies on Monte Carlo Tree Search (MCTS). RecipeMC allows us to define reward functions to put soft constraints on text generation and thus improve the credibility of the generated recipes. Our results show that human evaluators prefer recipes generated with RecipeMC more often than recipes generated with other baseline methods when compared with real recipes. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 10 pages, 1 figure, ICCC 2023

arXiv:2305.16593 [pdf]

A Multi-Resolution Physics-Informed Recurrent Neural Network: Formulation and Application to Musculoskeletal Systems

Authors: Karan Taneja, Xiaolong He, Qizhi He, J. S. Chen

Abstract: This work presents a multi-resolution physics-informed recurrent neural network (MR PI-RNN), for simultaneous prediction of musculoskeletal (MSK) motion and parameter identification of the MSK systems. The MSK application was selected as the model problem due to its challenging nature in mapping the high-frequency surface electromyography (sEMG) signals to the low-frequency body joint motion contr… ▽ More This work presents a multi-resolution physics-informed recurrent neural network (MR PI-RNN), for simultaneous prediction of musculoskeletal (MSK) motion and parameter identification of the MSK systems. The MSK application was selected as the model problem due to its challenging nature in mapping the high-frequency surface electromyography (sEMG) signals to the low-frequency body joint motion controlled by the MSK and muscle contraction dynamics. The proposed method utilizes the fast wavelet transform to decompose the mixed frequency input sEMG and output joint motion signals into nested multi-resolution signals. The prediction model is subsequently trained on coarser-scale input-output signals using a gated recurrent unit (GRU), and then the trained parameters are transferred to the next level of training with finer-scale signals. These training processes are repeated recursively under a transfer-learning fashion until the full-scale training (i.e., with unfiltered signals) is achieved, while satisfying the underlying dynamic equilibrium. Numerical examples on recorded subject data demonstrate the effectiveness of the proposed framework in generating a physics-informed forward-dynamics surrogate, which yields higher accuracy in motion predictions of elbow flexion-extension of an MSK system compared to the case with single-scale training. The framework is also capable of identifying muscle parameters that are physiologically consistent with the subject's kinematics data. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 40 pages, 11 figures, 5 tables

arXiv:2206.05182 [pdf, other]

Human-AI Interaction Design in Machine Teaching

Authors: Karan Taneja, Harshvardhan Sikka, Ashok Goel

Abstract: Machine Teaching (MT) is an interactive process where a human and a machine interact with the goal of training a machine learning model (ML) for a specified task. The human teacher communicates their task expertise and the machine student gathers the required data and knowledge to produce an ML model. MT systems are developed to jointly minimize the time spent on teaching and the learner's error r… ▽ More Machine Teaching (MT) is an interactive process where a human and a machine interact with the goal of training a machine learning model (ML) for a specified task. The human teacher communicates their task expertise and the machine student gathers the required data and knowledge to produce an ML model. MT systems are developed to jointly minimize the time spent on teaching and the learner's error rate. The design of human-AI interaction in an MT system not only impacts the teaching efficiency, but also indirectly influences the ML performance by affecting the teaching quality. In this paper, we build upon our previous work where we proposed an MT framework with three components, viz., the teaching interface, the machine learner, and the knowledge base, and focus on the human-AI interaction design involved in realizing the teaching interface. We outline design decisions that need to be addressed in developing an MT system beginning from an ML task. The paper follows the Socratic method entailing a dialogue between a curious student and a wise teacher. △ Less

Submitted 10 June, 2022; originally announced June 2022.

Comments: 7 pages, 4 figures

arXiv:2204.10357 [pdf, other]

A Framework for Interactive Knowledge-Aided Machine Teaching

Authors: Karan Taneja, Harshvardhan Sikka, Ashok Goel

Abstract: Machine Teaching (MT) is an interactive process where humans train a machine learning model by playing the role of a teacher. The process of designing an MT system involves decisions that can impact both efficiency of human teachers and performance of machine learners. Previous research has proposed and evaluated specific MT systems but there is limited discussion on a general framework for design… ▽ More Machine Teaching (MT) is an interactive process where humans train a machine learning model by playing the role of a teacher. The process of designing an MT system involves decisions that can impact both efficiency of human teachers and performance of machine learners. Previous research has proposed and evaluated specific MT systems but there is limited discussion on a general framework for designing them. We propose a framework for designing MT systems and also detail a system for the text classification problem as a specific instance. Our framework focuses on three components i.e. teaching interface, machine learner, and knowledge base; and their relations describe how each component can benefit the others. Our preliminary experiments show how MT systems can reduce both human teaching time and machine learner error rate. △ Less

Submitted 21 April, 2022; originally announced April 2022.

Comments: 8 pages, 4 figures

arXiv:2010.05549 [pdf, ps, other]

Improving Low Resource Code-switched ASR using Augmented Code-switched TTS

Authors: Yash Sharma, Basil Abraham, Karan Taneja, Preethi Jyothi

Abstract: Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide. End-to-end ASR systems are a natural modeling choice due to their ease of use and superior performance in monolingual settings. However, it is well known that end-to-end systems require large amoun… ▽ More Building Automatic Speech Recognition (ASR) systems for code-switched speech has recently gained renewed attention due to the widespread use of speech technologies in multilingual communities worldwide. End-to-end ASR systems are a natural modeling choice due to their ease of use and superior performance in monolingual settings. However, it is well known that end-to-end systems require large amounts of labeled speech. In this work, we investigate improving code-switched ASR in low resource settings via data augmentation using code-switched text-to-speech (TTS) synthesis. We propose two targeted techniques to effectively leverage TTS speech samples: 1) Mixup, an existing technique to create new training samples via linear interpolation of existing samples, applied to TTS and real speech samples, and 2) a new loss function, used in conjunction with TTS samples, to encourage code-switched predictions. We report significant improvements in ASR performance achieving absolute word error rate (WER) reductions of up to 5%, and measurable improvement in code switching using our proposed techniques on a Hindi-English code-switched ASR task. △ Less

Submitted 12 October, 2020; originally announced October 2020.

Comments: Interspeech 2020, 5 pages

arXiv:1005.5130 [pdf]

Exploring Selfish Trends of Malicious Mobile Devices in MANET

Authors: P. K. Suri, Kavita Taneja

Abstract: The research effort on mobile computing has focused mainly on routing and usually assumes that all mobile devices (MDs) are cooperative. These assumptions hold on military or search and rescue operations, where all hosts are from the same authority and their users have common goals. The application of mobile ad hoc networks (MANETs) as open networks has emerged recently but proliferated exponentia… ▽ More The research effort on mobile computing has focused mainly on routing and usually assumes that all mobile devices (MDs) are cooperative. These assumptions hold on military or search and rescue operations, where all hosts are from the same authority and their users have common goals. The application of mobile ad hoc networks (MANETs) as open networks has emerged recently but proliferated exponentially. Energy is a valuable commodity in MANETs due to the limited battery of the portable devices. Batteries typically cannot be replaced in MANETs, making their lifetime limited. Diverse users, with unlike goals, share the resources of their devices and ensuring global connectivity comes very low in their priority. This sort of communities can already be found in wired networks, namely on peer-to-peer networks. In this scenario, open MANETs will likely resemble social environments. A group of persons can provide benefits to each of its members as long as everyone provides his contribution. For our particular case, each element of a MANET will be called to forward messages and to participate on routing protocols. A selfish behavior threatens the entire community and also this behavior is infectious as, other MDs may also start to perform in the same way. In the extreme, this can take to the complete sabotage of the network. This paper investigates the prevalent malicious attacks in MANET and analyzes recent selfish trends in MANET. We analyzed the respective strengths and vulnerabilities of the existing selfish behaviour prevention scheme. △ Less

Submitted 1 June, 2010; v1 submitted 27 May, 2010; originally announced May 2010.

Journal ref: Journal of Telecommunications,Volume 2, Issue 2, pp25-30, May 2010

Showing 1–8 of 8 results for author: Taneja, K