Search | arXiv e-print repository

Evaluation Metrics of Language Generation Models for Synthetic Traffic Generation Tasks

Authors: Simone Filice, Jason Ingyu Choi, Giuseppe Castellucci, Eugene Agichtein, Oleg Rokhlenko

Abstract: Many Natural Language Generation (NLG) tasks aim to generate a single output text given an input prompt. Other settings require the generation of multiple texts, e.g., for Synthetic Traffic Generation (STG). This generation task is crucial for training and evaluating QA systems as well as conversational agents, where the goal is to generate multiple questions or utterances resembling the linguisti… ▽ More Many Natural Language Generation (NLG) tasks aim to generate a single output text given an input prompt. Other settings require the generation of multiple texts, e.g., for Synthetic Traffic Generation (STG). This generation task is crucial for training and evaluating QA systems as well as conversational agents, where the goal is to generate multiple questions or utterances resembling the linguistic variability of real users. In this paper, we show that common NLG metrics, like BLEU, are not suitable for evaluating STG. We propose and evaluate several metrics designed to compare the generated traffic to the distribution of real user texts. We validate our metrics with an automatic procedure to verify whether they capture different types of quality issues of generated data; we also run human annotations to verify the correlation with human judgements. Experiments on three tasks, i.e., Shopping Utterance Generation, Product Question Generation and Query Auto Completion, demonstrate that our metrics are effective for evaluating STG tasks, and improve the agreement with human judgement up to 20% with respect to common NLG metrics. We believe these findings can pave the way towards better solutions for estimating the representativeness of synthetic text data. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2303.02512 [pdf, other]

Visual Saliency-Guided Channel Pruning for Deep Visual Detectors in Autonomous Driving

Authors: Jung Im Choi, Qing Tian

Abstract: Deep neural network (DNN) pruning has become a de facto component for deploying on resource-constrained devices since it can reduce memory requirements and computation costs during inference. In particular, channel pruning gained more popularity due to its structured nature and direct savings on general hardware. However, most existing pruning approaches utilize importance measures that are not di… ▽ More Deep neural network (DNN) pruning has become a de facto component for deploying on resource-constrained devices since it can reduce memory requirements and computation costs during inference. In particular, channel pruning gained more popularity due to its structured nature and direct savings on general hardware. However, most existing pruning approaches utilize importance measures that are not directly related to the task utility. Moreover, few in the literature focus on visual detection models. To fill these gaps, we propose a novel gradient-based saliency measure for visual detection and use it to guide our channel pruning. Experiments on the KITTI and COCO traffic datasets demonstrate our pruning method's efficacy and superiority over state-of-the-art competing approaches. It can even achieve better performance with fewer parameters than the original model. Our pruning also demonstrates great potential in handling small-scale objects. △ Less

Submitted 4 March, 2023; originally announced March 2023.

Comments: 6 pages, 4 figures

arXiv:2202.04781 [pdf, other]

Adversarial Attack and Defense of YOLO Detectors in Autonomous Driving Scenarios

Authors: Jung Im Choi, Qing Tian

Abstract: Visual detection is a key task in autonomous driving, and it serves as a crucial foundation for self-driving planning and control. Deep neural networks have achieved promising results in various visual tasks, but they are known to be vulnerable to adversarial attacks. A comprehensive understanding of deep visual detectors' vulnerability is required before people can improve their robustness. Howev… ▽ More Visual detection is a key task in autonomous driving, and it serves as a crucial foundation for self-driving planning and control. Deep neural networks have achieved promising results in various visual tasks, but they are known to be vulnerable to adversarial attacks. A comprehensive understanding of deep visual detectors' vulnerability is required before people can improve their robustness. However, only a few adversarial attack/defense works have focused on object detection, and most of them employed only classification and/or localization losses, ignoring the objectness aspect. In this paper, we identify a serious objectness-related adversarial vulnerability in YOLO detectors and present an effective attack strategy targeting the objectness aspect of visual detection in autonomous vehicles. Furthermore, to address such vulnerability, we propose a new objectness-aware adversarial training approach for visual detection. Experiments show that the proposed attack targeting the objectness aspect is 45.17% and 43.50% more effective than those generated from classification and/or localization losses on the KITTI and COCO traffic datasets, respectively. Also, the proposed adversarial defense approach can improve the detectors' robustness against objectness-oriented attacks by up to 21% and 12% mAP on KITTI and COCO traffic, respectively. △ Less

Submitted 3 July, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

Comments: Accepted by 2022 IEEE Intelligent Vehicles Symposium (IV 2022)

arXiv:2008.08180 [pdf, other]

Semantic Product Search for Matching Structured Product Catalogs in E-Commerce

Authors: Jason Ingyu Choi, Surya Kallumadi, Bhaskar Mitra, Eugene Agichtein, Faizan Javed

Abstract: Retrieving all semantically relevant products from the product catalog is an important problem in E-commerce. Compared to web documents, product catalogs are more structured and sparse due to multi-instance fields that encode heterogeneous aspects of products (e.g. brand name and product dimensions). In this paper, we propose a new semantic product search algorithm that learns to represent and agg… ▽ More Retrieving all semantically relevant products from the product catalog is an important problem in E-commerce. Compared to web documents, product catalogs are more structured and sparse due to multi-instance fields that encode heterogeneous aspects of products (e.g. brand name and product dimensions). In this paper, we propose a new semantic product search algorithm that learns to represent and aggregate multi-instance fields into a document representation using state of the art transformers as encoders. Our experiments investigate two aspects of the proposed approach: (1) effectiveness of field representations and structured matching; (2) effectiveness of adding lexical features to semantic search. After training our models using user click logs from a well-known E-commerce platform, we show that our results provide useful insights for improving product search. Lastly, we present a detailed error analysis to show which types of queries benefited the most by fielded representations and structured matching. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: 4 pages

arXiv:2006.01926 [pdf, ps, other]

doi 10.1145/3343413.3378013

Would You Like to Hear the News? Investigating Voice-BasedSuggestions for Conversational News Recommendation

Authors: Harshita Sahijwani, Jason Ingyu Choi, Eugene Agichtein

Abstract: One of the key benefits of voice-based personal assistants is the potential to proactively recommend relevant and interesting information. One of the most valuable sources of such information is the News. However, in order for the user to hear the news that is useful and relevant to them, it must be recommended in an interesting and informative way. However, to the best of our knowledge, how to pr… ▽ More One of the key benefits of voice-based personal assistants is the potential to proactively recommend relevant and interesting information. One of the most valuable sources of such information is the News. However, in order for the user to hear the news that is useful and relevant to them, it must be recommended in an interesting and informative way. However, to the best of our knowledge, how to present a news item for a voice-based recommendation remains an open question. In this paper, we empirically compare different ways of recommending news, or specific news items, in a voice-based conversational setting. Specifically, we study the user engagement and satisfaction with five different variants of presenting news recommendations: (1) a generic news briefing; (2) news about a specific entity relevant to the current conversation; (3) news about an entity from a past conversation; (4) news on a trending news topic; and (5) the default - a suggestion to talk about news in general. Our results show that entity-based news recommendations exhibit 29% higher acceptance compared to briefing recommendations, and almost 100% higher acceptance compared to recommending generic or trending news. Our investigation into the presentation of news recommendations and the resulting insights could make voice assistants more informative and engaging. △ Less

Submitted 2 June, 2020; originally announced June 2020.

arXiv:2006.01921 [pdf, other]

doi 10.1145/3357384.3358047

Offline and Online Satisfaction Prediction in Open-Domain Conversational Systems

Authors: Jason Ingyu Choi, Ali Ahmadvand, Eugene Agichtein

Abstract: Predicting user satisfaction in conversational systems has become critical, as spoken conversational assistants operate in increasingly complex domains. Online satisfaction prediction (i.e., predicting satisfaction of the user with the system after each turn) could be used as a new proxy for implicit user feedback, and offers promising opportunities to create more responsive and effective conversa… ▽ More Predicting user satisfaction in conversational systems has become critical, as spoken conversational assistants operate in increasingly complex domains. Online satisfaction prediction (i.e., predicting satisfaction of the user with the system after each turn) could be used as a new proxy for implicit user feedback, and offers promising opportunities to create more responsive and effective conversational agents, which adapt to the user's engagement with the agent. To accomplish this goal, we propose a conversational satisfaction prediction model specifically designed for open-domain spoken conversational agents, called ConvSAT. To operate robustly across domains, ConvSAT aggregates multiple representations of the conversation, namely the conversation history, utterance and response content, and system- and user-oriented behavioral signals. We first calibrate ConvSAT performance against state of the art methods on a standard dataset (Dialogue Breakdown Detection Challenge) in an online regime, and then evaluate ConvSAT on a large dataset of conversations with real users, collected as part of the Alexa Prize competition. Our experimental results show that ConvSAT significantly improves satisfaction prediction for both offline and online setting on both datasets, compared to the previously reported state-of-the-art approaches. The insights from our study can enable more intelligent conversational systems, which could adapt in real-time to the inferred user satisfaction and engagement. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: Published in CIKM '19, 10 pages

arXiv:2006.01916 [pdf, other]

doi 10.1145/3343413.3378009

Quantifying the Effects of Prosody Modulation on User Engagement and Satisfaction in Conversational Systems

Authors: Jason Ingyu Choi, Eugene Agichtein

Abstract: As voice-based assistants such as Alexa, Siri, and Google Assistant become ubiquitous, users increasingly expect to maintain natural and informative conversations with such systems. However, for an open-domain conversational system to be coherent and engaging, it must be able to maintain the user's interest for extended periods, without sounding boring or annoying. In this paper, we investigate on… ▽ More As voice-based assistants such as Alexa, Siri, and Google Assistant become ubiquitous, users increasingly expect to maintain natural and informative conversations with such systems. However, for an open-domain conversational system to be coherent and engaging, it must be able to maintain the user's interest for extended periods, without sounding boring or annoying. In this paper, we investigate one natural approach to this problem, of modulating response prosody, i.e., changing the pitch and cadence of the response to indicate delight, sadness or other common emotions, as well as using pre-recorded interjections. Intuitively, this approach should improve the naturalness of the conversation, but attempts to quantify the effects of prosodic modulation on user satisfaction and engagement remain challenging. To accomplish this, we report results obtained from a large-scale empirical study that measures the effects of prosodic modulation on user behavior and engagement across multiple conversation domains, both immediately after each turn, and at the overall conversation level. Our results indicate that the prosody modulation significantly increases both immediate and overall user satisfaction. However, since the effects vary across different domains, we verify that prosody modulations do not substitute for coherent, informative content of the responses. Together, our results provide useful tools and insights for improving the naturalness of responses in conversational systems. △ Less

Submitted 2 June, 2020; originally announced June 2020.

Comments: Published in CHIIR 2020, 4 pages

arXiv:2005.13804 [pdf, other]

Contextual Dialogue Act Classification for Open-Domain Conversational Agents

Authors: Ali Ahmadvand, Jason Ingyu Choi, Eugene Agichtein

Abstract: Classifying the general intent of the user utterance in a conversation, also known as Dialogue Act (DA), e.g., open-ended question, statement of opinion, or request for an opinion, is a key step in Natural Language Understanding (NLU) for conversational agents. While DA classification has been extensively studied in human-human conversations, it has not been sufficiently explored for the emerging… ▽ More Classifying the general intent of the user utterance in a conversation, also known as Dialogue Act (DA), e.g., open-ended question, statement of opinion, or request for an opinion, is a key step in Natural Language Understanding (NLU) for conversational agents. While DA classification has been extensively studied in human-human conversations, it has not been sufficiently explored for the emerging open-domain automated conversational agents. Moreover, despite significant advances in utterance-level DA classification, full understanding of dialogue utterances requires conversational context. Another challenge is the lack of available labeled data for open-domain human-machine conversations. To address these problems, we propose a novel method, CDAC (Contextual Dialogue Act Classifier), a simple yet effective deep learning approach for contextual dialogue act classification. Specifically, we use transfer learning to adapt models trained on human-human conversations to predict dialogue acts in human-machine dialogues. To investigate the effectiveness of our method, we train our model on the well-known Switchboard human-human dialogue dataset, and fine-tune it for predicting dialogue acts in human-machine conversation data, collected as part of the Amazon Alexa Prize 2018 competition. The results show that the CDAC model outperforms an utterance-level state of the art baseline by 8.0% on the Switchboard dataset, and is comparable to the latest reported state-of-the-art contextual DA classification results. Furthermore, our results show that fine-tuning the CDAC model on a small sample of manually labeled human-machine conversations allows CDAC to more accurately predict dialogue acts in real users' conversations, suggesting a promising direction for future improvements. △ Less

Submitted 28 May, 2020; originally announced May 2020.

Comments: SIGIR 2019

arXiv:2005.13798 [pdf, other]

ConCET: Entity-Aware Topic Classification for Open-Domain Conversational Agents

Authors: Ali Ahmadvand, Harshita Sahijwani, Jason Ingyu Choi, Eugene Agichtein

Abstract: Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed to a single component responsible for that domain. Thus, correctly mapping a user utterance to the right domain is critical. To address this problem, we introduc… ▽ More Identifying the topic (domain) of each user's utterance in open-domain conversational systems is a crucial step for all subsequent language understanding and response tasks. In particular, for complex domains, an utterance is often routed to a single component responsible for that domain. Thus, correctly mapping a user utterance to the right domain is critical. To address this problem, we introduce ConCET: a Concurrent Entity-aware conversational Topic classifier, which incorporates entity-type information together with the utterance content features. Specifically, ConCET utilizes entity information to enrich the utterance representation, combining character, word, and entity-type embeddings into a single representation. However, for rich domains with millions of available entities, unrealistic amounts of labeled training data would be required. To complement our model, we propose a simple and effective method for generating synthetic training data, to augment the typically limited amounts of labeled training data, using commonly available knowledge bases to generate additional labeled utterances. We extensively evaluate ConCET and our proposed training method first on an openly available human-human conversational dataset called Self-Dialogue, to calibrate our approach against previous state-of-the-art methods; second, we evaluate ConCET on a large dataset of human-machine conversations with real users, collected as part of the Amazon Alexa Prize. Our results show that ConCET significantly improves topic classification performance on both datasets, including 8-10% improvements over state-of-the-art deep learning methods. We complement our quantitative results with detailed analysis of system performance, which could be used for further improvements of conversational agents. △ Less

Submitted 28 May, 2020; originally announced May 2020.

Comments: CIKM 2019

arXiv:1907.00935 [pdf, other]

One-Time Programs made Practical

Authors: Lianying Zhao, Joseph I. Choi, Didem Demirag, Kevin R. B. Butler, Mohammad Mannan, Erman Ayday, Jeremy Clark

Abstract: A one-time program (OTP) works as follows: Alice provides Bob with the implementation of some function. Bob can have the function evaluated exclusively on a single input of his choosing. Once executed, the program will fail to evaluate on any other input. State-of-the-art one-time programs have remained theoretical, requiring custom hardware that is cost-ineffective/unavailable, or confined to adh… ▽ More A one-time program (OTP) works as follows: Alice provides Bob with the implementation of some function. Bob can have the function evaluated exclusively on a single input of his choosing. Once executed, the program will fail to evaluate on any other input. State-of-the-art one-time programs have remained theoretical, requiring custom hardware that is cost-ineffective/unavailable, or confined to adhoc/unrealistic assumptions. To bridge this gap, we explore how the Trusted Execution Environment (TEE) of modern CPUs can realize the OTP functionality. Specifically, we build two flavours of such a system: in the first, the TEE directly enforces the one-timeness of the program; in the second, the program is represented with a garbled circuit and the TEE ensures Bob's input can only be wired into the circuit once, equivalent to a smaller cryptographic primitive called one-time memory. These have different performance profiles: the first is best when Alice's input is small and Bob's is large, and the second for the converse. △ Less

Submitted 1 July, 2019; originally announced July 2019.

arXiv:1905.01233 [pdf, other]

doi 10.1145/3321705.3329835

A Hybrid Approach to Secure Function Evaluation Using SGX

Authors: Joseph I. Choi, Dave 'Jing' Tian, Grant Hernandez, Christopher Patton, Benjamin Mood, Thomas Shrimpton, Kevin R. B. Butler, Patrick Traynor

Abstract: A protocol for two-party secure function evaluation (2P-SFE) aims to allow the parties to learn the output of function $f$ of their private inputs, while leaking nothing more. In a sense, such a protocol realizes a trusted oracle that computes $f$ and returns the result to both parties. There have been tremendous strides in efficiency over the past ten years, yet 2P-SFE protocols remain impractica… ▽ More A protocol for two-party secure function evaluation (2P-SFE) aims to allow the parties to learn the output of function $f$ of their private inputs, while leaking nothing more. In a sense, such a protocol realizes a trusted oracle that computes $f$ and returns the result to both parties. There have been tremendous strides in efficiency over the past ten years, yet 2P-SFE protocols remain impractical for most real-time, online computations, particularly on modestly provisioned devices. Intel's Software Guard Extensions (SGX) provides hardware-protected execution environments, called enclaves, that may be viewed as trusted computation oracles. While SGX provides native CPU speed for secure computation, previous side-channel and micro-architecture attacks have demonstrated how security guarantees of enclaves can be compromised. In this paper, we explore a balanced approach to 2P-SFE on SGX-enabled processors by constructing a protocol for evaluating $f$ relative to a partitioning of $f$. This approach alleviates the burden of trust on the enclave by allowing the protocol designer to choose which components should be evaluated within the enclave, and which via standard cryptographic techniques. We describe SGX-enabled SFE protocols (modeling the enclave as an oracle), and formalize the strongest-possible notion of 2P-SFE for our setting. We prove our protocol meets this notion when properly realized. We implement the protocol and apply it to two practical problems: privacy-preserving queries to a database, and a version of Dijkstra's algorithm for privacy-preserving navigation. Our evaluation shows that our SGX-enabled SFE scheme enjoys a 38x increase in performance over garbled-circuit-based SFE. Finally, we justify modeling of the enclave as an oracle by implementing protections against known side-channels. △ Less

Submitted 6 May, 2019; v1 submitted 3 May, 2019; originally announced May 2019.

Comments: Full version, with proofs, of conference paper at AsiaCCS 2019; updated to include copyright information

arXiv:1810.00024 [pdf, other]

Explainable Black-Box Attacks Against Model-based Authentication

Authors: Washington Garcia, Joseph I. Choi, Suman K. Adari, Somesh Jha, Kevin R. B. Butler

Abstract: Establishing unique identities for both humans and end systems has been an active research problem in the security community, giving rise to innovative machine learning-based authentication techniques. Although such techniques offer an automated method to establish identity, they have not been vetted against sophisticated attacks that target their core machine learning technique. This paper demons… ▽ More Establishing unique identities for both humans and end systems has been an active research problem in the security community, giving rise to innovative machine learning-based authentication techniques. Although such techniques offer an automated method to establish identity, they have not been vetted against sophisticated attacks that target their core machine learning technique. This paper demonstrates that mimicking the unique signatures generated by host fingerprinting and biometric authentication systems is possible. We expose the ineffectiveness of underlying machine learning classification models by constructing a blind attack based around the query synthesis framework and utilizing Explainable-AI (XAI) techniques. We launch an attack in under 130 queries on a state-of-the-art face authentication system, and under 100 queries on a host authentication system. We examine how these attacks can be defended against and explore their limitations. XAI provides an effective means for adversaries to infer decision boundaries and provides a new way forward in constructing attacks against systems using machine learning models for authentication. △ Less

Submitted 28 September, 2018; originally announced October 2018.

Showing 1–12 of 12 results for author: Choi, J I