Search | arXiv e-print repository

Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference

Authors: Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong

Abstract: We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically const… ▽ More We study how to subvert language models from following the rules. We model rule-following as inference in propositional Horn logic, a mathematical system in which rules have the form "if $P$ and $Q$, then $R$" for some propositions $P$, $Q$, and $R$. We prove that although transformers can faithfully abide by such rules, maliciously crafted prompts can nevertheless mislead even theoretically constructed models. Empirically, we find that attacks on our theoretical models mirror popular attacks on large language models. Our work suggests that studying smaller theoretical models can help understand the behavior of large language models in rule-based settings like logical reasoning and jailbreak attacks. △ Less

Submitted 21 June, 2024; originally announced July 2024.

arXiv:2406.18534 [pdf, other]

Towards Compositionality in Concept Learning

Authors: Adam Stein, Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong

Abstract: Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositiona… ▽ More Concept-based interpretability methods offer a lens into the internals of foundation models by decomposing their embeddings into high-level concepts. These concept representations are most useful when they are compositional, meaning that the individual concepts compose to explain the full sample. We show that existing unsupervised concept extraction methods find concepts which are not compositional. To automatically discover compositional concept representations, we identify two salient properties of such representations, and propose Compositional Concept Extraction (CCE) for finding concepts which obey these properties. We evaluate CCE on five different datasets over image and text data. Our evaluation shows that CCE finds more compositional concept representations than baselines and yields better accuracy on four downstream classification tasks. Code and data are available at https://github.com/adaminsky/compositional_concepts . △ Less

Submitted 26 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024. 26 pages, 10 figures

arXiv:2406.10952 [pdf, other]

Avoiding Copyright Infringement via Machine Unlearning

Authors: Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong

Abstract: Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. To address these issues, it is critical for model owners to be able to unlearn copyrighted content at various time steps. We explore the setting of sequential unlearning, where copyrighted content… ▽ More Pre-trained Large Language Models (LLMs) have demonstrated remarkable capabilities but also pose risks by learning and generating copyrighted material, leading to significant legal and ethical concerns. To address these issues, it is critical for model owners to be able to unlearn copyrighted content at various time steps. We explore the setting of sequential unlearning, where copyrighted content is removed over multiple time steps - a scenario that has not been rigorously addressed. To tackle this challenge, we propose Stable Sequential Unlearning (SSU), a novel unlearning framework for LLMs, designed to have a more stable process to remove copyrighted content from LLMs throughout different time steps using task vectors, by incorporating additional random labeling loss and applying gradient-based weight saliency mapping. Experiments demonstrate that SSU finds a good balance between unlearning efficacy and maintaining the model's general knowledge compared to existing baselines. △ Less

Submitted 16 June, 2024; originally announced June 2024.

arXiv:2406.06246 [pdf, other]

Data-Efficient Learning with Neural Programs

Authors: Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev Alur, Mayur Naik, Eric Wong

Abstract: Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiabl… ▽ More Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner. △ Less

Submitted 10 June, 2024; originally announced June 2024.

arXiv:2406.00611 [pdf, other]

DISCRET: Synthesizing Faithful Explanations For Treatment Effect Estimation

Authors: Yinjun Wu, Mayank Keoliya, Kan Chen, Neelay Velingker, Ziyang Li, Emily J Getzen, Qi Long, Mayur Naik, Ravi B Parikh, Eric Wong

Abstract: Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers… ▽ More Designing faithful yet accurate AI models is challenging, particularly in the field of individual treatment effect estimation (ITE). ITE prediction models deployed in critical settings such as healthcare should ideally be (i) accurate, and (ii) provide faithful explanations. However, current solutions are inadequate: state-of-the-art black-box models do not supply explanations, post-hoc explainers for black-box models lack faithfulness guarantees, and self-interpretable models greatly compromise accuracy. To address these issues, we propose DISCRET, a self-interpretable ITE framework that synthesizes faithful, rule-based explanations for each sample. A key insight behind DISCRET is that explanations can serve dually as database queries to identify similar subgroups of samples. We provide a novel RL algorithm to efficiently synthesize these explanations from a large search space. We evaluate DISCRET on diverse tasks involving tabular, image, and text data. DISCRET outperforms the best self-interpretable models and has accuracy comparable to the best black-box models while providing faithful explanations. DISCRET is available at https://github.com/wuyinjun-1993/DISCRET-ICML2024. △ Less

Submitted 2 June, 2024; originally announced June 2024.

Comments: Accepted at ICML 2024. 22 pages, 5 figures

arXiv:2405.06692 [pdf, ps, other]

Analyzing Language Bias Between French and English in Conventional Multilingual Sentiment Analysis Models

Authors: Ethan Parker Wong, Faten M'hiri

Abstract: Inspired by the 'Bias Considerations in Bilingual Natural Language Processing' report by Statistics Canada, this study delves into potential biases in multilingual sentiment analysis between English and French. Given a 50-50 dataset of French and English, we aim to determine if there exists a language bias and explore how the incorporation of more diverse datasets in the future might affect the eq… ▽ More Inspired by the 'Bias Considerations in Bilingual Natural Language Processing' report by Statistics Canada, this study delves into potential biases in multilingual sentiment analysis between English and French. Given a 50-50 dataset of French and English, we aim to determine if there exists a language bias and explore how the incorporation of more diverse datasets in the future might affect the equity of multilingual Natural Language Processing (NLP) systems. By employing Support Vector Machine (SVM) and Naive Bayes models on three balanced datasets, we reveal potential biases in multilingual sentiment classification. Utilizing Fairlearn, a tool for assessing bias in machine learning models, our findings indicate nuanced outcomes. With French data outperforming English across accuracy, recall, and F1 score metrics in both models, hinting at a language bias favoring French. However, Fairlearn's metrics suggest that the SVM approaches equitable levels with a demographic parity ratio of 0.963, 0.989, and 0.985 for the three separate datasets, indicating near-equitable treatment across languages. In contrast, Naive Bayes demonstrates greater disparities, evidenced by a demographic parity ratio of 0.813, 0.908, and 0.961. These findings reveal the importance of developing equitable multilingual NLP systems, particularly as we anticipate the inclusion of more datasets in various languages in the future. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Undergraduate Research Project

arXiv:2405.04873 [pdf, other]

doi 10.1145/3613904.3642502

Practice-informed Patterns for Organising Large Groups in Distributed Mixed Reality Collaboration

Authors: Emily Wong, Juan Sánchez Esquivel, Jens Emil Grønbæk, Germán Leiva, Eduardo Velloso

Abstract: Collaborating across dissimilar, distributed spaces presents numerous challenges for computer-aided spatial communication. Mixed reality (MR) can blend selected surfaces, allowing collaborators to work in blended f-formations (facing formations), even when their workstations are physically misaligned. Since collaboration often involves more than just participant pairs, this research examines how w… ▽ More Collaborating across dissimilar, distributed spaces presents numerous challenges for computer-aided spatial communication. Mixed reality (MR) can blend selected surfaces, allowing collaborators to work in blended f-formations (facing formations), even when their workstations are physically misaligned. Since collaboration often involves more than just participant pairs, this research examines how we might scale MR experiences for large-group collaboration. To do so, this study recruited collaboration designers (CDs) to evaluate and reimagine MR for large-scale collaboration. These CDs were engaged in a four-part user study that involved a technology probe, a semi-structured interview, a speculative low-fidelity prototyping activity and a validation session. The outcomes of this paper contribute (1) a set of collaboration design principles to inspire future computer-supported collaborative work, (2) eight collaboration patterns for blended f-formations and collaboration at scale and (3) theoretical implications for f-formations and space-place relationships. As a result, this work creates a blueprint for scaling collaboration across distributed spaces. △ Less

Submitted 9 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Journal ref: CHI '24, Proceedings of the CHI Conference on Human Factors in Computing Systems, May 11-16 2024, Honolulu, HI, USA

arXiv:2404.14299 [pdf, other]

A Cross-Platform Execution Engine for the Quantum Intermediate Representation

Authors: Elaine Wong, Vicente Leyton Ortega, Daniel Claudino, Seth Johnson, Sharmin Afrose, Meenambika Gowrishankar, Anthony M. Cabrera, Travis S. Humble

Abstract: Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-… ▽ More Hybrid languages like the Quantum Intermediate Representation (QIR) are essential for programming systems that mix quantum and conventional computing models, while execution of these programs is often deferred to a system-specific implementation. Here, we describe and demonstrate the QIR Execution Engine (QIR-EE) for parsing, interpreting, and executing QIR across multiple hardware platforms. QIR-EE uses LLVM to execute hybrid instructions specifying quantum programs and, by design, presents extension points that support customized runtime and hardware environments. We demonstrate an implementation that uses the XACC quantum hardware-accelerator library to dispatch prototypical quantum programs on different commercial quantum platforms and numerical simulators, and we validate execution of QIR-EE on the IonQ Harmony and Quantinuum H1-1 hardware. Our results highlight the efficiency of hybrid executable architectures for handling mixed instructions, managing mixed data, and integrating with quantum computing frameworks to realize cross-platform execution. △ Less

Submitted 22 April, 2024; originally announced April 2024.

arXiv:2404.01318 [pdf, other]

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Authors: Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

Abstract: Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and suc… ▽ More Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and success rates in incomparable ways. And third, numerous works are not reproducible, as they withhold adversarial prompts, involve closed-source code, or rely on evolving proprietary APIs. To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work (Zou et al., 2023; Mazeika et al., 2023, 2024) -- which align with OpenAI's usage policies; (3) a standardized evaluation framework at https://github.com/JailbreakBench/jailbreakbench that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard at https://jailbreakbench.github.io/ that tracks the performance of attacks and defenses for various LLMs. We have carefully considered the potential ethical implications of releasing this benchmark, and believe that it will be a net positive for the community. △ Less

Submitted 16 July, 2024; v1 submitted 27 March, 2024; originally announced April 2024.

Comments: JailbreakBench v1.0: more attack artifacts, more test-time defenses, a more accurate jailbreak judge (Llama-3-70B with a custom prompt), a larger dataset of human preferences for selecting a jailbreak judge (300 examples), an over-refusal evaluation dataset (100 benign/borderline behaviors), a semantic refusal judge based on Llama-3-8B

arXiv:2402.16192 [pdf, other]

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

Authors: Jiabao Ji, Bairu Hou, Alexander Robey, George J. Pappas, Hamed Hassani, Yang Zhang, Eric Wong, Shiyu Chang

Abstract: Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content. While initial defenses show promise against token-based threat models, there do not exist defenses that provide robustness against semantic attacks and avoid unfavorable trade-offs between robustness and nominal performance.… ▽ More Aligned large language models (LLMs) are vulnerable to jailbreaking attacks, which bypass the safeguards of targeted LLMs and fool them into generating objectionable content. While initial defenses show promise against token-based threat models, there do not exist defenses that provide robustness against semantic attacks and avoid unfavorable trade-offs between robustness and nominal performance. To meet this need, we propose SEMANTICSMOOTH, a smoothing-based defense that aggregates the predictions of multiple semantically transformed copies of a given input prompt. Experimental results demonstrate that SEMANTICSMOOTH achieves state-of-the-art robustness against GCG, PAIR, and AutoDAN attacks while maintaining strong nominal performance on instruction following benchmarks such as InstructionFollowing and AlpacaEval. The codes will be publicly available at https://github.com/UCSB-NLP-Chang/SemanticSmooth. △ Less

Submitted 28 February, 2024; v1 submitted 25 February, 2024; originally announced February 2024.

Comments: 37 pages

arXiv:2401.13650 [pdf, other]

Tyche: Stochastic In-Context Learning for Medical Image Segmentation

Authors: Marianne Rakic, Hallee E. Wong, Jose Javier Gonzalez Ortiz, Beth Cimini, John Guttag, Adrian V. Dalca

Abstract: Existing learning-based solutions to medical image segmentation have two important shortcomings. First, for most new segmentation task, a new model has to be trained or fine-tuned. This requires extensive resources and machine learning expertise, and is therefore often infeasible for medical researchers and clinicians. Second, most existing segmentation methods produce a single deterministic segme… ▽ More Existing learning-based solutions to medical image segmentation have two important shortcomings. First, for most new segmentation task, a new model has to be trained or fine-tuned. This requires extensive resources and machine learning expertise, and is therefore often infeasible for medical researchers and clinicians. Second, most existing segmentation methods produce a single deterministic segmentation mask for a given image. In practice however, there is often considerable uncertainty about what constitutes the correct segmentation, and different expert annotators will often segment the same image differently. We tackle both of these problems with Tyche, a model that uses a context set to generate stochastic predictions for previously unseen tasks without the need to retrain. Tyche differs from other in-context segmentation methods in two important ways. (1) We introduce a novel convolution block architecture that enables interactions among predictions. (2) We introduce in-context test-time augmentation, a new mechanism to provide prediction stochasticity. When combined with appropriate model design and loss functions, Tyche can predict a set of plausible diverse segmentation candidates for new or unseen medical images and segmentation tasks without the need to retrain. △ Less

Submitted 24 January, 2024; originally announced January 2024.

arXiv:2312.07381 [pdf, other]

ScribblePrompt: Fast and Flexible Interactive Segmentation for Any Biomedical Image

Authors: Hallee E. Wong, Marianne Rakic, John Guttag, Adrian V. Dalca

Abstract: Biomedical image segmentation is a crucial part of both scientific research and clinical care. With enough labelled data, deep learning models can be trained to accurately automate specific biomedical image segmentation tasks. However, manually segmenting images to create training data is highly labor intensive and requires domain expertise. We present \emph{ScribblePrompt}, a flexible neural netw… ▽ More Biomedical image segmentation is a crucial part of both scientific research and clinical care. With enough labelled data, deep learning models can be trained to accurately automate specific biomedical image segmentation tasks. However, manually segmenting images to create training data is highly labor intensive and requires domain expertise. We present \emph{ScribblePrompt}, a flexible neural network based interactive segmentation tool for biomedical imaging that enables human annotators to segment previously unseen structures using scribbles, clicks, and bounding boxes. Through rigorous quantitative experiments, we demonstrate that given comparable amounts of interaction, ScribblePrompt produces more accurate segmentations than previous methods on datasets unseen during training. In a user study with domain experts, ScribblePrompt reduced annotation time by 28% while improving Dice by 15% compared to the next best method. ScribblePrompt's success rests on a set of careful design decisions. These include a training strategy that incorporates both a highly diverse set of images and tasks, novel algorithms for simulated user interactions and labels, and a network that enables fast inference. We showcase ScribblePrompt in an interactive demo, provide code, and release a dataset of scribble annotations at https://scribbleprompt.csail.mit.edu △ Less

Submitted 16 July, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

Comments: Accepted by ECCV 2024. Project Website: https://scribbleprompt.csail.mit.edu Keywords: Interactive Segmentation, Medical Imaging, Segment Anything Model, SAM, Scribble Annotations, Prompt

arXiv:2312.05716 [pdf, other]

Initialization Matters for Adversarial Transfer Learning

Authors: Andong Hua, Jindong Gu, Zhiyu Xue, Nicholas Carlini, Eric Wong, Yao Qin

Abstract: With the prevalence of the Pretraining-Finetuning paradigm in transfer learning, the robustness of downstream tasks has become a critical concern. In this work, we delve into adversarial robustness in transfer learning and reveal the critical role of initialization, including both the pretrained model and the linear head. First, we discover the necessity of an adversarially robust pretrained model… ▽ More With the prevalence of the Pretraining-Finetuning paradigm in transfer learning, the robustness of downstream tasks has become a critical concern. In this work, we delve into adversarial robustness in transfer learning and reveal the critical role of initialization, including both the pretrained model and the linear head. First, we discover the necessity of an adversarially robust pretrained model. Specifically, we reveal that with a standard pretrained model, Parameter-Efficient Finetuning (PEFT) methods either fail to be adversarially robust or continue to exhibit significantly degraded adversarial robustness on downstream tasks, even with adversarial training during finetuning. Leveraging a robust pretrained model, surprisingly, we observe that a simple linear probing can outperform full finetuning and other PEFT methods with random initialization on certain datasets. We further identify that linear probing excels in preserving robustness from the robust pretraining. Based on this, we propose Robust Linear Initialization (RoLI) for adversarial finetuning, which initializes the linear head with the weights obtained by adversarial linear probing to maximally inherit the robustness from pretraining. Across five different image classification datasets, we demonstrate the effectiveness of RoLI and achieve new state-of-the-art results. Our code is available at \url{https://github.com/DongXzz/RoLI}. △ Less

Submitted 30 March, 2024; v1 submitted 9 December, 2023; originally announced December 2023.

Comments: CVPR 2024

arXiv:2312.03231 [pdf, other]

Deep Multimodal Fusion for Surgical Feedback Classification

Authors: Rafal Kocielnik, Elyssa Y. Wong, Timothy N. Chu, Lydia Lin, De-An Huang, Jiayun Wang, Anima Anandkumar, Andrew J. Hung

Abstract: Quantification of real-time informal feedback delivered by an experienced surgeon to a trainee during surgery is important for skill improvements in surgical training. Such feedback in the live operating room is inherently multimodal, consisting of verbal conversations (e.g., questions and answers) as well as non-verbal elements (e.g., through visual cues like pointing to anatomic elements). In th… ▽ More Quantification of real-time informal feedback delivered by an experienced surgeon to a trainee during surgery is important for skill improvements in surgical training. Such feedback in the live operating room is inherently multimodal, consisting of verbal conversations (e.g., questions and answers) as well as non-verbal elements (e.g., through visual cues like pointing to anatomic elements). In this work, we leverage a clinically-validated five-category classification of surgical feedback: "Anatomic", "Technical", "Procedural", "Praise" and "Visual Aid". We then develop a multi-label machine learning model to classify these five categories of surgical feedback from inputs of text, audio, and video modalities. The ultimate goal of our work is to help automate the annotation of real-time contextual surgical feedback at scale. Our automated classification of surgical feedback achieves AUCs ranging from 71.5 to 77.6 with the fusion improving performance by 3.1%. We also show that high-quality manual transcriptions of feedback audio from experts improve AUCs to between 76.5 and 96.2, which demonstrates a clear path toward future improvements. Empirically, we find that the Staged training strategy, with first pre-training each modality separately and then training them jointly, is more effective than training different modalities altogether. We also present intuitive findings on the importance of modalities for different feedback categories. This work offers an important first look at the feasibility of automated classification of real-world live surgical feedback based on text, audio, and video modalities. △ Less

Submitted 5 December, 2023; originally announced December 2023.

Journal ref: Published in Proceedings of Machine Learning for Health 2024

arXiv:2310.16316 [pdf, other]

Sum-of-Parts Models: Faithful Attributions for Groups of Features

Authors: Weiqiu You, Helen Qu, Marco Gatti, Bhuvnesh Jain, Eric Wong

Abstract: An explanation of a machine learning model is considered "faithful" if it accurately reflects the model's decision-making process. However, explanations such as feature attributions for deep learning are not guaranteed to be faithful, and can produce potentially misleading interpretations. In this work, we develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature at… ▽ More An explanation of a machine learning model is considered "faithful" if it accurately reflects the model's decision-making process. However, explanations such as feature attributions for deep learning are not guaranteed to be faithful, and can produce potentially misleading interpretations. In this work, we develop Sum-of-Parts (SOP), a class of models whose predictions come with grouped feature attributions that are faithful-by-construction. This model decomposes a prediction into an interpretable sum of scores, each of which is directly attributable to a sparse group of features. We evaluate SOP on benchmarks with standard interpretability metrics, and in a case study, we use the faithful explanations from SOP to help astrophysicists discover new knowledge about galaxy formation. △ Less

Submitted 24 October, 2023; originally announced October 2023.

arXiv:2310.12508 [pdf, other]

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Authors: Chongyu Fan, Jiancheng Liu, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu

Abstract: With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parall… ▽ More With evolving data regulations, machine unlearning (MU) has become an important tool for fostering trust and safety in today's AI models. However, existing MU methods focusing on data and/or weight perspectives often suffer limitations in unlearning accuracy, stability, and cross-domain applicability. To address these challenges, we introduce the concept of 'weight saliency' for MU, drawing parallels with input saliency in model explanation. This innovation directs MU's attention toward specific model weights rather than the entire model, improving effectiveness and efficiency. The resultant method that we call saliency unlearning (SalUn) narrows the performance gap with 'exact' unlearning (model retraining from scratch after removing the forgetting data points). To the best of our knowledge, SalUn is the first principled MU approach that can effectively erase the influence of forgetting data, classes, or concepts in both image classification and generation tasks. As highlighted below, For example, SalUn yields a stability advantage in high-variance random data forgetting, e.g., with a 0.2% gap compared to exact unlearning on the CIFAR-10 dataset. Moreover, in preventing conditional diffusion models from generating harmful images, SalUn achieves nearly 100% unlearning accuracy, outperforming current state-of-the-art baselines like Erased Stable Diffusion and Forget-Me-Not. Codes are available at https://github.com/OPTML-Group/Unlearn-Saliency. (WARNING: This paper contains model outputs that may be offensive in nature.) △ Less

Submitted 4 April, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

Comments: Accepted by ICLR 2024 as a Spotlight paper

arXiv:2310.08419 [pdf, other]

Jailbreaking Black Box Large Language Models in Twenty Queries

Authors: Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

Abstract: There is growing interest in ensuring that large language models (LLMs) align with human values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which coax LLMs into overriding their safety guardrails. The identification of these vulnerabilities is therefore instrumental in understanding inherent weaknesses and preventing future misuse. To this end, we propose Prompt… ▽ More There is growing interest in ensuring that large language models (LLMs) align with human values. However, the alignment of such models is vulnerable to adversarial jailbreaks, which coax LLMs into overriding their safety guardrails. The identification of these vulnerabilities is therefore instrumental in understanding inherent weaknesses and preventing future misuse. To this end, we propose Prompt Automatic Iterative Refinement (PAIR), an algorithm that generates semantic jailbreaks with only black-box access to an LLM. PAIR -- which is inspired by social engineering attacks -- uses an attacker LLM to automatically generate jailbreaks for a separate targeted LLM without human intervention. In this way, the attacker LLM iteratively queries the target LLM to update and refine a candidate jailbreak. Empirically, PAIR often requires fewer than twenty queries to produce a jailbreak, which is orders of magnitude more efficient than existing algorithms. PAIR also achieves competitive jailbreaking success rates and transferability on open and closed-source LLMs, including GPT-3.5/4, Vicuna, and Gemini. △ Less

Submitted 18 July, 2024; v1 submitted 12 October, 2023; originally announced October 2023.

arXiv:2310.07135 [pdf, other]

Comparing Styles across Languages

Authors: Shreya Havaldar, Matthew Pressimone, Eric Wong, Lyle Ungar

Abstract: Understanding how styles differ across languages is advantageous for training both humans and computers to generate culturally appropriate text. We introduce an explanation framework to extract stylistic differences from multilingual LMs and compare styles across languages. Our framework (1) generates comprehensive style lexica in any language and (2) consolidates feature importances from LMs into… ▽ More Understanding how styles differ across languages is advantageous for training both humans and computers to generate culturally appropriate text. We introduce an explanation framework to extract stylistic differences from multilingual LMs and compare styles across languages. Our framework (1) generates comprehensive style lexica in any language and (2) consolidates feature importances from LMs into comparable lexical categories. We apply this framework to compare politeness, creating the first holistic multilingual politeness dataset and exploring how politeness varies across four languages. Our approach enables an effective evaluation of how distinct linguistic categories contribute to stylistic variations and provides interpretable insights into how people communicate differently around the world. △ Less

Submitted 4 December, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

Comments: Accepted to EMNLP 2023

arXiv:2310.03684 [pdf, other]

SmoothLLM: Defending Large Language Models Against Jailbreaking Attacks

Authors: Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas

Abstract: Despite efforts to align large language models (LLMs) with human intentions, widely-used LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. To address this vulnerability, we propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks. Based on our finding that adversarial… ▽ More Despite efforts to align large language models (LLMs) with human intentions, widely-used LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an adversary fools a targeted LLM into generating objectionable content. To address this vulnerability, we propose SmoothLLM, the first algorithm designed to mitigate jailbreaking attacks. Based on our finding that adversarially-generated prompts are brittle to character-level changes, our defense randomly perturbs multiple copies of a given input prompt, and then aggregates the corresponding predictions to detect adversarial inputs. Across a range of popular LLMs, SmoothLLM sets the state-of-the-art for robustness against the GCG, PAIR, RandomSearch, and AmpleGCG jailbreaks. SmoothLLM is also resistant against adaptive GCG attacks, exhibits a small, though non-negligible trade-off between robustness and nominal performance, and is compatible with any LLM. Our code is publicly available at \url{https://github.com/arobey1/smooth-llm}. △ Less

Submitted 11 June, 2024; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2308.06686 [pdf, other]

TorchQL: A Programming Framework for Integrity Constraints in Machine Learning

Authors: Aaditya Naik, Adam Stein, Yinjun Wu, Mayur Naik, Eric Wong

Abstract: Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check… ▽ More Finding errors in machine learning applications requires a thorough exploration of their behavior over data. Existing approaches used by practitioners are often ad-hoc and lack the abstractions needed to scale this process. We present TorchQL, a programming framework to evaluate and improve the correctness of machine learning applications. TorchQL allows users to write queries to specify and check integrity constraints over machine learning models and datasets. It seamlessly integrates relational algebra with functional programming to allow for highly expressive queries using only eight intuitive operators. We evaluate TorchQL on diverse use-cases including finding critical temporal inconsistencies in objects detected across video frames in autonomous driving, finding data imputation errors in time-series medical records, finding data labeling errors in real-world images, and evaluating biases and constraining outputs of language models. Our experiments show that TorchQL enables up to 13x faster query executions than baselines like Pandas and MongoDB, and up to 40% shorter queries than native Python. We also conduct a user study and find that TorchQL is natural enough for developers familiar with Python to specify complex integrity constraints. △ Less

Submitted 14 February, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

arXiv:2307.05902 [pdf, other]

Stability Guarantees for Feature Attributions with Multiplicative Smoothing

Authors: Anton Xue, Rajeev Alur, Eric Wong

Abstract: Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothi… ▽ More Explanation methods for machine learning models tend not to provide any formal guarantees and may not reflect the underlying decision-making process. In this work, we analyze stability as a property for reliable feature attribution methods. We prove that relaxed variants of stability are guaranteed if the model is sufficiently Lipschitz with respect to the masking of features. We develop a smoothing method called Multiplicative Smoothing (MuS) to achieve such a model. We show that MuS overcomes the theoretical limitations of standard smoothing techniques and can be integrated with any classifier and feature attribution method. We evaluate MuS on vision and language models with various feature attribution methods, such as LIME and SHAP, and demonstrate that MuS endows feature attributions with non-trivial stability guarantees. △ Less

Submitted 26 October, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

arXiv:2306.14414 [pdf, ps, other]

Rationality of Four-Valued Families of Weil Sums of Binomials

Authors: Daniel J. Katz, Allison E. Wong

Abstract: We investigate the rationality of Weil sums of binomials of the form $W^{K,s}_u=\sum_{x \in K} ψ(x^s - u x)$, where $K$ is a finite field whose canonical additive character is $ψ$, and where $u$ is an element of $K^{\times}$ and $s$ is a positive integer relatively prime to $|K^\times|$, so that $x \mapsto x^s$ is a permutation of $K$. The Weil spectrum for $K$ and $s$, which is the family of valu… ▽ More We investigate the rationality of Weil sums of binomials of the form $W^{K,s}_u=\sum_{x \in K} ψ(x^s - u x)$, where $K$ is a finite field whose canonical additive character is $ψ$, and where $u$ is an element of $K^{\times}$ and $s$ is a positive integer relatively prime to $|K^\times|$, so that $x \mapsto x^s$ is a permutation of $K$. The Weil spectrum for $K$ and $s$, which is the family of values $W^{K,s}_u$ as $u$ runs through $K^\times$, is of interest in arithmetic geometry and in several information-theoretic applications. The Weil spectrum always contains at least three distinct values if $s$ is nondegenerate (i.e., if $s$ is not a power of $p$ modulo $|K^\times|$, where $p$ is the characteristic of $K$). It is already known that if the Weil spectrum contains precisely three distinct values, then they must all be rational integers. We show that if the Weil spectrum contains precisely four distinct values, then they must all be rational integers, with the sole exception of the case where $|K|=5$ and $s \equiv 3 \pmod{4}$. △ Less

Submitted 6 April, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

Comments: 33 pages

MSC Class: 11T24; 11L05; 11L40; 11T22; 11G25; 11T71; 94A55; 94A60; 94B15

arXiv:2306.00976 [pdf, other]

TopEx: Topic-based Explanations for Model Comparison

Authors: Shreya Havaldar, Adam Stein, Eric Wong, Lyle Ungar

Abstract: Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between… ▽ More Meaningfully comparing language models is challenging with current explanation methods. Current explanations are overwhelming for humans due to large vocabularies or incomparable across models. We present TopEx, an explanation method that enables a level playing field for comparing language models via model-agnostic topics. We demonstrate how TopEx can identify similarities and differences between DistilRoBERTa and GPT-2 on a variety of NLP tasks. △ Less

Submitted 1 June, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: Accepted to ICLR 2023, Tiny Papers Track

arXiv:2305.16308 [pdf, other]

Rectifying Group Irregularities in Explanations for Distribution Shift

Authors: Adam Stein, Yinjun Wu, Eric Wong, Mayur Naik

Abstract: It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these… ▽ More It is well-known that real-world changes constituting distribution shift adversely affect model performance. How to characterize those changes in an interpretable manner is poorly understood. Existing techniques to address this problem take the form of shift explanations that elucidate how to map samples from the original distribution toward the shifted one by reducing the disparity between these two distributions. However, these methods can introduce group irregularities, leading to explanations that are less feasible and robust. To address these issues, we propose Group-aware Shift Explanations (GSE), a method that produces interpretable explanations by leveraging worst-group optimization to rectify group irregularities. We demonstrate how GSE not only maintains group structures, such as demographic and hierarchical subpopulations, but also enhances feasibility and robustness in the resulting explanations in a wide range of tabular, language, and image settings. △ Less

Submitted 25 May, 2023; originally announced May 2023.

Comments: 19 pages, 5 figures

arXiv:2303.09603 [pdf, ps, other]

Rigorous Analytic Combinatorics in Several Variables in SageMath

Authors: Benjamin Hackl, Andrew Luo, Stephen Melczer, Jesse Selover, Elaine Wong

Abstract: We introduce the new sage_acsv package for the SageMath computer algebra system, allowing users to rigorously compute asymptotics for a large variety of multivariate sequences with rational generating functions. Using Sage's support for exact computations over the algebraic number field, this package provides the first rigorous implementation of algorithms from the theory of analytic combinatorics… ▽ More We introduce the new sage_acsv package for the SageMath computer algebra system, allowing users to rigorously compute asymptotics for a large variety of multivariate sequences with rational generating functions. Using Sage's support for exact computations over the algebraic number field, this package provides the first rigorous implementation of algorithms from the theory of analytic combinatorics in several variables. △ Less

Submitted 31 August, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: 8 pages; Package: https://pypi.org/project/sage-acsv/

Journal ref: Séminaire Lotharingiende Combinatoire 89B (2023): Proceedings of the 35th FPSAC Conference, Article #90,12pp

arXiv:2303.01433 [pdf, other]

Do Machine Learning Models Learn Statistical Rules Inferred from Data?

Authors: Aaditya Naik, Yinjun Wu, Mayur Naik, Eric Wong

Abstract: Machine learning models can make critical errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or to even formalize. We thereby seek to infer statistical rules from the data and quantify the extent to which a model has learned them. We propose a framework SQRL tha… ▽ More Machine learning models can make critical errors that are easily hidden within vast amounts of data. Such errors often run counter to rules based on human intuition. However, rules based on human knowledge are challenging to scale or to even formalize. We thereby seek to infer statistical rules from the data and quantify the extent to which a model has learned them. We propose a framework SQRL that integrates logic-based methods with statistical inference to derive these rules from a model's training data without supervision. We further show how to adapt models at test time to reduce rule violations and produce more coherent predictions. SQRL generates up to 300K rules over datasets from vision, tabular, and language settings. We uncover up to 158K violations of those rules by state-of-the-art models for classification, object detection, and data imputation. Test-time adaptation reduces these violations by up to 68.7% with relative performance improvement up to 32%. SQRL is available at https://github.com/DebugML/sqrl. △ Less

Submitted 6 June, 2023; v1 submitted 2 March, 2023; originally announced March 2023.

arXiv:2302.11042 [pdf, other]

In-context Example Selection with Influences

Authors: Tai Nguyen, Eric Wong

Abstract: In-context learning (ICL) is a powerful paradigm emerged from large language models (LLMs). Despite its promises, ICL performance is known to be highly sensitive to input examples. In this work, we use $\textit{in-context influences}$ to analyze few-shot ICL performance directly from the in-context examples. Our proposed influence-based example selection method can identify both positive and negat… ▽ More In-context learning (ICL) is a powerful paradigm emerged from large language models (LLMs). Despite its promises, ICL performance is known to be highly sensitive to input examples. In this work, we use $\textit{in-context influences}$ to analyze few-shot ICL performance directly from the in-context examples. Our proposed influence-based example selection method can identify both positive and negative examples, outperforming several baselines when evaluated on 9 SuperGLUE tasks. Our analysis uncovers up to a $16.3\%$ performance gap between using the most negative in-context examples compared to the most positive. In a case study, we apply our influence-based framework to quantify the phenomena of recency bias in example ordering for few-shot ICL. △ Less

Submitted 5 June, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

arXiv:2302.04237 [pdf, other]

Black Box Adversarial Prompting for Foundation Models

Authors: Natalie Maus, Patrick Chao, Eric Wong, Jacob Gardner

Abstract: Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to… ▽ More Prompting interfaces allow users to quickly adjust the output of generative models in both vision and language. However, small changes and design choices in the prompt can lead to significant differences in the output. In this work, we develop a black-box framework for generating adversarial prompts for unstructured image and text generation. These prompts, which can be standalone or prepended to benign prompts, induce specific behaviors into the generative process, such as generating images of a particular object or generating high perplexity text. △ Less

Submitted 29 May, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

arXiv:2302.04067 [pdf, other]

doi 10.1145/3597066.3597113

A Unified Approach to Unimodality of Gaussian Polynomials

Authors: Christoph Koutschan, Ali K. Uncu, Elaine Wong

Abstract: In 2013, Pak and Panova proved the strict unimodality property of $q$-binomial coefficients $\binom{\ell+m}{m}_q$ (as polynomials in $q$) based on the combinatorics of Young tableaux and the semigroup property of Kronecker coefficients. They showed it to be true for all $\ell,m\geq 8$ and a few other cases. We propose a different approach to this problem based on computer algebra, where we establi… ▽ More In 2013, Pak and Panova proved the strict unimodality property of $q$-binomial coefficients $\binom{\ell+m}{m}_q$ (as polynomials in $q$) based on the combinatorics of Young tableaux and the semigroup property of Kronecker coefficients. They showed it to be true for all $\ell,m\geq 8$ and a few other cases. We propose a different approach to this problem based on computer algebra, where we establish a closed form for the coefficients of these polynomials and then use cylindrical algebraic decomposition to identify exactly the range of coefficients where strict unimodality holds. This strategy allows us to tackle generalizations of the problem, e.g., to show unimodality with larger gaps or unimodality of related sequences. In particular, we present proofs of two additional cases of a conjecture by Stanley and Zanello. △ Less

Submitted 31 August, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

Comments: Supplementary material at https://wongey.github.io/unimodality

Journal ref: ISSAC 2023: Proceedings of the 2023 International Symposium on Symbolic and Algebraic Computation, July 2023, Pages 434-442

arXiv:2301.13379 [pdf, other]

Faithful Chain-of-Thought Reasoning

Authors: Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, Chris Callison-Burch

Abstract: While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving… ▽ More While Chain-of-Thought (CoT) prompting boosts Language Models' (LM) performance on a gamut of complex reasoning tasks, the generated reasoning chain does not necessarily reflect how the model arrives at the answer (aka. faithfulness). We propose Faithful CoT, a reasoning framework involving two stages: Translation (Natural Language query $\rightarrow$ symbolic reasoning chain) and Problem Solving (reasoning chain $\rightarrow$ answer), using an LM and a deterministic solver respectively. This guarantees that the reasoning chain provides a faithful explanation of the final answer. Aside from interpretability, Faithful CoT also improves empirical performance: it outperforms standard CoT on 9 of 10 benchmarks from 4 diverse domains, with a relative accuracy gain of 6.3% on Math Word Problems (MWP), 3.4% on Planning, 5.5% on Multi-hop Question Answering (QA), and 21.4% on Relational Inference. Furthermore, with GPT-4 and Codex, it sets the new state-of-the-art few-shot performance on 7 datasets (with 95.0+ accuracy on 6 of them), showing a strong synergy between faithfulness and accuracy. △ Less

Submitted 20 September, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: IJCNLP-AACL 2023 camera-ready version

arXiv:2211.08624 [pdf, ps, other]

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-channel Speech Enhancement

Authors: Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu

Abstract: Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost. During training, our approach augments a model learning complex spectral mapping with a tempora… ▽ More Most speech enhancement (SE) models learn a point estimate and do not make use of uncertainty estimation in the learning process. In this paper, we show that modeling heteroscedastic uncertainty by minimizing a multivariate Gaussian negative log-likelihood (NLL) improves SE performance at no extra cost. During training, our approach augments a model learning complex spectral mapping with a temporary submodel to predict the covariance of the enhancement error at each time-frequency bin. Due to unrestricted heteroscedastic uncertainty, the covariance introduces an undersampling effect, detrimental to SE performance. To mitigate undersampling, our approach inflates the uncertainty lower bound and weights each loss component with their uncertainty, effectively compensating severely undersampled components with more penalties. Our multivariate setting reveals common covariance assumptions such as scalar and diagonal matrices. By weakening these assumptions, we show that the NLL achieves superior performance compared to popular loss functions including the mean squared error (MSE), mean absolute error (MAE), and scale-invariant signal-to-distortion ratio (SI-SDR). △ Less

Submitted 8 March, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

Comments: 5 pages. Accepted at ICASSP 2023

arXiv:2209.08422 [pdf]

Computed Decision Weights and a New Learning Algorithm for Neural Classifiers

Authors: Eugene Wong

Abstract: In this paper we consider the possibility of computing rather than training the decision layer weights of a neural classifier. Such a possibility arises in two way, from making an appropriate choice of loss function and by solving a problem of constrained optimization. The latter formulation leads to a promising new learning process for pre-decision weights with both simplicity and efficacy. In this paper we consider the possibility of computing rather than training the decision layer weights of a neural classifier. Such a possibility arises in two way, from making an appropriate choice of loss function and by solving a problem of constrained optimization. The latter formulation leads to a promising new learning process for pre-decision weights with both simplicity and efficacy. △ Less

Submitted 17 September, 2022; originally announced September 2022.

arXiv:2209.02446 [pdf, other]

Web3 Challenges and Opportunities for the Market

Authors: Dan Sheridan, James Harris, Frank Wear, Jerry Cowell Jr, Easton Wong, Abbas Yazdinejad

Abstract: The inability of a computer to think has been a limiter in its usefulness and a point of reassurance for humanity since the first computers were created. The semantic web is the first step toward removing that barrier, enabling computers to operate based on conceptual understanding, and AI and ML are the second. Both semantic knowledge and the ability to learn are fundamental to web3, as are block… ▽ More The inability of a computer to think has been a limiter in its usefulness and a point of reassurance for humanity since the first computers were created. The semantic web is the first step toward removing that barrier, enabling computers to operate based on conceptual understanding, and AI and ML are the second. Both semantic knowledge and the ability to learn are fundamental to web3, as are blockchain, decentralization, transactional transparency, and ownership. Web3 is the next generational step in the information age, where the web evolves into a more digestible medium for users and machines to browse knowledge. The slow introduction of Web3 across the global software ecosystem will impact the people who enable the current iteration. This evolution of the internet space will expand the way knowledge is shared, consumed, and owned, which will lessen the requirement for a global standard and allow data to interact efficiently, no matter the construction of the knowledge. The heart of this paper understands the: 1) Enablement of Web3 across the digital ecosystem. 2) What a Web3 developer will look like. 3) How this alteration will evolve the market around software and knowledge in general. △ Less

Submitted 6 September, 2022; originally announced September 2022.

arXiv:2207.05739 [pdf, other]

A Data-Based Perspective on Transfer Learning

Authors: Saachi Jain, Hadi Salman, Alaa Khaddaj, Eric Wong, Sung Min Park, Aleksander Madry

Abstract: It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Ou… ▽ More It is commonly believed that in transfer learning including more pre-training data translates into better performance. However, recent evidence suggests that removing data from the source dataset can actually help too. In this work, we take a closer look at the role of the source dataset's composition in transfer learning and present a framework for probing its impact on downstream performance. Our framework gives rise to new capabilities such as pinpointing transfer learning brittleness as well as detecting pathologies such as data-leakage and the presence of misleading examples in the source dataset. In particular, we demonstrate that removing detrimental datapoints identified by our framework improves transfer learning performance from ImageNet on a variety of target tasks. Code is available at https://github.com/MadryLab/data-transfer △ Less

Submitted 12 July, 2022; originally announced July 2022.

arXiv:2207.02842 [pdf, other]

When does Bias Transfer in Transfer Learning?

Authors: Hadi Salman, Saachi Jain, Andrew Ilyas, Logan Engstrom, Eric Wong, Aleksander Madry

Abstract: Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural… ▽ More Using transfer learning to adapt a pre-trained "source model" to a downstream "target task" can dramatically increase performance with seemingly no downside. In this work, we demonstrate that there can exist a downside after all: bias transfer, or the tendency for biases of the source model to persist even after adapting the model to the target class. Through a combination of synthetic and natural experiments, we show that bias transfer both (a) arises in realistic settings (such as when pre-training on ImageNet or other standard datasets) and (b) can occur even when the target dataset is explicitly de-biased. As transfer-learned models are increasingly deployed in the real world, our work highlights the importance of understanding the limitations of pre-trained source models. Code is available at https://github.com/MadryLab/bias-transfer △ Less

Submitted 6 July, 2022; originally announced July 2022.

arXiv:2204.08945 [pdf, other]

Missingness Bias in Model Debugging

Authors: Saachi Jain, Hadi Salman, Eric Wong, Pengchuan Zhang, Vibhav Vineet, Sai Vemprala, Aleksander Madry

Abstract: Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures ca… ▽ More Missingness, or the absence of features from an input, is a concept fundamental to many model debugging tools. However, in computer vision, pixels cannot simply be removed from an image. One thus tends to resort to heuristics such as blacking out pixels, which may in turn introduce bias into the debugging process. We study such biases and, in particular, show how transformer-based architectures can enable a more natural implementation of missingness, which side-steps these issues and improves the reliability of model debugging in practice. Our code is available at https://github.com/madrylab/missingness △ Less

Submitted 13 June, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: Published at ICLR 2022

arXiv:2202.13898

DistAD: Software Anomaly Detection Based on Execution Trace Distribution

Authors: Shiyi Kong, Jun Ai, Minyan Lu, Shuguang Wang, W. Eric Wong

Abstract: Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) be… ▽ More Modern software systems have become increasingly complex, which makes them difficult to test and validate. Detecting software partial anomalies in complex systems at runtime can assist with handling unintended software behaviors, avoiding catastrophic software failures and improving software runtime availability. These detection techniques aim to identify the manifestation of faults (anomalies) before they ultimately lead to unavoidable failures, thus, supporting the following runtime fault-tolerant techniques. In this work, we propose a novel anomaly detection method named DistAD, which is based on the distribution of software runtime dynamic execution traces. Unlike other existing works using key performance indicators, the execution trace is collected during runtime via intrusive instrumentation. Instrumentation are controlled following a sampling mechanism to avoid excessive overheads. Bi-directional Long Short-Term Memory (Bi-LSTM), an architecture of Recurrent Neural Network (RNN) is used to achieve the anomaly detection. The whole framework is constructed under a One-Class Neural Network (OCNN) learning mode which can help eliminate the limits of lacking for enough labeled samples and the data imbalance issues. A series of controlled experiments are conducted on a widely used database system named Cassandra to prove the validity and feasibility of the proposed method. Overheads brought about by the intrusive probing are also evaluated. The results show that DistAD can achieve more than 70% accuracy and 90% recall (in normal states) with no more than 2 times overheads compared with unmonitored executions. △ Less

Submitted 26 April, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

Comments: need modification, the experiment results need carefully check

arXiv:2202.12778 [pdf, other]

doi 10.1109/INFOCOM.2018.8485846

CCOMPASSION: A Hybrid Cloudlet Placement Framework over Passive Optical Access Networks

Authors: Sourav Mondal, Goutam Das, Elaine Wong

Abstract: Cloud-based computing technology is one of the most significant technical advents of the last decade and extension of this facility towards access networks by aggregation of cloudlets is a step further. To fulfill the ravenous demand for computational resources entangled with the stringent latency requirements of computationally-heavy applications related to augmented reality, cognitive assistance… ▽ More Cloud-based computing technology is one of the most significant technical advents of the last decade and extension of this facility towards access networks by aggregation of cloudlets is a step further. To fulfill the ravenous demand for computational resources entangled with the stringent latency requirements of computationally-heavy applications related to augmented reality, cognitive assistance and context-aware computation, installation of cloudlets near the access segment is a very promising solution because of its support for wide geographical network distribution, low latency, mobility and heterogeneity. In this paper, we propose a novel framework, Cloudlet Cost OptiMization over PASSIve Optical Network (CCOMPASSION), and formulate a nonlinear mixed-integer program to identify optimal cloudlet placement locations such that installation cost is minimized whilst meeting the capacity and latency constraints. Considering urban, suburban and rural scenarios as commonly-used network deployment models, we investigate the feasibility of the proposed model over them and provide guidance on the overall cloudlet facility installation over optical access network. We also study the percentage of incremental energy budget in the presence of cloudlets of the existing network. The final results from our proposed model can be considered as fundamental cornerstones for network planning with hybrid cloudlet network architectures. △ Less

Submitted 25 February, 2022; originally announced February 2022.

Comments: This paper is published in 2018 IEEE Conference on Computer Communications (INFOCOM). Copyright @ IEEE

Report number: 18150397

Journal ref: IEEE INFOCOM 2018 - IEEE Conference on Computer Communications

arXiv:2110.07719 [pdf, other]

Certified Patch Robustness via Smoothed Vision Transformers

Authors: Hadi Salman, Saachi Jain, Eric Wong, Aleksander Mądry

Abstract: Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incu… ▽ More Certified patch defenses can guarantee robustness of an image classifier to arbitrary changes within a bounded contiguous region. But, currently, this robustness comes at a cost of degraded standard accuracies and slower inference times. We demonstrate how using vision transformers enables significantly better certified patch robustness that is also more computationally efficient and does not incur a substantial drop in standard accuracy. These improvements stem from the inherent ability of the vision transformer to gracefully handle largely masked images. Our code is available at https://github.com/MadryLab/smoothed-vit. △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2106.12041 [pdf, other]

doi 10.5194/ascmo-8-117-2022

Analysis of the Evolution of Parametric Drivers of High-End Sea-Level Hazards

Authors: Alana Hough, Tony E. Wong

Abstract: Climate models are critical tools for developing strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how thos… ▽ More Climate models are critical tools for developing strategies to manage the risks posed by sea-level rise to coastal communities. While these models are necessary for understanding climate risks, there is a level of uncertainty inherent in each parameter in the models. This model parametric uncertainty leads to uncertainty in future climate risks. Consequently, there is a need to understand how those parameter uncertainties impact our assessment of future climate risks and the efficacy of strategies to manage them. Here, we use random forests to examine the parametric drivers of future climate risk and how the relative importances of those drivers change over time. We find that the equilibrium climate sensitivity and a factor that scales the effect of aerosols on radiative forcing are consistently the most important climate model parametric uncertainties throughout the 2020 to 2150 interval for both low and high radiative forcing scenarios. The near-term hazards of high-end sea-level rise are driven primarily by thermal expansion, while the longer-term hazards are associated with mass loss from the Antarctic and Greenland ice sheets. Our results highlight the practical importance of considering time-evolving parametric uncertainties when developing strategies to manage future climate risks. △ Less

Submitted 10 June, 2021; originally announced June 2021.

arXiv:2106.09117 [pdf, other]

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Authors: Shaoru Chen, Eric Wong, J. Zico Kolter, Mahyar Fazlyab

Abstract: Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In… ▽ More Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular, scales to very large problem instances, and compromises operations that are amenable to fast parallelization with GPU acceleration. We demonstrate our method in bounding the worst-case performance of large convolutional networks in image classification and reinforcement learning settings, and in reachability analysis of neural network dynamical systems. △ Less

Submitted 8 July, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

Comments: Published in IEEE Open Journal of Control Systems

arXiv:2105.08539 [pdf, other]

doi 10.1016/j.ejc.2021.103437

Binomial Determinants for Tiling Problems Yield to the Holonomic Ansatz

Authors: Hao Du, Christoph Koutschan, Thotsaporn Thanatipanonda, Elaine Wong

Abstract: We present and prove closed form expressions for some families of binomial determinants with signed Kronecker deltas that are located along an arbitrary diagonal in the corresponding matrix. They count cyclically symmetric rhombus tilings of hexagonal regions with triangular holes. We extend a previous systematic study of these families, where the locations of the Kronecker deltas depended on an a… ▽ More We present and prove closed form expressions for some families of binomial determinants with signed Kronecker deltas that are located along an arbitrary diagonal in the corresponding matrix. They count cyclically symmetric rhombus tilings of hexagonal regions with triangular holes. We extend a previous systematic study of these families, where the locations of the Kronecker deltas depended on an additional parameter, to families with negative Kronecker deltas. By adapting Zeilberger's holonomic ansatz to make it work for our problems, we can take full advantage of computer algebra tools for symbolic summation. This, together with the combinatorial interpretation, allows us to realize some new determinantal relationships. From there, we are able to resolve all remaining open conjectures related to these determinants, including one from 2005 due to Lascoux and Krattenthaler. △ Less

Submitted 21 September, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: 45 pages; Supplementary material at https://wongey.github.io/binom-det

Journal ref: European Journal of Combinatorics, Volume 99, January 2022, 103437

arXiv:2105.04857 [pdf, other]

Leveraging Sparse Linear Layers for Debuggable Deep Networks

Authors: Eric Wong, Shibani Santurkar, Aleksander Mądry

Abstract: We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, expla… ▽ More We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantiatively via numerical and human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks. The code for our toolkit can be found at https://github.com/madrylab/debuggabledeepnetworks. △ Less

Submitted 11 May, 2021; originally announced May 2021.

arXiv:2010.08889 [pdf, other]

doi 10.1007/s11786-021-00514-3

Creative Telescoping on Multiple Sums

Authors: Christoph Koutschan, Elaine Wong

Abstract: We showcase a collection of practical strategies to deal with a problem arising from an analysis of integral estimators derived via quasi-Monte Carlo methods. The problem reduces to a triple binomial sum, thereby enabling us to open up the holonomic toolkit, which contains tools such as creative telescoping that can be used to deduce a recurrence satisfied by the sum. While applying these techniqu… ▽ More We showcase a collection of practical strategies to deal with a problem arising from an analysis of integral estimators derived via quasi-Monte Carlo methods. The problem reduces to a triple binomial sum, thereby enabling us to open up the holonomic toolkit, which contains tools such as creative telescoping that can be used to deduce a recurrence satisfied by the sum. While applying these techniques, a host of issues arose that partly needed to be resolved by hand. In other words, no creative telescoping implementation currently exists that can resolve all these issues automatically. Thus, we felt the need to compile the different strategies we tried and the difficulties that we encountered along the way. In particular, we highlight the necessity of the certificate in these computations and how its complexity can greatly influence the computation time. △ Less

Submitted 17 March, 2021; v1 submitted 17 October, 2020; originally announced October 2020.

Comments: 22 pages; Supplementary material at https://wongey.github.io/digital-nets-walsh/

Journal ref: Mathematics in Computer Science, Vol. 15(3), Pages 483-498 (2021)

arXiv:2007.08450 [pdf, other]

Learning perturbation sets for robust machine learning

Authors: Eric Wong, J. Zico Kolter

Abstract: Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a c… ▽ More Although much progress has been made towards robust deep learning, a significant gap in robustness remains between real-world perturbations and more narrowly defined sets typically studied in adversarial defenses. In this paper, we aim to bridge this gap by learning perturbation sets from data, in order to characterize real-world effects for robust training and evaluation. Specifically, we use a conditional generator that defines the perturbation set over a constrained region of the latent space. We formulate desirable properties that measure the quality of a learned perturbation set, and theoretically prove that a conditional variational autoencoder naturally satisfies these criteria. Using this framework, our approach can generate a variety of perturbations at different complexities and scales, ranging from baseline spatial transformations, through common image corruptions, to lighting variations. We measure the quality of our learned perturbation sets both quantitatively and qualitatively, finding that our models are capable of producing a diverse set of meaningful perturbations beyond the limited data seen during training. Finally, we leverage our learned perturbation sets to train models which are empirically and certifiably robust to adversarial image corruptions and adversarial lighting variations, while improving generalization on non-adversarial data. All code and configuration files for reproducing the experiments as well as pretrained model weights can be found at https://github.com/locuslab/perturbation_learning. △ Less

Submitted 8 October, 2020; v1 submitted 16 July, 2020; originally announced July 2020.

arXiv:2007.00147 [pdf, other]

Neural Network Virtual Sensors for Fuel Injection Quantities with Provable Performance Specifications

Authors: Eric Wong, Tim Schneider, Joerg Schmitt, Frank R. Schmidt, J. Zico Kolter

Abstract: Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance… ▽ More Recent work has shown that it is possible to learn neural networks with provable guarantees on the output of the model when subject to input perturbations, however these works have focused primarily on defending against adversarial examples for image classifiers. In this paper, we study how these provable guarantees can be naturally applied to other real world settings, namely getting performance specifications for robust virtual sensors measuring fuel injection quantities within an engine. We first demonstrate that, in this setting, even simple neural network models are highly susceptible to reasonable levels of adversarial sensor noise, which are capable of increasing the mean relative error of a standard neural network from 6.6% to 43.8%. We then leverage methods for learning provably robust networks and verifying robustness properties, resulting in a robust model which we can provably guarantee has at most 16.5% mean relative error under any sensor noise. Additionally, we show how specific intervals of fuel injection quantities can be targeted to maximize robustness for certain ranges, allowing us to train a virtual sensor for fuel injection which is provably guaranteed to have at most 10.69% relative error under noise while maintaining 3% relative error on non-adversarial data within normalized fuel injection ranges of 0.6 to 1.0. △ Less

Submitted 30 June, 2020; originally announced July 2020.

arXiv:2006.06225 [pdf, other]

doi 10.1016/j.matcom.2020.10.026

Walsh functions, scrambled $(0,m,s)$-nets, and negative covariance: applying symbolic computation to quasi-Monte Carlo integration

Authors: Jaspar Wiart, Elaine Wong

Abstract: We investigate base $b$ Walsh functions for which the variance of the integral estimator based on a scrambled $(0,m,s)$-net in base $b$ is less than or equal to that of the Monte-Carlo estimator based on the same number of points. First we compute the Walsh decomposition for the joint probability density function of two distinct points randomly chosen from a scrambled $(t,m,s)$-net in base $b$ in… ▽ More We investigate base $b$ Walsh functions for which the variance of the integral estimator based on a scrambled $(0,m,s)$-net in base $b$ is less than or equal to that of the Monte-Carlo estimator based on the same number of points. First we compute the Walsh decomposition for the joint probability density function of two distinct points randomly chosen from a scrambled $(t,m,s)$-net in base $b$ in terms of certain counting numbers and simplify it in the special case $t$ is zero. Using this, we obtain an expression for the covariance of the integral estimator in terms of the Walsh coefficients of the function. Finally, we prove that the covariance of the integral estimator is negative when the Walsh coefficients of the function satisfy a certain decay condition. To do this, we use creative telescoping and recurrence solving algorithms from symbolic computation to find a sign equivalent closed form expression for the covariance term. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: 27 pages; Supplementary material at https://wongey.github.io/digital-nets-walsh/

Journal ref: Mathematics and Computers in Simulation, Volume 182, April 2021, Pages 277-295

arXiv:2006.00390 [pdf, other]

doi 10.1016/j.comnet.2022.108847

Centralized and Decentralized Non-Cooperative Load-Balancing Games among Federated Cloudlets

Authors: Sourav Mondal, Goutam Das, Elaine Wong

Abstract: Edge computing servers like cloudlets from different service providers compensate scarce computational, memory, and energy resources of mobile devices, are distributed across access networks. However, depending on the mobility pattern and dynamically varying computational requirements of associated mobile devices, cloudlets at different parts of the network become either overloaded or under-loaded… ▽ More Edge computing servers like cloudlets from different service providers compensate scarce computational, memory, and energy resources of mobile devices, are distributed across access networks. However, depending on the mobility pattern and dynamically varying computational requirements of associated mobile devices, cloudlets at different parts of the network become either overloaded or under-loaded. Hence, load balancing among neighboring cloudlets appears to be an essential research problem. Nonetheless, the existing load balancing frameworks are unsuitable for low-latency applications. Thus, in this paper, we propose an economic and non-cooperative load balancing game for low-latency applications among federated neighboring cloudlets from the same as well as different service providers and heterogeneous classes of job requests. Firstly, we propose a centralized incentive mechanism to compute the pure strategy Nash equilibrium load balancing strategies of the cloudlets under the supervision of a neutral mediator. With this mechanism, we ensure that the truthful revelation of private information to the mediator is a weakly-dominant strategy for all the federated cloudlets. Secondly, we propose a continuous-action reinforcement learning automata-based algorithm, which allows each cloudlet to independently compute the Nash equilibrium in a completely distributed network setting. We critically study the convergence properties of the designed learning algorithm, scaffolding our understanding of the underlying load balancing game for faster convergence. Furthermore, through extensive simulations, we study the impacts of exploration and exploitation on learning accuracy. This is the first study to show the effectiveness of reinforcement learning algorithms for load balancing games among neighboring cloudlets. △ Less

Submitted 5 May, 2021; v1 submitted 30 May, 2020; originally announced June 2020.

Report number: 108847

arXiv:2002.11569 [pdf, other]

Overfitting in adversarially robust deep learning

Authors: Leslie Rice, Eric Wong, J. Zico Kolter

Abstract: It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are… ▽ More It is common practice in deep learning to use overparameterized networks and train for as long as possible; there are numerous studies that show, both theoretically and empirically, that such practices surprisingly do not unduly harm the generalization performance of the classifier. In this paper, we empirically study this phenomenon in the setting of adversarially trained deep networks, which are trained to minimize the loss under worst-case adversarial perturbations. We find that overfitting to the training set does in fact harm robust performance to a very large degree in adversarially robust training across multiple datasets (SVHN, CIFAR-10, CIFAR-100, and ImageNet) and perturbation models ($\ell_\infty$ and $\ell_2$). Based upon this observed effect, we show that the performance gains of virtually all recent algorithmic improvements upon adversarial training can be matched by simply using early stopping. We also show that effects such as the double descent curve do still occur in adversarially trained models, yet fail to explain the observed overfitting. Finally, we study several classical and modern deep learning remedies for overfitting, including regularization and data augmentation, and find that no approach in isolation improves significantly upon the gains achieved by early stopping. All code for reproducing the experiments as well as pretrained model weights and training logs can be found at https://github.com/locuslab/robust_overfitting. △ Less

Submitted 4 March, 2020; v1 submitted 26 February, 2020; originally announced February 2020.

arXiv:2002.02355 [pdf, ps, other]

doi 10.1145/3373207.3404025

An Additive Decomposition in S-Primitive Towers

Authors: Hao Du, Jing Guo, Ziming Li, Elaine Wong

Abstract: We consider the additive decomposition problem in primitive towers and present an algorithm to decompose a function in an S-primitive tower as a sum of a derivative in the tower and a remainder which is minimal in some sense. Special instances of S-primitive towers include differential fields generated by finitely many logarithmic functions and logarithmic integrals. A function in an S-primitive t… ▽ More We consider the additive decomposition problem in primitive towers and present an algorithm to decompose a function in an S-primitive tower as a sum of a derivative in the tower and a remainder which is minimal in some sense. Special instances of S-primitive towers include differential fields generated by finitely many logarithmic functions and logarithmic integrals. A function in an S-primitive tower is integrable in the tower if and only if the remainder is equal to zero. The additive decomposition is achieved by viewing our towers not as a traditional chain of extension fields, but rather as a direct sum of certain subrings. Furthermore, we can determine whether or not a function in an S-primitive tower has an elementary integral without solving any differential equations. We also show that a kind of S-primitive towers, known as logarithmic towers, can be embedded into a particular extension where we can obtain a finer remainder. △ Less

Submitted 6 February, 2020; originally announced February 2020.

Comments: This article has been submitted to ISSAC2020 for review. Supplementary material at https://wongey.github.io/add-decomp-sprimitive/

Journal ref: ISSAC 2020: Proceedings of the 45th International Symposium on Symbolic and Algebraic Computation, July 2020, Pages 146-153

Showing 1–50 of 62 results for author: Wong, E