Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Wu, C H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12814  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Adversarial Attacks on Multimodal Agents

    Authors: Chen Henry Wu, Jing Yu Koh, Ruslan Salakhutdinov, Daniel Fried, Aditi Raghunathan

    Abstract: Vision-enabled language models (VLMs) are now used to build autonomous multimodal agents capable of taking actions in real environments. In this paper, we show that multimodal agents raise new safety risks, even though attacking agents is more challenging than prior attacks due to limited access to and knowledge about the environment. Our attacks use adversarial text strings to guide gradient-base… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 19 pages

  2. arXiv:2312.03556  [pdf, other

    cs.CV cs.LG

    Personalized Face Inpainting with Diffusion Models by Parallel Visual Attention

    Authors: Jianjin Xu, Saman Motamed, Praneetha Vaddamanu, Chen Henry Wu, Christian Haene, Jean-Charles Bazin, Fernando de la Torre

    Abstract: Face inpainting is important in various applications, such as photo restoration, image editing, and virtual reality. Despite the significant advances in face generative models, ensuring that a person's unique facial identity is maintained during the inpainting process is still an elusive goal. Current state-of-the-art techniques, exemplified by MyStyle, necessitate resource-intensive fine-tuning a… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

  3. arXiv:2309.11489  [pdf, other

    cs.LG cs.AI cs.CL cs.RO

    Text2Reward: Reward Shaping with Language Models for Reinforcement Learning

    Authors: Tianbao Xie, Siheng Zhao, Chen Henry Wu, Yitao Liu, Qian Luo, Victor Zhong, Yanchao Yang, Tao Yu

    Abstract: Designing reward functions is a longstanding challenge in reinforcement learning (RL); it requires specialized knowledge or domain data, leading to high costs for development. To address this, we introduce Text2Reward, a data-free framework that automates the generation and shaping of dense reward functions based on large language models (LLMs). Given a goal described in natural language, Text2Rew… ▽ More

    Submitted 25 May, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: ICLR 2024 camera ready, 37 pages, 12 figures

  4. arXiv:2309.05569  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    ITI-GEN: Inclusive Text-to-Image Generation

    Authors: Cheng Zhang, Xuanbai Chen, Siqi Chai, Chen Henry Wu, Dmitry Lagun, Thabo Beeler, Fernando De la Torre

    Abstract: Text-to-image generative models often reflect the biases of the training data, leading to unequal representations of underrepresented groups. This study investigates inclusive text-to-image generative models that generate images based on human-written prompts and ensure the resulting images are uniformly distributed across attributes of interest. Unfortunately, directly expressing the desired attr… ▽ More

    Submitted 11 September, 2023; originally announced September 2023.

    Comments: Accepted to ICCV 2023 (Oral Presentation)

  5. arXiv:2304.06107  [pdf, other

    cs.CV cs.LG

    PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting

    Authors: Saman Motamed, Jianjin Xu, Chen Henry Wu, Fernando De la Torre

    Abstract: Generative models such as StyleGAN2 and Stable Diffusion have achieved state-of-the-art performance in computer vision tasks such as image synthesis, inpainting, and de-noising. However, current generative models for face inpainting often fail to preserve fine facial details and the identity of the person, despite creating aesthetically convincing image structures and textures. In this work, we pr… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

  6. arXiv:2303.15441  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Zero-shot Model Diagnosis

    Authors: Jinqi Luo, Zhaoning Wang, Chen Henry Wu, Dong Huang, Fernando De la Torre

    Abstract: When it comes to deploying deep vision models, the behavior of these systems must be explicable to ensure confidence in their reliability and fairness. A common approach to evaluate deep learning models is to build a labeled test set with attributes of interest and assess how well it performs. However, creating a balanced test set (i.e., one that is uniformly sampled over all the important traits)… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Accepted in CVPR 2023

  7. arXiv:2303.13010  [pdf, other

    cs.CV cs.AI cs.LG

    Semantic Image Attack for Visual Model Diagnosis

    Authors: Jinqi Luo, Zhaoning Wang, Chen Henry Wu, Dong Huang, Fernando De la Torre

    Abstract: In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models. This is partially due to the fact that obtaining a balanced, diverse, and perfectly labeled dataset is typically expensive, time-consuming, and error-prone. Rather than relying on a carefully designed test set to assess ML models' failures, fairness, or robustness, this paper proposes S… ▽ More

    Submitted 22 March, 2023; originally announced March 2023.

    Comments: Initial version submitted to NeurIPS 2022

  8. arXiv:2210.05559  [pdf, other

    cs.CV cs.GR cs.LG

    Unifying Diffusion Models' Latent Space, with Applications to CycleDiffusion and Guidance

    Authors: Chen Henry Wu, Fernando De la Torre

    Abstract: Diffusion models have achieved unprecedented performance in generative modeling. The commonly-adopted formulation of the latent code of diffusion models is a sequence of gradually denoised samples, as opposed to the simpler (e.g., Gaussian) latent space of GANs, VAEs, and normalizing flows. This paper provides an alternative, Gaussian formulation of the latent space of various diffusion models, as… ▽ More

    Submitted 6 December, 2022; v1 submitted 11 October, 2022; originally announced October 2022.

  9. arXiv:2209.06970  [pdf, other

    cs.CV cs.GR cs.LG

    Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

    Authors: Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De la Torre

    Abstract: Generative models (e.g., GANs, diffusion models) learn the underlying data distribution in an unsupervised manner. However, many applications of interest require sampling from a particular region of the output space or sampling evenly over a range of characteristics. For efficient sampling in these scenarios, we propose Generative Visual Prompt (PromptGen), a framework for distributional control o… ▽ More

    Submitted 17 October, 2022; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022

  10. arXiv:2209.01975  [pdf, other

    cs.CL

    Selective Annotation Makes Language Models Better Few-Shot Learners

    Authors: Hongjin Su, Jungo Kasai, Chen Henry Wu, Weijia Shi, Tianlu Wang, Jiayi Xin, Rui Zhang, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu

    Abstract: Many recent approaches to natural language tasks are built on the remarkable abilities of large language models. Large language models can perform in-context learning, where they learn a new task from a few task demonstrations, without any parameter updates. This work examines the implications of in-context learning for the creation of datasets for new natural language tasks. Departing from recent… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

  11. arXiv:2206.02290  [pdf, other

    q-bio.QM cs.AI q-bio.MN

    A knowledge graph representation learning approach to predict novel kinase-substrate interactions

    Authors: Sachin Gavali, Karen Ross, Chuming Chen, Julie Cowart, Cathy H. Wu

    Abstract: The human proteome contains a vast network of interacting kinases and substrates. Even though some kinases have proven to be immensely useful as therapeutic targets, a majority are still understudied. In this work, we present a novel knowledge graph representation learning approach to predict novel interaction partners for understudied kinases. Our approach uses a phosphoproteomic knowledge graph… ▽ More

    Submitted 9 June, 2022; v1 submitted 5 June, 2022; originally announced June 2022.

  12. arXiv:2201.05966  [pdf, other

    cs.CL

    UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models

    Authors: Tianbao Xie, Chen Henry Wu, Peng Shi, Ruiqi Zhong, Torsten Scholak, Michihiro Yasunaga, Chien-Sheng Wu, Ming Zhong, Pengcheng Yin, Sida I. Wang, Victor Zhong, Bailin Wang, Chengzu Li, Connor Boyle, Ansong Ni, Ziyu Yao, Dragomir Radev, Caiming Xiong, Lingpeng Kong, Rui Zhang, Noah A. Smith, Luke Zettlemoyer, Tao Yu

    Abstract: Structured knowledge grounding (SKG) leverages structured knowledge to complete user requests, such as semantic parsing over databases and question answering over knowledge bases. Since the inputs and outputs of SKG tasks are heterogeneous, they have been studied separately by different communities, which limits systematic and compatible research on SKG. In this paper, we overcome this limitation… ▽ More

    Submitted 18 October, 2022; v1 submitted 15 January, 2022; originally announced January 2022.

    Comments: EMNLP 2022

  13. arXiv:2110.10150  [pdf, other

    cs.CL

    Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

    Authors: Yusen Zhang, Ansong Ni, Ziming Mao, Chen Henry Wu, Chenguang Zhu, Budhaditya Deb, Ahmed H. Awadallah, Dragomir Radev, Rui Zhang

    Abstract: Text summarization helps readers capture salient information from documents, news, interviews, and meetings. However, most state-of-the-art pretrained language models (LM) are unable to efficiently process long text for many summarization tasks. In this paper, we propose Summ$^N$, a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context lengt… ▽ More

    Submitted 13 April, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  14. arXiv:2110.08168  [pdf, other

    cs.CL

    DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

    Authors: Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

    Abstract: Transformer-based models have achieved state-of-the-art performance on short-input summarization. However, they still struggle with summarizing longer text. In this paper, we present DYLE, a novel dynamic latent extraction approach for abstractive long-input summarization. DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynam… ▽ More

    Submitted 24 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  15. arXiv:2109.07713  [pdf, other

    cs.CL cs.AI cs.LG

    Transferable Persona-Grounded Dialogues via Grounded Minimal Edits

    Authors: Chen Henry Wu, Yinhe Zheng, Xiaoxi Mao, Minlie Huang

    Abstract: Grounded dialogue models generate responses that are grounded on certain concepts. Limited by the distribution of grounded dialogue data, models trained on such data face the transferability challenges in terms of the data distribution and the type of grounded concepts. To address the challenges, we propose the grounded minimal editing framework, which minimally edits existing responses to be grou… ▽ More

    Submitted 16 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP 2021

  16. arXiv:2108.01547  [pdf, other

    cs.CL cs.AI

    EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training

    Authors: Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang

    Abstract: Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones. In this paper, we propose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model with 2.8B parameters. To build this model, we collect… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: 8 pages, 4 figures

  17. arXiv:2106.02210  [pdf, other

    cs.CL

    NAST: A Non-Autoregressive Generator with Word Alignment for Unsupervised Text Style Transfer

    Authors: Fei Huang, Zikai Chen, Chen Henry Wu, Qihan Guo, Xiaoyan Zhu, Minlie Huang

    Abstract: Autoregressive models have been widely used in unsupervised text style transfer. Despite their success, these models still suffer from the content preservation problem that they usually ignore part of the source sentence and generate some irrelevant words with strong styles. In this paper, we propose a Non-Autoregressive generator for unsupervised text Style Transfer (NAST), which alleviates the p… ▽ More

    Submitted 3 June, 2021; originally announced June 2021.

    Comments: Accepted by ACL 2021: Findings (long paper)