Zum Hauptinhalt springen

Showing 1–50 of 101 results for author: Pang, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.04682  [pdf, other

    cs.CL cs.AI cs.LG

    ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities

    Authors: Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Felix Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang

    Abstract: Recent large language models (LLMs) advancements sparked a growing research interest in tool assisted LLMs solving real-world challenges, which calls for comprehensive evaluation of tool-use capabilities. While previous works focused on either evaluating over stateless web services (RESTful API), based on a single turn user prompt, or an off-policy dialog trajectory, ToolSandbox includes stateful… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

  2. arXiv:2408.02666  [pdf, other

    cs.CL cs.AI

    Self-Taught Evaluators

    Authors: Tianlu Wang, Ilia Kulikov, Olga Golovneva, Ping Yu, Weizhe Yuan, Jane Dwivedi-Yu, Richard Yuanzhe Pang, Maryam Fazel-Zarandi, Jason Weston, Xian Li

    Abstract: Model-based evaluation is at the heart of successful model development -- as a reward model for training, and as a replacement for human evaluation. To train such evaluators, the standard approach is to collect a large amount of human preference judgments over model responses, which is costly and the data becomes stale as models improve. In this work, we present an approach that aims to im-prove e… ▽ More

    Submitted 8 August, 2024; v1 submitted 5 August, 2024; originally announced August 2024.

  3. arXiv:2407.21075  [pdf, other

    cs.AI cs.CL cs.LG

    Apple Intelligence Foundation Language Models

    Authors: Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek , et al. (130 additional authors not shown)

    Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used… ▽ More

    Submitted 29 July, 2024; originally announced July 2024.

  4. arXiv:2407.18961  [pdf, other

    cs.AI

    MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains

    Authors: Guoli Yin, Haoping Bai, Shuang Ma, Feng Nan, Yanchao Sun, Zhaoyang Xu, Shen Ma, Jiarui Lu, Xiang Kong, Aonan Zhang, Dian Ang Yap, Yizhe zhang, Karsten Ahnert, Vik Kamath, Mathias Berglund, Dominic Walsh, Tobias Gindele, Juergen Wiest, Zhengfeng Lai, Xiaoming Wang, Jiulong Shan, Meng Cao, Ruoming Pang, Zirui Wang

    Abstract: Recent advances in large language models (LLMs) have increased the demand for comprehensive benchmarks to evaluate their capabilities as human-like agents. Existing benchmarks, while useful, often focus on specific application scenarios, emphasizing task completion but failing to dissect the underlying skills that drive these outcomes. This lack of granularity makes it difficult to deeply discern… ▽ More

    Submitted 15 August, 2024; v1 submitted 17 July, 2024; originally announced July 2024.

  5. arXiv:2406.13853  [pdf, other

    cs.HC

    AltGeoViz: Facilitating Accessible Geovisualization

    Authors: Chu Li, Rock Yuren Pang, Ather Sharif, Arnavi Chheda-Kothary, Jeffrey Heer, Jon E. Froehlich

    Abstract: Geovisualizations are powerful tools for exploratory spatial analysis, enabling sighted users to discern patterns, trends, and relationships within geographic data. However, these visual tools have remained largely inaccessible to screen-reader users. We present AltGeoViz, a new system we designed to facilitate geovisualization exploration for these users. AltGeoViz dynamically generates alt-text… ▽ More

    Submitted 21 June, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

  6. arXiv:2406.09669  [pdf, other

    cs.CR

    Watch the Watcher! Backdoor Attacks on Security-Enhancing Diffusion Models

    Authors: Changjiang Li, Ren Pang, Bochuan Cao, Jinghui Chen, Fenglong Ma, Shouling Ji, Ting Wang

    Abstract: Thanks to their remarkable denoising capabilities, diffusion models are increasingly being employed as defensive tools to reinforce the security of other models, notably in purifying adversarial examples and certifying adversarial robustness. However, the security risks of these practices themselves remain largely unexplored, which is highly concerning. To bridge this gap, this work investigates t… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.04638  [pdf, other

    cs.CL

    Large Language Model-guided Document Selection

    Authors: Xiang Kong, Tom Gunter, Ruoming Pang

    Abstract: Large Language Model (LLM) pre-training exhausts an ever growing compute budget, yet recent research has demonstrated that careful document selection enables comparable model quality with only a fraction of the FLOPs. Inspired by efforts suggesting that domain-specific training document selection is in fact an interpretable process [Gunasekar et al., 2023], as well as research showing that instruc… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 9 pages

  8. arXiv:2405.17247  [pdf, other

    cs.LG

    An Introduction to Vision-Language Modeling

    Authors: Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, Alexander C. Li, Adrien Bardes, Suzanne Petryk, Oscar Mañas, Zhiqiu Lin, Anas Mahmoud, Bargav Jayaraman, Mark Ibrahim, Melissa Hall, Yunyang Xiong, Jonathan Lebensold, Candace Ross, Srihari Jayakumar, Chuan Guo, Diane Bouchacourt, Haider Al-Tahan, Karthik Padthe, Vasu Sharma, Hu Xu, Xiaoqing Ellen Tan, Megan Richards, Samuel Lavoie , et al. (16 additional authors not shown)

    Abstract: Following the recent popularity of Large Language Models (LLMs), several attempts have been made to extend them to the visual domain. From having a visual assistant that could guide us through unfamiliar environments to generative models that produce images using only a high-level text description, the vision-language model (VLM) applications will significantly impact our relationship with technol… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  9. arXiv:2405.15052  [pdf, other

    cs.LG cs.AI

    Revisiting MoE and Dense Speed-Accuracy Comparisons for LLM Training

    Authors: Xianzhi Du, Tom Gunter, Xiang Kong, Mark Lee, Zirui Wang, Aonan Zhang, Nan Du, Ruoming Pang

    Abstract: Mixture-of-Experts (MoE) enjoys performance gain by increasing model capacity while keeping computation cost constant. When comparing MoE to dense models, prior work typically adopt the following setting: 1) use FLOPs or activated parameters as a measure of model complexity; 2) train all models to the same number of tokens. We argue that this setting favors MoE as FLOPs and activated parameters do… ▽ More

    Submitted 28 June, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

    Comments: 8 pages

  10. arXiv:2405.06783  [pdf, other

    cs.HC cs.AI cs.CY

    BLIP: Facilitating the Exploration of Undesirable Consequences of Digital Technologies

    Authors: Rock Yuren Pang, Sebastin Santy, René Just, Katharina Reinecke

    Abstract: Digital technologies have positively transformed society, but they have also led to undesirable consequences not anticipated at the time of design or development. We posit that insights into past undesirable consequences can help researchers and practitioners gain awareness and anticipate potential adverse effects. To test this assumption, we introduce BLIP, a system that extracts real-world undes… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: To appear in the Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11--16, 2024, Honolulu, HI, USA

  11. arXiv:2404.19733  [pdf, other

    cs.CL cs.AI

    Iterative Reasoning Preference Optimization

    Authors: Richard Yuanzhe Pang, Weizhe Yuan, Kyunghyun Cho, He He, Sainbayar Sukhbaatar, Jason Weston

    Abstract: Iterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks (Yuan et al., 2024, Chen et al., 2024). In this work we develop an iterative approach that optimizes the preference between competing generated Chain-of-Thought (CoT) candidates by optimizing for winning vs. losing reasoni… ▽ More

    Submitted 25 June, 2024; v1 submitted 30 April, 2024; originally announced April 2024.

  12. arXiv:2404.12450  [pdf, other

    cs.CV cs.AI cs.LG

    Enhancing AI Diagnostics: Autonomous Lesion Masking via Semi-Supervised Deep Learning

    Authors: Ting-Ruen Wei, Michele Hell, Dang Bich Thuy Le, Aren Vierra, Ran Pang, Mahesh Patel, Young Kang, Yuling Yan

    Abstract: This study presents an unsupervised domain adaptation method aimed at autonomously generating image masks outlining regions of interest (ROIs) for differentiating breast lesions in breast ultrasound (US) imaging. Our semi-supervised learning approach utilizes a primitive model trained on a small public breast US dataset with true annotations. This model is then iteratively refined for the domain a… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  13. arXiv:2403.09611  [pdf, other

    cs.CV cs.CL cs.LG

    MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training

    Authors: Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, Sam Dodge, Bowen Zhang, Philipp Dufter, Dhruti Shah, Xianzhi Du, Futang Peng, Floris Weers, Anton Belyi, Haotian Zhang, Karanjeet Singh, Doug Kang, Ankur Jain, Hongyu Hè, Max Schwarzer, Tom Gunter, Xiang Kong, Aonan Zhang, Jianyu Wang, Chong Wang, Nan Du, Tao Lei, Sam Wiseman , et al. (7 additional authors not shown)

    Abstract: In this work, we discuss building performant Multimodal Large Language Models (MLLMs). In particular, we study the importance of various architecture components and data choices. Through careful and comprehensive ablations of the image encoder, the vision language connector, and various pre-training data choices, we identified several crucial design lessons. For example, we demonstrate that for la… ▽ More

    Submitted 18 April, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  14. arXiv:2401.10020  [pdf, other

    cs.CL cs.AI

    Self-Rewarding Language Models

    Authors: Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

    Abstract: We posit that to achieve superhuman agents, future models require superhuman feedback in order to provide an adequate training signal. Current approaches commonly train reward models from human preferences, which may then be bottlenecked by human performance level, and secondly these separate frozen reward models cannot then learn to improve during LLM training. In this work, we study Self-Rewardi… ▽ More

    Submitted 8 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

  15. arXiv:2401.07475  [pdf, other

    cs.CL

    GWPT: A Green Word-Embedding-based POS Tagger

    Authors: Chengwei Wei, Runqi Pang, C. -C. Jay Kuo

    Abstract: As a fundamental tool for natural language processing (NLP), the part-of-speech (POS) tagger assigns the POS label to each word in a sentence. A novel lightweight POS tagger based on word embeddings is proposed and named GWPT (green word-embedding-based POS tagger) in this work. Following the green learning (GL) methodology, GWPT contains three modules in cascade: 1) representation learning, 2) fe… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

  16. arXiv:2312.09057  [pdf, other

    cs.CR cs.AI cs.CV

    On the Difficulty of Defending Contrastive Learning against Backdoor Attacks

    Authors: Changjiang Li, Ren Pang, Bochuan Cao, Zhaohan Xi, Jinghui Chen, Shouling Ji, Ting Wang

    Abstract: Recent studies have shown that contrastive learning, like supervised learning, is highly vulnerable to backdoor attacks wherein malicious functions are injected into target models, only to be activated by specific triggers. However, thus far it remains under-explored how contrastive backdoor attacks fundamentally differ from their supervised counterparts, which impedes the development of effective… ▽ More

    Submitted 14 December, 2023; originally announced December 2023.

    Comments: USENIX Security 24

  17. arXiv:2312.05386  [pdf, other

    cs.LG cs.CR

    Model Extraction Attacks Revisited

    Authors: Jiacheng Liang, Ren Pang, Changjiang Li, Ting Wang

    Abstract: Model extraction (ME) attacks represent one major threat to Machine-Learning-as-a-Service (MLaaS) platforms by ``stealing'' the functionality of confidential machine-learning models through querying black-box APIs. Over seven years have passed since ME attacks were first conceptualized in the seminal work. During this period, substantial advances have been made in both ME attacks and MLaaS platfor… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

    Comments: Under Review

  18. arXiv:2311.12022  [pdf, other

    cs.AI cs.CL

    GPQA: A Graduate-Level Google-Proof Q&A Benchmark

    Authors: David Rein, Betty Li Hou, Asa Cooper Stickland, Jackson Petty, Richard Yuanzhe Pang, Julien Dirani, Julian Michael, Samuel R. Bowman

    Abstract: We present GPQA, a challenging dataset of 448 multiple-choice questions written by domain experts in biology, physics, and chemistry. We ensure that the questions are high-quality and extremely difficult: experts who have or are pursuing PhDs in the corresponding domains reach 65% accuracy (74% when discounting clear mistakes the experts identified in retrospect), while highly skilled non-expert v… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    Comments: 28 pages, 5 figures, 7 tables

  19. arXiv:2309.13256  [pdf, other

    cs.LG cs.AI

    Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks

    Authors: Zhaohan Xi, Tianyu Du, Changjiang Li, Ren Pang, Shouling Ji, Jinghui Chen, Fenglong Ma, Ting Wang

    Abstract: Pre-trained language models (PLMs) have demonstrated remarkable performance as few-shot learners. However, their security risks under such settings are largely unexplored. In this work, we conduct a pilot study showing that PLMs as few-shot learners are highly vulnerable to backdoor attacks while existing defenses are inadequate due to the unique challenges of few-shot scenarios. To address such c… ▽ More

    Submitted 23 September, 2023; originally announced September 2023.

    Comments: Accepted by NeurIPS'23

  20. arXiv:2309.09843  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Instruction-Following Speech Recognition

    Authors: Cheng-I Jeff Lai, Zhiyun Lu, Liangliang Cao, Ruoming Pang

    Abstract: Conventional end-to-end Automatic Speech Recognition (ASR) models primarily focus on exact transcription tasks, lacking flexibility for nuanced user interactions. With the advent of Large Language Models (LLMs) in speech processing, more organic, text-prompt-based interactions have become possible. However, the mechanisms behind these models' speech understanding and "reasoning" capabilities remai… ▽ More

    Submitted 18 September, 2023; originally announced September 2023.

  21. arXiv:2309.04456  [pdf, other

    cs.CY cs.HC

    The Case for Anticipating Undesirable Consequences of Computing Innovations Early, Often, and Across Computer Science

    Authors: Rock Yuren Pang, Dan Grossman, Tadayoshi Kohno, Katharina Reinecke

    Abstract: From smart sensors that infringe on our privacy to neural nets that portray realistic imposter deepfakes, our society increasingly bears the burden of negative, if unintended, consequences of computing innovations. As the experts in the technology we create, Computer Science (CS) researchers must do better at anticipating and addressing these undesirable consequences proactively. Our prior work sh… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

    Comments: More details at NSF #2315937: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2315937&HistoricalAwards=false

  22. arXiv:2309.04354  [pdf, other

    cs.CV cs.LG stat.ML

    Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts

    Authors: Erik Daxberger, Floris Weers, Bowen Zhang, Tom Gunter, Ruoming Pang, Marcin Eichner, Michael Emmersberger, Yinfei Yang, Alexander Toshev, Xianzhi Du

    Abstract: Sparse Mixture-of-Experts models (MoEs) have recently gained popularity due to their ability to decouple model size from inference efficiency by only activating a small subset of the model parameters for any given input token. As such, sparse MoEs have enabled unprecedented scalability, resulting in tremendous successes across domains such as natural language processing and computer vision. In thi… ▽ More

    Submitted 8 September, 2023; originally announced September 2023.

  23. arXiv:2307.14117  [pdf, other

    cs.CL

    Leveraging Implicit Feedback from Deployment Data in Dialogue

    Authors: Richard Yuanzhe Pang, Stephen Roller, Kyunghyun Cho, He He, Jason Weston

    Abstract: We study improving social conversational agents by learning from natural dialogue between users and a deployed model, without extra annotations. To implicitly measure the quality of a machine-generated utterance, we leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes. Our experiments use the publicly released deployme… ▽ More

    Submitted 31 January, 2024; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: EACL 2024

  24. arXiv:2305.15269  [pdf, other

    cs.CL cs.AI

    Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

    Authors: Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He

    Abstract: Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity. Recent studies have shown that large language models (LLMs) possess some abstract deductive reasoning ability given chain-of-thought prompts. However, they have primarily been tested on proofs using modus ponens or of a specific size, an… ▽ More

    Submitted 3 November, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

    Comments: Published as a conference paper at NeurIPS 2023

  25. Auditing Cross-Cultural Consistency of Human-Annotated Labels for Recommendation Systems

    Authors: Rock Yuren Pang, Jack Cenatempo, Franklyn Graham, Bridgette Kuehn, Maddy Whisenant, Portia Botchway, Katie Stone Perez, Allison Koenecke

    Abstract: Recommendation systems increasingly depend on massive human-labeled datasets; however, the human annotators hired to generate these labels increasingly come from homogeneous backgrounds. This poses an issue when downstream predictive models -- based on these labels -- are applied globally to a heterogeneous set of users. We study this disconnect with respect to the labels themselves, asking whethe… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted at FAccT 2023

  26. arXiv:2305.02383  [pdf, other

    cs.CR cs.AI

    On the Security Risks of Knowledge Graph Reasoning

    Authors: Zhaohan Xi, Tianyu Du, Changjiang Li, Ren Pang, Shouling Ji, Xiapu Luo, Xusheng Xiao, Fenglong Ma, Ting Wang

    Abstract: Knowledge graph reasoning (KGR) -- answering complex logical queries over large knowledge graphs -- represents an important artificial intelligence task, entailing a range of applications (e.g., cyber threat hunting). However, despite its surging popularity, the potential security risks of KGR are largely unexplored, which is concerning, given the increasing use of such capability in security-crit… ▽ More

    Submitted 22 June, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: In proceedings of USENIX Security'23. Codes: https://github.com/HarrialX/security-risk-KG-reasoning

  27. arXiv:2304.05687  [pdf, ps, other

    cs.HC

    Anticipating Unintended Consequences of Technology Using Insights from Creativity Support Tools

    Authors: Rock Yuren Pang, Katharina Reinecke

    Abstract: Our society has been increasingly witnessing a number of negative, unintended consequences of digital technologies. While post-hoc policy regulation is crucial in addressing these issues, reasonably anticipating the consequences before deploying technology can help mitigate potential harm to society in the first place. Yet, the quest to anticipate potential harms can be difficult without seeing di… ▽ More

    Submitted 12 April, 2023; originally announced April 2023.

    Comments: In CHI '23 Workshop on Designing Technology and Policy Simultaneously: Towards A Research Agenda and New Practice, April 23, 2023

  28. arXiv:2304.00171  [pdf, other

    cs.CL cs.SD eess.AS

    Practical Conformer: Optimizing size, speed and flops of Conformer for on-Device and cloud ASR

    Authors: Rami Botros, Anmol Gulati, Tara N. Sainath, Krzysztof Choromanski, Ruoming Pang, Trevor Strohman, Weiran Wang, Jiahui Yu

    Abstract: Conformer models maintain a large number of internal states, the vast majority of which are associated with self-attention layers. With limited memory bandwidth, reading these from memory at each inference step can slow down inference. In this paper, we design an optimized conformer that is small enough to meet on-device restrictions and has fast inference on TPUs. We explore various ideas to impr… ▽ More

    Submitted 31 March, 2023; originally announced April 2023.

  29. "That's important, but...": How Computer Science Researchers Anticipate Unintended Consequences of Their Research Innovations

    Authors: Kimberly Do, Rock Yuren Pang, Jiachen Jiang, Katharina Reinecke

    Abstract: Computer science research has led to many breakthrough innovations but has also been scrutinized for enabling technology that has negative, unintended consequences for society. Given the increasing discussions of ethics in the news and among researchers, we interviewed 20 researchers in various CS sub-disciplines to identify whether and how they consider potential unintended consequences of their… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Corresponding author: Rock Yuren Pang, email provided below. Kimberly Do and Rock Yuren Pang contributed equally to this research. The author order is listed alphabetically. To appear in CHI Conference on Human Factors in Computing Systems (CHI '23), April 23-April 28, 2023, Hamburg, Germany. ACM, New York, NY, USA, 16 pages

  30. arXiv:2303.04562  [pdf, other

    cs.LG cs.CL q-bio.QM

    Extrapolative Controlled Sequence Generation via Iterative Refinement

    Authors: Vishakh Padmakumar, Richard Yuanzhe Pang, He He, Ankur P. Parikh

    Abstract: We study the problem of extrapolative controlled generation, i.e., generating sequences with attribute values beyond the range seen in training. This task is of significant importance in automated design, especially drug discovery, where the goal is to design novel proteins that are \textit{better} (e.g., more stable) than existing sequences. Thus, by definition, the target sequences and their att… ▽ More

    Submitted 7 June, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: ICML 2023 - Camera Ready Version

  31. arXiv:2301.13081  [pdf, other

    cs.CV

    STAIR: Learning Sparse Text and Image Representation in Grounded Tokens

    Authors: Chen Chen, Bowen Zhang, Liangliang Cao, Jiguang Shen, Tom Gunter, Albin Madappally Jose, Alexander Toshev, Jonathon Shlens, Ruoming Pang, Yinfei Yang

    Abstract: Image and text retrieval is one of the foundational tasks in the vision and language domain with multiple real-world applications. State-of-the-art approaches, e.g. CLIP, ALIGN, represent images and texts as dense embeddings and calculate the similarity in the dense embedding space as the matching score. On the other hand, sparse semantic features like bag-of-words models are more interpretable, b… ▽ More

    Submitted 7 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

  32. arXiv:2212.14160  [pdf

    physics.ao-ph cs.AI cs.CV

    A Deep Learning Method for Real-time Bias Correction of Wind Field Forecasts in the Western North Pacific

    Authors: Wei Zhang, Yueyue Jiang, Junyu Dong, Xiaojiang Song, Renbo Pang, Boyu Guoan, Hui Yu

    Abstract: Forecasts by the European Centre for Medium-Range Weather Forecasts (ECMWF; EC for short) can provide a basis for the establishment of maritime-disaster warning systems, but they contain some systematic biases.The fifth-generation EC atmospheric reanalysis (ERA5) data have high accuracy, but are delayed by about 5 days. To overcome this issue, a spatiotemporal deep-learning method could be used fo… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: 18 pages

    Journal ref: Atmospheric Research, Volume 284, 15 March, 2023

  33. arXiv:2211.08714  [pdf, other

    cs.CL cs.AI cs.LG

    Reward Gaming in Conditional Text Generation

    Authors: Richard Yuanzhe Pang, Vishakh Padmakumar, Thibault Sellam, Ankur P. Parikh, He He

    Abstract: To align conditional text generation model outputs with desired behaviors, there has been an increasing focus on training the model using reinforcement learning (RL) with reward functions learned from human annotations. Under this framework, we identify three common cases where high rewards are incorrectly assigned to undesirable patterns: noise-induced spurious correlation, naturally occurring sp… ▽ More

    Submitted 1 June, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

    Comments: ACL 2023

  34. arXiv:2210.12179  [pdf, other

    cs.CR cs.LG

    Neural Architectural Backdoors

    Authors: Ren Pang, Changjiang Li, Zhaohan Xi, Shouling Ji, Ting Wang

    Abstract: This paper asks the intriguing question: is it possible to exploit neural architecture search (NAS) as a new attack vector to launch previously improbable attacks? Specifically, we present EVAS, a new attack that leverages NAS to find neural architectures with inherent backdoors and exploits such vulnerability using input-aware triggers. Compared with existing attacks, EVAS demonstrates many inter… ▽ More

    Submitted 7 November, 2022; v1 submitted 21 October, 2022; originally announced October 2022.

  35. arXiv:2210.07346  [pdf, other

    cs.CR cs.CV cs.LG

    An Embarrassingly Simple Backdoor Attack on Self-supervised Learning

    Authors: Changjiang Li, Ren Pang, Zhaohan Xi, Tianyu Du, Shouling Ji, Yuan Yao, Ting Wang

    Abstract: As a new paradigm in machine learning, self-supervised learning (SSL) is capable of learning high-quality representations of complex data without relying on labels. In addition to eliminating the need for labeled data, research has found that SSL improves the adversarial robustness over supervised learning since lacking labels makes it more challenging for adversaries to manipulate model predictio… ▽ More

    Submitted 13 August, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: The 2023 International Conference on Computer Vision (ICCV '23)

  36. arXiv:2210.03305  [pdf, other

    cs.HC

    How Do Data Science Workers Communicate Intermediate Results?

    Authors: Rock Yuren Pang, Ruotong Wang, Joely Nelson, Leilani Battle

    Abstract: Data science workers increasingly collaborate on large-scale projects before communicating insights to a broader audience in the form of visualization. While prior work has modeled how data science teams, oftentimes with distinct roles and work processes, communicate knowledge to outside stakeholders, we have little knowledge of how data science workers communicate intermediately before delivering… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

    Comments: This paper was accepted for presentation as part of the eighth Symposium on Visualization in Data Science (VDS) at ACM KDD 2022 as well as IEEE VIS 2022. http://www.visualdatascience.org/2022/index.html

  37. arXiv:2209.13702  [pdf, other

    cs.AI cs.LG

    Reasoning over Multi-view Knowledge Graphs

    Authors: Zhaohan Xi, Ren Pang, Changjiang Li, Tianyu Du, Shouling Ji, Fenglong Ma, Ting Wang

    Abstract: Recently, knowledge representation learning (KRL) is emerging as the state-of-the-art approach to process queries over knowledge graphs (KGs), wherein KG entities and the query are embedded into a latent space such that entities that answer the query are embedded close to the query. Yet, despite the intensive research on KRL, most existing studies either focus on homogenous KGs or assume KG comple… ▽ More

    Submitted 27 September, 2022; originally announced September 2022.

  38. arXiv:2208.13916  [pdf, other

    eess.AS cs.CL cs.SD

    A Language Agnostic Multilingual Streaming On-Device ASR System

    Authors: Bo Li, Tara N. Sainath, Ruoming Pang, Shuo-yiin Chang, Qiumin Xu, Trevor Strohman, Vince Chen, Qiao Liang, Heguang Liu, Yanzhang He, Parisa Haghani, Sameer Bidichandani

    Abstract: On-device end-to-end (E2E) models have shown improvements over a conventional model on English Voice Search tasks in both quality and latency. E2E models have also shown promising results for multilingual automatic speech recognition (ASR). In this paper, we extend our previous capacity solution to streaming applications and present a streaming multilingual E2E ASR system that runs fully on device… ▽ More

    Submitted 29 August, 2022; originally announced August 2022.

    Comments: Accepted in Interspeech 2022

  39. arXiv:2208.12852  [pdf, other

    cs.CL cs.AI

    What Do NLP Researchers Believe? Results of the NLP Community Metasurvey

    Authors: Julian Michael, Ari Holtzman, Alicia Parrish, Aaron Mueller, Alex Wang, Angelica Chen, Divyam Madaan, Nikita Nangia, Richard Yuanzhe Pang, Jason Phang, Samuel R. Bowman

    Abstract: We present the results of the NLP Community Metasurvey. Run from May to June 2022, the survey elicited opinions on controversial issues, including industry influence in the field, concerns about AGI, and ethics. Our results put concrete numbers to several controversies: For example, respondents are split almost exactly in half on questions about the importance of artificial general intelligence, w… ▽ More

    Submitted 26 August, 2022; originally announced August 2022.

    Comments: 31 pages, 19 figures, 3 tables; more information at https://nlpsurvey.net

    ACM Class: I.2.7

  40. arXiv:2205.11465  [pdf, ps, other

    cs.CL cs.AI

    SQuALITY: Building a Long-Document Summarization Dataset the Hard Way

    Authors: Alex Wang, Richard Yuanzhe Pang, Angelica Chen, Jason Phang, Samuel R. Bowman

    Abstract: Summarization datasets are often assembled either by scraping naturally occurring public-domain summaries -- which are nearly always in difficult-to-work-with technical domains -- or by using approximate heuristics to extract them from everyday text -- which frequently yields unfaithful summaries. In this work, we turn to a slower but more straightforward approach to developing summarization bench… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  41. arXiv:2203.13240  [pdf, other

    cs.CL cs.LG

    Token Dropping for Efficient BERT Pretraining

    Authors: Le Hou, Richard Yuanzhe Pang, Tianyi Zhou, Yuexin Wu, Xinying Song, Xiaodan Song, Denny Zhou

    Abstract: Transformer-based models generally allocate the same amount of computation for each token in a given sequence. We develop a simple but effective "token dropping" method to accelerate the pretraining of transformer models, such as BERT, without degrading its performance on downstream tasks. In short, we drop unimportant tokens starting from an intermediate layer in the model to make the model focus… ▽ More

    Submitted 24 March, 2022; originally announced March 2022.

    Comments: ACL 2022

  42. arXiv:2203.12533  [pdf, other

    cs.DC cs.LG

    Pathways: Asynchronous Distributed Dataflow for ML

    Authors: Paul Barham, Aakanksha Chowdhery, Jeff Dean, Sanjay Ghemawat, Steven Hand, Dan Hurt, Michael Isard, Hyeontaek Lim, Ruoming Pang, Sudip Roy, Brennan Saeta, Parker Schuh, Ryan Sepassi, Laurent El Shafey, Chandramohan A. Thekkath, Yonghui Wu

    Abstract: We present the design of a new large scale orchestration layer for accelerators. Our system, Pathways, is explicitly designed to enable exploration of new systems and ML research ideas, while retaining state of the art performance for current models. Pathways uses a sharded dataflow graph of asynchronous operators that consume and produce futures, and efficiently gang-schedules heterogeneous paral… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: MLSys 2022

  43. arXiv:2203.05008  [pdf, other

    cs.CL cs.LG cs.SD eess.AS

    Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition

    Authors: W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor Strohman, Shankar Kumar

    Abstract: Language model fusion helps smart assistants recognize words which are rare in acoustic data but abundant in text-only corpora (typed search logs). However, such corpora have properties that hinder downstream performance, including being (1) too large, (2) beset with domain-mismatched content, and (3) heavy-headed rather than heavy-tailed (excessively many duplicate search queries such as "weather… ▽ More

    Submitted 15 June, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: Interspeech 2022

  44. arXiv:2112.08670  [pdf, other

    cs.CL cs.LG

    Amortized Noisy Channel Neural Machine Translation

    Authors: Richard Yuanzhe Pang, He He, Kyunghyun Cho

    Abstract: Noisy channel models have been especially effective in neural machine translation (NMT). However, recent approaches like "beam search and rerank" (BSR) incur significant computation overhead during inference, making real-world application infeasible. We aim to study if it is possible to build an amortized noisy channel NMT model such that when we do greedy decoding during inference, the translatio… ▽ More

    Submitted 18 July, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: INLG 2022

  45. arXiv:2112.08608  [pdf, other

    cs.CL

    QuALITY: Question Answering with Long Input Texts, Yes!

    Authors: Richard Yuanzhe Pang, Alicia Parrish, Nitish Joshi, Nikita Nangia, Jason Phang, Angelica Chen, Vishakh Padmakumar, Johnny Ma, Jana Thompson, He He, Samuel R. Bowman

    Abstract: To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than rely… ▽ More

    Submitted 11 May, 2022; v1 submitted 15 December, 2021; originally announced December 2021.

    Comments: NAACL 2022

  46. arXiv:2112.07175  [pdf, other

    cs.CV

    Co-training Transformer with Videos and Images Improves Action Recognition

    Authors: Bowen Zhang, Jiahui Yu, Christopher Fifty, Wei Han, Andrew M. Dai, Ruoming Pang, Fei Sha

    Abstract: In learning action recognition, models are typically pre-trained on object recognition with images, such as ImageNet, and later fine-tuned on target action recognition with videos. This approach has achieved good empirical performance especially with recent transformer-based video architectures. While recently many works aim to design more advanced transformer architectures for action recognition,… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  47. arXiv:2110.14693  [pdf, other

    cs.CR

    Towards Robust Reasoning over Knowledge Graphs

    Authors: Zhaohan Xi, Ren Pang, Changjiang Li, Shouling Ji, Xiapu Luo, Xusheng Xiao, Ting Wang

    Abstract: Answering complex logical queries over large-scale knowledge graphs (KGs) represents an important artificial intelligence task, entailing a range of applications. Recently, knowledge representation learning (KRL) has emerged as the state-of-the-art approach, wherein KG entities and the query are embedded into a latent space such that entities that answer the query are embedded close to the query.… ▽ More

    Submitted 31 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

  48. arXiv:2110.06018  [pdf, other

    cs.LG cs.CR cs.CV

    On the Security Risks of AutoML

    Authors: Ren Pang, Zhaohan Xi, Shouling Ji, Xiapu Luo, Ting Wang

    Abstract: Neural Architecture Search (NAS) represents an emerging machine learning (ML) paradigm that automatically searches for models tailored to given tasks, which greatly simplifies the development of ML systems and propels the trend of ML democratization. Yet, little is known about the potential security risks incurred by NAS, which is concerning given the increasing use of NAS-generated models in crit… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: Accepted as a full paper at USENIX Security '22

  49. arXiv:2110.04627  [pdf, other

    cs.CV cs.LG

    Vector-quantized Image Modeling with Improved VQGAN

    Authors: Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, Yonghui Wu

    Abstract: Pretraining language models with next-token prediction on massive text corpora has delivered phenomenal zero-shot, few-shot, transfer learning and multi-tasking capabilities on both generative and discriminative language tasks. Motivated by this success, we explore a Vector-quantized Image Modeling (VIM) approach that involves pretraining a Transformer to predict rasterized image tokens autoregres… ▽ More

    Submitted 4 June, 2022; v1 submitted 9 October, 2021; originally announced October 2021.

    Comments: Accepted in ICLR 2022

  50. arXiv:2109.13226  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

    Authors: Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang , et al. (1 additional authors not shown)

    Abstract: We summarize the results of a host of efforts using giant automatic speech recognition (ASR) models pre-trained using large, diverse unlabeled datasets containing approximately a million hours of audio. We find that the combination of pre-training, self-training and scaling up model size greatly increases data efficiency, even for extremely large tasks with tens of thousands of hours of labeled da… ▽ More

    Submitted 21 July, 2022; v1 submitted 27 September, 2021; originally announced September 2021.

    Comments: 14 pages, 7 figures, 13 tables; v2: minor corrections, reference baselines and bibliography updated; v3: corrections based on reviewer feedback, bibliography updated