Skip to main content

Showing 1–14 of 14 results for author: Zhou, J P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.06172  [pdf, other

    cs.AI cs.CL

    On Speeding Up Language Model Evaluation

    Authors: Jin Peng Zhou, Christian K. Belardi, Ruihan Wu, Travis Zhang, Carla P. Gomes, Wen Sun, Kilian Q. Weinberger

    Abstract: Large language models (LLMs) currently dominate the field of natural language processing (NLP), representing the state-of-the-art across a diverse array of tasks. Developing a model of this nature, from training to inference, requires making numerous decisions which define a combinatorial search problem. For example, selecting the optimal pre-trained LLM, prompt, or hyperparameters to attain the b… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  2. arXiv:2407.04181  [pdf, other

    cs.AI cs.CL

    Orchestrating LLMs with Different Personalizations

    Authors: Jin Peng Zhou, Katie Z Luo, Jingwen Gu, Jason Yuan, Kilian Q. Weinberger, Wen Sun

    Abstract: This paper presents a novel approach to aligning large language models (LLMs) with individual human preferences, sometimes referred to as Reinforcement Learning from \textit{Personalized} Human Feedback (RLPHF). Given stated preferences along multiple dimensions, such as helpfulness, conciseness, or humor, the goal is to create an LLM without re-training that best adheres to this specification. St… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2405.17503  [pdf, other

    cs.SE cs.AI cs.CL cs.PL

    Code Repair with LLMs gives an Exploration-Exploitation Tradeoff

    Authors: Hao Tang, Keya Hu, Jin Peng Zhou, Sicheng Zhong, Wei-Long Zheng, Xujie Si, Kevin Ellis

    Abstract: Iteratively improving and repairing source code with large language models (LLMs), known as refinement, has emerged as a popular way of generating programs that would be too complex to construct in one shot. Given a bank of test cases, together with a candidate program, an LLM can improve that program by being prompted with failed test cases. But it remains an open question how to best iteratively… ▽ More

    Submitted 30 May, 2024; v1 submitted 26 May, 2024; originally announced May 2024.

  4. arXiv:2403.18120  [pdf, other

    cs.AI cs.CL cs.LG

    Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization

    Authors: Jin Peng Zhou, Charles Staats, Wenda Li, Christian Szegedy, Kilian Q. Weinberger, Yuhuai Wu

    Abstract: Large language models (LLM), such as Google's Minerva and OpenAI's GPT families, are becoming increasingly capable of solving mathematical quantitative reasoning problems. However, they still make unjustified logical and computational errors in their reasoning steps and answers. In this paper, we leverage the fact that if the training corpus of LLMs contained sufficiently many examples of formal m… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: ICLR 2024

  5. arXiv:2402.17032  [pdf, other

    cs.AI cs.LG

    REFACTOR: Learning to Extract Theorems from Proofs

    Authors: Jin Peng Zhou, Yuhuai Wu, Qiyang Li, Roger Grosse

    Abstract: Human mathematicians are often good at recognizing modular and reusable theorems that make complex mathematical results within reach. In this paper, we propose a novel method called theoREm-from-prooF extrACTOR (REFACTOR) for training neural networks to mimic this ability in formal mathematical theorem proving. We show on a set of unseen proofs, REFACTOR is able to extract 19.6% of the theorems th… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: ICLR 2024

  6. arXiv:2402.03292  [pdf, other

    cs.LG cs.CV

    Zero-shot Object-Level OOD Detection with Context-Aware Inpainting

    Authors: Quang-Huy Nguyen, Jin Peng Zhou, Zhenzhen Liu, Khanh-Huyen Bui, Kilian Q. Weinberger, Dung D. Le

    Abstract: Machine learning algorithms are increasingly provided as black-box cloud services or pre-trained models, without access to their training data. This motivates the problem of zero-shot out-of-distribution (OOD) detection. Concretely, we aim to detect OOD objects that do not belong to the classifier's label set but are erroneously classified as in-distribution (ID) objects. Our approach, RONIN, uses… ▽ More

    Submitted 6 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  7. arXiv:2310.16176  [pdf, other

    cs.CL cs.AI

    Correction with Backtracking Reduces Hallucination in Summarization

    Authors: Zhenzhen Liu, Chao Wan, Varsha Kishore, Jin Peng Zhou, Minmin Chen, Kilian Q. Weinberger

    Abstract: Abstractive summarization aims at generating natural language summaries of a source document that are succinct while preserving the important elements. Despite recent advances, neural text summarization models are known to be susceptible to hallucinating (or more correctly confabulating), that is to produce summaries with details that are not grounded in the source document. In this paper, we intr… ▽ More

    Submitted 31 October, 2023; v1 submitted 24 October, 2023; originally announced October 2023.

  8. arXiv:2303.04488  [pdf, other

    cs.LG cs.AI cs.LO

    Magnushammer: A Transformer-Based Approach to Premise Selection

    Authors: Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak, Bartosz Piotrowski, Albert Qiaochu Jiang, Jin Peng Zhou, Christian Szegedy, Łukasz Kuciński, Piotr Miłoś, Yuhuai Wu

    Abstract: This paper presents a novel approach to premise selection, a crucial reasoning task in automated theorem proving. Traditionally, symbolic methods that rely on extensive domain knowledge and engineering effort are applied to this task. In contrast, this work demonstrates that contrastive training with the transformer architecture can achieve higher-quality retrieval of relevant premises, without th… ▽ More

    Submitted 18 March, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: ICLR 2024

  9. arXiv:2302.10326  [pdf, other

    cs.CV cs.LG

    Unsupervised Out-of-Distribution Detection with Diffusion Inpainting

    Authors: Zhenzhen Liu, Jin Peng Zhou, Yufan Wang, Kilian Q. Weinberger

    Abstract: Unsupervised out-of-distribution detection (OOD) seeks to identify out-of-domain data by learning only from unlabeled in-domain data. We present a novel approach for this task - Lift, Map, Detect (LMD) - that leverages recent advancement in diffusion models. Diffusion models are one type of generative models. At their core, they learn an iterative denoising process that gradually maps a noisy imag… ▽ More

    Submitted 16 August, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: ICML 2023

  10. arXiv:2212.10318  [pdf, other

    cs.CR cs.LG

    Learned Systems Security

    Authors: Roei Schuster, Jin Peng Zhou, Thorsten Eisenhofer, Paul Grubbs, Nicolas Papernot

    Abstract: A learned system uses machine learning (ML) internally to improve performance. We can expect such systems to be vulnerable to some adversarial-ML attacks. Often, the learned component is shared between mutually-distrusting users or processes, much like microarchitectural resources such as caches, potentially giving rise to highly-realistic attacker models. However, compared to attacks on other ML-… ▽ More

    Submitted 10 January, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  11. arXiv:2210.12283  [pdf, other

    cs.AI cs.LG

    Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs

    Authors: Albert Q. Jiang, Sean Welleck, Jin Peng Zhou, Wenda Li, Jiacheng Liu, Mateja Jamnik, Timothée Lacroix, Yuhuai Wu, Guillaume Lample

    Abstract: The formalization of existing mathematical proofs is a notoriously difficult process. Despite decades of research on automation and proof assistants, writing formal proofs remains arduous and only accessible to a few experts. While previous studies to automate formalization focused on powerful search algorithms, no attempts were made to take advantage of available informal proofs. In this work, we… ▽ More

    Submitted 20 February, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  12. arXiv:2202.12968  [pdf, other

    cs.LG

    Does Label Differential Privacy Prevent Label Inference Attacks?

    Authors: Ruihan Wu, Jin Peng Zhou, Kilian Q. Weinberger, Chuan Guo

    Abstract: Label differential privacy (label-DP) is a popular framework for training private ML models on datasets with public features and sensitive private labels. Despite its rigorous privacy guarantee, it has been observed that in practice label-DP does not preclude label inference attacks (LIAs): Models trained with label-DP can be evaluated on the public training features to recover, with high accuracy… ▽ More

    Submitted 3 June, 2023; v1 submitted 25 February, 2022; originally announced February 2022.

  13. arXiv:2008.09194  [pdf, other

    cs.LG cs.CR cs.CV cs.CY

    On Attribution of Deepfakes

    Authors: Baiwu Zhang, Jin Peng Zhou, Ilia Shumailov, Nicolas Papernot

    Abstract: Progress in generative modelling, especially generative adversarial networks, have made it possible to efficiently synthesize and alter media at scale. Malicious individuals now rely on these machine-generated media, or deepfakes, to manipulate social discourse. In order to ensure media authenticity, existing research is focused on deepfake detection. Yet, the adversarial nature of frameworks used… ▽ More

    Submitted 3 March, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

  14. arXiv:2008.01246  [pdf, other

    cs.IR

    Noise Contrastive Estimation for Autoencoding-based One-Class Collaborative Filtering

    Authors: Jin Peng Zhou, Ga Wu, Zheda Mai, Scott Sanner

    Abstract: One-class collaborative filtering (OC-CF) is a common class of recommendation problem where only the positive class is explicitly observed (e.g., purchases, clicks). Autoencoder based recommenders such as AutoRec and variants demonstrate strong performance on many OC-CF benchmarks, but also empirically suffer from a strong popularity bias. While a careful choice of negative samples in the OC-CF se… ▽ More

    Submitted 5 August, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: 10 pages, 7 figures