Skip to main content

Showing 1–40 of 40 results for author: Ke, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03978  [pdf, other

    cs.CL cs.AI

    Benchmarking Complex Instruction-Following with Multiple Constraints Composition

    Authors: Bosi Wen, Pei Ke, Xiaotao Gu, Lindong Wu, Hao Huang, Jinfeng Zhou, Wenchuang Li, Binxin Hu, Wendy Gao, Jiaxin Xu, Yiming Liu, Jie Tang, Hongning Wang, Minlie Huang

    Abstract: Instruction following is one of the fundamental capabilities of large language models (LLMs). As the ability of LLMs is constantly improving, they have been increasingly applied to deal with complex human instructions in real-world scenarios. Therefore, how to evaluate the ability of complex instruction-following of LLMs has become a critical research problem. Existing benchmarks mainly focus on m… ▽ More

    Submitted 11 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

    Comments: 20 pages, 7 figures

  2. arXiv:2407.02855  [pdf, other

    cs.CR cs.CL cs.LG

    Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks

    Authors: Zhexin Zhang, Junxiao Yang, Pei Ke, Shiyao Cui, Chujie Zheng, Hongning Wang, Minlie Huang

    Abstract: LLMs are known to be vulnerable to jailbreak attacks, even after safety alignment. An important observation is that, while different types of jailbreak attacks can generate significantly different queries, they mostly result in similar responses that are rooted in the same harmful knowledge (e.g., detailed steps to make a bomb). Therefore, we conjecture that directly unlearn the harmful knowledge… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 15 pages

  3. arXiv:2406.16714  [pdf, other

    cs.CL cs.AI cs.LG

    AutoDetect: Towards a Unified Framework for Automated Weakness Detection in Large Language Models

    Authors: Jiale Cheng, Yida Lu, Xiaotao Gu, Pei Ke, Xiao Liu, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang

    Abstract: Although Large Language Models (LLMs) are becoming increasingly powerful, they still exhibit significant but subtle weaknesses, such as mistakes in instruction-following or coding tasks. As these unexpected errors could lead to severe consequences in practical deployments, it is crucial to investigate the limitations within LLMs systematically. Traditional benchmarking approaches cannot thoroughly… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  4. arXiv:2406.04604  [pdf, other

    cs.CL cs.PL

    Learning Task Decomposition to Assist Humans in Competitive Programming

    Authors: Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang

    Abstract: When using language models (LMs) to solve complex problems, humans might struggle to understand the LM-generated solutions and repair the flawed ones. To assist humans in repairing them, we propose to automatically decompose complex solutions into multiple simpler pieces that correspond to specific subtasks. We introduce a novel objective for learning task decomposition, termed assistive value (As… ▽ More

    Submitted 17 July, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Main Conference

  5. arXiv:2405.14383  [pdf, other

    cs.CL cs.AI

    Perception of Knowledge Boundary for Large Language Models through Semi-open-ended Question Answering

    Authors: Zhihua Wen, Zhiliang Tian, Zexin Jian, Zhen Huang, Pei Ke, Yifu Gao, Minlie Huang, Dongsheng Li

    Abstract: Large Language Models (LLMs) are widely used for knowledge-seeking yet suffer from hallucinations. The knowledge boundary (KB) of an LLM limits its factual understanding, beyond which it may begin to hallucinate. Investigating the perception of LLMs' KB is crucial for detecting hallucinations and LLMs' reliable generation. Current studies perceive LLMs' KB on questions with a concrete answer (clos… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  6. arXiv:2402.16444  [pdf, other

    cs.CL

    ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

    Authors: Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang

    Abstract: The safety of Large Language Models (LLMs) has gained increasing attention in recent years, but there still lacks a comprehensive approach for detecting safety issues within LLMs' responses in an aligned, customizable and explainable manner. In this paper, we propose ShieldLM, an LLM-based safety detector, which aligns with general human safety standards, supports customizable detection rules, and… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

    Comments: 17 pages

  7. arXiv:2402.00856  [pdf, other

    cs.CL

    Towards Efficient Exact Optimization of Language Model Alignment

    Authors: Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang

    Abstract: The alignment of language models with human preferences is vital for their application in real-world tasks. The problem is formulated as optimizing the model's policy to maximize the expected reward that reflects human preferences with minimal deviation from the initial policy. While considered as a straightforward solution, reinforcement learning (RL) suffers from high variance in policy updates,… ▽ More

    Submitted 5 June, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

    Comments: 24 pages, 9 figures

    Journal ref: Forty-first International Conference on Machine Learning (ICML 2024)

  8. arXiv:2401.03204  [pdf, ps, other

    cs.CR

    The 4-adic complexity of quaternary sequences with low autocorrelation and high linear complexity

    Authors: Feifei Yan, Pinhui Ke, Lingmei Xiao

    Abstract: Recently, Jiang et al. proposed several new classes of quaternary sequences with low autocorrelation and high linear complexity by using the inverse Gray mapping (JAMC, \textbf{69} (2023): 689--706). In this paper, we estimate the 4-adic complexity of these quaternary sequences. Our results show that these sequences have large 4-adic complexity to resist the attack of the rational approximation al… ▽ More

    Submitted 6 January, 2024; originally announced January 2024.

  9. arXiv:2311.18743  [pdf, other

    cs.CL cs.AI cs.LG

    AlignBench: Benchmarking Chinese Alignment of Large Language Models

    Authors: Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang

    Abstract: Alignment has become a critical step for instruction-tuned Large Language Models (LLMs) to become helpful assistants. However, effective evaluation of alignment for emerging Chinese LLMs is still significantly lacking, calling for real-scenario grounded, open-ended, challenging and automatic evaluations tailored for alignment. To fill in this gap, we introduce AlignBench, a comprehensive multi-dim… ▽ More

    Submitted 5 December, 2023; v1 submitted 30 November, 2023; originally announced November 2023.

  10. arXiv:2311.18702  [pdf, other

    cs.CL cs.AI

    CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation

    Authors: Pei Ke, Bosi Wen, Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang

    Abstract: Since the natural language processing (NLP) community started to make large language models (LLMs) act as a critic to evaluate the quality of generated texts, most of the existing works train a critique generation model on the evaluation data labeled by GPT-4's direct prompting. We observe that these models lack the ability to generate informative critiques in both pointwise grading and pairwise c… ▽ More

    Submitted 26 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepted by ACL 2024 (Main Conference)

  11. arXiv:2311.17391  [pdf, other

    cs.CL

    Unveiling the Implicit Toxicity in Large Language Models

    Authors: Jiaxin Wen, Pei Ke, Hao Sun, Zhexin Zhang, Chengfei Li, Jinfeng Bai, Minlie Huang

    Abstract: The open-endedness of large language models (LLMs) combined with their impressive capabilities may lead to new safety issues when being exploited for malicious use. While recent studies primarily focus on probing toxic outputs that can be easily detected with existing toxicity classifiers, we show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Main Conference

  12. arXiv:2311.09096  [pdf, other

    cs.CL

    Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization

    Authors: Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang

    Abstract: While significant attention has been dedicated to exploiting weaknesses in LLMs through jailbreaking attacks, there remains a paucity of effort in defending against these attacks. We point out a pivotal factor contributing to the success of jailbreaks: the intrinsic conflict between the goals of being helpful and ensuring safety. Accordingly, we propose to integrate goal prioritization at both tra… ▽ More

    Submitted 12 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: ACL 2024 Main Conference

  13. arXiv:2311.04155  [pdf, other

    cs.CL

    Black-Box Prompt Optimization: Aligning Large Language Models without Model Training

    Authors: Jiale Cheng, Xiao Liu, Kehan Zheng, Pei Ke, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang

    Abstract: Large language models (LLMs) have shown impressive success in various applications. However, these models are often not well aligned with human intents, which calls for additional treatments on them; that is, the alignment problem. To make LLMs better follow user instructions, existing alignment methods primarily focus on further training them. However, the extra training of LLMs is usually expens… ▽ More

    Submitted 21 June, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted to ACL 2024

  14. arXiv:2310.01041  [pdf, other

    cs.CL

    Language Model Decoding as Direct Metrics Optimization

    Authors: Haozhe Ji, Pei Ke, Hongning Wang, Minlie Huang

    Abstract: Despite the remarkable advances in language modeling, current mainstream decoding methods still struggle to generate texts that align with human texts across different aspects. In particular, sampling-based methods produce less-repetitive texts which are often disjunctive in discourse, while search-based methods maintain topic coherence at the cost of increased repetition. Overall, these methods f… ▽ More

    Submitted 5 June, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 33 pages, 3 figures

    Journal ref: The Twelfth International Conference on Learning Representations (ICLR 2024)

  15. arXiv:2307.06869  [pdf, other

    cs.CL cs.AI

    DecompEval: Evaluating Generated Texts as Unsupervised Decomposed Question Answering

    Authors: Pei Ke, Fei Huang, Fei Mi, Yasheng Wang, Qun Liu, Xiaoyan Zhu, Minlie Huang

    Abstract: Existing evaluation metrics for natural language generation (NLG) tasks face the challenges on generalization ability and interpretability. Specifically, most of the well-performed metrics are required to train on evaluation datasets of specific NLG tasks and evaluation dimensions, which may cause over-fitting to task-specific datasets. Furthermore, existing metrics only provide an evaluation scor… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: Accepted by ACL 2023 (Main Conference)

  16. arXiv:2306.03350  [pdf, other

    cs.CL

    Click: Controllable Text Generation with Sequence Likelihood Contrastive Learning

    Authors: Chujie Zheng, Pei Ke, Zheng Zhang, Minlie Huang

    Abstract: It has always been an important yet challenging problem to control language models to avoid generating texts with undesirable attributes, such as toxic language and unnatural repetition. We introduce Click for controllable text generation, which needs no modification to the model architecture and facilitates out-of-the-box use of trained models. It employs a contrastive loss on sequence likelihood… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Findings of ACL 2023

  17. arXiv:2304.11791  [pdf, other

    cs.CL

    Directed Acyclic Transformer Pre-training for High-quality Non-autoregressive Text Generation

    Authors: Fei Huang, Pei Ke, Minlie Huang

    Abstract: Non-AutoRegressive (NAR) text generation models have drawn much attention because of their significantly faster decoding speed and good generation quality in machine translation. However, in a wider range of text generation tasks, existing NAR models lack proper pre-training, making them still far behind the pre-trained autoregressive models. In this paper, we propose Pre-trained Directed Acyclic… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: Accepted at Transactions of the Association for Computational Linguistics

  18. arXiv:2302.13344  [pdf, other

    cs.CL

    Tailoring Language Generation Models under Total Variation Distance

    Authors: Haozhe Ji, Pei Ke, Zhipeng Hu, Rongsheng Zhang, Minlie Huang

    Abstract: The standard paradigm of neural language generation adopts maximum likelihood estimation (MLE) as the optimizing method. From a distributional view, MLE in fact minimizes the Kullback-Leibler divergence (KLD) between the distribution of the real data and that of the model. However, this approach forces the model to distribute non-zero (sometimes large) probability mass to all training samples rega… ▽ More

    Submitted 26 February, 2023; originally announced February 2023.

    Comments: Published in ICLR 2023 (notable-top-5%)

    Journal ref: International Conference on Learning Representations (ICLR 2023)

  19. arXiv:2212.12347  [pdf, other

    cs.LO

    Technical Report: Automating Vehicle SOA Threat Analysis using a Model-Based Methodology

    Authors: Yuri Gil Dantas, Simon Barner, Pei Ke, Vivek Nigam, Ulrich Schoepp

    Abstract: While the adoption of Service-Oriented Architectures (SOA) eases the implementation of features such as autonomous driving and over-the-air updates, it also increases the vehicle's exposure to attacks that may place road-users in harm. To address this problem, standards (ISO 21434/UNECE) expect manufacturers to produce security arguments and evidence by carrying out appropriate threat analysis. As… ▽ More

    Submitted 23 December, 2022; originally announced December 2022.

  20. arXiv:2210.09175  [pdf, other

    cs.CL

    Learning Instructions with Unlabeled Data for Zero-Shot Cross-Task Generalization

    Authors: Yuxian Gu, Pei Ke, Xiaoyan Zhu, Minlie Huang

    Abstract: Training language models to learn from human instructions for zero-shot cross-task generalization has attracted much attention in NLP communities. Recently, instruction tuning (IT), which fine-tunes a pre-trained language model on a massive collection of tasks described via human-craft instructions, has been shown effective in instruction learning for unseen tasks. However, IT relies on a large am… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted by the main conference of EMNLP 2022

  21. arXiv:2206.02712  [pdf, other

    cs.CL

    Curriculum-Based Self-Training Makes Better Few-Shot Learners for Data-to-Text Generation

    Authors: Pei Ke, Haozhe Ji, Zhenyu Yang, Yi Huang, Junlan Feng, Xiaoyan Zhu, Minlie Huang

    Abstract: Despite the success of text-to-text pre-trained models in various natural language generation (NLG) tasks, the generation performance is largely restricted by the number of labeled data in downstream tasks, particularly in data-to-text generation tasks. Existing works mostly utilize abundant unlabeled structured data to conduct unsupervised pre-training for task adaption, which fail to model the c… ▽ More

    Submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted by IJCAI 2022

  22. arXiv:2204.00862  [pdf, other

    cs.CL cs.AI

    CTRLEval: An Unsupervised Reference-Free Metric for Evaluating Controlled Text Generation

    Authors: Pei Ke, Hao Zhou, Yankai Lin, Peng Li, Jie Zhou, Xiaoyan Zhu, Minlie Huang

    Abstract: Existing reference-free metrics have obvious limitations for evaluating controlled text generation models. Unsupervised metrics can only provide a task-agnostic evaluation result which correlates weakly with human judgments, whereas supervised ones may overfit task-specific data with poor generalization ability to other datasets. In this paper, we propose an unsupervised reference-free metric call… ▽ More

    Submitted 5 December, 2022; v1 submitted 2 April, 2022; originally announced April 2022.

    Comments: Accepted by ACL 2022 (Main Conference)

  23. EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training

    Authors: Yuxian Gu, Jiaxin Wen, Hao Sun, Yi Song, Pei Ke, Chujie Zheng, Zheng Zhang, Jianzhu Yao, Lei Liu, Xiaoyan Zhu, Minlie Huang

    Abstract: Large-scale pre-training has shown remarkable performance in building open-domain dialogue systems. However, previous works mainly focus on showing and evaluating the conversational performance of the released dialogue model, ignoring the discussion of some key factors towards a powerful human-like chatbot, especially in Chinese scenarios. In this paper, we conduct extensive experiments to investi… ▽ More

    Submitted 21 October, 2023; v1 submitted 17 March, 2022; originally announced March 2022.

    Comments: Machine Intelligence Research. https://link.springer.com/article/10.1007/s11633-022-1387-3 . 12 pages, 5 figures. The code and pre-trained models are publicly available at https://github.com/thu-coai/EVA

  24. arXiv:2202.13587  [pdf, other

    cs.CL cs.AI

    Rethinking and Refining the Distinct Metric

    Authors: Siyang Liu, Sahand Sabour, Yinhe Zheng, Pei Ke, Xiaoyan Zhu, Minlie Huang

    Abstract: Distinct-$n$ score\cite{Li2016} is a widely used automatic metric for evaluating diversity in language generation tasks. However, we observed that the original approach for calculating distinct scores has evident biases that tend to assign higher penalties to longer sequences. We refine the calculation of distinct scores by scaling the number of distinct tokens based on their expectations. We prov… ▽ More

    Submitted 3 April, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: 4 pages, to be published at ACL2022

    ACM Class: I.2.7

  25. arXiv:2108.01547  [pdf, other

    cs.CL cs.AI

    EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative Pre-Training

    Authors: Hao Zhou, Pei Ke, Zheng Zhang, Yuxian Gu, Yinhe Zheng, Chujie Zheng, Yida Wang, Chen Henry Wu, Hao Sun, Xiaocong Yang, Bosi Wen, Xiaoyan Zhu, Minlie Huang, Jie Tang

    Abstract: Although pre-trained language models have remarkably enhanced the generation ability of dialogue systems, open-domain Chinese dialogue systems are still limited by the dialogue data and the model size compared with English ones. In this paper, we propose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model with 2.8B parameters. To build this model, we collect… ▽ More

    Submitted 3 August, 2021; originally announced August 2021.

    Comments: 8 pages, 4 figures

  26. arXiv:2106.10715  [pdf, other

    cs.CL

    CPM-2: Large-scale Cost-effective Pre-trained Language Models

    Authors: Zhengyan Zhang, Yuxian Gu, Xu Han, Shengqi Chen, Chaojun Xiao, Zhenbo Sun, Yuan Yao, Fanchao Qi, Jian Guan, Pei Ke, Yanzheng Cai, Guoyang Zeng, Zhixing Tan, Zhiyuan Liu, Minlie Huang, Wentao Han, Yang Liu, Xiaoyan Zhu, Maosong Sun

    Abstract: In recent years, the size of pre-trained language models (PLMs) has grown by leaps and bounds. However, efficiency issues of these large-scale PLMs limit their utilization in real-world scenarios. We present a suite of cost-effective techniques for the use of PLMs to deal with the efficiency issues of pre-training, fine-tuning, and inference. (1) We introduce knowledge inheritance to accelerate th… ▽ More

    Submitted 24 June, 2021; v1 submitted 20 June, 2021; originally announced June 2021.

  27. arXiv:2106.10502  [pdf, other

    cs.CL cs.AI

    JointGT: Graph-Text Joint Representation Learning for Text Generation from Knowledge Graphs

    Authors: Pei Ke, Haozhe Ji, Yu Ran, Xin Cui, Liwei Wang, Linfeng Song, Xiaoyan Zhu, Minlie Huang

    Abstract: Existing pre-trained models for knowledge-graph-to-text (KG-to-text) generation simply fine-tune text-to-text pre-trained models such as BART or T5 on KG-to-text datasets, which largely ignore the graph structure during encoding and lack elaborate pre-training tasks to explicitly model graph-text alignments. To tackle these problems, we propose a graph-text joint representation learning model call… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

    Comments: ACL 2021 (Findings)

  28. arXiv:2106.03065  [pdf, other

    cs.CL

    Semantic-Enhanced Explainable Finetuning for Open-Domain Dialogues

    Authors: Yinhe Zheng, Yida Wang, Pei Ke, Zhenyu Yang, Minlie Huang

    Abstract: This paper propose to combine pretrained language models with the modular dialogue paradigm for open-domain dialogue modeling. Our method, semantic-enhanced finetuning, instantiates conversation understanding, planning, and response generation as a language model finetuning task. At inference, we disentangle semantic and token variations by specifying sampling methods and constraints for each modu… ▽ More

    Submitted 23 May, 2022; v1 submitted 6 June, 2021; originally announced June 2021.

    Comments: Under review

  29. arXiv:2012.00413  [pdf, other

    cs.CL

    CPM: A Large-scale Generative Chinese Pre-trained Language Model

    Authors: Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun

    Abstract: Pre-trained Language Models (PLMs) have proven to be beneficial for various downstream NLP tasks. Recently, GPT-3, with 175 billion parameters and 570GB training data, drew a lot of attention due to the capacity of few-shot (even zero-shot) learning. However, applying GPT-3 to address Chinese NLP tasks is still challenging, as the training corpus of GPT-3 is primarily English, and the parameters a… ▽ More

    Submitted 1 December, 2020; originally announced December 2020.

  30. arXiv:2011.09013  [pdf

    physics.ao-ph cs.LG

    Estimates of daily ground-level NO2 concentrations in China based on big data and machine learning approaches

    Authors: Xinyu Dou, Cuijuan Liao, Hengqi Wang, Ying Huang, Ying Tu, Xiaomeng Huang, Yiran Peng, Biqing Zhu, Jianguang Tan, Zhu Deng, Nana Wu, Taochun Sun, Piyu Ke, Zhu Liu

    Abstract: Nitrogen dioxide (NO2) is one of the most important atmospheric pollutants. However, current ground-level NO2 concentration data are lack of either high-resolution coverage or full coverage national wide, due to the poor quality of source data and the computing power of the models. To our knowledge, this study is the first to estimate the ground-level NO2 concentration in China with national cover… ▽ More

    Submitted 17 November, 2020; originally announced November 2020.

  31. arXiv:2009.11753  [pdf, other

    cs.CL

    Generating Commonsense Explanation by Extracting Bridge Concepts from Reasoning Paths

    Authors: Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Minlie Huang

    Abstract: Commonsense explanation generation aims to empower the machine's sense-making capability by generating plausible explanations to statements against commonsense. While this task is easy to human, the machine still struggles to generate reasonable and informative explanations. In this work, we propose a method that first extracts the underlying concepts which are served as \textit{bridges} in the re… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: Accepted by AACL-IJCNLP 2020

  32. arXiv:2009.11692  [pdf, other

    cs.CL

    Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph

    Authors: Haozhe Ji, Pei Ke, Shaohan Huang, Furu Wei, Xiaoyan Zhu, Minlie Huang

    Abstract: Despite the success of generative pre-trained language models on a series of text generation tasks, they still suffer in cases where reasoning over underlying commonsense knowledge is required during generation. Existing approaches that integrate commonsense knowledge into generative pre-trained language models simply transfer relational knowledge by post-training on individual knowledge triples w… ▽ More

    Submitted 24 September, 2020; originally announced September 2020.

    Comments: accepted by EMNLP 2020

  33. arXiv:2008.03946  [pdf, ps, other

    cs.CL

    A Large-Scale Chinese Short-Text Conversation Dataset

    Authors: Yida Wang, Pei Ke, Yinhe Zheng, Kaili Huang, Yong Jiang, Xiaoyan Zhu, Minlie Huang

    Abstract: The advancements of neural dialogue generation models show promising results on modeling short-text conversations. However, training such models usually needs a large-scale high-quality dialogue corpus, which is hard to access. In this paper, we present a large-scale cleaned Chinese conversation dataset, LCCC, which contains a base version (6.8million dialogues) and a large version (12.0 million d… ▽ More

    Submitted 26 April, 2022; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: Accepted by NLPCC 2020 (Best Student Paper)

  34. arXiv:2002.00583  [pdf, other

    cs.CL cs.LG

    CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation

    Authors: Fei Huang, Dazhen Wan, Zhihong Shao, Pei Ke, Jian Guan, Yilin Niu, Xiaoyan Zhu, Minlie Huang

    Abstract: In text generation evaluation, many practical issues, such as inconsistent experimental settings and metric implementations, are often ignored but lead to unfair evaluation and untenable conclusions. We present CoTK, an open-source toolkit aiming to support fast development and fair evaluation of text generation. In model development, CoTK helps handle the cumbersome issues, such as data processin… ▽ More

    Submitted 3 February, 2020; originally announced February 2020.

    Comments: Submitting to ACL2020 demo

    ACM Class: I.2.7

  35. arXiv:1911.06670  [pdf, ps, other

    cs.IT math.CO

    New Successor Rules to Efficiently Produce Exponentially Many Binary de Bruijn Sequences

    Authors: Zuling Chang, Martianus Frederic Ezerman, Pinhui Ke, Qiang Wang

    Abstract: We put forward new general criteria to design successor rules that generate binary de Bruijn sequences. Prior fast algorithms based on successor rules in the literature are then shown to be special instances. We implemented the criteria to join the cycles generated by a number of simple feedback shift registers (FSRs) of order $n$. These include the pure cycling register (PCR) and the pure summing… ▽ More

    Submitted 5 July, 2021; v1 submitted 15 November, 2019; originally announced November 2019.

    Comments: in submission

  36. arXiv:1911.02493  [pdf, other

    cs.CL

    SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge

    Authors: Pei Ke, Haozhe Ji, Siyang Liu, Xiaoyan Zhu, Minlie Huang

    Abstract: Most of the existing pre-trained language representation models neglect to consider the linguistic knowledge of texts, which can promote language understanding in NLP tasks. To benefit the downstream tasks in sentiment analysis, we propose a novel language representation model called SentiLARE, which introduces word-level linguistic knowledge including part-of-speech tag and sentiment polarity (in… ▽ More

    Submitted 24 September, 2020; v1 submitted 6 November, 2019; originally announced November 2019.

    Comments: Accepted by EMNLP 2020 (Main Conference)

  37. arXiv:1908.07195  [pdf, other

    cs.CL cs.LG

    ARAML: A Stable Adversarial Training Framework for Text Generation

    Authors: Pei Ke, Fei Huang, Minlie Huang, Xiaoyan Zhu

    Abstract: Most of the existing generative adversarial networks (GAN) for text generation suffer from the instability of reinforcement learning training algorithms such as policy gradient, leading to unstable performance. To tackle this problem, we propose a novel framework called Adversarial Reward Augmented Maximum Likelihood (ARAML). During adversarial training, the discriminator assigns rewards to sample… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: Accepted by EMNLP 2019

    MSC Class: 68T50

  38. arXiv:1803.03339  [pdf, ps, other

    cs.CR math.NT

    On $k$-error linear complexity of pseudorandom binary sequences derived from Euler quotients

    Authors: Zhixiong Chen, Vladimir Edemskiy, Pinhui Ke, Chenhuang Wu

    Abstract: We investigate the $k$-error linear complexity of pseudorandom binary sequences of period $p^{\mathfrak{r}}$ derived from the Euler quotients modulo $p^{\mathfrak{r}-1}$, a power of an odd prime $p$ for $\mathfrak{r}\geq 2$. When $\mathfrak{r}=2$, this is just the case of polynomial quotients (including Fermat quotients) modulo $p$, which has been studied in an earlier work of Chen, Niu and Wu. In… ▽ More

    Submitted 15 March, 2018; v1 submitted 8 March, 2018; originally announced March 2018.

    MSC Class: 94A55; 94A60; 65C10

    Journal ref: Advances in Mathematics of Communications,2018, 12(4)

  39. arXiv:1712.08886  [pdf, ps, other

    cs.CR math.NT

    A further study on the linear complexity of new binary cyclotomic sequence of length $p^r$

    Authors: Zhifan Ye, Pinhui Ke, Chenhuang Wu

    Abstract: Recently, a conjecture on the linear complexity of a new class of generalized cyclotomic binary sequences of period $p^r$ was proposed by Z. Xiao et al. (Des. Codes Cryptogr., DOI 10.1007/s10623-017-0408-7). Later, for the case $f$ being the form $2^r$ with $r\ge 1$, Vladimir Edemskiy proved the conjecture (arXiv:1712.03947). In this paper, under the assumption of $2^{p-1} \not\equiv 1 \bmod p^2$… ▽ More

    Submitted 15 March, 2018; v1 submitted 24 December, 2017; originally announced December 2017.

  40. arXiv:1711.06063  [pdf, ps, other

    cs.CR math.NT

    On error linear complexity of new generalized cyclotomic binary sequences of period $p^2$

    Authors: Chenhuang Wu, Chunxiang Xu, Zhixiong Chen, Pinhui Ke

    Abstract: We consider the $k$-error linear complexity of a new binary sequence of period $p^2$, proposed in the recent paper "New generalized cyclotomic binary sequences of period $p^2$", by Z. Xiao et al., who calculated the linear complexity of the sequences (Designs, Codes and Cryptography, 2017, https://doi.org/10.1007/s10623-017-0408-7). More exactly, we determine the values of $k$-error linear complex… ▽ More

    Submitted 22 April, 2018; v1 submitted 16 November, 2017; originally announced November 2017.

    MSC Class: 94A60; 11K45; 11L40