Fine-tuning Language Models with Generative Adversarial Reward Modelling
Authors:
Zhang Ze Yu,
Lau Jia Jaw,
Zhang Hui,
Bryan Kian Hsiang Low
Abstract:
Reinforcement Learning with Human Feedback (RLHF) has been demonstrated to significantly enhance the performance of large language models (LLMs) by aligning their outputs with desired human values through instruction tuning. However, RLHF is constrained by the expertise and productivity limitations of human evaluators. A response to this downside is to fall back to supervised fine-tuning (SFT) wit…
▽ More
Reinforcement Learning with Human Feedback (RLHF) has been demonstrated to significantly enhance the performance of large language models (LLMs) by aligning their outputs with desired human values through instruction tuning. However, RLHF is constrained by the expertise and productivity limitations of human evaluators. A response to this downside is to fall back to supervised fine-tuning (SFT) with additional carefully selected expert demonstrations. However, while this method has been proven to be effective, it invariably also leads to increased human-in-the-loop overhead. In this study, we propose another alternative approach: Reinforcement Learning with Generative Adversarial Feedback (RLGAF) to RLHF and SFT, which uses a generative adversarial training style to enable the LLMs to learn useful human expert demonstrations without being directly exposed to the training examples, thus enabling good generalization capabilities while preserving sample efficiency. Our preliminary findings indicate that RLGAF can help align LLMs outputs with competitive performance against RLHF and SFT, while not suffering from their respective inherent restrictions, suggesting promising avenues for further research on automating AI alignment.
△ Less
Submitted 5 March, 2024; v1 submitted 9 May, 2023;
originally announced May 2023.
Novel Quantum Information Processing Methods and Investigation
Authors:
Zhang Ze Yu
Abstract:
Quantum information processing and its subfield, quantum image processing, are rapidly growing fields as a result of advancements in the practicality of quantum mechanics. In this paper, we propose a quantum algorithm for processing information, such as one-dimensional time series and two-dimensional images, in the frequency domain. The information of interest is encoded into the magnitude of prob…
▽ More
Quantum information processing and its subfield, quantum image processing, are rapidly growing fields as a result of advancements in the practicality of quantum mechanics. In this paper, we propose a quantum algorithm for processing information, such as one-dimensional time series and two-dimensional images, in the frequency domain. The information of interest is encoded into the magnitude of probability amplitude or the coefficient of each basis state. The oracle for filtering operates based on postselection results, and its explicit circuit design is presented. This oracle is versatile enough to perform all basic filtering, including high pass, low pass, band pass, band stop, and many other processing techniques. Finally, we present two novel schemes for transposing matrices in this paper. They use similar encoding rules but with deliberate choices in terms of selecting basis states. These schemes could potentially be useful for other quantum information processing tasks, such as edge detection. The proposed techniques are implemented on the IBM Qiskit quantum simulator. Some results are compared with traditional information processing results to verify their correctness and are presented in this paper.
△ Less
Submitted 10 May, 2023;
originally announced May 2023.