Zum Hauptinhalt springen

Showing 1–3 of 3 results for author: Ferdinan, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11275  [pdf, other

    cs.CL

    Self-training Large Language Models through Knowledge Detection

    Authors: Wei Jie Yeo, Teddy Ferdinan, Przemyslaw Kazienko, Ranjan Satapathy, Erik Cambria

    Abstract: Large language models (LLMs) often necessitate extensive labeled datasets and training compute to achieve impressive performance across downstream tasks. This paper explores a self-training paradigm, where the LLM autonomously curates its own labels and selectively trains on unknown data samples identified through a reference-free consistency method. Empirical evaluations demonstrate significant i… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review

  2. arXiv:2404.05892  [pdf, other

    cs.CL cs.AI

    Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

    Authors: Bo Peng, Daniel Goldstein, Quentin Anthony, Alon Albalak, Eric Alcaide, Stella Biderman, Eugene Cheah, Xingjian Du, Teddy Ferdinan, Haowen Hou, Przemysław Kazienko, Kranthi Kiran GV, Jan Kocoń, Bartłomiej Koptyra, Satyapriya Krishna, Ronald McClelland Jr., Niklas Muennighoff, Fares Obeid, Atsushi Saito, Guangyu Song, Haoqin Tu, Stanisław Woźniak, Ruichong Zhang, Bingchen Zhao, Qihang Zhao , et al. (3 additional authors not shown)

    Abstract: We present Eagle (RWKV-5) and Finch (RWKV-6), sequence models improving upon the RWKV (RWKV-4) architecture. Our architectural design advancements include multi-headed matrix-valued states and a dynamic recurrence mechanism that improve expressivity while maintaining the inference efficiency characteristics of RNNs. We introduce a new multilingual corpus with 1.12 trillion tokens and a fast tokeni… ▽ More

    Submitted 10 April, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

  3. arXiv:2402.09147  [pdf, other

    cs.AI

    Into the Unknown: Self-Learning Large Language Models

    Authors: Teddy Ferdinan, Jan Kocoń, Przemysław Kazienko

    Abstract: We address the main problem of self-learning LLM: the question of what to learn. We propose a self-learning LLM framework that enables an LLM to independently learn previously unknown knowledge through selfassessment of their own hallucinations. Using the hallucination score, we introduce a new concept of Points in the Unknown (PiUs), along with one extrinsic and three intrinsic methods for automa… ▽ More

    Submitted 4 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 16 pages, 12 figures, 4 tables, submitted to ACL SRW 2024