Skip to main content

Showing 1–10 of 10 results for author: Oseki, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (57 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2402.12691  [pdf, other

    cs.CL

    Tree-Planted Transformers: Unidirectional Transformer Language Models with Implicit Syntactic Supervision

    Authors: Ryo Yoshida, Taiga Someya, Yohei Oseki

    Abstract: Syntactic Language Models (SLMs) can be trained efficiently to reach relatively high performance; however, they have trouble with inference efficiency due to the explicit generation of syntactic structures. In this paper, we propose a new method dubbed tree-planting: instead of explicitly generating syntactic structures, we "plant" trees into attention weights of unidirectional Transformer LMs to… ▽ More

    Submitted 6 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 (Findings)

  3. arXiv:2402.12363  [pdf, other

    cs.CL

    Emergent Word Order Universals from Cognitively-Motivated Language Models

    Authors: Tatsuki Kuribayashi, Ryo Ueda, Ryo Yoshida, Yohei Oseki, Ted Briscoe, Timothy Baldwin

    Abstract: The world's languages exhibit certain so-called typological or implicational universals; for example, Subject-Object-Verb (SOV) languages typically use postpositions. Explaining the source of such biases is a key goal of linguistics. We study word-order universals through a computational simulation with language models (LMs). Our experiments show that typologically-typical word orders tend to have… ▽ More

    Submitted 7 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

    Comments: Accepted by ACL 2024 main conference, 22 pages

  4. arXiv:2311.07484  [pdf, other

    cs.CL cs.AI

    Psychometric Predictive Power of Large Language Models

    Authors: Tatsuki Kuribayashi, Yohei Oseki, Timothy Baldwin

    Abstract: Instruction tuning aligns the response of large language models (LLMs) with human preferences. Despite such efforts in human--LLM alignment, we find that instruction tuning does not always make LLMs human-like from a cognitive modeling perspective. More specifically, next-word probabilities estimated by instruction-tuned LLMs are often worse at simulating human reading behavior than those estimate… ▽ More

    Submitted 15 April, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 23 pages; Findings of NAACL 2024

  5. arXiv:2309.12676  [pdf, other

    cs.CL

    JCoLA: Japanese Corpus of Linguistic Acceptability

    Authors: Taiga Someya, Yushi Sugimoto, Yohei Oseki

    Abstract: Neural language models have exhibited outstanding performance in a range of downstream tasks. However, there is limited understanding regarding the extent to which these models internalize syntactic knowledge, so that various datasets have recently been constructed to facilitate syntactic evaluation of language models across languages. In this paper, we introduce JCoLA (Japanese Corpus of Linguist… ▽ More

    Submitted 22 September, 2023; originally announced September 2023.

  6. arXiv:2210.12958  [pdf, other

    cs.CL

    Composition, Attention, or Both?

    Authors: Ryo Yoshida, Yohei Oseki

    Abstract: In this paper, we propose a novel architecture called Composition Attention Grammars (CAGs) that recursively compose subtrees into a single vector representation with a composition function, and selectively attend to previous structural information with a self-attention mechanism. We investigate whether these components -- the composition function and the self-attention mechanism -- can both induc… ▽ More

    Submitted 10 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: Accepted by Findings of EMNLP 2022

  7. arXiv:2205.11463  [pdf, other

    cs.CL

    Context Limitations Make Neural Language Models More Human-Like

    Authors: Tatsuki Kuribayashi, Yohei Oseki, Ana Brassard, Kentaro Inui

    Abstract: Language models (LMs) have been used in cognitive modeling as well as engineering studies -- they compute information-theoretic complexity metrics that simulate humans' cognitive load during reading. This study highlights a limitation of modern neural LMs as the model of choice for this purpose: there is a discrepancy between their context access capacities and that of humans. Our results showed t… ▽ More

    Submitted 1 November, 2022; v1 submitted 23 May, 2022; originally announced May 2022.

    Comments: Accepted by EMNLP2022 (main long)

  8. arXiv:2109.04939  [pdf, other

    cs.CL

    Modeling Human Sentence Processing with Left-Corner Recurrent Neural Network Grammars

    Authors: Ryo Yoshida, Hiroshi Noji, Yohei Oseki

    Abstract: In computational linguistics, it has been shown that hierarchical structures make language models (LMs) more human-like. However, the previous literature has been agnostic about a parsing strategy of the hierarchical models. In this paper, we investigated whether hierarchical structures make LMs more human-like, and if so, which parsing strategy is most cognitively plausible. In order to address t… ▽ More

    Submitted 5 October, 2023; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Accepted by EMNLP 2021

  9. arXiv:2106.01229  [pdf, other

    cs.CL

    Lower Perplexity is Not Always Human-Like

    Authors: Tatsuki Kuribayashi, Yohei Oseki, Takumi Ito, Ryo Yoshida, Masayuki Asahara, Kentaro Inui

    Abstract: In computational psycholinguistics, various language models have been evaluated against human reading behavior (e.g., eye movement) to build human-like computational models. However, most previous efforts have focused almost exclusively on English, despite the recent trend towards linguistic universal within the general community. In order to fill the gap, this paper investigates whether the estab… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: Accepted by ACL 2021

  10. arXiv:2105.14822  [pdf, other

    cs.CL

    Effective Batching for Recurrent Neural Network Grammars

    Authors: Hiroshi Noji, Yohei Oseki

    Abstract: As a language model that integrates traditional symbolic operations and flexible neural representations, recurrent neural network grammars (RNNGs) have attracted great attention from both scientific and engineering perspectives. However, RNNGs are known to be harder to scale due to the difficulty of batched training. In this paper, we propose effective batching for RNNGs, where every operation is… ▽ More

    Submitted 31 May, 2021; originally announced May 2021.

    Comments: Findings of ACL: ACL-IJCNLP 2021