Skip to main content

Showing 1–4 of 4 results for author: Rzepka, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03963  [pdf, other

    cs.CL cs.AI

    LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

    Authors: LLM-jp, :, Akiko Aizawa, Eiji Aramaki, Bowen Chen, Fei Cheng, Hiroyuki Deguchi, Rintaro Enomoto, Kazuki Fujii, Kensuke Fukumoto, Takuya Fukushima, Namgi Han, Yuto Harada, Chikara Hashimoto, Tatsuya Hiraoka, Shohei Hisada, Sosuke Hosokawa, Lu Jie, Keisuke Kamata, Teruhito Kanazawa, Hiroki Kanezashi, Hiroshi Kataoka, Satoru Katsumata, Daisuke Kawahara, Seiya Kawano , et al. (57 additional authors not shown)

    Abstract: This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. Speciesist Language and Nonhuman Animal Bias in English Masked Language Models

    Authors: Masashi Takeshita, Rafal Rzepka, Kenji Araki

    Abstract: Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked La… ▽ More

    Submitted 12 August, 2022; v1 submitted 9 March, 2022; originally announced March 2022.

    Comments: This paper is an accepted manuscript for publication in Information Processing & Management

  3. arXiv:2203.02116  [pdf

    cs.CL cs.AI cs.LG

    In the Service of Online Order: Tackling Cyber-Bullying with Machine Learning and Affect Analysis

    Authors: Michal Ptaszynski, Pawel Dybala, Tatsuaki Matsuba, Fumito Masui, Rafal Rzepka, Kenji Araki, Yoshio Momouchi

    Abstract: One of the burning problems lately in Japan has been cyber-bullying, or slandering and bullying people online. The problem has been especially noticed on unofficial Web sites of Japanese schools. Volunteers consisting of school personnel and PTA (Parent-Teacher Association) members have started Online Patrol to spot malicious contents within Web forums and blogs. In practise, Online Patrol assumes… ▽ More

    Submitted 3 March, 2022; originally announced March 2022.

    Comments: 12 pages, 11 tables, 6 figures

    Journal ref: International Journal of Computational Linguistics Research, Vol. 1, Issue 3, pp. 135-154, 2010

  4. arXiv:2010.12077  [pdf, other

    cs.CL

    Summarizing Utterances from Japanese Assembly Minutes using Political Sentence-BERT-based Method for QA Lab-PoliInfo-2 Task of NTCIR-15

    Authors: Daiki Shirafuji, Hiromichi Kameya, Rafal Rzepka, Kenji Araki

    Abstract: There are many discussions held during political meetings, and a large number of utterances for various topics is included in their transcripts. We need to read all of them if we want to follow speakers\' intentions or opinions about a given topic. To avoid such a costly and time-consuming process to grasp often longish discussions, NLP researchers work on generating concise summaries of utterance… ▽ More

    Submitted 22 October, 2020; originally announced October 2020.

    Comments: 8 pages, 1 figure, 8 tables, NTCIR-15 conference