-
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Authors:
LLM-jp,
:,
Akiko Aizawa,
Eiji Aramaki,
Bowen Chen,
Fei Cheng,
Hiroyuki Deguchi,
Rintaro Enomoto,
Kazuki Fujii,
Kensuke Fukumoto,
Takuya Fukushima,
Namgi Han,
Yuto Harada,
Chikara Hashimoto,
Tatsuya Hiraoka,
Shohei Hisada,
Sosuke Hosokawa,
Lu Jie,
Keisuke Kamata,
Teruhito Kanazawa,
Hiroki Kanezashi,
Hiroshi Kataoka,
Satoru Katsumata,
Daisuke Kawahara,
Seiya Kawano
, et al. (57 additional authors not shown)
Abstract:
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its…
▽ More
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Analyzing Social Biases in Japanese Large Language Models
Authors:
Hitomi Yanaka,
Namgi Han,
Ryoma Kumon,
Jie Lu,
Masashi Takeshita,
Ryo Sekizawa,
Taisei Kato,
Hiromi Arai
Abstract:
With the development of Large Language Models (LLMs), social biases in the LLMs have become a crucial issue. While various benchmarks for social biases have been provided across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias be…
▽ More
With the development of Large Language Models (LLMs), social biases in the LLMs have become a crucial issue. While various benchmarks for social biases have been provided across languages, the extent to which Japanese LLMs exhibit social biases has not been fully investigated. In this study, we construct the Japanese Bias Benchmark dataset for Question Answering (JBBQ) based on the English bias benchmark BBQ, and analyze social biases in Japanese LLMs. The results show that while current Japanese LLMs improve their accuracies on JBBQ by instruction-tuning, their bias scores become larger. In addition, augmenting their prompts with warning about social biases reduces the effect of biases in some models.
△ Less
Submitted 5 June, 2024; v1 submitted 4 June, 2024;
originally announced June 2024.
-
Towards Theory-based Moral AI: Moral AI with Aggregating Models Based on Normative Ethical Theory
Authors:
Masashi Takeshita,
Rzepka Rafal,
Kenji Araki
Abstract:
Moral AI has been studied in the fields of philosophy and artificial intelligence. Although most existing studies are only theoretical, recent developments in AI have made it increasingly necessary to implement AI with morality. On the other hand, humans are under the moral uncertainty of not knowing what is morally right. In this paper, we implement the Maximizing Expected Choiceworthiness (MEC)…
▽ More
Moral AI has been studied in the fields of philosophy and artificial intelligence. Although most existing studies are only theoretical, recent developments in AI have made it increasingly necessary to implement AI with morality. On the other hand, humans are under the moral uncertainty of not knowing what is morally right. In this paper, we implement the Maximizing Expected Choiceworthiness (MEC) algorithm, which aggregates outputs of models based on three normative theories of normative ethics to generate the most appropriate output. MEC is a method for making appropriate moral judgments under moral uncertainty. Our experimental results suggest that the output of MEC correlates to some extent with commonsense morality and that MEC can produce equally or more appropriate output than existing methods.
△ Less
Submitted 20 June, 2023;
originally announced June 2023.
-
A Transponder Aggregator with Efficient Use of Filtering Function for Transponder Noise Suppression
Authors:
Kenya Suzuki,
Osamu Moriwaki,
Koichi Hadama,
Keita Yamaguchi,
Hiroki Taniguchi,
Yoshiaki Kisaka,
Daisuke Ogawa,
Makoto Takeshita,
Stefano Camatel,
Yiran Ma,
Mitsunori Fukutoku
Abstract:
Colorless, directionless, and contentionless reconfigurable optical add/drop multiplexing (CDC-ROADM) provides highly flexible physical layer network configuration. Such CDC-ROADM must operate in multiple wavelength bands which are being increasingly implemented in optical transmission systems. The operation in C+L bands requires switch devices used in CDC-ROADM to also be capable of multiband ope…
▽ More
Colorless, directionless, and contentionless reconfigurable optical add/drop multiplexing (CDC-ROADM) provides highly flexible physical layer network configuration. Such CDC-ROADM must operate in multiple wavelength bands which are being increasingly implemented in optical transmission systems. The operation in C+L bands requires switch devices used in CDC-ROADM to also be capable of multiband operation. Recent studies on wavelength division multiplexing (WDM) systems have pointed out the impact of amplified spontaneous emission (ASE) noise generated by signals of different wavelengths, which causes OSNR degradation. Therefore, it is desirable to filter out the ASE noise from different transponders when multiplexing multiple wavelengths at the transmitter side, especially in a system with non-wavelength selective combiners such as directional couplers and multicast switches. The use of transponder aggregators with filtering functions, such as the M x N wavelength selective switch (WSS), is preferable for this filtering. However, the downside of these devices is that it is difficult to provide economical multiband support. Therefore, we propose an economical transponder aggregator configuration by allowing a certain amount of ASE superposition and reducing the number of filtering functions. In this paper, we fabricated a prototype of the proposed transponder aggregator by combining silica-based planar lightwave circuit technology and C+L band WSS, both commercially available, and verified its feasibility through transmission experiments. The novel transponder aggregator is a practical solution for a multiband CDC-ROADM system with improved OSNR performance.
△ Less
Submitted 3 October, 2022; v1 submitted 20 September, 2022;
originally announced September 2022.
-
Speciesist Language and Nonhuman Animal Bias in English Masked Language Models
Authors:
Masashi Takeshita,
Rafal Rzepka,
Kenji Araki
Abstract:
Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked La…
▽ More
Various existing studies have analyzed what social biases are inherited by NLP models. These biases may directly or indirectly harm people, therefore previous studies have focused only on human attributes. However, until recently no research on social biases in NLP regarding nonhumans existed. In this paper, we analyze biases to nonhuman animals, i.e. speciesist bias, inherent in English Masked Language Models such as BERT. We analyzed speciesist bias against 46 animal names using template-based and corpus-extracted sentences containing speciesist (or non-speciesist) language. We found that pre-trained masked language models tend to associate harmful words with nonhuman animals and have a bias toward using speciesist language for some nonhuman animal names. Our code for reproducing the experiments will be made available on GitHub.
△ Less
Submitted 12 August, 2022; v1 submitted 9 March, 2022;
originally announced March 2022.