-
LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Authors:
LLM-jp,
:,
Akiko Aizawa,
Eiji Aramaki,
Bowen Chen,
Fei Cheng,
Hiroyuki Deguchi,
Rintaro Enomoto,
Kazuki Fujii,
Kensuke Fukumoto,
Takuya Fukushima,
Namgi Han,
Yuto Harada,
Chikara Hashimoto,
Tatsuya Hiraoka,
Shohei Hisada,
Sosuke Hosokawa,
Lu Jie,
Keisuke Kamata,
Teruhito Kanazawa,
Hiroki Kanezashi,
Hiroshi Kataoka,
Satoru Katsumata,
Daisuke Kawahara,
Seiya Kawano
, et al. (57 additional authors not shown)
Abstract:
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its…
▽ More
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Phoneme-aware Encoding for Prefix-tree-based Contextual ASR
Authors:
Hayato Futami,
Emiru Tsunoo,
Yosuke Kashiwagi,
Hiroaki Ogawa,
Siddhant Arora,
Shinji Watanabe
Abstract:
In speech recognition applications, it is important to recognize context-specific rare words, such as proper nouns. Tree-constrained Pointer Generator (TCPGen) has shown promise for this purpose, which efficiently biases such words with a prefix tree. While the original TCPGen relies on grapheme-based encoding, we propose extending it with phoneme-aware encoding to better recognize words of unusua…
▽ More
In speech recognition applications, it is important to recognize context-specific rare words, such as proper nouns. Tree-constrained Pointer Generator (TCPGen) has shown promise for this purpose, which efficiently biases such words with a prefix tree. While the original TCPGen relies on grapheme-based encoding, we propose extending it with phoneme-aware encoding to better recognize words of unusual pronunciations. As TCPGen handles biasing words as subword units, we propose obtaining subword-level phoneme-aware encoding by using alignment between phonemes and subwords. Furthermore, we propose injecting phoneme-level predictions from CTC into queries of TCPGen so that the model better interprets the phoneme-aware encodings. We conducted ASR experiments with TCPGen for RNN transducer. We observed that proposed phoneme-aware encoding outperformed ordinary grapheme-based encoding on both the English LibriSpeech and Japanese CSJ datasets, demonstrating the robustness of our approach across linguistically diverse languages.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Performance Portable Back-projection Algorithms on CPUs: Agnostic Data Locality and Vectorization Optimizations
Authors:
Peng Chen,
Mohamed Wahib,
Xiao Wang,
Shinichiro Takizawa,
Takahiro Hirofuchi,
Hirotaka Ogawa,
Satoshi Matsuoka
Abstract:
Computed Tomography (CT) is a key 3D imaging technology that fundamentally relies on the compute-intense back-projection operation to generate 3D volumes. GPUs are typically used for back-projection in production CT devices. However, with the rise of power-constrained micro-CT devices, and also the emergence of CPUs comparable in performance to GPUs, back-projection for CPUs could become favorable…
▽ More
Computed Tomography (CT) is a key 3D imaging technology that fundamentally relies on the compute-intense back-projection operation to generate 3D volumes. GPUs are typically used for back-projection in production CT devices. However, with the rise of power-constrained micro-CT devices, and also the emergence of CPUs comparable in performance to GPUs, back-projection for CPUs could become favorable. Unlike GPUs, extracting parallelism for back-projection algorithms on CPUs is complex given that parallelism and locality are not explicitly defined and controlled by the programmer, as is the case when using CUDA for instance. We propose a collection of novel back-projection algorithms that reduce the arithmetic computation, robustly enable vectorization, enforce a regular memory access pattern, and maximize the data locality. We also implement the novel algorithms as efficient back-projection kernels that are performance portable over a wide range of CPUs. Performance evaluation using a variety of CPUs from different vendors and generations demonstrates that our back-projection implementation achieves on average 5.2x speedup over the multi-threaded implementation of the most widely used, and optimized, open library. With a state-of-the-art CPU, we reach performance that rivals top-performing GPUs.
△ Less
Submitted 27 April, 2021;
originally announced April 2021.
-
Unsupposable Test-data Generation for Machine-learned Software
Authors:
Naoto Sato,
Hironobu Kuruma,
Hideto Ogawa
Abstract:
As for software development by machine learning, a trained model is evaluated by using part of an existing dataset as test data. However, if data with characteristics that differ from the existing data is input, the model does not always behave as expected. Accordingly, to confirm the behavior of the model more strictly, it is necessary to create data that differs from the existing data and test t…
▽ More
As for software development by machine learning, a trained model is evaluated by using part of an existing dataset as test data. However, if data with characteristics that differ from the existing data is input, the model does not always behave as expected. Accordingly, to confirm the behavior of the model more strictly, it is necessary to create data that differs from the existing data and test the model with that different data. The data to be tested includes not only data that developers can suppose (supposable data) but also data they cannot suppose (unsupposable data). To confirm the behavior of the model strictly, it is important to create as much unsupposable data as possible. In this study, therefore, a method called "unsupposable test-data generation" (UTG)---for giving suggestions for unsupposable data to model developers and testers---is proposed. UTG uses a variational autoencoder (VAE) to generate unsupposable data. The unsupposable data is generated by acquiring latent values with low occurrence probability in the prior distribution of the VAE and inputting the acquired latent values into the decoder. If unsupposable data is included in the data generated by the decoder, the developer can recognize new unsupposable features by referring to the data. On the basis of those unsupposable features, the developer will be able to create other unsupposable data with the same features. The proposed UTG was applied to the MNIST dataset and the House Sales Price dataset. The results demonstrate the feasibility of UTG.
△ Less
Submitted 20 May, 2020;
originally announced May 2020.
-
Formal Verification of Decision-Tree Ensemble Model and Detection of its Violating-input-value Ranges
Authors:
Naoto Sato,
Hironobu Kuruma,
Yuichiroh Nakagawa,
Hideto Ogawa
Abstract:
As one type of machine-learning model, a "decision-tree ensemble model" (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value for any input value. Accordingly, when a DTEM is used in regard to a system that requires reliability, it is…
▽ More
As one type of machine-learning model, a "decision-tree ensemble model" (DTEM) is represented by a set of decision trees. A DTEM is mainly known to be valid for structured data; however, like other machine-learning models, it is difficult to train so that it returns the correct output value for any input value. Accordingly, when a DTEM is used in regard to a system that requires reliability, it is important to comprehensively detect input values that lead to malfunctions of a system (failures) during development and take appropriate measures. One conceivable solution is to install an input filter that controls the input to the DTEM, and to use separate software to process input values that may lead to failures. To develop the input filter, it is necessary to specify the filtering condition of the input value that leads to the malfunction of the system. Given that necessity, in this paper, we propose a method for formally verifying a DTEM and, according to the result of the verification, if an input value leading to a failure is found, extracting the range in which such an input value exists. The proposed method can comprehensively extract the range in which the input value leading to the failure exists; therefore, by creating an input filter based on that range, it is possible to prevent the failure occurring in the system. In this paper, the algorithm of the proposed method is described, and the results of a case study using a dataset of house prices are presented. On the basis of those results, the feasibility of the proposed method is demonstrated, and its scalability is evaluated.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
DeepSaucer: Unified Environment for Verifying Deep Neural Networks
Authors:
Naoto Sato,
Hironobu Kuruma,
Masanori Kaneko,
Yuichiroh Nakagawa,
Hideto Ogawa,
Thai Son Hoang,
Michael Butler
Abstract:
In recent years, a number of methods for verifying DNNs have been developed. Because the approaches of the methods differ and have their own limitations, we think that a number of verification methods should be applied to a developed DNN. To apply a number of methods to the DNN, it is necessary to translate either the implementation of the DNN or the verification method so that one runs in the sam…
▽ More
In recent years, a number of methods for verifying DNNs have been developed. Because the approaches of the methods differ and have their own limitations, we think that a number of verification methods should be applied to a developed DNN. To apply a number of methods to the DNN, it is necessary to translate either the implementation of the DNN or the verification method so that one runs in the same environment as the other. Since those translations are time-consuming, a utility tool, named DeepSaucer, which helps to retain and reuse implementations of DNNs, verification methods, and their environments, is proposed. In DeepSaucer, code snippets of loading DNNs, running verification methods, and creating their environments are retained and reused as software assets in order to reduce cost of verifying DNNs. The feasibility of DeepSaucer is confirmed by implementing it on the basis of Anaconda, which provides virtual environment for loading a DNN and running a verification method. In addition, the effectiveness of DeepSaucer is demonstrated by usecase examples.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
Where, When, and How mmWave is Used in 5G and Beyon
Authors:
Kei Sakaguchi,
Thomas Haustein,
Sergio Barbarossa,
Emilio Calvanese Strinati,
Antonio Clemente,
Giuseppe Destino,
Aarno Pärssinen,
Ilgyu Kim,
Heesang Chung,
Junhyeong Kim,
Wilhelm Keusgen,
Richard J. Weiler,
Koji Takinami,
Elena Ceci,
Ali Sadri,
Liang Xain,
Alexander Maltsev,
Gia Khanh Tran,
Hiroaki Ogawa,
Kim Mahler,
Robert W. Heath Jr
Abstract:
Wireless engineers and business planners commonly raise the question on where, when, and how millimeter-wave (mmWave) will be used in 5G and beyond. Since the next generation network is not just a new radio access standard, but instead an integration of networks for vertical markets with diverse applications, answers to the question depend on scenarios and use cases to be deployed. This paper give…
▽ More
Wireless engineers and business planners commonly raise the question on where, when, and how millimeter-wave (mmWave) will be used in 5G and beyond. Since the next generation network is not just a new radio access standard, but instead an integration of networks for vertical markets with diverse applications, answers to the question depend on scenarios and use cases to be deployed. This paper gives four 5G mmWave deployment examples and describes in chronological order the scenarios and use cases of their probable deployment, including expected system architectures and hardware prototypes. The paper starts with 28 GHz outdoor backhauling for fixed wireless access and moving hotspots, which will be demonstrated at the PyeongChang winter Olympic games in 2018. The second deployment example is a 60 GHz unlicensed indoor access system at the Tokyo-Narita airport, which is combined with Mobile Edge Computing (MEC) to enable ultra-high speed content download with low latency. The third example is mmWave mesh network to be used as a micro Radio Access Network (μ-RAN), for cost-effective backhauling of small-cell Base Stations (BSs) in dense urban scenarios. The last example is mmWave based Vehicular-to-Vehicular (V2V) and Vehicular-to-Everything (V2X) communications system, which enables automated driving by exchanging High Definition (HD) dynamic map information between cars and Roadside Units (RSUs). For 5G and beyond, mmWave and MEC will play important roles for a diverse set of applications that require both ultra-high data rate and low latency communications.
△ Less
Submitted 26 April, 2017;
originally announced April 2017.
-
Multi-command Tactile Brain Computer Interface: A Feasibility Study
Authors:
Hiromu Mori,
Yoshihiro Matsumoto,
Victor Kryssanov,
Eric Cooper,
Hitoshi Ogawa,
Shoji Makino,
Zbigniew R. Struzik,
Tomasz M. Rutkowski
Abstract:
The study presented explores the extent to which tactile stimuli delivered to the ten digits of a BCI-naive subject can serve as a platform for a brain computer interface (BCI) that could be used in an interactive application such as robotic vehicle operation. The ten fingertips are used to evoke somatosensory brain responses, thus defining a tactile brain computer interface (tBCI). Experimental r…
▽ More
The study presented explores the extent to which tactile stimuli delivered to the ten digits of a BCI-naive subject can serve as a platform for a brain computer interface (BCI) that could be used in an interactive application such as robotic vehicle operation. The ten fingertips are used to evoke somatosensory brain responses, thus defining a tactile brain computer interface (tBCI). Experimental results on subjects performing online (real-time) tBCI, using stimuli with a moderately fast inter-stimulus-interval (ISI), provide a validation of the tBCI prototype, while the feasibility of the concept is illuminated through information-transfer rates obtained through the case study.
△ Less
Submitted 18 May, 2013;
originally announced May 2013.
-
Perceiving the Social: A Multi-Agent System to Support Human Navigation in Foreign Communities
Authors:
Victor V. Kryssanov,
Shizuka Kumokawa,
Igor Goncharenko,
Hitoshi Ogawa
Abstract:
This paper describes a system developed to help people explore local communities by providing navigation services in social spaces created by the community members via communication and knowledge sharing. The proposed system utilizes data of a community's social network to reconstruct the social space, which is otherwise not physically perceptible but imaginary, experiential, yet learnable. The so…
▽ More
This paper describes a system developed to help people explore local communities by providing navigation services in social spaces created by the community members via communication and knowledge sharing. The proposed system utilizes data of a community's social network to reconstruct the social space, which is otherwise not physically perceptible but imaginary, experiential, yet learnable. The social space is modeled with an agent network, where each agent stands for a member of the community and has knowledge about expertise and personal characteristics of some other members. An agent can gather information, using its social "connections", to find community members most suitable to communicate to in a specific situation defined by the system's user. The system then deploys its multimodal interface, which "maps" the social space onto a representation of the relevant physical space, to locate the potential interlocutors and advise the user on an efficient communication strategy for the given community.
△ Less
Submitted 18 March, 2010;
originally announced March 2010.
-
We cite as we communicate: A communication model for the citation process
Authors:
Victor V. Kryssanov,
Evgeny L. Kuleshov,
Frank J. Rinaldo,
Hitoshi Ogawa
Abstract:
Building on ideas from linguistics, psychology, and social sciences about the possible mechanisms of human decision-making, we propose a novel theoretical framework for the citation analysis. Given the existing trend to investigate citation statistics in the context of various forms of power and Zipfian laws, we show that the popular models of citation have poor predictive ability and can hardly…
▽ More
Building on ideas from linguistics, psychology, and social sciences about the possible mechanisms of human decision-making, we propose a novel theoretical framework for the citation analysis. Given the existing trend to investigate citation statistics in the context of various forms of power and Zipfian laws, we show that the popular models of citation have poor predictive ability and can hardly provide for an adequate explanation of the observed behavior of the empirical data. An alternative model is then derived, using the apparatus of statistical mechanics. The model is applied to approximate the citation frequencies of scientific articles from two large collections, and it demonstrates a predictive potential much superior to the one of any of the citation models known to the authors from the literature. Some analytical properties of the developed model are discussed, and conclusions are drawn. Directions for future work are also given at the paper's end.
△ Less
Submitted 18 June, 2007; v1 submitted 23 March, 2007;
originally announced March 2007.
-
Citation as a Representation Process
Authors:
V. V. Kryssanov,
F. J. Rinaldo,
H. Ogawa,
E. Kuleshov
Abstract:
The presented work proposes a novel approach to model the citation rate. The paper begins with a brief introduction into informetrics studies and highlights drawbacks of the contemporary approaches to modeling the citation process as a product of social interactions. An alternative modeling framework based on results obtained in cognitive psychology is then introduced and applied in an experimen…
▽ More
The presented work proposes a novel approach to model the citation rate. The paper begins with a brief introduction into informetrics studies and highlights drawbacks of the contemporary approaches to modeling the citation process as a product of social interactions. An alternative modeling framework based on results obtained in cognitive psychology is then introduced and applied in an experiment to investigate properties of the citation process, as they are revealed by a large collection of citation statistics. Major research findings are discussed, and a summary is given.
△ Less
Submitted 16 July, 2006; v1 submitted 14 July, 2006;
originally announced July 2006.
-
Modeling the Dynamics of Social Networks
Authors:
Victor V. Kryssanov,
Frank J. Rinaldo,
Evgeny L. Kuleshov,
Hitoshi Ogawa
Abstract:
Modeling human dynamics responsible for the formation and evolution of the so-called social networks - structures comprised of individuals or organizations and indicating connectivities existing in a community - is a topic recently attracting a significant research interest. It has been claimed that these dynamics are scale-free in many practically important cases, such as impersonal and persona…
▽ More
Modeling human dynamics responsible for the formation and evolution of the so-called social networks - structures comprised of individuals or organizations and indicating connectivities existing in a community - is a topic recently attracting a significant research interest. It has been claimed that these dynamics are scale-free in many practically important cases, such as impersonal and personal communication, auctioning in a market, accessing sites on the WWW, etc., and that human response times thus conform to the power law. While a certain amount of progress has recently been achieved in predicting the general response rate of a human population, existing formal theories of human behavior can hardly be found satisfactory to accommodate and comprehensively explain the scaling observed in social networks. In the presented study, a novel system-theoretic modeling approach is proposed and successfully applied to determine important characteristics of a communication network and to analyze consumer behavior on the WWW.
△ Less
Submitted 23 May, 2006;
originally announced May 2006.
-
Optimal multiple assignments based on integer programming in secret sharing schemes with general access structures
Authors:
Mitsugu Iwamoto,
Hirosuke Yamamoto,
Hirohisa Ogawa
Abstract:
It is known that for any general access structure, a secret sharing scheme (SSS) can be constructed from an (m,m)-threshold scheme by using the so-called cumulative map or from a (t,m)-threshold SSS by a modified cumulative map. However, such constructed SSSs are not efficient generally. In this paper, we propose a new method to construct a SSS from a $(t,m)$-threshold scheme for any given gener…
▽ More
It is known that for any general access structure, a secret sharing scheme (SSS) can be constructed from an (m,m)-threshold scheme by using the so-called cumulative map or from a (t,m)-threshold SSS by a modified cumulative map. However, such constructed SSSs are not efficient generally. In this paper, we propose a new method to construct a SSS from a $(t,m)$-threshold scheme for any given general access structure. In the proposed method, integer programming is used to distribute optimally the shares of (t,m)-threshold scheme to each participant of the general access structure. From the optimality, it can always attain lower coding rate than the cumulative maps except the cases that they give the optimal distribution. The same method is also applied to construct SSSs for incomplete access structures and/or ramp access structures.
△ Less
Submitted 15 June, 2005;
originally announced June 2005.