Skip to main content

Showing 1–50 of 64 results for author: Hayashi, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.05656  [pdf, other

    cs.LG cs.CL

    Multi-label Learning with Random Circular Vectors

    Authors: Ken Nishida, Kojiro Machi, Kazuma Onishi, Katsuhiko Hayashi, Hidetaka Kamigaito

    Abstract: The extreme multi-label classification~(XMC) task involves learning a classifier that can predict from a large label set the most relevant subset of labels for a data instance. While deep neural networks~(DNNs) have demonstrated remarkable success in XMC problems, the task is still challenging because it must deal with a large number of output labels, which make the DNN training computationally ex… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 11 pages, 6 figures, 3 tables; accepted to workshop RepL4NLP held in conjunction with ACL 2024

  2. arXiv:2407.04251  [pdf, other

    cs.CL cs.LG

    Unified Interpretation of Smoothing Methods for Negative Sampling Loss Functions in Knowledge Graph Embedding

    Authors: Xincan Feng, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: Knowledge Graphs (KGs) are fundamental resources in knowledge-intensive tasks in NLP. Due to the limitation of manually creating KGs, KG Completion (KGC) has an important role in automatically completing KGs by scoring their links with KG Embedding (KGE). To handle many entities in training, KGE relies on Negative Sampling (NS) loss that can reduce the computational cost by sampling. Since the app… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 9 pages, 4 figures, 2 tables; accepted to workshop RepL4NLP held in conjunction with ACL 2024

  3. arXiv:2403.00068  [pdf, other

    cs.CV

    Artwork Explanation in Large-scale Vision Language Models

    Authors: Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: Large-scale vision-language models (LVLMs) output text from images and instructions, demonstrating advanced capabilities in text generation and comprehension. However, it has not been clarified to what extent LVLMs understand the knowledge necessary for explaining images, the complex relationships between various pieces of knowledge, and how they integrate these understandings into their explanati… ▽ More

    Submitted 29 February, 2024; originally announced March 2024.

  4. arXiv:2402.18839  [pdf, other

    cs.LG math.AP math.FA math.OC math.PR

    Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity Equation

    Authors: Noboru Isobe, Masanori Koyama, Jinzhe Zhang, Kohei Hayashi, Kenji Fukumizu

    Abstract: The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can… ▽ More

    Submitted 5 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 27 pages, 10 figures, We have corrected an error in our experiment on COT-FM

    MSC Class: 68T07 (Primary); 49Q22 (Secondary)

  5. arXiv:2402.12121  [pdf, other

    cs.CL cs.AI cs.CV cs.MM

    Evaluating Image Review Ability of Vision Language Models

    Authors: Shigeki Saito, Kazuki Hayashi, Yusuke Ide, Yusuke Sakai, Kazuma Onishi, Toma Suzuki, Seiji Gobara, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: Large-scale vision language models (LVLMs) are language models that are capable of processing images and text inputs by a single model. This paper explores the use of LVLMs to generate review texts for images. The ability of LVLMs to review images is not fully understood, highlighting the need for a methodical evaluation of their review abilities. Unlike image captions, review texts can be written… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: 9pages, under reviewing

  6. arXiv:2402.08134  [pdf, other

    cs.LG math.NA math.OC

    Randomized Algorithms for Symmetric Nonnegative Matrix Factorization

    Authors: Koby Hayashi, Sinan G. Aksoy, Grey Ballard, Haesun Park

    Abstract: Symmetric Nonnegative Matrix Factorization (SymNMF) is a technique in data analysis and machine learning that approximates a symmetric matrix with a product of a nonnegative, low-rank matrix and its transpose. To design faster and more scalable algorithms for SymNMF we develop two randomized algorithms for its computation. The first algorithm uses randomized matrix sketching to compute an initial… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    MSC Class: 65F55; 65F20

  7. arXiv:2402.01734  [pdf, other

    cs.CL cs.LG q-fin.CP stat.AP

    CFTM: Continuous time fractional topic model

    Authors: Kei Nakagawa, Kohei Hayashi, Yugo Fujimoto

    Abstract: In this paper, we propose the Continuous Time Fractional Topic Model (cFTM), a new method for dynamic topic modeling. This approach incorporates fractional Brownian motion~(fBm) to effectively identify positive or negative correlations in topic and word distribution over time, revealing long-term dependency or roughness. Our theoretical analysis shows that the cFTM can capture these long-term depe… ▽ More

    Submitted 6 February, 2024; v1 submitted 29 January, 2024; originally announced February 2024.

  8. arXiv:2401.00496  [pdf, other

    cs.CV cs.AI cs.LG

    SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge

    Authors: Dimitrios Psychogyios, Emanuele Colleoni, Beatrice Van Amsterdam, Chih-Yang Li, Shu-Yu Huang, Yuchong Li, Fucang Jia, Baosheng Zou, Guotai Wang, Yang Liu, Maxence Boels, Jiayu Huo, Rachel Sparks, Prokar Dasgupta, Alejandro Granados, Sebastien Ourselin, Mengya Xu, An Wang, Yanan Wu, Long Bai, Hongliang Ren, Atsushi Yamada, Yuriko Harai, Yuto Ishikawa, Kazuyuki Hayashi , et al. (25 additional authors not shown)

    Abstract: Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segme… ▽ More

    Submitted 23 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

  9. arXiv:2311.09109  [pdf, other

    cs.CL cs.AI cs.LG

    Does Pre-trained Language Model Actually Infer Unseen Links in Knowledge Graph Completion?

    Authors: Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: Knowledge graphs (KGs) consist of links that describe relationships between entities. Due to the difficulty of manually enumerating all relationships between entities, automatically completing them is essential for KGs. Knowledge Graph Completion (KGC) is a task that infers unseen relationships between entities in a KG. Traditional embedding-based KGC methods, such as RESCAL, TransE, DistMult, Com… ▽ More

    Submitted 6 June, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

    Comments: Accepted at NAACL 2024 main oral, 15 pages, 10 figures

  10. arXiv:2309.09296  [pdf, other

    cs.CL cs.AI cs.LG

    Model-based Subsampling for Knowledge Graph Completion

    Authors: Xincan Feng, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: Subsampling is effective in Knowledge Graph Embedding (KGE) for reducing overfitting caused by the sparsity in Knowledge Graph (KG) datasets. However, current subsampling approaches consider only frequencies of queries that consist of entities and their relations. Thus, the existing subsampling potentially underestimates the appearance probabilities of infrequent queries even if the frequencies of… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

    Comments: Accepted by AACL 2023; 9 pages, 3 figures, 5 tables

  11. arXiv:2308.13536  [pdf, ps, other

    cs.IR cs.LG

    Implicit ZCA Whitening Effects of Linear Autoencoders for Recommendation

    Authors: Katsuhiko Hayashi, Kazuma Onishi

    Abstract: Recently, in the field of recommendation systems, linear regression (autoencoder) models have been investigated as a way to learn item similarity. In this paper, we show a connection between a linear autoencoder model and ZCA whitening for recommendation data. In particular, we show that the dual form solution of a linear autoencoder model actually has ZCA whitening effects on feature vectors of i… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  12. arXiv:2306.10656  [pdf, other

    cs.LG cs.AI stat.ML

    Virtual Human Generative Model: Masked Modeling Approach for Learning Human Characteristics

    Authors: Kenta Oono, Nontawat Charoenphakdee, Kotatsu Bito, Zhengyan Gao, Yoshiaki Ota, Shoichiro Yamaguchi, Yohei Sugawara, Shin-ichi Maeda, Kunihiko Miyoshi, Yuki Saito, Koki Tsuda, Hiroshi Maruyama, Kohei Hayashi

    Abstract: Identifying the relationship between healthcare attributes, lifestyles, and personality is vital for understanding and improving physical and mental conditions. Machine learning approaches are promising for modeling their relationships and offering actionable suggestions. In this paper, we propose Virtual Human Generative Model (VHGM), a machine learning model for estimating attributes about healt… ▽ More

    Submitted 14 August, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: 14 pages, 4 figures

  13. arXiv:2306.08636  [pdf, other

    cs.IR cs.SI

    Using Wikipedia Editor Information to Build High-performance Recommender Systems

    Authors: Katsuhiko Hayashi

    Abstract: Wikipedia has high-quality articles on a variety of topics and has been used in diverse research areas. In this study, a method is presented for using Wikipedia's editor information to build recommender systems in various domains that outperform content-based systems.

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: Accepted at Wiki Workshop2023 (withdrawn by the author)

  14. arXiv:2306.02115  [pdf, other

    cs.CL cs.CV cs.LG

    Table and Image Generation for Investigating Knowledge of Entities in Pre-trained Vision and Language Models

    Authors: Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe

    Abstract: In this paper, we propose a table and image generation task to verify how the knowledge about entities acquired from natural language is retained in Vision & Language (V&L) models. This task consists of two parts: the first is to generate a table containing knowledge about an entity and its related image, and the second is to generate an image from an entity with a caption and a table containing r… ▽ More

    Submitted 25 July, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted at ACL 2023

  15. arXiv:2305.18484  [pdf, other

    stat.ML cs.LG

    Neural Fourier Transform: A General Approach to Equivariant Representation Learning

    Authors: Masanori Koyama, Kenji Fukumizu, Kohei Hayashi, Takeru Miyato

    Abstract: Symmetry learning has proven to be an effective approach for extracting the hidden structure of data, with the concept of equivariance relation playing the central role. However, most of the current studies are built on architectural theory and corresponding assumptions on the form of data. We propose Neural Fourier Transform (NFT), a general framework of learning the latent linear action of the g… ▽ More

    Submitted 14 February, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  16. arXiv:2303.15747  [pdf, other

    cs.LG cs.AI

    TabRet: Pre-training Transformer-based Tabular Models for Unseen Columns

    Authors: Soma Onishi, Kenta Oono, Kohei Hayashi

    Abstract: We present \emph{TabRet}, a pre-trainable Transformer-based model for tabular data. TabRet is designed to work on a downstream task that contains columns not seen in pre-training. Unlike other methods, TabRet has an extra learning step before fine-tuning called \emph{retokenizing}, which calibrates feature embeddings based on the masked autoencoding loss. In experiments, we pre-trained TabRet with… ▽ More

    Submitted 15 April, 2023; v1 submitted 28 March, 2023; originally announced March 2023.

    Comments: Accepted at the Workshop on Understanding Foundation Models at ICLR 2023

  17. arXiv:2209.12801  [pdf, ps, other

    cs.AI cs.CL cs.LG

    Subsampling for Knowledge Graph Embedding Explained

    Authors: Hidetaka Kamigaito, Katsuhiko Hayashi

    Abstract: In this article, we explain the recent advance of subsampling methods in knowledge graph embedding (KGE) starting from the original one used in word2vec.

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: Notes for subsampling methods in Knowledge Graph Embedding

  18. arXiv:2206.10140  [pdf, other

    cs.LG cs.AI cs.CL cs.SI

    Comprehensive Analysis of Negative Sampling in Knowledge Graph Representation Learning

    Authors: Hidetaka Kamigaito, Katsuhiko Hayashi

    Abstract: Negative sampling (NS) loss plays an important role in learning knowledge graph embedding (KGE) to handle a huge number of entities. However, the performance of KGE degrades without hyperparameters such as the margin term and number of negative samples in NS loss being appropriately selected. Currently, empirical hyperparameter tuning addresses this problem at the cost of computational time. To so… ▽ More

    Submitted 6 July, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted at ICML2022

  19. arXiv:2204.07425  [pdf, other

    cs.DS math.CO math.OC

    Finding Hall blockers by matrix scaling

    Authors: Koyo Hayashi, Hiroshi Hirai, Keiya Sakabe

    Abstract: For a given nonnegative matrix $A=(A_{ij})$, the matrix scaling problem asks whether $A$ can be scaled to a doubly stochastic matrix $D_1AD_2$ for some positive diagonal matrices $D_1,D_2$.The Sinkhorn algorithm is a simple iterative algorithm, which repeats row-normalization $A_{ij} \leftarrow A_{ij}/\sum_{j}A_{ij}$ and column-normalization $A_{ij} \leftarrow A_{ij}/\sum_{i}A_{ij}$ alternatively.… ▽ More

    Submitted 15 June, 2023; v1 submitted 15 April, 2022; originally announced April 2022.

    MSC Class: 05C50

  20. arXiv:2204.03188  [pdf, ps, other

    math.CO cs.DM

    Two flags in a semimodular lattice generate an antimatroid

    Authors: Koyo Hayashi, Hiroshi Hirai

    Abstract: A basic property in a modular lattice is that any two flags generate a distributive sublattice. It is shown (Abels 1991, Herscovic 1998) that two flags in a semimodular lattice no longer generate such a good sublattice, whereas shortest galleries connecting them form a relatively good join-sublattice. In this note, we sharpen this investigation to establish an analogue of the two-flag generation t… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    MSC Class: 06C10

  21. arXiv:2203.08619  [pdf, other

    cs.RO

    An Independently Learnable Hierarchical Model for Bilateral Control-Based Imitation Learning Applications

    Authors: Kazuki Hayashi, Sho Sakaino, Toshiaki Tsuji

    Abstract: Recently, motion generation by machine learning has been actively researched to automate various tasks. Imitation learning is one such method that learns motions from data collected in advance. However, executing long-term tasks remains challenging. Therefore, a novel framework for imitation learning is proposed to solve this problem. The proposed framework comprises upper and lower layers, where… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

  22. arXiv:2203.01388  [pdf, other

    cs.LG

    Skew-Symmetric Adjacency Matrices for Clustering Directed Graphs

    Authors: Koby Hayashi, Sinan G. Aksoy, Haesun Park

    Abstract: Cut-based directed graph (digraph) clustering often focuses on finding dense within-cluster or sparse between-cluster connections, similar to cut-based undirected graph clustering methods. In contrast, for flow-based clusterings the edges between clusters tend to be oriented in one direction and have been found in migration data, food webs, and trade data. In this paper we introduce a spectral alg… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

    Comments: 21 pages, 7 figures

  23. arXiv:2201.05974  [pdf, other

    cs.LG q-fin.CP stat.ML

    Fractional SDE-Net: Generation of Time Series Data with Long-term Memory

    Authors: Kohei Hayashi, Kei Nakagawa

    Abstract: In this paper, we focus on the generation of time-series data using neural networks. It is often the case that input time-series data have only one realized (and usually irregularly sampled) path, which makes it difficult to extract time-series characteristics, and its noise structure is more complicated than i.i.d. type. Time series data, especially from hydrology, telecommunications, economics,… ▽ More

    Submitted 23 August, 2022; v1 submitted 16 January, 2022; originally announced January 2022.

    Comments: IEEE DSAA 2022 Accepted

  24. arXiv:2108.11018  [pdf, other

    cs.LG cs.CV

    A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?

    Authors: Hiroaki Mikami, Kenji Fukumizu, Shogo Murai, Shuji Suzuki, Yuta Kikuchi, Taiji Suzuki, Shin-ichi Maeda, Kohei Hayashi

    Abstract: Synthetic-to-real transfer learning is a framework in which a synthetically generated dataset is used to pre-train a model to improve its performance on real vision tasks. The most significant advantage of using synthetic images is that the ground-truth labels are automatically available, enabling unlimited expansion of the data size without human cost. However, synthetic data may have a huge doma… ▽ More

    Submitted 8 October, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

  25. Unified Interpretation of Softmax Cross-Entropy and Negative Sampling: With Case Study for Knowledge Graph Embedding

    Authors: Hidetaka Kamigaito, Katsuhiko Hayashi

    Abstract: In knowledge graph embedding, the theoretical relationship between the softmax cross-entropy and negative sampling loss functions has not been investigated. This makes it difficult to fairly compare the results of the two different loss functions. We attempted to solve this problem by using the Bregman divergence to provide a unified interpretation of the softmax cross-entropy and negative samplin… ▽ More

    Submitted 16 March, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: Accepted at ACL-IJCNLP 2021

  26. A New Autoregressive Neural Network Model with Command Compensation for Imitation Learning Based on Bilateral Control

    Authors: Kazuki Hayashi, Ayumu Sasagawa, Sho Sakaino, Toshiaki Tsuji

    Abstract: In the near future, robots are expected to work with humans or operate alone and may replace human workers in various fields such as homes and factories. In a previous study, we proposed bilateral control-based imitation learning that enables robots to utilize force information and operate almost simultaneously with an expert's demonstration. In addition, we recently proposed an autoregressive neu… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Journal ref: 2021 IEEE International Conference on Mechatronics (ICM), Pages 1-7, 2021

  27. arXiv:2008.03175  [pdf, other

    stat.ML cond-mat.dis-nn cs.LG

    Reconstructing Sparse Signals via Greedy Monte-Carlo Search

    Authors: Kao Hayashi, Tomoyuki Obuchi, Yoshiyuki Kabashima

    Abstract: We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorith… ▽ More

    Submitted 29 January, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

    Comments: 15 pages, 4 figures

  28. arXiv:2008.01523  [pdf, other

    cs.CL

    A System for Worldwide COVID-19 Information Aggregation

    Authors: Akiko Aizawa, Frederic Bergeron, Junjie Chen, Fei Cheng, Katsuhiko Hayashi, Kentaro Inui, Hiroyoshi Ito, Daisuke Kawahara, Masaru Kitsuregawa, Hirokazu Kiyomaru, Masaki Kobayashi, Takashi Kodama, Sadao Kurohashi, Qianying Liu, Masaki Matsubara, Yusuke Miyao, Atsuyuki Morishima, Yugo Murawaki, Kazumasa Omura, Haiyue Song, Eiichiro Sumita, Shinji Suzuki, Ribeka Tanaka, Yu Tanaka, Masashi Toyoda , et al. (4 additional authors not shown)

    Abstract: The global pandemic of COVID-19 has made the public pay close attention to related news, covering various domains, such as sanitation, treatment, and effects on education. Meanwhile, the COVID-19 condition is very different among the countries (e.g., policies and development of the epidemic), and thus citizens would be interested in news in foreign countries. We build a system for worldwide COVID-… ▽ More

    Submitted 11 October, 2020; v1 submitted 27 July, 2020; originally announced August 2020.

    Comments: Accepted to EMNLP 2020 Workshop NLP-COVID

  29. arXiv:2006.16377  [pdf, other

    cs.LG stat.ML

    Hypergraph Random Walks, Laplacians, and Clustering

    Authors: Koby Hayashi, Sinan G. Aksoy, Cheong Hee Park, Haesun Park

    Abstract: We propose a flexible framework for clustering hypergraph-structured data based on recently proposed random walks utilizing edge-dependent vertex weights. When incorporating edge-dependent vertex weights (EDVW), a weight is associated with each vertex-hyperedge pair, yielding a weighted incidence matrix of the hypergraph. Such weightings have been utilized in term-document representations of text… ▽ More

    Submitted 27 October, 2020; v1 submitted 29 June, 2020; originally announced June 2020.

  30. arXiv:2006.06909  [pdf, other

    cs.LG stat.ML

    Weisfeiler-Lehman Embedding for Molecular Graph Neural Networks

    Authors: Katsuhiko Ishiguro, Kenta Oono, Kohei Hayashi

    Abstract: A graph neural network (GNN) is a good choice for predicting the chemical properties of molecules. Compared with other deep networks, however, the current performance of a GNN is limited owing to the "curse of depth." Inspired by long-established feature engineering in the field of chemistry, we expanded an atom representation using Weisfeiler-Lehman (WL) embedding, which is designed to capture lo… ▽ More

    Submitted 17 August, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: Reference Updated. An implementation example is included in Chainer Chemistry Ver 0.7.1: see https://github.com/chainer/chainer-chemistry

  31. arXiv:1912.02686  [pdf, ps, other

    cs.LG stat.ML

    Binarized Canonical Polyadic Decomposition for Knowledge Graph Completion

    Authors: Koki Kishimoto, Katsuhiko Hayashi, Genki Akai, Masashi Shimbo

    Abstract: Methods based on vector embeddings of knowledge graphs have been actively pursued as a promising approach to knowledge graph completion.However, embedding models generate storage-inefficient representations, particularly when the number of entities and relations, and the dimensionality of the real-valued embedding vectors are large. We present a binarized CANDECOMP/PARAFAC(CP) decomposition algori… ▽ More

    Submitted 4 December, 2019; originally announced December 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1902.02970

  32. Neuro-SERKET: Development of Integrative Cognitive System through the Composition of Deep Probabilistic Generative Models

    Authors: Tadahiro Taniguchi, Tomoaki Nakamura, Masahiro Suzuki, Ryo Kuniyasu, Kaede Hayashi, Akira Taniguchi, Takato Horii, Takayuki Nagai

    Abstract: This paper describes a framework for the development of an integrative cognitive system based on probabilistic generative models (PGMs) called Neuro-SERKET. Neuro-SERKET is an extension of SERKET, which can compose elemental PGMs developed in a distributed manner and provide a scheme that allows the composed PGMs to learn throughout the system in an unsupervised way. In addition to the head-to-tai… ▽ More

    Submitted 29 January, 2020; v1 submitted 20 October, 2019; originally announced October 2019.

    Comments: New Gener. Comput. (2020)

    Journal ref: New Generation Computing, 2020, volume 38, 23--48

  33. arXiv:1909.01567  [pdf, other

    cs.AI cs.CL

    A Non-commutative Bilinear Model for Answering Path Queries in Knowledge Graphs

    Authors: Katsuhiko Hayashi, Masashi Shimbo

    Abstract: Bilinear diagonal models for knowledge graph embedding (KGE), such as DistMult and ComplEx, balance expressiveness and computational efficiency by representing relations as diagonal matrices. Although they perform well in predicting atomic relations, composite relations (relation paths) cannot be modeled naturally by the product of relation matrices, as the product of diagonal matrices is commutat… ▽ More

    Submitted 4 September, 2019; originally announced September 2019.

    Comments: Accepted for EMNLP-IJCNLP 2019

  34. arXiv:1909.01149  [pdf, other

    math.NA cs.DC cs.MS

    PLANC: Parallel Low Rank Approximation with Non-negativity Constraints

    Authors: Srinivas Eswar, Koby Hayashi, Grey Ballard, Ramakrishnan Kannan, Michael A. Matheson, Haesun Park

    Abstract: We consider the problem of low-rank approximation of massive dense non-negative tensor data, for example to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input da… ▽ More

    Submitted 30 August, 2019; originally announced September 2019.

    Comments: arXiv admin note: text overlap with arXiv:1806.07985

  35. arXiv:1908.04471  [pdf, other

    cs.LG stat.ML

    Einconv: Exploring Unexplored Tensor Network Decompositions for Convolutional Neural Networks

    Authors: Kohei Hayashi, Taiki Yamaguchi, Yohei Sugawara, Shin-ichi Maeda

    Abstract: Tensor decomposition methods are widely used for model compression and fast inference in convolutional neural networks (CNNs). Although many decompositions are conceivable, only CP decomposition and a few others have been applied in practice, and no extensive comparisons have been made between available methods. Previous studies have not determined how many decompositions are available, nor which… ▽ More

    Submitted 27 November, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

    Comments: NeurIPS 2019

  36. arXiv:1906.08412  [pdf, other

    cs.LG stat.ML

    Data Interpolating Prediction: Alternative Interpretation of Mixup

    Authors: Takuya Shimada, Shoichiro Yamaguchi, Kohei Hayashi, Sosuke Kobayashi

    Abstract: Data augmentation by mixing samples, such as Mixup, has widely been used typically for classification tasks. However, this strategy is not always effective due to the gap between augmented samples for training and original samples for testing. This gap may prevent a classifier from learning the optimal decision boundary and increase the generalization error. To overcome this problem, we propose an… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: Presented at the 2nd Learning from Limited Labeled Data (LLD) Workshop at ICLR 2019

  37. arXiv:1902.02970  [pdf, ps, other

    cs.LG cs.IR stat.ML

    Binarized Knowledge Graph Embeddings

    Authors: Koki Kishimoto, Katsuhiko Hayashi, Genki Akai, Masashi Shimbo, Kazunori Komatani

    Abstract: Tensor factorization has become an increasingly popular approach to knowledge graph completion(KGC), which is the task of automatically predicting missing facts in a knowledge graph. However, even with a simple model like CANDECOMP/PARAFAC(CP) tensor decomposition, KGC on existing knowledge graphs is impractical in resource-limited environments, as a large amount of memory is required to store par… ▽ More

    Submitted 8 February, 2019; originally announced February 2019.

  38. arXiv:1901.09541  [pdf, other

    stat.ML cs.LG

    On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis

    Authors: Kohei Hayashi, Masaaki Imaizumi, Yuichi Yoshida

    Abstract: In this paper, we study random subsampling of Gaussian process regression, one of the simplest approximation baselines, from a theoretical perspective. Although subsampling discards a large part of training data, we show provable guarantees on the accuracy of the predictive mean/variance and its generalization ability. For analysis, we consider embedding kernel matrices into graphons, which encaps… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

  39. arXiv:1812.10044  [pdf, other

    cs.IT cs.LG

    Trainable Projected Gradient Detector for Massive Overloaded MIMO Channels: Data-driven Tuning Approach

    Authors: Satoshi Takabe, Masayuki Imanishi, Tadashi Wadayama, Ryo Hayakawa, Kazunori Hayashi

    Abstract: This paper presents a deep learning-aided iterative detection algorithm for massive overloaded multiple-input multiple-output (MIMO) systems where the number of transmit antennas $n$ is larger than that of receive antennas $m$. Since the proposed algorithm is based on the projected gradient descent method with trainable parameters, it is named the trainable projected gradient-detector (TPG-detecto… ▽ More

    Submitted 9 July, 2019; v1 submitted 25 December, 2018; originally announced December 2018.

    Comments: 12 pages, 16 figures, accepted to IEEE Access, this is a long version of arXiv:1806.10827

  40. arXiv:1810.08307  [pdf, ps, other

    cs.CL

    Reduction of Parameter Redundancy in Biaffine Classifiers with Symmetric and Circulant Weight Matrices

    Authors: Tomoki Matsuno, Katsuhiko Hayashi, Takahiro Ishihara, Hitoshi Manabe, Yuji Matsumoto

    Abstract: Currently, the biaffine classifier has been attracting attention as a method to introduce an attention mechanism into the modeling of binary relations. For instance, in the field of dependency parsing, the Deep Biaffine Parser by Dozat and Manning has achieved state-of-the-art performance as a graph-based dependency parser on the English Penn Treebank and CoNLL 2017 shared task. On the other hand,… ▽ More

    Submitted 18 October, 2018; originally announced October 2018.

    Comments: Accepted to PACLIC 32

  41. arXiv:1808.08361  [pdf, other

    cs.LG stat.ML

    Data-dependent Learning of Symmetric/Antisymmetric Relations for Knowledge Base Completion

    Authors: Hitoshi Manabe, Katsuhiko Hayashi, Masashi Shimbo

    Abstract: Embedding-based methods for knowledge base completion (KBC) learn representations of entities and relations in a vector space, along with the scoring function to estimate the likelihood of relations between entities. The learnable class of scoring functions is designed to be expressive enough to cover a variety of real-world relations, but this expressive comes at the cost of an increased number o… ▽ More

    Submitted 25 August, 2018; originally announced August 2018.

    Comments: In AAAI 2018

  42. arXiv:1806.10827  [pdf, ps, other

    cs.IT cs.LG

    Deep Learning-Aided Projected Gradient Detector for Massive Overloaded MIMO Channels

    Authors: Satoshi Takabe, Masayuki Imanishi, Tadashi Wadayama, Kazunori Hayashi

    Abstract: The paper presents a deep learning-aided iterative detection algorithm for massive overloaded MIMO systems. Since the proposed algorithm is based on the projected gradient descent method with trainable parameters, it is named as trainable projected descent-detector (TPG-detector). The trainable internal parameters can be optimized with standard deep learning techniques such as back propagation and… ▽ More

    Submitted 25 December, 2018; v1 submitted 28 June, 2018; originally announced June 2018.

    Comments: 6 pages, 8 figures, metadata is updated

  43. arXiv:1806.07985  [pdf, other

    math.NA cs.DC cs.MS

    Parallel Nonnegative CP Decomposition of Dense Tensors

    Authors: Grey Ballard, Koby Hayashi, Ramakrishnan Kannan

    Abstract: The CP tensor decomposition is a low-rank approximation of a tensor. We present a distributed-memory parallel algorithm and implementation of an alternating optimization method for computing a CP decomposition of dense tensor data that can enforce nonnegativity of the computed low-rank factors. The principal task is to parallelize the matricized-tensor times Khatri-Rao product (MTTKRP) bottleneck… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  44. arXiv:1710.09932  [pdf, other

    cs.CG math.MG

    A polynomial time algorithm to compute geodesics in CAT(0) cubical complexes

    Authors: Koyo Hayashi

    Abstract: This paper presents the first polynomial time algorithm to compute geodesics in a CAT(0) cubical complex in general dimension. The algorithm is a simple iterative method to update breakpoints of a path joining two points using Miller, Owen and Provan's algorithm (2015) as a subroutine. Our algorithm is applicable to any CAT(0) space in which geodesics between two close points can be computed, not… ▽ More

    Submitted 29 June, 2018; v1 submitted 26 October, 2017; originally announced October 2017.

    Comments: 16 pages

  45. arXiv:1709.06671  [pdf, other

    cs.CL cs.LG cs.NE

    Think Globally, Embed Locally --- Locally Linear Meta-embedding of Words

    Authors: Danushka Bollegala, Kohei Hayashi, Ken-ichi Kawarabayashi

    Abstract: Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accura… ▽ More

    Submitted 19 September, 2017; originally announced September 2017.

  46. arXiv:1708.08976  [pdf, other

    cs.DC

    Shared Memory Parallelization of MTTKRP for Dense Tensors

    Authors: Koby Hayashi, Grey Ballard, Jeffrey Jiang, Michael Tobia

    Abstract: The matricized-tensor times Khatri-Rao product (MTTKRP) is the computational bottleneck for algorithms computing CP decompositions of tensors. In this paper, we develop shared-memory parallel algorithms for MTTKRP involving dense tensors. The algorithms cast nearly all of the computation as matrix operations in order to use optimized BLAS subroutines, and they avoid reordering tensor entries in me… ▽ More

    Submitted 29 August, 2017; originally announced August 2017.

    Comments: 10 pages, 27 figures

  47. arXiv:1702.05563  [pdf, other

    cs.LG

    On the Equivalence of Holographic and Complex Embeddings for Link Prediction

    Authors: Katsuhiko Hayashi, Masashi Shimbo

    Abstract: We show the equivalence of two state-of-the-art link prediction/knowledge graph completion methods: Nickel et al's holographic embedding and Trouillon et al.'s complex embedding. We first consider a spectral version of the holographic embedding, exploiting the frequency domain in the Fourier transform for efficient computation. The analysis of the resulting method reveals that it can be viewed as… ▽ More

    Submitted 22 September, 2017; v1 submitted 17 February, 2017; originally announced February 2017.

    Comments: This is a slightly modified version of the paper of the same title that appeared in ACL 2017

  48. arXiv:1608.07179  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Minimizing Quadratic Functions in Constant Time

    Authors: Kohei Hayashi, Yuichi Yoshida

    Abstract: A sampling-based optimization method for quadratic functions is proposed. Our method approximately solves the following $n$-dimensional quadratic minimization problem in constant time, which is independent of $n$: $z^*=\min_{\mathbf{v} \in \mathbb{R}^n}\langle\mathbf{v}, A \mathbf{v}\rangle + n\langle\mathbf{v}, \mathrm{diag}(\mathbf{d})\mathbf{v}\rangle + n\langle\mathbf{b}, \mathbf{v}\rangle$, w… ▽ More

    Submitted 25 August, 2016; originally announced August 2016.

    Comments: An extended abstract will appear in the proceedings of NIPS'16

  49. arXiv:1602.02256  [pdf, ps, other

    cs.LG stat.ML

    A Tractable Fully Bayesian Method for the Stochastic Block Model

    Authors: Kohei Hayashi, Takuya Konishi, Tatsuro Kawamoto

    Abstract: The stochastic block model (SBM) is a generative model revealing macroscopic structures in graphs. Bayesian methods are used for (i) cluster assignment inference and (ii) model selection for the number of clusters. In this paper, we study the behavior of Bayesian inference in the SBM in the large sample limit. Combining variational approximation and Laplace's method, a consistent criterion of the… ▽ More

    Submitted 6 February, 2016; originally announced February 2016.

  50. arXiv:1510.07273  [pdf, ps, other

    cs.IT

    Multiuser Detection by MAP Estimation with Sum-of-Absolute-Values Relaxation

    Authors: Hampei Sasahara, Kazunori Hayashi, Masaaki Nagahara

    Abstract: In this article, we consider multiuser detection that copes with multiple access interference caused in star-topology machine-to-machine (M2M) communications. We assume that the transmitted signals are discrete-valued (e.g. binary signals taking values of $\pm 1$), which is taken into account as prior information in detection. We formulate the detection problem as the maximum a posteriori (MAP) es… ▽ More

    Submitted 25 October, 2015; originally announced October 2015.

    Comments: submitted; 6 pages, 7 figures