-
Differentiable Logic Programming for Distant Supervision
Authors:
Akihiro Takemura,
Katsumi Inoue
Abstract:
We introduce a new method for integrating neural networks with logic programming in Neural-Symbolic AI (NeSy), aimed at learning with distant supervision, in which direct labels are unavailable. Unlike prior methods, our approach does not depend on symbolic solvers for reasoning about missing labels. Instead, it evaluates logical implications and constraints in a differentiable manner by embedding…
▽ More
We introduce a new method for integrating neural networks with logic programming in Neural-Symbolic AI (NeSy), aimed at learning with distant supervision, in which direct labels are unavailable. Unlike prior methods, our approach does not depend on symbolic solvers for reasoning about missing labels. Instead, it evaluates logical implications and constraints in a differentiable manner by embedding both neural network outputs and logic programs into matrices. This method facilitates more efficient learning under distant supervision. We evaluated our approach against existing methods while maintaining a constant volume of training data. The findings indicate that our method not only matches or exceeds the accuracy of other methods across various tasks but also speeds up the learning process. These results highlight the potential of our approach to enhance both accuracy and learning efficiency in NeSy applications.
△ Less
Submitted 25 August, 2024; v1 submitted 22 August, 2024;
originally announced August 2024.
-
Variable Assignment Invariant Neural Networks for Learning Logic Programs
Authors:
Yin Jun Phua,
Katsumi Inoue
Abstract:
Learning from interpretation transition (LFIT) is a framework for learning rules from observed state transitions. LFIT has been implemented in purely symbolic algorithms, but they are unable to deal with noise or generalize to unobserved transitions. Rule extraction based neural network methods suffer from overfitting, while more general implementation that categorize rules suffer from combinatori…
▽ More
Learning from interpretation transition (LFIT) is a framework for learning rules from observed state transitions. LFIT has been implemented in purely symbolic algorithms, but they are unable to deal with noise or generalize to unobserved transitions. Rule extraction based neural network methods suffer from overfitting, while more general implementation that categorize rules suffer from combinatorial explosion. In this paper, we introduce a technique to leverage variable permutation invariance inherent in symbolic domains. Our technique ensures that the permutation and the naming of the variables would not affect the results. We demonstrate the effectiveness and the scalability of this method with various experiments. Our code is publicly available at https://github.com/phuayj/delta-lfit-2
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Abductive Reasoning in a Paraconsistent Framework
Authors:
Meghyn Bienvenu,
Katsumi Inoue,
Daniil Kozhemiachenko
Abstract:
We explore the problem of explaining observations starting from a classically inconsistent theory by adopting a paraconsistent framework. We consider two expansions of the well-known Belnap--Dunn paraconsistent four-valued logic $\mathsf{BD}$: $\mathsf{BD}_\circ$ introduces formulas of the form $\circφ$ (the information on $φ$ is reliable), while $\mathsf{BD}_\triangle$ augments the language with…
▽ More
We explore the problem of explaining observations starting from a classically inconsistent theory by adopting a paraconsistent framework. We consider two expansions of the well-known Belnap--Dunn paraconsistent four-valued logic $\mathsf{BD}$: $\mathsf{BD}_\circ$ introduces formulas of the form $\circφ$ (the information on $φ$ is reliable), while $\mathsf{BD}_\triangle$ augments the language with $\triangleφ$'s (there is information that $φ$ is true). We define and motivate the notions of abduction problems and explanations in $\mathsf{BD}_\circ$ and $\mathsf{BD}_\triangle$ and show that they are not reducible to one another. We analyse the complexity of standard abductive reasoning tasks (solution recognition, solution existence, and relevance / necessity of hypotheses) in both logics. Finally, we show how to reduce abduction in $\mathsf{BD}_\circ$ and $\mathsf{BD}_\triangle$ to abduction in classical propositional logic, thereby enabling the reuse of existing abductive reasoning procedures.
△ Less
Submitted 23 August, 2024; v1 submitted 1 August, 2024;
originally announced August 2024.
-
Large Neighborhood Prioritized Search for Combinatorial Optimization with Answer Set Programming
Authors:
Irumi Sugimori,
Katsumi Inoue,
Hidetomo Nabeshima,
Torsten Schaub,
Takehide Soh,
Naoyuki Tamura,
Mutsunori Banbara
Abstract:
We propose Large Neighborhood Prioritized Search (LNPS) for solving combinatorial optimization problems in Answer Set Programming (ASP). LNPS is a metaheuristic that starts with an initial solution and then iteratively tries to find better solutions by alternately destroying and prioritized searching for a current solution. Due to the variability of neighborhoods, LNPS allows for flexible search w…
▽ More
We propose Large Neighborhood Prioritized Search (LNPS) for solving combinatorial optimization problems in Answer Set Programming (ASP). LNPS is a metaheuristic that starts with an initial solution and then iteratively tries to find better solutions by alternately destroying and prioritized searching for a current solution. Due to the variability of neighborhoods, LNPS allows for flexible search without strongly depending on the destroy operators. We present an implementation of LNPS based on ASP. The resulting heulingo solver demonstrates that LNPS can significantly enhance the solving performance of ASP for optimization. Furthermore, we establish the competitiveness of our LNPS approach by empirically contrasting it to (adaptive) large neighborhood search.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
MOD-CL: Multi-label Object Detection with Constrained Loss
Authors:
Sota Moriyama,
Koji Watanabe,
Katsumi Inoue,
Akihiro Takemura
Abstract:
We introduce MOD-CL, a multi-label object detection framework that utilizes constrained loss in the training process to produce outputs that better satisfy the given requirements. In this paper, we use $\mathrm{MOD_{YOLO}}$, a multi-label object detection model built upon the state-of-the-art object detection model YOLOv8, which has been published in recent years. In Task 1, we introduce the Corre…
▽ More
We introduce MOD-CL, a multi-label object detection framework that utilizes constrained loss in the training process to produce outputs that better satisfy the given requirements. In this paper, we use $\mathrm{MOD_{YOLO}}$, a multi-label object detection model built upon the state-of-the-art object detection model YOLOv8, which has been published in recent years. In Task 1, we introduce the Corrector Model and Blender Model, two new models that follow after the object detection process, aiming to generate a more constrained output. For Task 2, constrained losses have been incorporated into the $\mathrm{MOD_{YOLO}}$ architecture using Product T-Norm. The results show that these implementations are instrumental to improving the scores for both Task 1 and Task 2.
△ Less
Submitted 31 January, 2024;
originally announced March 2024.
-
Multilingual Turn-taking Prediction Using Voice Activity Projection
Authors:
Koji Inoue,
Bing'er Jiang,
Erik Ekstedt,
Tatsuya Kawahara,
Gabriel Skantze
Abstract:
This paper investigates the application of voice activity projection (VAP), a predictive turn-taking model for spoken dialogue, on multilingual data, encompassing English, Mandarin, and Japanese. The VAP model continuously predicts the upcoming voice activities of participants in dyadic dialogue, leveraging a cross-attention Transformer to capture the dynamic interplay between participants. The re…
▽ More
This paper investigates the application of voice activity projection (VAP), a predictive turn-taking model for spoken dialogue, on multilingual data, encompassing English, Mandarin, and Japanese. The VAP model continuously predicts the upcoming voice activities of participants in dyadic dialogue, leveraging a cross-attention Transformer to capture the dynamic interplay between participants. The results show that a monolingual VAP model trained on one language does not make good predictions when applied to other languages. However, a multilingual model, trained on all three languages, demonstrates predictive performance on par with monolingual models across all languages. Further analyses show that the multilingual model has learned to discern the language of the input signal. We also analyze the sensitivity to pitch, a prosodic cue that is thought to be important for turn-taking. Finally, we compare two different audio encoders, contrastive predictive coding (CPC) pre-trained on English, with a recent model based on multilingual wav2vec 2.0 (MMS).
△ Less
Submitted 14 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
SFQ counter-based precomputation for large-scale cryogenic VQE machines
Authors:
Yosuke Ueno,
Satoshi Imamura,
Yuna Tomida,
Teruo Tanimoto,
Masamitsu Tanaka,
Yutaka Tabuchi,
Koji Inoue,
Hiroshi Nakamura
Abstract:
The variational quantum eigensolver (VQE) is a promising candidate that brings practical benefits from quantum computing. However, the required bandwidth in/out of a cryostat is a limiting factor to scale cryogenic quantum computers. We propose a tailored counter-based module with single flux quantum circuits in 4-K stage which precomputes a part of VQE calculation and reduces the amount of inter-…
▽ More
The variational quantum eigensolver (VQE) is a promising candidate that brings practical benefits from quantum computing. However, the required bandwidth in/out of a cryostat is a limiting factor to scale cryogenic quantum computers. We propose a tailored counter-based module with single flux quantum circuits in 4-K stage which precomputes a part of VQE calculation and reduces the amount of inter-temperature communication. The evaluation shows that our system reduces the required bandwidth by 97%, and with this drastic reduction, total power consumption is reduced by 93% in the case where 277 VQE programs are executed in parallel on a 10000-qubit machine.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Evaluation of a semi-autonomous attentive listening system with takeover prompting
Authors:
Haruki Kawai,
Divesh Lala,
Koji Inoue,
Keiko Ochi,
Tatsuya Kawahara
Abstract:
The handling of communication breakdowns and loss of engagement is an important aspect of spoken dialogue systems, particularly for chatting systems such as attentive listening, where the user is mostly speaking. We presume that a human is best equipped to handle this task and rescue the flow of conversation. To this end, we propose a semi-autonomous system, where a remote operator can take contro…
▽ More
The handling of communication breakdowns and loss of engagement is an important aspect of spoken dialogue systems, particularly for chatting systems such as attentive listening, where the user is mostly speaking. We presume that a human is best equipped to handle this task and rescue the flow of conversation. To this end, we propose a semi-autonomous system, where a remote operator can take control of an autonomous attentive listening system in real-time. In order to make human intervention easy and consistent, we introduce automatic detection of low interest and engagement to provide explicit takeover prompts to the remote operator. We implement this semi-autonomous system which detects takeover points for the operator and compare it to fully tele-operated and fully autonomous attentive listening systems. We find that the semi-autonomous system is generally perceived more positively than the autonomous system. The results suggest that identifying points of conversation when the user starts to lose interest may help us improve a fully autonomous dialogue system.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Acknowledgment of Emotional States: Generating Validating Responses for Empathetic Dialogue
Authors:
Zi Haur Pang,
Yahui Fu,
Divesh Lala,
Keiko Ochi,
Koji Inoue,
Tatsuya Kawahara
Abstract:
In the realm of human-AI dialogue, the facilitation of empathetic responses is important. Validation is one of the key communication techniques in psychology, which entails recognizing, understanding, and acknowledging others' emotional states, thoughts, and actions. This study introduces the first framework designed to engender empathetic dialogue with validating responses. Our approach incorpora…
▽ More
In the realm of human-AI dialogue, the facilitation of empathetic responses is important. Validation is one of the key communication techniques in psychology, which entails recognizing, understanding, and acknowledging others' emotional states, thoughts, and actions. This study introduces the first framework designed to engender empathetic dialogue with validating responses. Our approach incorporates a tripartite module system: 1) validation timing detection, 2) users' emotional state identification, and 3) validating response generation. Utilizing Japanese EmpatheticDialogues dataset - a textual-based dialogue dataset consisting of 8 emotional categories from Plutchik's wheel of emotions - the Task Adaptive Pre-Training (TAPT) BERT-based model outperforms both random baseline and the ChatGPT performance, in term of F1-score, in all modules. Further validation of our model's efficacy is confirmed in its application to the TUT Emotional Storytelling Corpus (TESC), a speech-based dialogue dataset, by surpassing both random baseline and the ChatGPT. This consistent performance across both textual and speech-based dialogues underscores the effectiveness of our framework in fostering empathetic human-AI communication.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Real-time and Continuous Turn-taking Prediction Using Voice Activity Projection
Authors:
Koji Inoue,
Bing'er Jiang,
Erik Ekstedt,
Tatsuya Kawahara,
Gabriel Skantze
Abstract:
A demonstration of a real-time and continuous turn-taking prediction system is presented. The system is based on a voice activity projection (VAP) model, which directly maps dialogue stereo audio to future voice activities. The VAP model includes contrastive predictive coding (CPC) and self-attention transformers, followed by a cross-attention transformer. We examine the effect of the input contex…
▽ More
A demonstration of a real-time and continuous turn-taking prediction system is presented. The system is based on a voice activity projection (VAP) model, which directly maps dialogue stereo audio to future voice activities. The VAP model includes contrastive predictive coding (CPC) and self-attention transformers, followed by a cross-attention transformer. We examine the effect of the input context audio length and demonstrate that the proposed system can operate in real-time with CPU settings, with minimal performance degradation.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue Systems
Authors:
Koji Inoue,
Divesh Lala,
Keiko Ochi,
Tatsuya Kawahara,
Gabriel Skantze
Abstract:
Establishing evaluation schemes for spoken dialogue systems is important, but it can also be challenging. While subjective evaluations are commonly used in user experiments, objective evaluations are necessary for research comparison and reproducibility. To address this issue, we propose a framework for indirectly but objectively evaluating systems based on users' behaviors. In this paper, to this…
▽ More
Establishing evaluation schemes for spoken dialogue systems is important, but it can also be challenging. While subjective evaluations are commonly used in user experiments, objective evaluations are necessary for research comparison and reproducibility. To address this issue, we propose a framework for indirectly but objectively evaluating systems based on users' behaviors. In this paper, to this end, we investigate the relationship between user behaviors and subjective evaluation scores in social dialogue tasks: attentive listening, job interview, and first-meeting conversation. The results reveal that in dialogue tasks where user utterances are primary, such as attentive listening and job interview, indicators like the number of utterances and words play a significant role in evaluation. Observing disfluency also can indicate the effectiveness of formal tasks, such as job interview. On the other hand, in dialogue tasks with high interactivity, such as first-meeting conversation, behaviors related to turn-taking, like average switch pause length, become more important. These findings suggest that selecting appropriate user behaviors can provide valuable insights for objective evaluation in each social dialogue task.
△ Less
Submitted 23 January, 2024; v1 submitted 9 January, 2024;
originally announced January 2024.
-
Analysis of Psychographic Indicators via LIWC and Their Correlation with CTR for Instagram Ads
Authors:
Kenjiro Inoue,
Mitsuo Yoshida
Abstract:
The online advertising industry continues to grow and accounts for over 40% of global advertising spending. Online display advertising consists of images and text, and advertisers maximize sales revenue by contacting consumers through advertisements and encouraging them to make purchases. In today's society, where products are becoming more homogenized and needs are diversifying, appealing to cons…
▽ More
The online advertising industry continues to grow and accounts for over 40% of global advertising spending. Online display advertising consists of images and text, and advertisers maximize sales revenue by contacting consumers through advertisements and encouraging them to make purchases. In today's society, where products are becoming more homogenized and needs are diversifying, appealing to consumer psychology through advertisements is becoming increasingly important. However, it is not sufficiently clear what kind of appeal influences consumer psychology. In this study, we quantified the appeal of the text in advertisements for health products and cosmetics, which were actually delivered in Instagram advertisements (one of display advertisements), by applying linguistic inquiry and word count (LIWC). The correlation between click-through rate (CTR) and the text was analyzed. The results showed that negative appeals that arouse consumer anxiety and a sense of crisis were related to CTR.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Structured World Representations in Maze-Solving Transformers
Authors:
Michael Igorevich Ivanitskiy,
Alex F. Spies,
Tilman Räuker,
Guillaume Corlouer,
Chris Mathwin,
Lucia Quirke,
Can Rager,
Rusheb Shah,
Dan Valentine,
Cecilia Diniz Behn,
Katsumi Inoue,
Samy Wu Fung
Abstract:
Transformer models underpin many recent advances in practical machine learning applications, yet understanding their internal behavior continues to elude researchers. Given the size and complexity of these models, forming a comprehensive picture of their inner workings remains a significant challenge. To this end, we set out to understand small transformer models in a more tractable setting: that…
▽ More
Transformer models underpin many recent advances in practical machine learning applications, yet understanding their internal behavior continues to elude researchers. Given the size and complexity of these models, forming a comprehensive picture of their inner workings remains a significant challenge. To this end, we set out to understand small transformer models in a more tractable setting: that of solving mazes. In this work, we focus on the abstractions formed by these models and find evidence for the consistent emergence of structured internal representations of maze topology and valid paths. We demonstrate this by showing that the residual stream of only a single token can be linearly decoded to faithfully reconstruct the entire maze. We also find that the learned embeddings of individual tokens have spatial structure. Furthermore, we take steps towards deciphering the circuity of path-following by identifying attention heads (dubbed $\textit{adjacency heads}$), which are implicated in finding valid subsequent tokens.
△ Less
Submitted 5 December, 2023;
originally announced December 2023.
-
Inter-temperature Bandwidth Reduction in Cryogenic QAOA Machines
Authors:
Yosuke Ueno,
Yuna Tomida,
Teruo Tanimoto,
Masamitsu Tanaka,
Yutaka Tabuchi,
Koji Inoue,
Hiroshi Nakamura
Abstract:
The bandwidth limit between cryogenic and room-temperature environments is a critical bottleneck in superconducting noisy intermediate-scale quantum computers. This paper presents the first trial of algorithm-aware system-level optimization to solve this issue by targeting the quantum approximate optimization algorithm. Our counter-based cryogenic architecture using single-flux quantum logic shows…
▽ More
The bandwidth limit between cryogenic and room-temperature environments is a critical bottleneck in superconducting noisy intermediate-scale quantum computers. This paper presents the first trial of algorithm-aware system-level optimization to solve this issue by targeting the quantum approximate optimization algorithm. Our counter-based cryogenic architecture using single-flux quantum logic shows exponential bandwidth reduction and decreases heat inflow and peripheral power consumption of inter-temperature cables, which contributes to the scalability of superconducting quantum computers.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Who Made This Copy? An Empirical Analysis of Code Clone Authorship
Authors:
Reishi Yokomori,
Katsuro Inoue
Abstract:
Code clones are code snippets that are identical or similar to other snippets within the same or different files. They are often created through copy-and-paste practices during development and maintenance activities. Since code clones may require consistent updates and coherent management, they present a challenging issue in software maintenance. Therefore, many studies have been conducted to find…
▽ More
Code clones are code snippets that are identical or similar to other snippets within the same or different files. They are often created through copy-and-paste practices during development and maintenance activities. Since code clones may require consistent updates and coherent management, they present a challenging issue in software maintenance. Therefore, many studies have been conducted to find various types of clones with accuracy, scalability, or performance. However, the exploration of the nature of code clones has been limited. Even the fundamental question of whether code snippets in the same clone set were written by the same author or different authors has not been thoroughly investigated.
In this paper, we investigate the characteristics of code clones with a focus on authorship. We analyzed the authorship of code clones at the line-level granularity for Java files in 153 Apache projects stored on GitHub and addressed three research questions.
Based on these research questions, we found that there are a substantial number of clone lines across all projects (an average of 18.5\% for all projects). Furthermore, authors who contribute to many non-clone lines also contribute to many clone lines. Additionally, we found that one-third of clone sets are primarily contributed to by multiple leading authors.
These results confirm our intuitive understanding of clone characteristics, although no previous publications have provided empirical validation data from multiple projects. As the results could assist in designing better clone management techniques, we will explore the implications of developing an effective clone management tool.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors
Authors:
Koji Inoue,
Divesh Lala,
Keiko Ochi,
Tatsuya Kawahara,
Gabriel Skantze
Abstract:
This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evalu…
▽ More
This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors. In this study, our main focus is on assessing the human-likeness of the robot as the primary evaluation metric. While previous research often relied on subjective evaluations from users, our approach aims to evaluate the robot's human-likeness based on observable user behaviors indirectly, thus enhancing objectivity and reproducibility. To begin, we created an annotated dataset of human-likeness scores, utilizing user behaviors found in an attentive listening dialogue corpus. We then conducted an analysis to determine the correlation between multimodal user behaviors and human-likeness scores, demonstrating the feasibility of our proposed behavior-based evaluation method.
△ Less
Submitted 25 September, 2023; v1 submitted 21 August, 2023;
originally announced August 2023.
-
Reasoning before Responding: Integrating Commonsense-based Causality Explanation for Empathetic Response Generation
Authors:
Yahui Fu,
Koji Inoue,
Chenhui Chu,
Tatsuya Kawahara
Abstract:
Recent approaches to empathetic response generation try to incorporate commonsense knowledge or reasoning about the causes of emotions to better understand the user's experiences and feelings. However, these approaches mainly focus on understanding the causalities of context from the user's perspective, ignoring the system's perspective. In this paper, we propose a commonsense-based causality expl…
▽ More
Recent approaches to empathetic response generation try to incorporate commonsense knowledge or reasoning about the causes of emotions to better understand the user's experiences and feelings. However, these approaches mainly focus on understanding the causalities of context from the user's perspective, ignoring the system's perspective. In this paper, we propose a commonsense-based causality explanation approach for diverse empathetic response generation that considers both the user's perspective (user's desires and reactions) and the system's perspective (system's intentions and reactions). We enhance ChatGPT's ability to reason for the system's perspective by integrating in-context learning with commonsense knowledge. Then, we integrate the commonsense-based causality explanation with both ChatGPT and a T5-based model. Experimental evaluations demonstrate that our method outperforms other comparable methods on both automatic and human evaluations.
△ Less
Submitted 5 September, 2023; v1 submitted 27 July, 2023;
originally announced August 2023.
-
Bounded Combinatorial Reconfiguration with Answer Set Programming
Authors:
Yuya Yamada,
Mutsunori Banbara,
Katsumi Inoue,
Torsten Schaub
Abstract:
We develop an approach called bounded combinatorial reconfiguration for solving combinatorial reconfiguration problems based on Answer Set Programming (ASP). The general task is to study the solution spaces of source combinatorial problems and to decide whether or not there are sequences of feasible solutions that have special properties. The resulting recongo solver covers all metrics of the solv…
▽ More
We develop an approach called bounded combinatorial reconfiguration for solving combinatorial reconfiguration problems based on Answer Set Programming (ASP). The general task is to study the solution spaces of source combinatorial problems and to decide whether or not there are sequences of feasible solutions that have special properties. The resulting recongo solver covers all metrics of the solver track in the most recent international competition on combinatorial reconfiguration (CoRe Challenge 2022). recongo ranked first in the shortest metric of the single-engine solvers track. In this paper, we present the design and implementation of bounded combinatorial reconfiguration, and present an ASP encoding of the independent set reconfiguration problem that is one of the most studied combinatorial reconfiguration problems. Finally, we present empirical analysis considering all instances of CoRe Challenge 2022.
△ Less
Submitted 20 July, 2023;
originally announced July 2023.
-
An end-to-end framework for gene expression classification by integrating a background knowledge graph: application to cancer prognosis prediction
Authors:
Kazuma Inoue,
Ryosuke Kojima,
Mayumi Kamada,
Yasushi Okuno
Abstract:
Biological data may be separated into primary data, such as gene expression, and secondary data, such as pathways and protein-protein interactions. Methods using secondary data to enhance the analysis of primary data are promising, because secondary data have background information that is not included in primary data. In this study, we proposed an end-to-end framework to integrally handle seconda…
▽ More
Biological data may be separated into primary data, such as gene expression, and secondary data, such as pathways and protein-protein interactions. Methods using secondary data to enhance the analysis of primary data are promising, because secondary data have background information that is not included in primary data. In this study, we proposed an end-to-end framework to integrally handle secondary data to construct a classification model for primary data. We applied this framework to cancer prognosis prediction using gene expression data and a biological network. Cross-validation results indicated that our model achieved higher accuracy compared with a deep neural network model without background biological network information. Experiments conducted in patient groups by cancer type showed improvement in ROC-area under the curve for many groups. Visualizations of high accuracy cancer types identified contributing genes and pathways by enrichment analysis. Known biomarkers and novel biomarker candidates were identified through these experiments.
△ Less
Submitted 29 June, 2023;
originally announced June 2023.
-
Promises and Perils of Mining Software Package Ecosystem Data
Authors:
Raula Gaikovina Kula,
Katsuro Inoue,
Christoph Treude
Abstract:
The use of third-party packages is becoming increasingly popular and has led to the emergence of large software package ecosystems with a maze of inter-dependencies. Since the reliance on these ecosystems enables developers to reduce development effort and increase productivity, it has attracted the interest of researchers: understanding the infrastructure and dynamics of package ecosystems has gi…
▽ More
The use of third-party packages is becoming increasingly popular and has led to the emergence of large software package ecosystems with a maze of inter-dependencies. Since the reliance on these ecosystems enables developers to reduce development effort and increase productivity, it has attracted the interest of researchers: understanding the infrastructure and dynamics of package ecosystems has given rise to approaches for better code reuse, automated updates, and the avoidance of vulnerabilities, to name a few examples. But the reality of these ecosystems also poses challenges to software engineering researchers, such as: How do we obtain the complete network of dependencies along with the corresponding versioning information? What are the boundaries of these package ecosystems? How do we consistently detect dependencies that are declared but not used? How do we consistently identify developers within a package ecosystem? How much of the ecosystem do we need to understand to analyse a single component? How well do our approaches generalise across different programming languages and package ecosystems? In this chapter, we review promises and perils of mining the rich data related to software package ecosystems available to software engineering researchers.
△ Less
Submitted 28 May, 2023;
originally announced June 2023.
-
Motion Capture Dataset for Practical Use of AI-based Motion Editing and Stylization
Authors:
Makito Kobayashi,
Chen-Chieh Liao,
Keito Inoue,
Sentaro Yojima,
Masafumi Takahashi
Abstract:
In this work, we proposed a new style-diverse dataset for the domain of motion style transfer. The motion dataset uses an industrial-standard human bone structure and thus is industry-ready to be plugged into 3D characters for many projects. We claim the challenges in motion style transfer and encourage future work in this domain by releasing the proposed motion dataset both to the public and the…
▽ More
In this work, we proposed a new style-diverse dataset for the domain of motion style transfer. The motion dataset uses an industrial-standard human bone structure and thus is industry-ready to be plugged into 3D characters for many projects. We claim the challenges in motion style transfer and encourage future work in this domain by releasing the proposed motion dataset both to the public and the market. We conduct a comprehensive study on motion style transfer in the experiment using the state-of-the-art method, and the results show the proposed dataset's validity for the motion style transfer task.
△ Less
Submitted 9 July, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
Towards end-to-end ASP computation
Authors:
Taisuke Sato,
Akihiro Takemura,
Katsumi Inoue
Abstract:
We propose an end-to-end approach for answer set programming (ASP) and linear algebraically compute stable models satisfying given constraints. The idea is to implement Lin-Zhao's theorem \cite{Lin04} together with constraints directly in vector spaces as numerical minimization of a cost function constructed from a matricized normal logic program, loop formulas in Lin-Zhao's theorem and constraint…
▽ More
We propose an end-to-end approach for answer set programming (ASP) and linear algebraically compute stable models satisfying given constraints. The idea is to implement Lin-Zhao's theorem \cite{Lin04} together with constraints directly in vector spaces as numerical minimization of a cost function constructed from a matricized normal logic program, loop formulas in Lin-Zhao's theorem and constraints, thereby no use of symbolic ASP or SAT solvers involved in our approach. We also propose precomputation that shrinks the program size and heuristics for loop formulas to reduce computational difficulty. We empirically test our approach with programming examples including the 3-coloring and Hamiltonian cycle problems. As our approach is purely numerical and only contains vector/matrix operations, acceleration by parallel technologies such as many-cores and GPUs is expected.
△ Less
Submitted 13 June, 2023; v1 submitted 11 June, 2023;
originally announced June 2023.
-
I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue
Authors:
Yuanchao Li,
Koji Inoue,
Leimin Tian,
Changzeng Fu,
Carlos Ishi,
Hiroshi Ishiguro,
Tatsuya Kawahara,
Catherine Lai
Abstract:
Current Spoken Dialogue Systems (SDSs) often serve as passive listeners that respond only after receiving user speech. To achieve human-like dialogue, we propose a novel future prediction architecture that allows an SDS to anticipate future affective reactions based on its current behaviors before the user speaks. In this work, we investigate two scenarios: speech and laughter. In speech, we propo…
▽ More
Current Spoken Dialogue Systems (SDSs) often serve as passive listeners that respond only after receiving user speech. To achieve human-like dialogue, we propose a novel future prediction architecture that allows an SDS to anticipate future affective reactions based on its current behaviors before the user speaks. In this work, we investigate two scenarios: speech and laughter. In speech, we propose to predict the user's future emotion based on its temporal relationship with the system's current emotion and its causal relationship with the system's current Dialogue Act (DA). In laughter, we propose to predict the occurrence and type of the user's laughter using the system's laughter behaviors in the current turn. Preliminary analysis of human-robot dialogue demonstrated synchronicity in the emotions and laughter displayed by the human and robot, as well as DA-emotion causality in their dialogue. This verifies that our architecture can contribute to the development of an anticipatory SDS.
△ Less
Submitted 17 March, 2023; v1 submitted 28 February, 2023;
originally announced March 2023.
-
Learning State Transition Rules from Hidden Layers of Restricted Boltzmann Machines
Authors:
Koji Watanabe,
Katsumi Inoue
Abstract:
Understanding the dynamics of a system is important in many scientific and engineering domains. This problem can be approached by learning state transition rules from observations using machine learning techniques. Such observed time-series data often consist of sequences of many continuous variables with noise and ambiguity, but we often need rules of dynamics that can be modeled with a few essen…
▽ More
Understanding the dynamics of a system is important in many scientific and engineering domains. This problem can be approached by learning state transition rules from observations using machine learning techniques. Such observed time-series data often consist of sequences of many continuous variables with noise and ambiguity, but we often need rules of dynamics that can be modeled with a few essential variables. In this work, we propose a method for extracting a small number of essential hidden variables from high-dimensional time-series data and for learning state transition rules between these hidden variables. The proposed method is based on the Restricted Boltzmann Machine (RBM), which treats observable data in the visible layer and latent features in the hidden layer. However, real-world data, such as video and audio, include both discrete and continuous variables, and these variables have temporal relationships. Therefore, we propose Recurrent Temporal GaussianBernoulli Restricted Boltzmann Machine (RTGB-RBM), which combines Gaussian-Bernoulli Restricted Boltzmann Machine (GB-RBM) to handle continuous visible variables, and Recurrent Temporal Restricted Boltzmann Machine (RT-RBM) to capture time dependence between discrete hidden variables. We also propose a rule-based method that extracts essential information as hidden variables and represents state transition rules in interpretable form. We conduct experiments on Bouncing Ball and Moving MNIST datasets to evaluate our proposed method. Experimental results show that our method can learn the dynamics of those physical systems as state transition rules between hidden variables and can predict unobserved future states from observed state transitions.
△ Less
Submitted 6 December, 2022;
originally announced December 2022.
-
Alzheimer's Dementia Detection through Spontaneous Dialogue with Proactive Robotic Listeners
Authors:
Yuanchao Li,
Catherine Lai,
Divesh Lala,
Koji Inoue,
Tatsuya Kawahara
Abstract:
As the aging of society continues to accelerate, Alzheimer's Disease (AD) has received more and more attention from not only medical but also other fields, such as computer science, over the past decade. Since speech is considered one of the effective ways to diagnose cognitive decline, AD detection from speech has emerged as a hot topic. Nevertheless, such approaches fail to tackle several key is…
▽ More
As the aging of society continues to accelerate, Alzheimer's Disease (AD) has received more and more attention from not only medical but also other fields, such as computer science, over the past decade. Since speech is considered one of the effective ways to diagnose cognitive decline, AD detection from speech has emerged as a hot topic. Nevertheless, such approaches fail to tackle several key issues: 1) AD is a complex neurocognitive disorder which means it is inappropriate to conduct AD detection using utterance information alone while ignoring dialogue information; 2) Utterances of AD patients contain many disfluencies that affect speech recognition yet are helpful to diagnosis; 3) AD patients tend to speak less, causing dialogue breakdown as the disease progresses. This fact leads to a small number of utterances, which may cause detection bias. Therefore, in this paper, we propose a novel AD detection architecture consisting of two major modules: an ensemble AD detector and a proactive listener. This architecture can be embedded in the dialogue system of conversational robots for healthcare.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Dialogue system with humanoid robot
Authors:
Koki Inoue,
Shuichiro Ogake,
Hayato Kawamura,
Naoki Igo
Abstract:
Today, as seen in smart speakers, spoken dialogue technology is rapidly advancing to enable human-like interaction. However, current dialogue systems cannot pay attention not only to the content of speech, but also to the way of speaking and eye contact and facial expressions, while watching the facial expressions of the person with whom one is speaking. Therefore, this study participated in a Jap…
▽ More
Today, as seen in smart speakers, spoken dialogue technology is rapidly advancing to enable human-like interaction. However, current dialogue systems cannot pay attention not only to the content of speech, but also to the way of speaking and eye contact and facial expressions, while watching the facial expressions of the person with whom one is speaking. Therefore, this study participated in a Japanese competition called the "Dialogue Robot Competition" and attempted to develop a dialogue system that includes control of not only the content of speech but also the robot's facial expressions and gaze in order to realize a humanoid robot that can naturally interact with humans.
△ Less
Submitted 18 October, 2022;
originally announced October 2022.
-
Action Languages Based Actual Causality for Computational Ethics: a Sound and Complete Implementation in ASP
Authors:
Camilo Sarmiento,
Gauvain Bourgne,
Katsumi Inoue,
Daniele Cavalli,
Jean-Gabriel Ganascia
Abstract:
Although moral responsibility is not circumscribed by causality, they are both closely intermixed. Furthermore, rationally understanding the evolution of the physical world is inherently linked with the idea of causality. Thus, the decision-making applications based on automated planning inevitably have to deal with causality, especially if they consider imputability aspects or integrate reference…
▽ More
Although moral responsibility is not circumscribed by causality, they are both closely intermixed. Furthermore, rationally understanding the evolution of the physical world is inherently linked with the idea of causality. Thus, the decision-making applications based on automated planning inevitably have to deal with causality, especially if they consider imputability aspects or integrate references to ethical norms. The many debates around causation in the last decades have shown how complex this notion is and thus, how difficult is its integration with planning. As a result, much of the work in computational ethics relegates causality to the background, despite the considerations stated above. This paper's contribution is to provide a complete and sound translation into logic programming from an actual causation definition suitable for action languages, this definition is a formalisation of Wright's NESS test. The obtained logic program allows to deal with complex causal relations. In addition to enabling agents to reason about causality, this contribution specifically enables the computational ethics domain to handle situations that were previously out of reach. In a context where ethical considerations in decision-making are increasingly important, advances in computational ethics can greatly benefit the entire AI community.
△ Less
Submitted 24 May, 2023; v1 submitted 5 May, 2022;
originally announced May 2022.
-
Physical Deep Learning with Biologically Plausible Training Method
Authors:
Mitsumasa Nakajima,
Katsuma Inoue,
Kenji Tanaka,
Yasuo Kuniyoshi,
Toshikazu Hashimoto,
Kohei Nakajima
Abstract:
The ever-growing demand for further advances in artificial intelligence motivated research on unconventional computation based on analog physical devices. While such computation devices mimic brain-inspired analog information processing, learning procedures still relies on methods optimized for digital processing such as backpropagation. Here, we present physical deep learning by extending a biolo…
▽ More
The ever-growing demand for further advances in artificial intelligence motivated research on unconventional computation based on analog physical devices. While such computation devices mimic brain-inspired analog information processing, learning procedures still relies on methods optimized for digital processing such as backpropagation. Here, we present physical deep learning by extending a biologically plausible training algorithm called direct feedback alignment. As the proposed method is based on random projection with arbitrary nonlinear activation, we can train a physical neural network without knowledge about the physical system. In addition, we can emulate and accelerate the computation for this training on a simple and scalable physical system. We demonstrate the proof-of-concept using a hierarchically connected optoelectronic recurrent neural network called deep reservoir computer. By constructing an FPGA-assisted optoelectronic benchtop, we confirmed the potential for accelerated computation with competitive performance on benchmarks. Our results provide practical solutions for the training and acceleration of neuromorphic computation.
△ Less
Submitted 1 April, 2022;
originally announced April 2022.
-
Learning First-Order Rules with Differentiable Logic Program Semantics
Authors:
Kun Gao,
Katsumi Inoue,
Yongzhi Cao,
Hanpin Wang
Abstract:
Learning first-order logic programs (LPs) from relational facts which yields intuitive insights into the data is a challenging topic in neuro-symbolic research. We introduce a novel differentiable inductive logic programming (ILP) model, called differentiable first-order rule learner (DFOL), which finds the correct LPs from relational facts by searching for the interpretable matrix representations…
▽ More
Learning first-order logic programs (LPs) from relational facts which yields intuitive insights into the data is a challenging topic in neuro-symbolic research. We introduce a novel differentiable inductive logic programming (ILP) model, called differentiable first-order rule learner (DFOL), which finds the correct LPs from relational facts by searching for the interpretable matrix representations of LPs. These interpretable matrices are deemed as trainable tensors in neural networks (NNs). The NNs are devised according to the differentiable semantics of LPs. Specifically, we first adopt a novel propositionalization method that transfers facts to NN-readable vector pairs representing interpretation pairs. We replace the immediate consequence operator with NN constraint functions consisting of algebraic operations and a sigmoid-like activation function. We map the symbolic forward-chained format of LPs into NN constraint functions consisting of operations between subsymbolic vector representations of atoms. By applying gradient descent, the trained well parameters of NNs can be decoded into precise symbolic LPs in forward-chained logic format. We demonstrate that DFOL can perform on several standard ILP datasets, knowledge bases, and probabilistic relation facts and outperform several well-known differentiable ILP models. Experimental results indicate that DFOL is a precise, robust, scalable, and computationally cheap differentiable ILP model.
△ Less
Submitted 28 April, 2022;
originally announced April 2022.
-
Generating Explainable Rule Sets from Tree-Ensemble Learning Methods by Answer Set Programming
Authors:
Akihiro Takemura,
Katsumi Inoue
Abstract:
We propose a method for generating explainable rule sets from tree-ensemble learners using Answer Set Programming (ASP). To this end, we adopt a decompositional approach where the split structures of the base decision trees are exploited in the construction of rules, which in turn are assessed using pattern mining methods encoded in ASP to extract interesting rules. We show how user-defined constr…
▽ More
We propose a method for generating explainable rule sets from tree-ensemble learners using Answer Set Programming (ASP). To this end, we adopt a decompositional approach where the split structures of the base decision trees are exploited in the construction of rules, which in turn are assessed using pattern mining methods encoded in ASP to extract interesting rules. We show how user-defined constraints and preferences can be represented declaratively in ASP to allow for transparent and flexible rule set generation, and how rules can be used as explanations to help the user better understand the models. Experimental evaluation with real-world datasets and popular tree-ensemble algorithms demonstrates that our approach is applicable to a wide range of classification tasks.
△ Less
Submitted 16 September, 2021;
originally announced September 2021.
-
Case-based Similar Image Retrieval for Weakly Annotated Large Histopathological Images of Malignant Lymphoma Using Deep Metric Learning
Authors:
Noriaki Hashimoto,
Yusuke Takagi,
Hiroki Masuda,
Hiroaki Miyoshi,
Kei Kohno,
Miharu Nagaishi,
Kensaku Sato,
Mai Takeuchi,
Takuya Furuta,
Keisuke Kawamoto,
Kyohei Yamada,
Mayuko Moritsubo,
Kanako Inoue,
Yasumasa Shimasaki,
Yusuke Ogura,
Teppei Imamoto,
Tatsuzo Mishina,
Ken Tanaka,
Yoshino Kawaguchi,
Shigeo Nakamura,
Koichi Ohshima,
Hidekata Hontani,
Ichiro Takeuchi
Abstract:
In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma. When a whole slide image (WSI) is used as an input query, it is desirable to be able to retrieve similar cases by focusing on image patches in pathologically important regions such as tumor cells. To address this problem, w…
▽ More
In the present study, we propose a novel case-based similar image retrieval (SIR) method for hematoxylin and eosin (H&E)-stained histopathological images of malignant lymphoma. When a whole slide image (WSI) is used as an input query, it is desirable to be able to retrieve similar cases by focusing on image patches in pathologically important regions such as tumor cells. To address this problem, we employ attention-based multiple instance learning, which enables us to focus on tumor-specific regions when the similarity between cases is computed. Moreover, we employ contrastive distance metric learning to incorporate immunohistochemical (IHC) staining patterns as useful supervised information for defining appropriate similarity between heterogeneous malignant lymphoma cases. In the experiment with 249 malignant lymphoma patients, we confirmed that the proposed method exhibited higher evaluation measures than the baseline case-based SIR methods. Furthermore, the subjective evaluation by pathologists revealed that our similarity measure using IHC staining patterns is appropriate for representing the similarity of H&E-stained tissue images for malignant lymphoma.
△ Less
Submitted 27 January, 2023; v1 submitted 8 July, 2021;
originally announced July 2021.
-
Word-level Sign Language Recognition with Multi-stream Neural Networks Focusing on Local Regions
Authors:
Mizuki Maruyama,
Shuvozit Ghose,
Katsufumi Inoue,
Partha Pratim Roy,
Masakazu Iwamura,
Michifumi Yoshioka
Abstract:
In recent years, Word-level Sign Language Recognition (WSLR) research has gained popularity in the computer vision community, and thus various approaches have been proposed. Among these approaches, the method using I3D network achieves the highest recognition accuracy on large public datasets for WSLR. However, the method with I3D only utilizes appearance information of the upper body of the signe…
▽ More
In recent years, Word-level Sign Language Recognition (WSLR) research has gained popularity in the computer vision community, and thus various approaches have been proposed. Among these approaches, the method using I3D network achieves the highest recognition accuracy on large public datasets for WSLR. However, the method with I3D only utilizes appearance information of the upper body of the signers to recognize sign language words. On the other hand, in WSLR, the information of local regions, such as the hand shape and facial expression, and the positional relationship among the body and both hands are important. Thus in this work, we utilized local region images of both hands and face, along with skeletal information to capture local information and the positions of both hands relative to the body, respectively. In other words, we propose a novel multi-stream WSLR framework, in which a stream with local region images and a stream with skeletal information are introduced by extending I3D network to improve the recognition accuracy of WSLR. From the experimental results on WLASL dataset, it is evident that the proposed method has achieved about 15% improvement in the Top-1 accuracy than the existing conventional methods.
△ Less
Submitted 30 June, 2021;
originally announced June 2021.
-
Influences on Drivers' Understandings of Systems by Presenting Image Recognition Results
Authors:
Bo Yang,
Koichiro Inoue,
Satoshi Kitazaki,
Kimihiko Nakano
Abstract:
It is essential to help drivers have appropriate understandings of level 2 automated driving systems for keeping driving safety. A human machine interface (HMI) was proposed to present real time results of image recognition by the automated driving systems to drivers. It was expected that drivers could better understand the capabilities of the systems by observing the proposed HMI. Driving simulat…
▽ More
It is essential to help drivers have appropriate understandings of level 2 automated driving systems for keeping driving safety. A human machine interface (HMI) was proposed to present real time results of image recognition by the automated driving systems to drivers. It was expected that drivers could better understand the capabilities of the systems by observing the proposed HMI. Driving simulator experiments with 18 participants were preformed to evaluate the effectiveness of the proposed system. Experimental results indicated that the proposed HMI could effectively inform drivers of potential risks continuously and help drivers better understand the level 2 automated driving systems.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
Transient Chaos in BERT
Authors:
Katsuma Inoue,
Soh Ohara,
Yasuo Kuniyoshi,
Kohei Nakajima
Abstract:
Language is an outcome of our complex and dynamic human-interactions and the technique of natural language processing (NLP) is hence built on human linguistic activities. Bidirectional Encoder Representations from Transformers (BERT) has recently gained its popularity by establishing the state-of-the-art scores in several NLP benchmarks. A Lite BERT (ALBERT) is literally characterized as a lightwe…
▽ More
Language is an outcome of our complex and dynamic human-interactions and the technique of natural language processing (NLP) is hence built on human linguistic activities. Bidirectional Encoder Representations from Transformers (BERT) has recently gained its popularity by establishing the state-of-the-art scores in several NLP benchmarks. A Lite BERT (ALBERT) is literally characterized as a lightweight version of BERT, in which the number of BERT parameters is reduced by repeatedly applying the same neural network called Transformer's encoder layer. By pre-training the parameters with a massive amount of natural language data, ALBERT can convert input sentences into versatile high-dimensional vectors potentially capable of solving multiple NLP tasks. In that sense, ALBERT can be regarded as a well-designed high-dimensional dynamical system whose operator is the Transformer's encoder, and essential structures of human language are thus expected to be encapsulated in its dynamics. In this study, we investigated the embedded properties of ALBERT to reveal how NLP tasks are effectively solved by exploiting its dynamics. We thereby aimed to explore the nature of human language from the dynamical expressions of the NLP model. Our short-term analysis clarified that the pre-trained model stably yields trajectories with higher dimensionality, which would enhance the expressive capacity required for NLP tasks. Also, our long-term analysis revealed that ALBERT intrinsically shows transient chaos, a typical nonlinear phenomenon showing chaotic dynamics only in its transient, and the pre-trained ALBERT model tends to produce the chaotic trajectory for a significantly longer time period compared to a randomly-initialized one. Our results imply that local chaoticity would contribute to improving NLP performance, uncovering a novel aspect in the role of chaotic dynamics in human language behaviors.
△ Less
Submitted 8 June, 2021; v1 submitted 6 June, 2021;
originally announced June 2021.
-
Intelligent Conversational Android ERICA Applied to Attentive Listening and Job Interview
Authors:
Tatsuya Kawahara,
Koji Inoue,
Divesh Lala
Abstract:
Following the success of spoken dialogue systems (SDS) in smartphone assistants and smart speakers, a number of communicative robots are developed and commercialized. Compared with the conventional SDSs designed as a human-machine interface, interaction with robots is expected to be in a closer manner to talking to a human because of the anthropomorphism and physical presence. The goal or task of…
▽ More
Following the success of spoken dialogue systems (SDS) in smartphone assistants and smart speakers, a number of communicative robots are developed and commercialized. Compared with the conventional SDSs designed as a human-machine interface, interaction with robots is expected to be in a closer manner to talking to a human because of the anthropomorphism and physical presence. The goal or task of dialogue may not be information retrieval, but the conversation itself. In order to realize human-level "long and deep" conversation, we have developed an intelligent conversational android ERICA. We set up several social interaction tasks for ERICA, including attentive listening, job interview, and speed dating. To allow for spontaneous, incremental multiple utterances, a robust turn-taking model is implemented based on TRP (transition-relevance place) prediction, and a variety of backchannels are generated based on time frame-wise prediction instead of IPU-based prediction. We have realized an open-domain attentive listening system with partial repeats and elaborating questions on focus words as well as assessment responses. It has been evaluated with 40 senior people, engaged in conversation of 5-7 minutes without a conversation breakdown. It was also compared against the WOZ setting. We have also realized a job interview system with a set of base questions followed by dynamic generation of elaborating questions. It has also been evaluated with student subjects, showing promising results.
△ Less
Submitted 2 May, 2021;
originally announced May 2021.
-
Enhancing Linear Algebraic Computation of Logic Programs Using Sparse Representation
Authors:
Tuan Nguyen Quoc,
Katsumi Inoue,
Chiaki Sakama
Abstract:
Algebraic characterization of logic programs has received increasing attention in recent years. Researchers attempt to exploit connections between linear algebraic computation and symbolic computation in order to perform logical inference in large scale knowledge bases. This paper proposes further improvement by using sparse matrices to embed logic programs in vector spaces. We show its great powe…
▽ More
Algebraic characterization of logic programs has received increasing attention in recent years. Researchers attempt to exploit connections between linear algebraic computation and symbolic computation in order to perform logical inference in large scale knowledge bases. This paper proposes further improvement by using sparse matrices to embed logic programs in vector spaces. We show its great power of computation in reaching the fixpoint of the immediate consequence operator from the initial vector. In particular, performance for computing the least models of definite programs is dramatically improved in this way. We also apply the method to the computation of stable models of normal programs, in which the guesses are associated with initial matrices, and verify its effect when there are small numbers of negation. These results show good enhancement in terms of performance for computing consequences of programs and depict the potential power of tensorized logic programs.
△ Less
Submitted 21 September, 2020;
originally announced September 2020.
-
Code Clone Matching: A Practical and Effective Approach to Find Code Snippets
Authors:
Katsuro Inoue,
Yuya Miyamoto,
Daniel M. German,
Takashi Ishio
Abstract:
Finding the same or similar code snippets in source code is one of fundamental activities in software maintenance. Text-based pattern matching tools such as grep is frequently used for such purpose, but making proper queries for the expected result is not easy. Code clone detectors could be used but their features and result are generally excessive. In this paper, we propose Code Clone matching (C…
▽ More
Finding the same or similar code snippets in source code is one of fundamental activities in software maintenance. Text-based pattern matching tools such as grep is frequently used for such purpose, but making proper queries for the expected result is not easy. Code clone detectors could be used but their features and result are generally excessive. In this paper, we propose Code Clone matching (CC matching for short) that employs a combination of token-based clone detection and meta-patterns enhanced with meta-tokens. The user simply gives a query code snippet possibly with a few meta-tokens and then gets the resulting snippets, forming type 1, 2, or 3 code clone pairs between the query and result. By using a code snippet with meta-tokens as the query, the resulting matches are well controlled by the users. CC matching has been implemented as a practical and efficient tool named ccgrep, with grep-like user interface. The evaluation shows that ccgrep~ is a very effective to find various kinds of code snippets.
△ Less
Submitted 12 March, 2020;
originally announced March 2020.
-
Designing spontaneous behavioral switching via chaotic itinerancy
Authors:
Katsuma Inoue,
Kohei Nakajima,
Yasuo Kuniyoshi
Abstract:
Chaotic itinerancy is a frequently observed phenomenon in high-dimensional and nonlinear dynamical systems, and it is characterized by the random transitions among multiple quasi-attractors. Several studies have revealed that chaotic itinerancy has been observed in brain activity, and it is considered to play a critical role in the spontaneous, stable behavior generation of animals. Thus, chaotic…
▽ More
Chaotic itinerancy is a frequently observed phenomenon in high-dimensional and nonlinear dynamical systems, and it is characterized by the random transitions among multiple quasi-attractors. Several studies have revealed that chaotic itinerancy has been observed in brain activity, and it is considered to play a critical role in the spontaneous, stable behavior generation of animals. Thus, chaotic itinerancy is a topic of great interest, particularly for neurorobotics researchers who wish to understand and implement autonomous behavioral controls for agents. However, it is generally difficult to gain control over high-dimensional nonlinear dynamical systems. Hence, the implementation of chaotic itinerancy has mainly been accomplished heuristically. In this study, we propose a novel way of implementing chaotic itinerancy reproducibly and at will in a generic high-dimensional chaotic system. In particular, we demonstrate that our method enables us to easily design both the trajectories of quasi-attractors and the transition rules among them simply by adjusting the limited number of system parameters and by utilizing the intrinsic high-dimensional chaos. Finally, we quantitatively discuss the validity and scope of application through the results of several numerical experiments.
△ Less
Submitted 13 February, 2020;
originally announced February 2020.
-
ESPnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit
Authors:
Tomoki Hayashi,
Ryuichi Yamamoto,
Katsuki Inoue,
Takenori Yoshimura,
Shinji Watanabe,
Tomoki Toda,
Kazuya Takeda,
Yu Zhang,
Xu Tan
Abstract:
This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit supports state-of-the-art E2E-TTS models, including Tacotron~2, Transformer TTS, and FastSpeech, and also provides recipes inspired by the Kaldi automatic speech recognition (ASR) toolkit. The recipes are based on the desig…
▽ More
This paper introduces a new end-to-end text-to-speech (E2E-TTS) toolkit named ESPnet-TTS, which is an extension of the open-source speech processing toolkit ESPnet. The toolkit supports state-of-the-art E2E-TTS models, including Tacotron~2, Transformer TTS, and FastSpeech, and also provides recipes inspired by the Kaldi automatic speech recognition (ASR) toolkit. The recipes are based on the design unified with the ESPnet ASR recipe, providing high reproducibility. The toolkit also provides pre-trained models and samples of all of the recipes so that users can use it as a baseline. Furthermore, the unified design enables the integration of ASR functions with TTS, e.g., ASR-based objective evaluation and semi-supervised learning with both ASR and TTS models. This paper describes the design of the toolkit and experimental evaluation in comparison with other toolkits. The experimental results show that our models can achieve state-of-the-art performance comparable to the other latest toolkits, resulting in a mean opinion score (MOS) of 4.25 on the LJSpeech dataset. The toolkit is publicly available at https://github.com/espnet/espnet.
△ Less
Submitted 16 February, 2020; v1 submitted 24 October, 2019;
originally announced October 2019.
-
Active learning for level set estimation under cost-dependent input uncertainty
Authors:
Yu Inatsu,
Masayuki Karasuyama,
Keiichi Inoue,
Ichiro Takeuchi
Abstract:
As part of a quality control process in manufacturing it is often necessary to test whether all parts of a product satisfy a required property, with as few inspections as possible. When multiple inspection apparatuses with different costs and precision exist, it is desirable that testing can be carried out cost-effectively by properly controlling the trade-off between the costs and the precision.…
▽ More
As part of a quality control process in manufacturing it is often necessary to test whether all parts of a product satisfy a required property, with as few inspections as possible. When multiple inspection apparatuses with different costs and precision exist, it is desirable that testing can be carried out cost-effectively by properly controlling the trade-off between the costs and the precision. In this paper, we formulate this as a level set estimation (LSE) problem under cost-dependent input uncertainty - LSE being a type of active learning for estimating the level set, i.e., the subset of the input space in which an unknown function value is greater or smaller than a pre-determined threshold. Then, we propose a new algorithm for LSE under cost-dependent input uncertainty with theoretical convergence guarantee. We demonstrate the effectiveness of the proposed algorithm by applying it to synthetic and real datasets.
△ Less
Submitted 13 September, 2019;
originally announced September 2019.
-
Partial Evaluation of Logic Programs in Vector Spaces
Authors:
Chiaki Sakama,
Hien D. Nguyen,
Taisuke Sato,
Katsumi Inoue
Abstract:
In this paper, we introduce methods of encoding propositional logic programs in vector spaces. Interpretations are represented by vectors and programs are represented by matrices. The least model of a definite program is computed by multiplying an interpretation vector and a program matrix. To optimize computation in vector spaces, we provide a method of partial evaluation of programs using linear…
▽ More
In this paper, we introduce methods of encoding propositional logic programs in vector spaces. Interpretations are represented by vectors and programs are represented by matrices. The least model of a definite program is computed by multiplying an interpretation vector and a program matrix. To optimize computation in vector spaces, we provide a method of partial evaluation of programs using linear algebra. Partial evaluation is done by unfolding rules in a program, and it is realized in a vector space by multiplying program matrices. We perform experiments using randomly generated programs and show that partial evaluation has potential for realizing efficient computation in huge scale of programs.
△ Less
Submitted 28 November, 2018;
originally announced November 2018.
-
Understanding Fake Faces
Authors:
Ryota Natsume,
Kazuki Inoue,
Yoshihiro Fukuhara,
Shintaro Yamamoto,
Shigeo Morishima,
Hirokatsu Kataoka
Abstract:
Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, "Face understanding o…
▽ More
Face recognition research is one of the most active topics in computer vision (CV), and deep neural networks (DNN) are now filling the gap between human-level and computer-driven performance levels in face verification algorithms. However, although the performance gap appears to be narrowing in terms of accuracy-based expectations, a curious question has arisen; specifically, "Face understanding of AI is really close to that of human?" In the present study, in an effort to confirm the brain-driven concept, we conduct image-based detection, classification, and generation using an in-house created fake face database. This database has two configurations: (i) false positive face detections produced using both the Viola Jones (VJ) method and convolutional neural networks (CNN), and (ii) simulacra that have fundamental characteristics that resemble faces but are completely artificial. The results show a level of suggestive knowledge that indicates the continuing existence of a gap between the capabilities of recent vision-based face recognition algorithms and human-level performance. On a positive note, however, we have obtained knowledge that will advance the progress of face-understanding models.
△ Less
Submitted 22 September, 2018;
originally announced September 2018.
-
A 1.2-V 162.9-pJ/cycle Bitmap Index Creation Core with 0.31-pW/bit Standby Power on 65-nm SOTB
Authors:
Xuan-Thuan Nguyen,
Trong-Thuc Hoang,
Hong-Thu Nguyen,
Katsumi Inoue,
Cong-Kha Pham
Abstract:
The ability to maximize the performance during peak workload hours and minimize the power consumption during off-peak time plays a significant role in the energy-efficient systems. Our previous work has proposed a high-performance multi-core bitmap index creator (BIC) in a field-programmable gate array that could deliver higher indexing throughput than central processing units and graphics process…
▽ More
The ability to maximize the performance during peak workload hours and minimize the power consumption during off-peak time plays a significant role in the energy-efficient systems. Our previous work has proposed a high-performance multi-core bitmap index creator (BIC) in a field-programmable gate array that could deliver higher indexing throughput than central processing units and graphics processing units. This brief extends the previous study by focusing on the application-specific integrated circuit implementation of the proposed BIC in a 65-nm silicon-on-thin-buried-oxide (SOTB) CMOS process. The BIC chip can operate with different supply voltage from 0.4 V to 1.2 V. In the active mode with the supply voltage of 1.2 V, the BIC chip is fully operational at 41 MHz and consumes 162.9 pJ/cycle. In the standby mode with the supply voltage of 0.4 V and clock-gating technique, the power consumption was reduced to 10.6 uW. The standby power is also dramatically reduced to 2.64 nW due to the utilization of reverse back-gate biasing technique. This achievement is considerable importance to the energy-efficient systems.
△ Less
Submitted 18 June, 2018;
originally announced June 2018.
-
Exploiting Answer Set Programming with External Sources for Meta-Interpretive Learning
Authors:
Tobias Kaminski,
Thomas Eiter,
Katsumi Inoue
Abstract:
Meta-Interpretive Learning (MIL) learns logic programs from examples by instantiating meta-rules, which is implemented by the Metagol system based on Prolog. Viewing MIL-problems as combinatorial search problems, they can alternatively be solved by employing Answer Set Programming (ASP), which may result in performance gains as a result of efficient conflict propagation. However, a straightforward…
▽ More
Meta-Interpretive Learning (MIL) learns logic programs from examples by instantiating meta-rules, which is implemented by the Metagol system based on Prolog. Viewing MIL-problems as combinatorial search problems, they can alternatively be solved by employing Answer Set Programming (ASP), which may result in performance gains as a result of efficient conflict propagation. However, a straightforward ASP-encoding of MIL results in a huge search space due to a lack of procedural bias and the need for grounding. To address these challenging issues, we encode MIL in the HEX-formalism, which is an extension of ASP that allows us to outsource the background knowledge, and we restrict the search space to compensate for a procedural bias in ASP. This way, the import of constants from the background knowledge can for a given type of meta-rules be limited to relevant ones. Moreover, by abstracting from term manipulations in the encoding and by exploiting the HEX interface mechanism, the import of such constants can be entirely avoided in order to mitigate the grounding bottleneck. An experimental evaluation shows promising results.
△ Less
Submitted 30 April, 2018;
originally announced May 2018.
-
An Efficient I/O Architecture for RAM-based Content-Addressable Memory on FPGA
Authors:
Xuan-Thuan Nguyen,
Trong-Thuc Hoang,
Hong-Thu Nguyen,
Katsumi Inoue,
Cong-Kha Pham
Abstract:
Despite the impressive search rate of one key per clock cycle, the update stage of a random-access-memory-based content-addressable-memory (RAM-based CAM) always suffers high latency. Two primary causes of such latency include: (1) the compulsory erasing stage along with the writing stage and (2) the major difference in data width between the RAM-based CAM (e.g., 8-bit width) and the modern system…
▽ More
Despite the impressive search rate of one key per clock cycle, the update stage of a random-access-memory-based content-addressable-memory (RAM-based CAM) always suffers high latency. Two primary causes of such latency include: (1) the compulsory erasing stage along with the writing stage and (2) the major difference in data width between the RAM-based CAM (e.g., 8-bit width) and the modern systems (e.g., 256-bit width). This brief, therefore, aims for an efficient input/output (I/O) architecture of RAM-based binary CAM (RCAM) for low-latency update. To achieve this goal, three techniques, namely centralized erase RAM, bit-sliced, and hierarchical-partitioning, are proposed to eliminate the latency of erasing stage, as well as to allow RCAM to exploit the bandwidth of modern systems effectively. Several RCAMs, whose data width ranges from 8 bits to 64 bits, were integrated into a 256-bit system for the evaluation. The experimental results in an Intel Arria V 5ASTFD5 FPGA prove that at 100 MHz, the proposed designs achieve at least 9.6 times higher I/O efficiency as compared to the traditional RCAM.
△ Less
Submitted 26 June, 2018; v1 submitted 27 March, 2018;
originally announced April 2018.
-
An FPGA-Based Hardware Accelerator for Energy-Efficient Bitmap Index Creation
Authors:
Xuan-Thuan Nguyen,
Trong-Thuc Hoang,
Hong-Thu Nguyen,
Katsumi Inoue,
Cong-Kha Pham
Abstract:
Bitmap index is recognized as a promising candidate for online analytics processing systems, because it effectively supports not only parallel processing but also complex and multi-dimensional queries. However, bitmap index creation is a time-consuming task. In this study, by taking full advantage of massive parallel computing of field-programmable gate array (FPGA), two hardware accelerators of b…
▽ More
Bitmap index is recognized as a promising candidate for online analytics processing systems, because it effectively supports not only parallel processing but also complex and multi-dimensional queries. However, bitmap index creation is a time-consuming task. In this study, by taking full advantage of massive parallel computing of field-programmable gate array (FPGA), two hardware accelerators of bitmap index creation, namely BIC64K8 and BIC32K16, are originally proposed. Each of the accelerator contains two primary components, namely an enhanced content-addressable memory and a query logic array module, which allow BIC64K8 and BIC32K16 to index 65,536 8-bit words and 32,768 16-bit words in parallel, at every clock cycle. The experimental results on an Intel Arria V 5ASTFD5 FPGA prove that at 100 MHz, BIC64K8 and BIC32K16 achieve the approximate indexing throughput of 1.43 GB/s and 1.46 GB/s, respectively. The throughputs are also proven to be stable, regardless the size of the data sets. More significantly, BIC32K16 only consumes as low as 6.76% and 3.28% of energy compared to the central-processing-unit- and graphic-processing-unit-based designs, respectively.
△ Less
Submitted 20 April, 2018; v1 submitted 13 March, 2018;
originally announced March 2018.
-
Detection of social signals for recognizing engagement in human-robot interaction
Authors:
Divesh Lala,
Koji Inoue,
Pierrick Milhorat,
Tatsuya Kawahara
Abstract:
Detection of engagement during a conversation is an important function of human-robot interaction. The level of user engagement can influence the dialogue strategy of the robot. Our motivation in this work is to detect several behaviors which will be used as social signal inputs for a real-time engagement recognition model. These behaviors are nodding, laughter, verbal backchannels and eye gaze. W…
▽ More
Detection of engagement during a conversation is an important function of human-robot interaction. The level of user engagement can influence the dialogue strategy of the robot. Our motivation in this work is to detect several behaviors which will be used as social signal inputs for a real-time engagement recognition model. These behaviors are nodding, laughter, verbal backchannels and eye gaze. We describe models of these behaviors which have been learned from a large corpus of human-robot interactions with the android robot ERICA. Input data to the models comes from a Kinect sensor and a microphone array. Using our engagement recognition model, we can achieve reasonable performance using the inputs from automatic social signal detection, compared to using manual annotation as input.
△ Less
Submitted 29 September, 2017;
originally announced September 2017.
-
An Empirical Study on the Impact of Refactoring Activities on Evolving Client-Used APIs
Authors:
Raula Gaikovina Kula,
Ali Ouni,
Daniel M. German,
Katsuro Inoue
Abstract:
Context: Refactoring is recognized as an effective practice to maintain evolving software systems. For software libraries, we study how library developers refactor their Application Programming Interfaces (APIs), especially when it impacts client users by breaking an API of the library. Objective: Our work aims to understand how clients that use a library API are affected by refactoring activities…
▽ More
Context: Refactoring is recognized as an effective practice to maintain evolving software systems. For software libraries, we study how library developers refactor their Application Programming Interfaces (APIs), especially when it impacts client users by breaking an API of the library. Objective: Our work aims to understand how clients that use a library API are affected by refactoring activities. We target popular libraries that potentially impact more library client users. Method: We distinguish between library APIs based on their client-usage (refereed to as client-used APIs) in order to understand the extent to which API breakages relate to refactorings. Our tool-based approach allows for a large-scale study across eight libraries (i.e., totaling 183 consecutive versions) with around 900 clients projects. Results: We find that library maintainers are less likely to break client-used API classes. Quantitatively, we find that refactoring activities break less than 37% of all client-used APIs. In a more qualitative analysis, we show two documented cases of where non-refactoring API breaking changes are motivated other maintenance issues (i.e., bug fix and new features) and involve more complex refactoring operations. Conclusion: Using our automated approach, we find that library developers are less likely to break APIs and tend to break client-used APIs when addressing these maintenance issues.
△ Less
Submitted 28 September, 2017; v1 submitted 27 September, 2017;
originally announced September 2017.
-
On the Impact of Micro-Packages: An Empirical Study of the npm JavaScript Ecosystem
Authors:
Raula Gaikovina Kula,
Ali Ouni,
Daniel M. German,
Katsuro Inoue
Abstract:
The rise of user-contributed Open Source Software (OSS) ecosystems demonstrate their prevalence in the software engineering discipline. Libraries work together by depending on each other across the ecosystem. From these ecosystems emerges a minimized library called a micro-package. Micro- packages become problematic when breaks in a critical ecosystem dependency ripples its effects to unsuspecting…
▽ More
The rise of user-contributed Open Source Software (OSS) ecosystems demonstrate their prevalence in the software engineering discipline. Libraries work together by depending on each other across the ecosystem. From these ecosystems emerges a minimized library called a micro-package. Micro- packages become problematic when breaks in a critical ecosystem dependency ripples its effects to unsuspecting users. In this paper, we investigate the impact of micro-packages in the npm JavaScript ecosystem. Specifically, we conducted an empirical in- vestigation with 169,964 JavaScript npm packages to understand (i) the widespread phenomena of micro-packages, (ii) the size dependencies inherited by a micro-package and (iii) the developer usage cost (ie., fetch, install, load times) of using a micro-package. Results of the study find that micro-packages form a significant portion of the npm ecosystem. Apart from the ease of readability and comprehension, we show that some micro-packages have long dependency chains and incur just as much usage costs as other npm packages. We envision that this work motivates the need for developers to be aware of how sensitive their third-party dependencies are to critical changes in the software ecosystem.
△ Less
Submitted 14 September, 2017;
originally announced September 2017.
-
Modeling Library Dependencies and Updates in Large Software Repository Universes
Authors:
Raula Gaikovina Kula,
Coen De Roover,
Daniel M. German,
Takashi Ishio,
Katsuro Inoue
Abstract:
Popular (re)use of third-party open-source software (OSS) is evidence of the impact of hosting repositories like maven on software development today. Updating libraries is crucial, with recent studies highlighting the associated vulnerabilities with aging OSS libraries. The decision to migrate to a newer library can range from trivial (security threat) to complex (assessment of work required to ac…
▽ More
Popular (re)use of third-party open-source software (OSS) is evidence of the impact of hosting repositories like maven on software development today. Updating libraries is crucial, with recent studies highlighting the associated vulnerabilities with aging OSS libraries. The decision to migrate to a newer library can range from trivial (security threat) to complex (assessment of work required to accommodate the changes). By leveraging the `wisdom of the software repository crowd' we propose a simple and efficient approach to recommending `consented' library updates. Our Software Universe Graph (SUG) models library dependency and update information mined from super repositories to provide different metrics and visualizations that aid in the update decision. To evaluate, we first constructed a SUG from 188,951 nodes of 6,374 maven unique artifacts. Then, we demonstrate how our metrics and visualizations are applied through real-world examples. As an extension, we show how the SUG can compare dependencies between different super repositories. From a sample of 100 GitHub applications, our method found that on average 79% similar overlapping dependencies combinations exist between the maven and github super repository universes.
△ Less
Submitted 14 September, 2017;
originally announced September 2017.