-
Parameterized Quantum Query Algorithms for Graph Problems
Authors:
Tatsuya Terao,
Ryuhei Mori
Abstract:
In this paper, we consider the parameterized quantum query complexity for graph problems. We design parameterized quantum query algorithms for $k$-vertex cover and $k$-matching problems, and present lower bounds on the parameterized quantum query complexity. Then, we show that our quantum query algorithms are optimal up to a constant factor when the parameters are small.
In this paper, we consider the parameterized quantum query complexity for graph problems. We design parameterized quantum query algorithms for $k$-vertex cover and $k$-matching problems, and present lower bounds on the parameterized quantum query complexity. Then, we show that our quantum query algorithms are optimal up to a constant factor when the parameters are small.
△ Less
Submitted 7 August, 2024;
originally announced August 2024.
-
Open-Source Conversational AI with SpeechBrain 1.0
Authors:
Mirco Ravanelli,
Titouan Parcollet,
Adel Moumen,
Sylvain de Langen,
Cem Subakan,
Peter Plantinga,
Yingzhi Wang,
Pooneh Mousavi,
Luca Della Libera,
Artem Ploujnikov,
Francesco Paissan,
Davide Borra,
Salah Zaiem,
Zeyu Zhao,
Shucong Zhang,
Georgios Karakasidis,
Sung-Lin Yeh,
Pierre Champion,
Aku Rouhe,
Rudolf Braun,
Florian Mai,
Juan Zuluaga-Gomez,
Seyed Mahed Mousavi,
Andreas Nautsch,
Xuechen Liu
, et al. (7 additional authors not shown)
Abstract:
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper prese…
▽ More
SpeechBrain is an open-source Conversational AI toolkit based on PyTorch, focused particularly on speech processing tasks such as speech recognition, speech enhancement, speaker recognition, text-to-speech, and much more. It promotes transparency and replicability by releasing both the pre-trained models and the complete "recipes" of code and algorithms required for training them. This paper presents SpeechBrain 1.0, a significant milestone in the evolution of the toolkit, which now has over 200 recipes for speech, audio, and language processing tasks, and more than 100 models available on Hugging Face. SpeechBrain 1.0 introduces new technologies to support diverse learning modalities, Large Language Model (LLM) integration, and advanced decoding strategies, along with novel models, tasks, and modalities. It also includes a new benchmark repository, offering researchers a unified platform for evaluating models across diverse tasks.
△ Less
Submitted 18 July, 2024; v1 submitted 29 June, 2024;
originally announced July 2024.
-
Time-space dynamics of income segregation: a case study of Milan's neighbourhoods
Authors:
Lavinia Rossi Mori,
Vittorio Loreto,
Riccardo Di Clemente
Abstract:
Traditional approaches to urban income segregation focus on static residential patterns, often failing to capture the dynamic nature of social mixing at the neighborhood level. Leveraging high-resolution location-based data from mobile phones, we capture the interplay of three different income groups (high, medium, low) based on their daily routines. We propose a three-dimensional space to analyze…
▽ More
Traditional approaches to urban income segregation focus on static residential patterns, often failing to capture the dynamic nature of social mixing at the neighborhood level. Leveraging high-resolution location-based data from mobile phones, we capture the interplay of three different income groups (high, medium, low) based on their daily routines. We propose a three-dimensional space to analyze social mixing, which is embedded in the temporal dynamics of urban activities. This framework offers a more detailed perspective on social interactions, closely linked to the geographical features of each neighborhood. While residential areas fail to encourage social mixing in the nighttime, the working hours foster inclusion, with the city center showing a heightened level of interaction. As evening sets in, leisure areas emerge as potential facilitators for social interactions, depending on urban features such as public transport and a variety of Points Of Interest. These characteristics significantly modulate the magnitude and type of social stratification involved in social mixing, also underscoring the significance of urban design in either bridging or widening socio-economic divides.
△ Less
Submitted 28 February, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
Dual-Matrix Domain-Wall: A Novel Technique for Generating Permutations by QUBO and Ising Models with Quadratic Sizes
Authors:
Koji Nakano,
Shunsuke Tsukiyama,
Yasuaki Ito,
Takashi Yazane,
Junko Yano,
Takumi Kato,
Shiro Ozaki,
Rie Mori,
Ryota Katsuki
Abstract:
The Ising model is defined by an objective function using a quadratic formula of qubit variables. The problem of an Ising model aims to determine the qubit values of the variables that minimize the objective function, and many optimization problems can be reduced to this problem. In this paper, we focus on optimization problems related to permutations, where the goal is to find the optimal permuta…
▽ More
The Ising model is defined by an objective function using a quadratic formula of qubit variables. The problem of an Ising model aims to determine the qubit values of the variables that minimize the objective function, and many optimization problems can be reduced to this problem. In this paper, we focus on optimization problems related to permutations, where the goal is to find the optimal permutation out of the $n!$ possible permutations of $n$ elements. To represent these problems as Ising models, a commonly employed approach is to use a kernel that utilizes one-hot encoding to find any one of the $n!$ permutations as the optimal solution. However, this kernel contains a large number of quadratic terms and high absolute coefficient values. The main contribution of this paper is the introduction of a novel permutation encoding technique called dual-matrix domain-wall, which significantly reduces the number of quadratic terms and the maximum absolute coefficient values in the kernel. Surprisingly, our dual-matrix domain-wall encoding reduces the quadratic term count and maximum absolute coefficient values from $n^3-n^2$ and $2n-4$ to $6n^2-12n+4$ and $2$, respectively. We also demonstrate the applicability of our encoding technique to partial permutations and Quadratic Unconstrained Binary Optimization (QUBO) models. Furthermore, we discuss a family of permutation problems that can be efficiently implemented using Ising/QUBO models with our dual-matrix domain-wall encoding.
△ Less
Submitted 1 November, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Supervised Anomaly Detection Method Combining Generative Adversarial Networks and Three-Dimensional Data in Vehicle Inspections
Authors:
Yohei Baba,
Takuro Hoshi,
Ryosuke Mori,
Gaurang Gavai
Abstract:
The external visual inspections of rolling stock's underfloor equipment are currently being performed via human visual inspection. In this study, we attempt to partly automate visual inspection by investigating anomaly inspection algorithms that use image processing technology. As the railroad maintenance studies tend to have little anomaly data, unsupervised learning methods are usually preferred…
▽ More
The external visual inspections of rolling stock's underfloor equipment are currently being performed via human visual inspection. In this study, we attempt to partly automate visual inspection by investigating anomaly inspection algorithms that use image processing technology. As the railroad maintenance studies tend to have little anomaly data, unsupervised learning methods are usually preferred for anomaly detection; however, training cost and accuracy is still a challenge. Additionally, a researcher created anomalous images from normal images by adding noise, etc., but the anomalous targeted in this study is the rotation of piping cocks that was difficult to create using noise. Therefore, in this study, we propose a new method that uses style conversion via generative adversarial networks on three-dimensional computer graphics and imitates anomaly images to apply anomaly detection based on supervised learning. The geometry-consistent style conversion model was used to convert the image, and because of this the color and texture of the image were successfully made to imitate the real image while maintaining the anomalous shape. Using the generated anomaly images as supervised data, the anomaly detection model can be easily trained without complex adjustments and successfully detects anomalies.
△ Less
Submitted 22 December, 2022;
originally announced December 2022.
-
Diverse Adaptive Bulk Search: a Framework for Solving QUBO Problems on Multiple GPUs
Authors:
Koji Nakano,
Daisuke Takafuji,
Yasuaki Ito,
Takashi Yazane,
Junko Yano,
Shiro Ozaki,
Ryota Katsuki,
Rie Mori
Abstract:
Quadratic Unconstrained Binary Optimization (QUBO) is a combinatorial optimization to find an optimal binary solution vector that minimizes the energy value defined by a quadratic formula of binary variables in the vector. As many NP-hard problems can be reduced to QUBO problems, considerable research has gone into developing QUBO solvers running on various computing platforms such as quantum devi…
▽ More
Quadratic Unconstrained Binary Optimization (QUBO) is a combinatorial optimization to find an optimal binary solution vector that minimizes the energy value defined by a quadratic formula of binary variables in the vector. As many NP-hard problems can be reduced to QUBO problems, considerable research has gone into developing QUBO solvers running on various computing platforms such as quantum devices, ASICs, FPGAs, GPUs, and optical fibers. This paper presents a framework called Diverse Adaptive Bulk Search (DABS), which has the potential to find optimal solutions of many types of QUBO problems. Our DABS solver employs a genetic algorithm-based search algorithm featuring three diverse strategies: multiple search algorithms, multiple genetic operations, and multiple solution pools. During the execution of the solver, search algorithms and genetic operations that succeeded in finding good solutions are automatically selected to obtain better solutions. Moreover, search algorithms traverse between different solution pools to find good solutions. We have implemented our DABS solver to run on multiple GPUs. Experimental evaluations using eight NVIDIA A100 GPUs confirm that our DABS solver succeeds in finding optimal or potentially optimal solutions for three types of QUBO problems.
△ Less
Submitted 17 March, 2023; v1 submitted 6 July, 2022;
originally announced July 2022.
-
Lower bounds on the error probability of multiple quantum channel discrimination by the Bures angle and the trace distance
Authors:
Ryo Ito,
Ryuhei Mori
Abstract:
Quantum channel discrimination is a fundamental problem in quantum information science. In this study, we consider general quantum channel discrimination problems, and derive the lower bounds of the error probability. Our lower bounds are based on the triangle inequalities of the Bures angle and the trace distance. As a consequence of the lower bound based on the Bures angle, we prove the optimali…
▽ More
Quantum channel discrimination is a fundamental problem in quantum information science. In this study, we consider general quantum channel discrimination problems, and derive the lower bounds of the error probability. Our lower bounds are based on the triangle inequalities of the Bures angle and the trace distance. As a consequence of the lower bound based on the Bures angle, we prove the optimality of Grover's search if the number of marked elements is fixed to some integer $\ell$. This result generalizes Zalka's result for $\ell=1$. We also present several numerical results in which our lower bounds based on the trace distance outperform recently obtained lower bounds.
△ Less
Submitted 1 August, 2022; v1 submitted 8 July, 2021;
originally announced July 2021.
-
SpeechBrain: A General-Purpose Speech Toolkit
Authors:
Mirco Ravanelli,
Titouan Parcollet,
Peter Plantinga,
Aku Rouhe,
Samuele Cornell,
Loren Lugosch,
Cem Subakan,
Nauman Dawalatabad,
Abdelwahab Heba,
Jianyuan Zhong,
Ju-Chieh Chou,
Sung-Lin Yeh,
Szu-Wei Fu,
Chien-Feng Liao,
Elena Rastorgueva,
François Grondin,
William Aris,
Hwidong Na,
Yan Gao,
Renato De Mori,
Yoshua Bengio
Abstract:
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel speech processing…
▽ More
SpeechBrain is an open-source and all-in-one speech toolkit. It is designed to facilitate the research and development of neural speech processing technologies by being simple, flexible, user-friendly, and well-documented. This paper describes the core architecture designed to support several tasks of common interest, allowing users to naturally conceive, compare and share novel speech processing pipelines. SpeechBrain achieves competitive or state-of-the-art performance in a wide range of speech benchmarks. It also provides training recipes, pretrained models, and inference scripts for popular speech datasets, as well as tutorials which allow anyone with basic Python proficiency to familiarize themselves with speech technologies.
△ Less
Submitted 8 June, 2021;
originally announced June 2021.
-
Quantum speedups for dynamic programming on $n$-dimensional lattice graphs
Authors:
Adam Glos,
Martins Kokainis,
Ryuhei Mori,
Jevgēnijs Vihrovs
Abstract:
Motivated by the quantum speedup for dynamic programming on the Boolean hypercube by Ambainis et al. (2019), we investigate which graphs admit a similar quantum advantage. In this paper, we examine a generalization of the Boolean hypercube graph, the $n$-dimensional lattice graph $Q(D,n)$ with vertices in $\{0,1,\ldots,D\}^n$. We study the complexity of the following problem: given a subgraph $G$…
▽ More
Motivated by the quantum speedup for dynamic programming on the Boolean hypercube by Ambainis et al. (2019), we investigate which graphs admit a similar quantum advantage. In this paper, we examine a generalization of the Boolean hypercube graph, the $n$-dimensional lattice graph $Q(D,n)$ with vertices in $\{0,1,\ldots,D\}^n$. We study the complexity of the following problem: given a subgraph $G$ of $Q(D,n)$ via query access to the edges, determine whether there is a path from $0^n$ to $D^n$. While the classical query complexity is $\widetildeΘ((D+1)^n)$, we show a quantum algorithm with complexity $\widetilde O(T_D^n)$, where $T_D < D+1$. The first few values of $T_D$ are $T_1 \approx 1.817$, $T_2 \approx 2.660$, $T_3 \approx 3.529$, $T_4 \approx 4.421$, $T_5 \approx 5.332$. We also prove that $T_D \geq \frac{D+1}{\mathrm e}$, thus for general $D$, this algorithm does not provide, for example, a speedup, polynomial in the size of the lattice.
While the presented quantum algorithm is a natural generalization of the known quantum algorithm for $D=1$ by Ambainis et al., the analysis of complexity is rather complicated. For the precise analysis, we use the saddle-point method, which is a common tool in analytic combinatorics, but has not been widely used in this field.
We then show an implementation of this algorithm with time complexity $\text{poly}(n)^{\log n} T_D^n$, and apply it to the Set Multicover problem. In this problem, $m$ subsets of $[n]$ are given, and the task is to find the smallest number of these subsets that cover each element of $[n]$ at least $D$ times. While the time complexity of the best known classical algorithm is $O(m(D+1)^n)$, the time complexity of our quantum algorithm is $\text{poly}(m,n)^{\log n} T_D^n$.
△ Less
Submitted 7 May, 2021; v1 submitted 29 April, 2021;
originally announced April 2021.
-
End2End Acoustic to Semantic Transduction
Authors:
Valentin Pelloin,
Nathalie Camelin,
Antoine Laurent,
Renato De Mori,
Antoine Caubrière,
Yannick Estève,
Sylvain Meignier
Abstract:
In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system re…
▽ More
In this paper, we propose a novel end-to-end sequence-to-sequence spoken language understanding model using an attention mechanism. It reliably selects contextual acoustic features in order to hypothesize semantic contents. An initial architecture capable of extracting all pronounced words and concepts from acoustic spans is designed and tested. With a shallow fusion language model, this system reaches a 13.6 concept error rate (CER) and an 18.5 concept value error rate (CVER) on the French MEDIA corpus, achieving an absolute 2.8 points reduction compared to the state-of-the-art. Then, an original model is proposed for hypothesizing concepts and their values. This transduction reaches a 15.4 CER and a 21.6 CVER without any new type of context.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
A Simple and Fast Algorithm for Computing the $N$-th Term of a Linearly Recurrent Sequence
Authors:
Alin Bostan,
Ryuhei Mori
Abstract:
We present a simple and fast algorithm for computing the $N$-th term of a given linearly recurrent sequence. Our new algorithm uses $O(\mathsf{M}(d) \log N)$ arithmetic operations, where $d$ is the order of the recurrence, and $\mathsf{M}(d)$ denotes the number of arithmetic operations for computing the product of two polynomials of degree $d$. The state-of-the-art algorithm, due to Charles Fiducc…
▽ More
We present a simple and fast algorithm for computing the $N$-th term of a given linearly recurrent sequence. Our new algorithm uses $O(\mathsf{M}(d) \log N)$ arithmetic operations, where $d$ is the order of the recurrence, and $\mathsf{M}(d)$ denotes the number of arithmetic operations for computing the product of two polynomials of degree $d$. The state-of-the-art algorithm, due to Charles Fiduccia (1985), has the same arithmetic complexity up to a constant factor. Our algorithm is simpler, faster and obtained by a totally different method. We also discuss several algorithmic applications, notably to polynomial modular exponentiation, powering of matrices and high-order lifting.
△ Less
Submitted 20 August, 2020;
originally announced August 2020.
-
Dialogue history integration into end-to-end signal-to-concept spoken language understanding systems
Authors:
Natalia Tomashenko,
Christian Raymond,
Antoine Caubriere,
Renato De Mori,
Yannick Esteve
Abstract:
This work investigates the embeddings for representing dialog history in spoken language understanding (SLU) systems. We focus on the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model. We proposed to integrate dialogue history into an end-to-end signal-to-concept SLU system. The dialog history is represented in…
▽ More
This work investigates the embeddings for representing dialog history in spoken language understanding (SLU) systems. We focus on the scenario when the semantic information is extracted directly from the speech signal by means of a single end-to-end neural network model. We proposed to integrate dialogue history into an end-to-end signal-to-concept SLU system. The dialog history is represented in the form of dialog history embedding vectors (so-called h-vectors) and is provided as an additional information to end-to-end SLU models in order to improve the system performance. Three following types of h-vectors are proposed and experimentally evaluated in this paper: (1) supervised-all embeddings predicting bag-of-concepts expected in the answer of the user from the last dialog system response; (2) supervised-freq embeddings focusing on predicting only a selected set of semantic concept (corresponding to the most frequent errors in our experiments); and (3) unsupervised embeddings. Experiments on the MEDIA corpus for the semantic slot filling task demonstrate that the proposed h-vectors improve the model performance.
△ Less
Submitted 14 February, 2020;
originally announced February 2020.
-
Exponential-time quantum algorithms for graph coloring problems
Authors:
Kazuya Shimizu,
Ryuhei Mori
Abstract:
The fastest known classical algorithm deciding the $k$-colorability of $n$-vertex graph requires running time $Ω(2^n)$ for $k\ge 5$. In this work, we present an exponential-space quantum algorithm computing the chromatic number with running time $O(1.9140^n)$ using quantum random access memory (QRAM). Our approach is based on Ambainis et al's quantum dynamic programming with applications of Grover…
▽ More
The fastest known classical algorithm deciding the $k$-colorability of $n$-vertex graph requires running time $Ω(2^n)$ for $k\ge 5$. In this work, we present an exponential-space quantum algorithm computing the chromatic number with running time $O(1.9140^n)$ using quantum random access memory (QRAM). Our approach is based on Ambainis et al's quantum dynamic programming with applications of Grover's search to branching algorithms. We also present a polynomial-space quantum algorithm not using QRAM for the graph $20$-coloring problem with running time $O(1.9575^n)$. In the polynomial-space quantum algorithm, we essentially show $(4-ε)^n$-time classical algorithms that can be improved quadratically by Grover's search.
△ Less
Submitted 30 June, 2019;
originally announced July 2019.
-
Real to H-space Encoder for Speech Recognition
Authors:
Titouan Parcollet,
Mohamed Morchid,
Georges Linarès,
Renato De Mori
Abstract:
Deep neural networks (DNNs) and more precisely recurrent neural networks (RNNs) are at the core of modern automatic speech recognition systems, due to their efficiency to process input sequences. Recently, it has been shown that different input representations, based on multidimensional algebras, such as complex and quaternion numbers, are able to bring to neural networks a more natural, compressi…
▽ More
Deep neural networks (DNNs) and more precisely recurrent neural networks (RNNs) are at the core of modern automatic speech recognition systems, due to their efficiency to process input sequences. Recently, it has been shown that different input representations, based on multidimensional algebras, such as complex and quaternion numbers, are able to bring to neural networks a more natural, compressive and powerful representation of the input signal by outperforming common real-valued NNs. Indeed, quaternion-valued neural networks (QNNs) better learn both internal dependencies, such as the relation between the Mel-filter-bank value of a specific time frame and its time derivatives, and global dependencies, describing the relations that exist between time frames. Nonetheless, QNNs are limited to quaternion-valued input signals, and it is difficult to benefit from this powerful representation with real-valued input data. This paper proposes to tackle this weakness by introducing a real-to-quaternion encoder that allows QNNs to process any one dimensional input features, such as traditional Mel-filter-banks for automatic speech recognition.
△ Less
Submitted 17 June, 2019;
originally announced June 2019.
-
Multiple topic identification in telephone conversations
Authors:
Xavier Bost,
Marc El Bèze,
Renato De Mori
Abstract:
This paper deals with the automatic analysis of conversations between a customer and an agent in a call centre of a customer care service. The purpose of the analysis is to hypothesize themes about problems and complaints discussed in the conversation. Themes are defined by the application documentation topics. A conversation may contain mentions that are irrelevant for the application purpose and…
▽ More
This paper deals with the automatic analysis of conversations between a customer and an agent in a call centre of a customer care service. The purpose of the analysis is to hypothesize themes about problems and complaints discussed in the conversation. Themes are defined by the application documentation topics. A conversation may contain mentions that are irrelevant for the application purpose and multiple themes whose mentions may be interleaved portions of a conversation that cannot be well defined. Two methods are proposed for multiple theme hypothesization. One of them is based on a cosine similarity measure using a bag of features extracted from the entire conversation. The other method introduces the concept of thematic density distributed around specific word positions in a conversation. In addition to automatically selected words, word bi-grams with possible gaps between successive words are also considered and selected. Experimental results show that the results obtained with the proposed methods outperform the results obtained with support vector machines on the same data. Furthermore, using the theme skeleton of a conversation from which thematic densities are derived, it will be possible to extract components of an automatic conversation report to be used for improving the service performance. Index Terms: multi-topic audio document classification, hu-man/human conversation analysis, speech analytics, distance bigrams
△ Less
Submitted 29 December, 2018; v1 submitted 21 December, 2018;
originally announced December 2018.
-
Multiple topic identification in human/human conversations
Authors:
X. Bost,
G. Senay,
M. El-Bèze,
R. De Mori
Abstract:
The paper deals with the automatic analysis of real-life telephone conversations between an agent and a customer of a customer care service (ccs). The application domain is the public transportation system in Paris and the purpose is to collect statistics about customer problems in order to monitor the service and decide priorities on the intervention for improving user satisfaction. Of primary im…
▽ More
The paper deals with the automatic analysis of real-life telephone conversations between an agent and a customer of a customer care service (ccs). The application domain is the public transportation system in Paris and the purpose is to collect statistics about customer problems in order to monitor the service and decide priorities on the intervention for improving user satisfaction. Of primary importance for the analysis is the detection of themes that are the object of customer problems. Themes are defined in the application requirements and are part of the application ontology that is implicit in the ccs documentation. Due to variety of customer population, the structure of conversations with an agent is unpredictable. A conversation may be about one or more themes. Theme mentions can be interleaved with mentions of facts that are irrelevant for the application purpose. Furthermore, in certain conversations theme mentions are localized in specific conversation segments while in other conversations mentions cannot be localized. As a consequence, approaches to feature extraction with and without mention localization are considered. Application domain relevant themes identified by an automatic procedure are expressed by specific sentences whose words are hypothesized by an automatic speech recognition (asr) system. The asr system is error prone. The word error rates can be very high for many reasons. Among them it is worth mentioning unpredictable background noise, speaker accent, and various types of speech disfluencies. As the application task requires the composition of proportions of theme mentions, a sequential decision strategy is introduced in this paper for performing a survey of the large amount of conversations made available in a given time period. The strategy has to sample the conversations to form a survey containing enough data analyzed with high accuracy so that proportions can be estimated with sufficient accuracy. Due to the unpredictable type of theme mentions, it is appropriate to consider methods for theme hypothesization based on global as well as local feature extraction. Two systems based on each type of feature extraction will be considered by the strategy. One of the four methods is novel. It is based on a new definition of density of theme mentions and on the localization of high density zones whose boundaries do not need to be precisely detected. The sequential decision strategy starts by grouping theme hypotheses into sets of different expected accuracy and coverage levels. For those sets for which accuracy can be improved with a consequent increase of coverage a new system with new features is introduced. Its execution is triggered only when specific preconditions are met on the hypotheses generated by the basic four systems. Experimental results are provided on a corpus collected in the call center of the Paris transportation system known as ratp. The results show that surveys with high accuracy and coverage can be composed with the proposed strategy and systems. This makes it possible to apply a previously published proportion estimation approach that takes into account hypothesization errors .
△ Less
Submitted 29 December, 2018; v1 submitted 18 December, 2018;
originally announced December 2018.
-
Speech recognition with quaternion neural networks
Authors:
Titouan Parcollet,
Mirco Ravanelli,
Mohamed Morchid,
Georges Linarès,
Renato De Mori
Abstract:
Neural network architectures are at the core of powerful automatic speech recognition systems (ASR). However, while recent researches focus on novel model architectures, the acoustic input features remain almost unchanged. Traditional ASR systems rely on multidimensional acoustic features such as the Mel filter bank energies alongside with the first, and second order derivatives to characterize ti…
▽ More
Neural network architectures are at the core of powerful automatic speech recognition systems (ASR). However, while recent researches focus on novel model architectures, the acoustic input features remain almost unchanged. Traditional ASR systems rely on multidimensional acoustic features such as the Mel filter bank energies alongside with the first, and second order derivatives to characterize time-frames that compose the signal sequence. Considering that these components describe three different views of the same element, neural networks have to learn both the internal relations that exist within these features, and external or global dependencies that exist between the time-frames. Quaternion-valued neural networks (QNN), recently received an important interest from researchers to process and learn such relations in multidimensional spaces. Indeed, quaternion numbers and QNNs have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with up to four times less learning parameters than real-valued models. We propose to investigate modern quaternion-valued models such as convolutional and recurrent quaternion neural networks in the context of speech recognition with the TIMIT dataset. The experiments show that QNNs always outperform real-valued equivalent models with way less free parameters, leading to a more efficient, compact, and expressive representation of the relevant information.
△ Less
Submitted 21 November, 2018;
originally announced November 2018.
-
Bidirectional Quaternion Long-Short Term Memory Recurrent Neural Networks for Speech Recognition
Authors:
Titouan Parcollet,
Mohamed Morchid,
Georges Linarès,
Renato De Mori
Abstract:
Recurrent neural networks (RNN) are at the core of modern automatic speech recognition (ASR) systems. In particular, long-short term memory (LSTM) recurrent neural networks have achieved state-of-the-art results in many speech recognition tasks, due to their efficient representation of long and short term dependencies in sequences of inter-dependent features. Nonetheless, internal dependencies wit…
▽ More
Recurrent neural networks (RNN) are at the core of modern automatic speech recognition (ASR) systems. In particular, long-short term memory (LSTM) recurrent neural networks have achieved state-of-the-art results in many speech recognition tasks, due to their efficient representation of long and short term dependencies in sequences of inter-dependent features. Nonetheless, internal dependencies within the element composing multidimensional features are weakly considered by traditional real-valued representations. We propose a novel quaternion long-short term memory (QLSTM) recurrent neural network that takes into account both the external relations between the features composing a sequence, and these internal latent structural dependencies with the quaternion algebra. QLSTMs are compared to LSTMs during a memory copy-task and a realistic application of speech recognition on the Wall Street Journal (WSJ) dataset. QLSTM reaches better performances during the two experiments with up to $2.8$ times less learning parameters, leading to a more expressive representation of the information.
△ Less
Submitted 6 November, 2018;
originally announced November 2018.
-
Quaternion Convolutional Neural Networks for End-to-End Automatic Speech Recognition
Authors:
Titouan Parcollet,
Ying Zhang,
Mohamed Morchid,
Chiheb Trabelsi,
Georges Linarès,
Renato De Mori,
Yoshua Bengio
Abstract:
Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives…
▽ More
Recently, the connectionist temporal classification (CTC) model coupled with recurrent (RNN) or convolutional neural networks (CNN), made it easier to train speech recognition systems in an end-to-end fashion. However in real-valued models, time frame components such as mel-filter-bank energies and the cepstral coefficients obtained from them, together with their first and second order derivatives, are processed as individual elements, while a natural alternative is to process such components as composed entities. We propose to group such elements in the form of quaternions and to process these quaternions using the established quaternion algebra. Quaternion numbers and quaternion neural networks have shown their efficiency to process multidimensional inputs as entities, to encode internal dependencies, and to solve many tasks with less learning parameters than real-valued models. This paper proposes to integrate multiple feature views in quaternion-valued convolutional neural network (QCNN), to be used for sequence-to-sequence mapping with the CTC model. Promising results are reported using simple QCNNs in phoneme recognition experiments with the TIMIT corpus. More precisely, QCNNs obtain a lower phoneme error rate (PER) with less learning parameters than a competing model based on real-valued CNNs.
△ Less
Submitted 20 June, 2018;
originally announced June 2018.
-
Quaternion Recurrent Neural Networks
Authors:
Titouan Parcollet,
Mirco Ravanelli,
Mohamed Morchid,
Georges Linarès,
Chiheb Trabelsi,
Renato De Mori,
Yoshua Bengio
Abstract:
Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence. Nonetheless, popular tasks such as speech or images recognition, involve multi-dimensional input features that are characterized by strong internal dependencies between the dimensions of the input vector. W…
▽ More
Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence. Nonetheless, popular tasks such as speech or images recognition, involve multi-dimensional input features that are characterized by strong internal dependencies between the dimensions of the input vector. We propose a novel quaternion recurrent neural network (QRNN), alongside with a quaternion long-short term memory neural network (QLSTM), that take into account both the external relations and these internal structural dependencies with the quaternion algebra. Similarly to capsules, quaternions allow the QRNN to code internal dependencies by composing and processing multidimensional features as single entities, while the recurrent operation reveals correlations between the elements composing the sequence. We show that both QRNN and QLSTM achieve better performances than RNN and LSTM in a realistic application of automatic speech recognition. Finally, we show that QRNN and QLSTM reduce by a maximum factor of 3.3x the number of free parameters needed, compared to real-valued RNNs and LSTMs to reach better results, leading to a more compact representation of the relevant information.
△ Less
Submitted 7 January, 2019; v1 submitted 12 June, 2018;
originally announced June 2018.
-
Periodic Fourier representation of Boolean functions
Authors:
Ryuhei Mori
Abstract:
In this work, we consider a new type of Fourier-like representation of Boolean function $f\colon\{+1,-1\}^n\to\{+1,-1\}$ \[ f(x) = \cos\left(π\sum_{S\subseteq[n]}φ_S \prod_{i\in S} x_i\right). \] This representation, which we call the periodic Fourier representation, of Boolean function is closely related to a certain type of multipartite Bell inequalities and non-adaptive measurement-based quantu…
▽ More
In this work, we consider a new type of Fourier-like representation of Boolean function $f\colon\{+1,-1\}^n\to\{+1,-1\}$ \[ f(x) = \cos\left(π\sum_{S\subseteq[n]}φ_S \prod_{i\in S} x_i\right). \] This representation, which we call the periodic Fourier representation, of Boolean function is closely related to a certain type of multipartite Bell inequalities and non-adaptive measurement-based quantum computation with linear side-processing ($\mathrm{NMQC}_\oplus$). The minimum number of non-zero coefficients in the above representation, which we call the periodic Fourier sparsity, is equal to the required number of qubits for the exact computation of $f$ by $\mathrm{NMQC}_\oplus$. Periodic Fourier representations are not unique, and can be directly obtained both from the Fourier representation and the $\mathbb{F}_2$-polynomial representation. In this work, we first show that Boolean functions related to $\mathbb{Z}/4\mathbb{Z}$-polynomial have small periodic Fourier sparsities. Second, we show that the periodic Fourier sparsity is at least $2^{\mathrm{deg}_{\mathbb{F}_2}(f)}-1$, which means that $\mathrm{NMQC}_\oplus$ efficiently computes a Boolean function $f$ if and only if $\mathbb{F}_2$-degree of $f$ is small. Furthermore, we show that any symmetric Boolean function, e.g., $\mathsf{AND}_n$, $\mathsf{Mod}^3_n$, $\mathsf{Maj}_n$, etc, can be exactly computed by depth-2 $\mathrm{NMQC}_\oplus$ using a polynomial number of qubits, that implies exponential gaps between $\mathrm{NMQC}_\oplus$ and depth-2 $\mathrm{NMQC}_\oplus$.
△ Less
Submitted 26 March, 2019; v1 submitted 27 March, 2018;
originally announced March 2018.
-
ASR error management for improving spoken language understanding
Authors:
Edwin Simonnet,
Sahar Ghannay,
Nathalie Camelin,
Yannick Estève,
Renato De Mori
Abstract:
This paper addresses the problem of automatic speech recognition (ASR) error detection and their use for improving spoken language understanding (SLU) systems. In this study, the SLU task consists in automatically extracting, from ASR transcriptions , semantic concepts and concept/values pairs in a e.g touristic information system. An approach is proposed for enriching the set of semantic labels w…
▽ More
This paper addresses the problem of automatic speech recognition (ASR) error detection and their use for improving spoken language understanding (SLU) systems. In this study, the SLU task consists in automatically extracting, from ASR transcriptions , semantic concepts and concept/values pairs in a e.g touristic information system. An approach is proposed for enriching the set of semantic labels with error specific labels and by using a recently proposed neural approach based on word embeddings to compute well calibrated ASR confidence measures. Experimental results are reported showing that it is possible to decrease significantly the Concept/Value Error Rate with a state of the art system, outperforming previously published results performance on the same experimental data. It also shown that combining an SLU approach based on conditional random fields with a neural encoder/decoder attention based architecture , it is possible to effectively identifying confidence islands and uncertain semantic output segments useful for deciding appropriate error handling actions by the dialogue manager strategy .
△ Less
Submitted 26 May, 2017;
originally announced May 2017.
-
Parallel Long Short-Term Memory for Multi-stream Classification
Authors:
Mohamed Bouaziz,
Mohamed Morchid,
Richard Dufour,
Georges Linarès,
Renato De Mori
Abstract:
Recently, machine learning methods have provided a broad spectrum of original and efficient algorithms based on Deep Neural Networks (DNN) to automatically predict an outcome with respect to a sequence of inputs. Recurrent hidden cells allow these DNN-based models to manage long-term dependencies such as Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). Nevertheless, these RNNs pr…
▽ More
Recently, machine learning methods have provided a broad spectrum of original and efficient algorithms based on Deep Neural Networks (DNN) to automatically predict an outcome with respect to a sequence of inputs. Recurrent hidden cells allow these DNN-based models to manage long-term dependencies such as Recurrent Neural Networks (RNN) and Long Short-Term Memory (LSTM). Nevertheless, these RNNs process a single input stream in one (LSTM) or two (Bidirectional LSTM) directions. But most of the information available nowadays is from multistreams or multimedia documents, and require RNNs to process these information synchronously during the training. This paper presents an original LSTM-based architecture, named Parallel LSTM (PLSTM), that carries out multiple parallel synchronized input sequences in order to predict a common output. The proposed PLSTM method could be used for parallel sequence classification purposes. The PLSTM approach is evaluated on an automatic telecast genre sequences classification task and compared with different state-of-the-art architectures. Results show that the proposed PLSTM method outperforms the baseline n-gram models as well as the state-of-the-art LSTM approach.
△ Less
Submitted 11 February, 2017;
originally announced February 2017.
-
Sum of squares lower bounds for refuting any CSP
Authors:
Pravesh K. Kothari,
Ryuhei Mori,
Ryan O'Donnell,
David Witmer
Abstract:
Let $P:\{0,1\}^k \to \{0,1\}$ be a nontrivial $k$-ary predicate. Consider a random instance of the constraint satisfaction problem $\mathrm{CSP}(P)$ on $n$ variables with $Δn$ constraints, each being $P$ applied to $k$ randomly chosen literals. Provided the constraint density satisfies $Δ\gg 1$, such an instance is unsatisfiable with high probability. The \emph{refutation} problem is to efficientl…
▽ More
Let $P:\{0,1\}^k \to \{0,1\}$ be a nontrivial $k$-ary predicate. Consider a random instance of the constraint satisfaction problem $\mathrm{CSP}(P)$ on $n$ variables with $Δn$ constraints, each being $P$ applied to $k$ randomly chosen literals. Provided the constraint density satisfies $Δ\gg 1$, such an instance is unsatisfiable with high probability. The \emph{refutation} problem is to efficiently find a proof of unsatisfiability.
We show that whenever the predicate $P$ supports a $t$-\emph{wise uniform} probability distribution on its satisfying assignments, the sum of squares (SOS) algorithm of degree $d = Θ(\frac{n}{Δ^{2/(t-1)} \log Δ})$ (which runs in time $n^{O(d)}$) \emph{cannot} refute a random instance of $\mathrm{CSP}(P)$. In particular, the polynomial-time SOS algorithm requires $\widetildeΩ(n^{(t+1)/2})$ constraints to refute random instances of CSP$(P)$ when $P$ supports a $t$-wise uniform distribution on its satisfying assignments. Together with recent work of Lee et al. [LRS15], our result also implies that \emph{any} polynomial-size semidefinite programming relaxation for refutation requires at least $\widetildeΩ(n^{(t+1)/2})$ constraints.
Our results (which also extend with no change to CSPs over larger alphabets) subsume all previously known lower bounds for semialgebraic refutation of random CSPs. For every constraint predicate~$P$, they give a three-way hardness tradeoff between the density of constraints, the SOS degree (hence running time), and the strength of the refutation. By recent algorithmic results of Allen et al. [AOW15] and Raghavendra et al. [RRS16], this full three-way tradeoff is \emph{tight}, up to lower-order factors.
△ Less
Submitted 16 January, 2017;
originally announced January 2017.
-
Better Protocol for XOR Game using Communication Protocol and Nonlocal Boxes
Authors:
Ryuhei Mori
Abstract:
Buhrman showed that an efficient communication protocol implies a reliable XOR game protocol. This idea rederives Linial and Shraibman's lower bounds of communication complexity, which was derived by using factorization norms, with worse constant factor in much more intuitive way. In this work, we improve and generalize Buhrman's idea, and obtain a class of lower bounds for classical communication…
▽ More
Buhrman showed that an efficient communication protocol implies a reliable XOR game protocol. This idea rederives Linial and Shraibman's lower bounds of communication complexity, which was derived by using factorization norms, with worse constant factor in much more intuitive way. In this work, we improve and generalize Buhrman's idea, and obtain a class of lower bounds for classical communication complexity including an exact Linial and Shraibman's lower bound as a special case. In the proof, we explicitly construct a protocol for XOR game from a classical communication protocol by using a concept of nonlocal boxes and Pawłowski et al.'s elegant protocol, which was used for showing the violation of information causality in superquantum theories.
△ Less
Submitted 28 April, 2017; v1 submitted 16 January, 2017;
originally announced January 2017.
-
Lower bounds for CSP refutation by SDP hierarchies
Authors:
Ryuhei Mori,
David Witmer
Abstract:
For a $k$-ary predicate $P$, a random instance of CSP$(P)$ with $n$ variables and $m$ constraints is unsatisfiable with high probability when $m \gg n$. The natural algorithmic task in this regime is \emph{refutation}: finding a proof that a given random instance is unsatisfiable. Recent work of Allen et al. suggests that the difficulty of refuting CSP$(P)$ using an SDP is determined by a paramete…
▽ More
For a $k$-ary predicate $P$, a random instance of CSP$(P)$ with $n$ variables and $m$ constraints is unsatisfiable with high probability when $m \gg n$. The natural algorithmic task in this regime is \emph{refutation}: finding a proof that a given random instance is unsatisfiable. Recent work of Allen et al. suggests that the difficulty of refuting CSP$(P)$ using an SDP is determined by a parameter $\mathrm{cmplx}(P)$, the smallest $t$ for which there does not exist a $t$-wise uniform distribution over satisfying assignments to $P$. In particular they show that random instances of CSP$(P)$ with $m \gg n^{\mathrm{cmplx(P)}/2}$ can be refuted efficiently using an SDP.
In this work, we give evidence that $n^{\mathrm{cmplx}(P)/2}$ constraints are also \emph{necessary} for refutation using SDPs. Specifically, we show that if $P$ supports a $(t-1)$-wise uniform distribution over satisfying assignments, then the Sherali-Adams$_+$ and Lovász-Schrijver$_+$ SDP hierarchies cannot refute a random instance of CSP$(P)$ in polynomial time for any $m \leq n^{t/2-ε}$.
△ Less
Submitted 10 October, 2016;
originally announced October 2016.
-
Average Shortest Path Length of Graphs of Diameter 3
Authors:
Nobutaka Shimizu,
Ryuhei Mori
Abstract:
A network topology with low average shortest path length (ASPL) provides efficient data transmission while the number of nodes and the number of links incident to each node are often limited due to physical constraints. In this paper, we consider the construction of low ASPL graphs under these constraints by using stochastic local search (SLS) algorithms. Since the ASPL cannot be calculated effici…
▽ More
A network topology with low average shortest path length (ASPL) provides efficient data transmission while the number of nodes and the number of links incident to each node are often limited due to physical constraints. In this paper, we consider the construction of low ASPL graphs under these constraints by using stochastic local search (SLS) algorithms. Since the ASPL cannot be calculated efficiently, the ASPL is not suitable for the evaluation function of SLS algorithms. We first derive an equality and bounds for the ASPL of graphs of diameter 3. Then, we propose use the simpliest upper bound represented by the number of triangles and squares in the graph as an evaluation function for graphs of diameter 3. We show that the proposed evaluation function can be evaluated in O(1) time as the number of nodes and the maximum degree tend to infinity by using some data tables. By using the simulated annealing with the proposed evaluation function, we construct low ASPL regular graphs of diameter 3 with 10 000 nodes.
△ Less
Submitted 16 June, 2016;
originally announced June 2016.
-
Three-input Majority Function as the Unique Optimal Function for the Bias Amplification using Nonlocal Boxes
Authors:
Ryuhei Mori
Abstract:
Brassard et al. [Phys. Rev. Lett. 96, 250401 (2006)] showed that shared nonlocal boxes with the CHSH probability greater than $\frac{3+\sqrt{6}}6$ yields trivial communication complexity. There still exists the gap with the maximum CHSH probability $\frac{2+\sqrt{2}}4$ achievable by quantum mechanics. It is an interesting open question to determine the exact threshold for the trivial communication…
▽ More
Brassard et al. [Phys. Rev. Lett. 96, 250401 (2006)] showed that shared nonlocal boxes with the CHSH probability greater than $\frac{3+\sqrt{6}}6$ yields trivial communication complexity. There still exists the gap with the maximum CHSH probability $\frac{2+\sqrt{2}}4$ achievable by quantum mechanics. It is an interesting open question to determine the exact threshold for the trivial communication complexity. Brassard et al.'s idea is based on the recursive bias amplification by the 3-input majority function. It was not obvious if other choice of function exhibits stronger bias amplification. We show that the 3-input majority function is the unique optimal, so that one cannot improve the threshold $\frac{3+\sqrt{6}}6$ by Brassard et al.'s bias amplification. In this work, protocols for computing the function used for the bias amplification are restricted to be non-adaptive protocols or particular adaptive protocol inspired by Pawłowski et al.'s protocol for information causality [Nature 461, 1101 (2009)]. We first show a new adaptive protocol inspired by Pawłowski et al.'s protocol, and then show that the new adaptive protocol is better than any non-adaptive protocol. Finally, we show that the 3-input majority function is the unique optimal for the bias amplification if we apply the new adaptive protocol to each step of the bias amplification.
△ Less
Submitted 29 November, 2016; v1 submitted 18 April, 2016;
originally announced April 2016.
-
Peeling Algorithm on Random Hypergraphs with Superlinear Number of Hyperedges
Authors:
Ryuhei Mori,
Osamu Watanabe
Abstract:
When we try to solve a system of linear equations, we can consider a simple iterative algorithm in which an equation including only one variable is chosen at each step, and the variable is fixed to the value satisfying the equation. The dynamics of this algorithm is captured by the peeling algorithm. Analyses of the peeling algorithm on random hypergraphs are required for many problems, e.g., the…
▽ More
When we try to solve a system of linear equations, we can consider a simple iterative algorithm in which an equation including only one variable is chosen at each step, and the variable is fixed to the value satisfying the equation. The dynamics of this algorithm is captured by the peeling algorithm. Analyses of the peeling algorithm on random hypergraphs are required for many problems, e.g., the decoding threshold of low-density parity check codes, the inverting threshold of Goldreich's pseudorandom generator, the load threshold of cuckoo hashing, etc. In this work, we deal with random hypergraphs including superlinear number of hyperedges, and derive the tight threshold for the succeeding of the peeling algorithm. For the analysis, Wormald's method of differential equations, which is commonly used for analyses of the peeling algorithm on random hypergraph with linear number of hyperedges, cannot be used due to the superlinear number of hyperedges. A new method called the evolution of the moment generating function is proposed in this work.
△ Less
Submitted 1 June, 2015;
originally announced June 2015.
-
Holographic Transformation, Belief Propagation and Loop Calculus for Generalized Probabilistic Theories
Authors:
Ryuhei Mori
Abstract:
The holographic transformation, belief propagation and loop calculus are generalized to problems in generalized probabilistic theories including quantum mechanics. In this work, the partition function of classical factor graph is represented by an inner product of two high-dimensional vectors both of which can be decomposed to tensor products of low-dimensional vectors. On the representation, the…
▽ More
The holographic transformation, belief propagation and loop calculus are generalized to problems in generalized probabilistic theories including quantum mechanics. In this work, the partition function of classical factor graph is represented by an inner product of two high-dimensional vectors both of which can be decomposed to tensor products of low-dimensional vectors. On the representation, the holographic transformation is clearly understood by using adjoint linear maps. Furthermore, on the formulation using inner product, the belief propagation is naturally defined from the derivation of the loop calculus formula. As a consequence, the holographic transformation, the belief propagation and the loop calculus are generalized to measurement problems in quantum mechanics and generalized probabilistic theories.
△ Less
Submitted 24 April, 2015; v1 submitted 17 January, 2015;
originally announced January 2015.
-
Linear Programming Relaxations for Goldreich's Generators over Non-Binary Alphabets
Authors:
Ryuhei Mori,
Takeshi Koshiba,
Osamu Watanabe,
Masaki Yamamoto
Abstract:
Goldreich suggested candidates of one-way functions and pseudorandom generators included in $\mathsf{NC}^0$. It is known that randomly generated Goldreich's generator using $(r-1)$-wise independent predicates with $n$ input variables and $m=C n^{r/2}$ output variables is not pseudorandom generator with high probability for sufficiently large constant $C$. Most of the previous works assume that the…
▽ More
Goldreich suggested candidates of one-way functions and pseudorandom generators included in $\mathsf{NC}^0$. It is known that randomly generated Goldreich's generator using $(r-1)$-wise independent predicates with $n$ input variables and $m=C n^{r/2}$ output variables is not pseudorandom generator with high probability for sufficiently large constant $C$. Most of the previous works assume that the alphabet is binary and use techniques available only for the binary alphabet. In this paper, we deal with non-binary generalization of Goldreich's generator and derives the tight threshold for linear programming relaxation attack using local marginal polytope for randomly generated Goldreich's generators. We assume that $u(n)\in ω(1)\cap o(n)$ input variables are known. In that case, we show that when $r\ge 3$, there is an exact threshold $μ_\mathrm{c}(k,r):=\binom{k}{r}^{-1}\frac{(r-2)^{r-2}}{r(r-1)^{r-1}}$ such that for $m=μ\frac{n^{r-1}}{u(n)^{r-2}}$, the LP relaxation can determine linearly many input variables of Goldreich's generator if $μ>μ_\mathrm{c}(k,r)$, and that the LP relaxation cannot determine $\frac1{r-2} u(n)$ input variables of Goldreich's generator if $μ<μ_\mathrm{c}(k,r)$. This paper uses characterization of LP solutions by combinatorial structures called stopping sets on a bipartite graph, which is related to a simple algorithm called peeling algorithm.
△ Less
Submitted 2 June, 2014;
originally announced June 2014.
-
Holographic Transformation for Quantum Factor Graphs
Authors:
Ryuhei Mori
Abstract:
Recently, a general tool called a holographic transformation, which transforms an expression of the partition function to another form, has been used for polynomial-time algorithms and for improvement and understanding of the belief propagation. In this work, the holographic transformation is generalized to quantum factor graphs.
Recently, a general tool called a holographic transformation, which transforms an expression of the partition function to another form, has been used for polynomial-time algorithms and for improvement and understanding of the belief propagation. In this work, the holographic transformation is generalized to quantum factor graphs.
△ Less
Submitted 6 February, 2014; v1 submitted 25 January, 2014;
originally announced January 2014.
-
Loop Calculus for Non-Binary Alphabets using Concepts from Information Geometry
Authors:
Ryuhei Mori
Abstract:
The Bethe approximation is a well-known approximation of the partition function used in statistical physics. Recently, an equality relating the partition function and its Bethe approximation was obtained for graphical models with binary variables by Chertkov and Chernyak. In this equality, the multiplicative error in the Bethe approximation is represented as a weighted sum over all generalized loo…
▽ More
The Bethe approximation is a well-known approximation of the partition function used in statistical physics. Recently, an equality relating the partition function and its Bethe approximation was obtained for graphical models with binary variables by Chertkov and Chernyak. In this equality, the multiplicative error in the Bethe approximation is represented as a weighted sum over all generalized loops in the graphical model. In this paper, the equality is generalized to graphical models with non-binary alphabet using concepts from information geometry.
△ Less
Submitted 19 December, 2014; v1 submitted 25 September, 2013;
originally announced September 2013.
-
New Understanding of the Bethe Approximation and the Replica Method
Authors:
Ryuhei Mori
Abstract:
In this thesis, new generalizations of the Bethe approximation and new understanding of the replica method are proposed. The Bethe approximation is an efficient approximation for graphical models, which gives an asymptotically accurate estimate of the partition function for many graphical models. The Bethe approximation explains the well-known message passing algorithm, belief propagation, which i…
▽ More
In this thesis, new generalizations of the Bethe approximation and new understanding of the replica method are proposed. The Bethe approximation is an efficient approximation for graphical models, which gives an asymptotically accurate estimate of the partition function for many graphical models. The Bethe approximation explains the well-known message passing algorithm, belief propagation, which is exact for tree graphical models. It is also known that the cluster variational method gives the generalized Bethe approximation, called the Kikuchi approximation, yielding the generalized belief propagation. In the thesis, a new series of generalization of the Bethe approximation is proposed, which is named the asymptotic Bethe approximation. The asymptotic Bethe approximation is derived from the characterization of the Bethe free energy using graph covers, which was recently obtained by Vontobel. The asymptotic Bethe approximation can be expressed in terms of the edge zeta function by using Watanabe and Fukumizu's result about the Hessian of the Bethe entropy. The asymptotic Bethe approximation is confirmed to be better than the conventional Bethe approximation on some conditions. For this purpose, Chertkov and Chernyak's loop calculus formula is employed, which shows that the error of the Bethe approximation can be expressed as a sum of weights corresponding to generalized loops, and generalized for non-binary finite alphabets by using concepts of information geometry.
△ Less
Submitted 9 March, 2013;
originally announced March 2013.
-
Source and Channel Polarization over Finite Fields and Reed-Solomon Matrices
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
Polarization phenomenon over any finite field $\mathbb{F}_{q}$ with size $q$ being a power of a prime is considered. This problem is a generalization of the original proposal of channel polarization by Arikan for the binary field, as well as its extension to a prime field by Sasoglu, Telatar, and Arikan. In this paper, a necessary and sufficient condition of a matrix over a finite field…
▽ More
Polarization phenomenon over any finite field $\mathbb{F}_{q}$ with size $q$ being a power of a prime is considered. This problem is a generalization of the original proposal of channel polarization by Arikan for the binary field, as well as its extension to a prime field by Sasoglu, Telatar, and Arikan. In this paper, a necessary and sufficient condition of a matrix over a finite field $\mathbb{F}_q$ is shown under which any source and channel are polarized. Furthermore, the result of the speed of polarization for the binary alphabet obtained by Arikan and Telatar is generalized to arbitrary finite field. It is also shown that the asymptotic error probability of polar codes is improved by using the Reed-Solomon matrix, which can be regarded as a natural generalization of the $2\times 2$ binary matrix used in the original proposal by Arikan.
△ Less
Submitted 20 February, 2014; v1 submitted 22 November, 2012;
originally announced November 2012.
-
New Generalizations of the Bethe Approximation via Asymptotic Expansion
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
The Bethe approximation, discovered in statistical physics, gives an efficient algorithm called belief propagation (BP) for approximating a partition function. BP empirically gives an accurate approximation for many problems, e.g., low-density parity-check codes, compressed sensing, etc. Recently, Vontobel gives a novel characterization of the Bethe approximation using graph cover. In this paper,…
▽ More
The Bethe approximation, discovered in statistical physics, gives an efficient algorithm called belief propagation (BP) for approximating a partition function. BP empirically gives an accurate approximation for many problems, e.g., low-density parity-check codes, compressed sensing, etc. Recently, Vontobel gives a novel characterization of the Bethe approximation using graph cover. In this paper, a new approximation based on the Bethe approximation is proposed. The new approximation is derived from Vontobel's characterization using graph cover, and expressed by using the edge zeta function, which is related with the Hessian of the Bethe free energy as shown by Watanabe and Fukumizu. On some conditions, it is proved that the new approximation is asymptotically better than the Bethe approximation.
△ Less
Submitted 10 October, 2012; v1 submitted 9 October, 2012;
originally announced October 2012.
-
Central Approximation in Statistical Physics and Information Theory
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
In statistical physics and information theory, although the exponent of the partition function is often of our primary interest, there are cases where one needs more detailed information. In this paper, we present a general framework to study more precise asymptotic behaviors of the partition function, using the central approximation in conjunction with the method of types.
In statistical physics and information theory, although the exponent of the partition function is often of our primary interest, there are cases where one needs more detailed information. In this paper, we present a general framework to study more precise asymptotic behaviors of the partition function, using the central approximation in conjunction with the method of types.
△ Less
Submitted 3 February, 2012;
originally announced February 2012.
-
Statistical Mechanical Analysis of Low-Density Parity-Check Codes on General Markov Channel
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
Low-density parity-check (LDPC) codes on symmetric memoryless channels have been analyzed using statistical physics by several authors. In this paper, statistical mechanical analysis of LDPC codes is performed for asymmetric memoryless channels and general Markov channels. It is shown that the saddle point equations of the replica symmetric solution for a Markov channel is equivalent to the densit…
▽ More
Low-density parity-check (LDPC) codes on symmetric memoryless channels have been analyzed using statistical physics by several authors. In this paper, statistical mechanical analysis of LDPC codes is performed for asymmetric memoryless channels and general Markov channels. It is shown that the saddle point equations of the replica symmetric solution for a Markov channel is equivalent to the density evolution of the belief propagation on the factor graph representing LDPC codes on the Markov channel. The derivation uses the method of types for Markov chain.
△ Less
Submitted 10 October, 2011;
originally announced October 2011.
-
Rate-Dependent Analysis of the Asymptotic Behavior of Channel Polarization
Authors:
S. Hamed Hassani,
Ryuhei Mori,
Toshiyuki Tanaka,
Rudiger Urbanke
Abstract:
For a binary-input memoryless symmetric channel $W$, we consider the asymptotic behavior of the polarization process in the large block-length regime when transmission takes place over $W$. In particular, we study the asymptotics of the cumulative distribution $\mathbb{P}(Z_n \leq z)$, where $\{Z_n\}$ is the Bhattacharyya process defined from $W$, and its dependence on the rate of transmission. On…
▽ More
For a binary-input memoryless symmetric channel $W$, we consider the asymptotic behavior of the polarization process in the large block-length regime when transmission takes place over $W$. In particular, we study the asymptotics of the cumulative distribution $\mathbb{P}(Z_n \leq z)$, where $\{Z_n\}$ is the Bhattacharyya process defined from $W$, and its dependence on the rate of transmission. On the basis of this result, we characterize the asymptotic behavior, as well as its dependence on the rate, of the block error probability of polar codes using the successive cancellation decoder. This refines the original bounds by Arıkan and Telatar. Our results apply to general polar codes based on $\ell \times \ell$ kernel matrices.
We also provide lower bounds on the block error probability of polar codes using the MAP decoder. The MAP lower bound and the successive cancellation upper bound coincide when $\ell=2$, but there is a gap for $\ell>2$.
△ Less
Submitted 4 October, 2011; v1 submitted 2 October, 2011;
originally announced October 2011.
-
Near concavity of the growth rate for coupled LDPC chains
Authors:
S. Hamed Hassani,
Nicolas Macris,
Ryuhei Mori
Abstract:
Convolutional Low-Density-Parity-Check (LDPC) ensembles have excellent performance. Their iterative threshold increases with their average degree, or with the size of the coupling window in randomized constructions. In the later case, as the window size grows, the Belief Propagation (BP) threshold attains the maximum-a-posteriori (MAP) threshold of the underlying ensemble. In this contribution we…
▽ More
Convolutional Low-Density-Parity-Check (LDPC) ensembles have excellent performance. Their iterative threshold increases with their average degree, or with the size of the coupling window in randomized constructions. In the later case, as the window size grows, the Belief Propagation (BP) threshold attains the maximum-a-posteriori (MAP) threshold of the underlying ensemble. In this contribution we show that a similar phenomenon happens for the growth rate of coupled ensembles. Loosely speaking, we observe that as the coupling strength grows, the growth rate of the coupled ensemble comes close to the concave hull of the underlying ensemble's growth rate. For ensembles randomly coupled across a window the growth rate actually tends to the concave hull of the underlying one as the window size increases. Our observations are supported by the calculations of the combinatorial growth rate, and that of the growth rate derived from the replica method. The observed concavity is a general feature of coupled mean field graphical models and is already present at the level of coupled Curie-Weiss models. There, the canonical free energy of the coupled system tends to the concave hull of the underlying one. As we explain, the behavior of the growth rate of coupled ensembles is exactly analogous.
△ Less
Submitted 4 April, 2011;
originally announced April 2011.
-
Connection between Annealed Free Energy and Belief Propagation on Random Factor Graph Ensembles
Authors:
Ryuhei Mori
Abstract:
Recently, Vontobel showed the relationship between Bethe free energy and annealed free energy for protograph factor graph ensembles. In this paper, annealed free energy of any random regular, irregular and Poisson factor graph ensembles are connected to Bethe free energy. The annealed free energy is expressed as the solution of maximization problem whose stationary condition equations coincide wit…
▽ More
Recently, Vontobel showed the relationship between Bethe free energy and annealed free energy for protograph factor graph ensembles. In this paper, annealed free energy of any random regular, irregular and Poisson factor graph ensembles are connected to Bethe free energy. The annealed free energy is expressed as the solution of maximization problem whose stationary condition equations coincide with equations of belief propagation since the contribution to partition function of particular type of variable and factor nodes has similar form of minus Bethe free energy. It gives simple derivation of replica symmetric solution. As consequence, it is shown that on replica symmetric ansatz, replica symmetric solution and annealed free energy are equal for regular ensemble.
△ Less
Submitted 18 February, 2011; v1 submitted 15 February, 2011;
originally announced February 2011.
-
Effects of Single-Cycle Structure on Iterative Decoding for Low-Density Parity-Check Codes
Authors:
Ryuhei Mori,
Toshiyuki Tanaka,
Kenta Kasai,
Kohichi Sakaniwa
Abstract:
We consider communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) codes and belief propagation (BP) decoding. For fixed numbers of BP iterations, the bit error probability approaches a limit as blocklength tends to infinity, and the limit is obtained via density evolution. On the other hand, the difference between the bit error probability of codes with blocklen…
▽ More
We consider communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) codes and belief propagation (BP) decoding. For fixed numbers of BP iterations, the bit error probability approaches a limit as blocklength tends to infinity, and the limit is obtained via density evolution. On the other hand, the difference between the bit error probability of codes with blocklength $n$ and that in the large blocklength limit is asymptotically $α(ε,t)/n + Θ(n^{-2})$ where $α(ε,t)$ denotes a specific constant determined by the code ensemble considered, the number $t$ of iterations, and the erasure probability $ε$ of the BEC. In this paper, we derive a set of recursive formulas which allows evaluation of the constant $α(ε,t)$ for standard irregular ensembles. The dominant difference $α(ε,t)/n$ can be considered as effects of cycle-free and single-cycle structures of local graphs. Furthermore, it is confirmed via numerical simulations that estimation of the bit error probability using $α(ε,t)$ is accurate even for small blocklengths.
△ Less
Submitted 2 October, 2010;
originally announced October 2010.
-
Non-Binary Polar Codes using Reed-Solomon Codes and Algebraic Geometry Codes
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
Polar codes, introduced by Arikan, achieve symmetric capacity of any discrete memoryless channels under low encoding and decoding complexity. Recently, non-binary polar codes have been investigated. In this paper, we calculate error probability of non-binary polar codes constructed on the basis of Reed-Solomon matrices by numerical simulations. It is confirmed that 4-ary polar codes have significa…
▽ More
Polar codes, introduced by Arikan, achieve symmetric capacity of any discrete memoryless channels under low encoding and decoding complexity. Recently, non-binary polar codes have been investigated. In this paper, we calculate error probability of non-binary polar codes constructed on the basis of Reed-Solomon matrices by numerical simulations. It is confirmed that 4-ary polar codes have significantly better performance than binary polar codes on binary-input AWGN channel. We also discuss an interpretation of polar codes in terms of algebraic geometry codes, and further show that polar codes using Hermitian codes have asymptotically good performance.
△ Less
Submitted 21 July, 2010;
originally announced July 2010.
-
Properties and Construction of Polar Codes
Authors:
Ryuhei Mori
Abstract:
Recently, Arıkan introduced the method of channel polarization on which one can construct efficient capacity-achieving codes, called polar codes, for any binary discrete memoryless channel. In the thesis, we show that decoding algorithm of polar codes, called successive cancellation decoding, can be regarded as belief propagation decoding, which has been used for decoding of low-density parity-c…
▽ More
Recently, Arıkan introduced the method of channel polarization on which one can construct efficient capacity-achieving codes, called polar codes, for any binary discrete memoryless channel. In the thesis, we show that decoding algorithm of polar codes, called successive cancellation decoding, can be regarded as belief propagation decoding, which has been used for decoding of low-density parity-check codes, on a tree graph. On the basis of the observation, we show an efficient construction method of polar codes using density evolution, which has been used for evaluation of the error probability of belief propagation decoding on a tree graph. We further show that channel polarization phenomenon and polar codes can be generalized to non-binary discrete memoryless channels. Asymptotic performances of non-binary polar codes, which use non-binary matrices called the Reed-Solomon matrices, are better than asymptotic performances of the best explicitly known binary polar code. We also find that the Reed-Solomon matrices are considered to be natural generalization of the original binary channel polarization introduced by Arıkan.
△ Less
Submitted 18 February, 2010;
originally announced February 2010.
-
Channel Polarization on q-ary Discrete Memoryless Channels by Arbitrary Kernels
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
A method of channel polarization, proposed by Arikan, allows us to construct efficient capacity-achieving channel codes. In the original work, binary input discrete memoryless channels are considered. A special case of $q$-ary channel polarization is considered by Sasoglu, Telatar, and Arikan. In this paper, we consider more general channel polarization on $q$-ary channels. We further show explici…
▽ More
A method of channel polarization, proposed by Arikan, allows us to construct efficient capacity-achieving channel codes. In the original work, binary input discrete memoryless channels are considered. A special case of $q$-ary channel polarization is considered by Sasoglu, Telatar, and Arikan. In this paper, we consider more general channel polarization on $q$-ary channels. We further show explicit constructions using Reed-Solomon codes, on which asymptotically fast channel polarization is induced.
△ Less
Submitted 21 July, 2010; v1 submitted 15 January, 2010;
originally announced January 2010.
-
Refined rate of channel polarization
Authors:
Toshiyuki Tanaka,
Ryuhei Mori
Abstract:
A rate-dependent upper bound of the best achievable block error probability of polar codes with successive-cancellation decoding is derived.
A rate-dependent upper bound of the best achievable block error probability of polar codes with successive-cancellation decoding is derived.
△ Less
Submitted 13 January, 2010;
originally announced January 2010.
-
Performance and Construction of Polar Codes on Symmetric Binary-Input Memoryless Channels
Authors:
Ryuhei Mori,
Toshiyuki Tanaka
Abstract:
Channel polarization is a method of constructing capacity achieving codes for symmetric binary-input discrete memoryless channels (B-DMCs) [1]. In the original paper, the construction complexity is exponential in the blocklength. In this paper, a new construction method for arbitrary symmetric binary memoryless channel (B-MC) with linear complexity in the blocklength is proposed. Furthermore, ne…
▽ More
Channel polarization is a method of constructing capacity achieving codes for symmetric binary-input discrete memoryless channels (B-DMCs) [1]. In the original paper, the construction complexity is exponential in the blocklength. In this paper, a new construction method for arbitrary symmetric binary memoryless channel (B-MC) with linear complexity in the blocklength is proposed. Furthermore, new upper and lower bounds of the block error probability of polar codes are derived for the BEC and the arbitrary symmetric B-MC, respectively.
△ Less
Submitted 23 May, 2009; v1 submitted 15 January, 2009;
originally announced January 2009.
-
Finite-Length Analysis of Irregular Expurgated LDPC Codes under Finite Number of Iterations
Authors:
Ryuhei Mori,
Toshiyuki Tanaka,
Kenta Kasai,
Kohichi Sakaniwa
Abstract:
Communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) codes and belief propagation (BP) decoding is considered. The average bit error probability of an irregular LDPC code ensemble after a fixed number of iterations converges to a limit, which is calculated via density evolution, as the blocklength $n$ tends to infinity. The difference between the bit error pr…
▽ More
Communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) codes and belief propagation (BP) decoding is considered. The average bit error probability of an irregular LDPC code ensemble after a fixed number of iterations converges to a limit, which is calculated via density evolution, as the blocklength $n$ tends to infinity. The difference between the bit error probability with blocklength $n$ and the large-blocklength limit behaves asymptotically like $α/n$, where the coefficient $α$ depends on the ensemble, the number of iterations and the erasure probability of the BEC\null. In [1], $α$ is calculated for regular ensembles. In this paper, $α$ for irregular expurgated ensembles is derived. It is demonstrated that convergence of numerical estimates of $α$ to the analytic result is significantly fast for irregular unexpurgated ensembles.
△ Less
Submitted 23 May, 2009; v1 submitted 15 January, 2009;
originally announced January 2009.
-
The Asymptotic Bit Error Probability of LDPC Codes for the Binary Erasure Channel with Finite Iteration Number
Authors:
Ryuhei Mori,
Kenta Kasai,
Tomoharu Shibuya,
Kohichi Sakaniwa
Abstract:
We consider communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) code and belief propagation (BP) decoding. The bit error probability for infinite block length is known by density evolution and it is well known that a difference between the bit error probability at finite iteration number for finite block length $n$ and for infinite block length is asymptotic…
▽ More
We consider communication over the binary erasure channel (BEC) using low-density parity-check (LDPC) code and belief propagation (BP) decoding. The bit error probability for infinite block length is known by density evolution and it is well known that a difference between the bit error probability at finite iteration number for finite block length $n$ and for infinite block length is asymptotically $α/n$, where $α$ is a specific constant depending on the degree distribution, the iteration number and the erasure probability. Our main result is to derive an efficient algorithm for calculating $α$ for regular ensembles. The approximation using $α$ is accurate for $(2,r)$-regular ensembles even in small block length.
△ Less
Submitted 23 January, 2008; v1 submitted 7 January, 2008;
originally announced January 2008.