Search | arXiv e-print repository

On Model Selection Consistency of Lasso for High-Dimensional Ising Models

Authors: Xiangming Meng, Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: We theoretically analyze the model selection consistency of least absolute shrinkage and selection operator (Lasso), both with and without post-thresholding, for high-dimensional Ising models. For random regular (RR) graphs of size $p$ with regular node degree $d$ and uniform couplings $θ_0$, it is rigorously proved that Lasso \textit{without post-thresholding} is model selection consistent in the… ▽ More We theoretically analyze the model selection consistency of least absolute shrinkage and selection operator (Lasso), both with and without post-thresholding, for high-dimensional Ising models. For random regular (RR) graphs of size $p$ with regular node degree $d$ and uniform couplings $θ_0$, it is rigorously proved that Lasso \textit{without post-thresholding} is model selection consistent in the whole paramagnetic phase with the same order of sample complexity $n=Ω{(d^3\log{p})}$ as that of $\ell_1$-regularized logistic regression ($\ell_1$-LogR). This result is consistent with the conjecture in Meng, Obuchi, and Kabashima 2021 using the non-rigorous replica method from statistical physics and thus complements it with a rigorous proof. For general tree-like graphs, it is demonstrated that the same result as RR graphs can be obtained under mild assumptions of the dependency condition and incoherence condition. Moreover, we provide a rigorous proof of the model selection consistency of Lasso with post-thresholding for general tree-like graphs in the paramagnetic phase without further assumptions on the dependency and incoherence conditions. Experimental results agree well with our theoretical analysis. △ Less

Submitted 17 February, 2023; v1 submitted 16 October, 2021; originally announced October 2021.

Comments: AISTATS2023, camera-ready version

arXiv:2102.03988 [pdf, other]

doi 10.1088/1742-5468/ac9831

Ising Model Selection Using $\ell_{1}$-Regularized Linear Regression: A Statistical Mechanics Analysis

Authors: Xiangming Meng, Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: We theoretically analyze the typical learning performance of $\ell_{1}$-regularized linear regression ($\ell_1$-LinR) for Ising model selection using the replica method from statistical mechanics. For typical random regular graphs in the paramagnetic phase, an accurate estimate of the typical sample complexity of $\ell_1$-LinR is obtained. Remarkably, despite the model misspecification, $\ell_1$-L… ▽ More We theoretically analyze the typical learning performance of $\ell_{1}$-regularized linear regression ($\ell_1$-LinR) for Ising model selection using the replica method from statistical mechanics. For typical random regular graphs in the paramagnetic phase, an accurate estimate of the typical sample complexity of $\ell_1$-LinR is obtained. Remarkably, despite the model misspecification, $\ell_1$-LinR is model selection consistent with the same order of sample complexity as $\ell_{1}$-regularized logistic regression ($\ell_1$-LogR), i.e., $M=\mathcal{O}\left(\log N\right)$, where $N$ is the number of variables of the Ising model. Moreover, we provide an efficient method to accurately predict the non-asymptotic behavior of $\ell_1$-LinR for moderate $M, N$, such as precision and recall. Simulations show a fairly good agreement between theoretical predictions and experimental results, even for graphs with many loops, which supports our findings. Although this paper mainly focuses on $\ell_1$-LinR, our method is readily applicable for precisely characterizing the typical learning performances of a wide class of $\ell_{1}$-regularized $M$-estimators including $\ell_1$-LogR and interaction screening. △ Less

Submitted 1 November, 2021; v1 submitted 7 February, 2021; originally announced February 2021.

Comments: Accepted to NeurIPS 2021. Camera-ready version with supplementary materials

arXiv:2008.08342 [pdf, other]

doi 10.1088/1742-5468/abfa10

Structure Learning in Inverse Ising Problems Using $\ell_2$-Regularized Linear Estimator

Authors: Xiangming Meng, Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under t… ▽ More The inference performance of the pseudolikelihood method is discussed in the framework of the inverse Ising problem when the $\ell_2$-regularized (ridge) linear regression is adopted. This setup is introduced for theoretically investigating the situation where the data generation model is different from the inference one, namely the model mismatch situation. In the teacher-student scenario under the assumption that the teacher couplings are sparse, the analysis is conducted using the replica and cavity methods, with a special focus on whether the presence/absence of teacher couplings is correctly inferred or not. The result indicates that despite the model mismatch, one can perfectly identify the network structure using naive linear regression without regularization when the number of spins $N$ is smaller than the dataset size $M$, in the thermodynamic limit $N\to \infty$. Further, to access the underdetermined region $M < N$, we examine the effect of the $\ell_2$ regularization, and find that biases appear in all the coupling estimates, preventing the perfect identification of the network structure. We, however, find that the biases are shown to decay exponentially fast as the distance from the center spin chosen in the pseudolikelihood method grows. Based on this finding, we propose a two-stage estimator: In the first stage, the ridge regression is used and the estimates are pruned by a relatively small threshold; in the second stage the naive linear regression is conducted only on the remaining couplings, and the resultant estimates are again pruned by another relatively large threshold. This estimator with the appropriate regularization coefficient and thresholds is shown to achieve the perfect identification of the network structure even in $0<M/N<1$. Results of extensive numerical experiments support these findings. △ Less

Submitted 23 November, 2020; v1 submitted 19 August, 2020; originally announced August 2020.

Comments: 35 pages, 8 figures

arXiv:2008.03175 [pdf, other]

doi 10.7566/JPSJ.89.124802

Reconstructing Sparse Signals via Greedy Monte-Carlo Search

Authors: Kao Hayashi, Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorith… ▽ More We propose a Monte-Carlo-based method for reconstructing sparse signals in the formulation of sparse linear regression in a high-dimensional setting. The basic idea of this algorithm is to explicitly select variables or covariates to represent a given data vector or responses and accept randomly generated updates of that selection if and only if the energy or cost function decreases. This algorithm is called the greedy Monte-Carlo (GMC) search algorithm. Its performance is examined via numerical experiments, which suggests that in the noiseless case, GMC can achieve perfect reconstruction in undersampling situations of a reasonable level: it can outperform the $\ell_1$ relaxation but does not reach the algorithmic limit of MC-based methods theoretically clarified by an earlier analysis. The necessary computational time is also examined and compared with that of an algorithm using simulated annealing. Additionally, experiments on the noisy case are conducted on synthetic datasets and on a real-world dataset, supporting the practicality of GMC. △ Less

Submitted 29 January, 2021; v1 submitted 7 August, 2020; originally announced August 2020.

Comments: 15 pages, 4 figures

arXiv:1912.11591 [pdf, other]

doi 10.1088/1742-5468/ab8c3a

Learning performance in inverse Ising problems with sparse teacher couplings

Authors: Alia Abbara, Yoshiyuki Kabashima, Tomoyuki Obuchi, Yingying Xu

Abstract: We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Ou… ▽ More We investigate the learning performance of the pseudolikelihood maximization method for inverse Ising problems. In the teacher-student scenario under the assumption that the teacher's couplings are sparse and the student does not know the graphical structure, the learning curve and order parameters are assessed in the typical case using the replica and cavity methods from statistical mechanics. Our formulation is also applicable to a certain class of cost functions having locality; the standard likelihood does not belong to that class. The derived analytical formulas indicate that the perfect inference of the presence/absence of the teacher's couplings is possible in the thermodynamic limit taking the number of spins $N$ as infinity while keeping the dataset size $M$ proportional to $N$, as long as $α=M/N > 2$. Meanwhile, the formulas also show that the estimated coupling values corresponding to the truly existing ones in the teacher tend to be overestimated in the absolute value, manifesting the presence of estimation bias. These results are considered to be exact in the thermodynamic limit on locally tree-like networks, such as the regular random or Erdős--Rényi graphs. Numerical simulation results fully support the theoretical predictions. Additional biases in the estimators on loopy graphs are also discussed. △ Less

Submitted 1 May, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

Comments: 29 pages, 8 figures

arXiv:1906.06002 [pdf, ps, other]

doi 10.1088/1751-8121/ab57a7

Empirical Bayes Method for Boltzmann Machines

Authors: Muneki Yasuda, Tomoyuki Obuchi

Abstract: In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empiric… ▽ More In this study, we consider an empirical Bayes method for Boltzmann machines and propose an algorithm for it. The empirical Bayes method allows estimation of the values of the hyperparameters of the Boltzmann machine by maximizing a specific likelihood function referred to as the empirical Bayes likelihood function in this study. However, the maximization is computationally hard because the empirical Bayes likelihood function involves intractable integrations of the partition function. The proposed algorithm avoids this computational problem by using the replica method and the Plefka expansion. Our method does not require any iterative procedures and is quite simple and fast, though it introduces a bias to the estimate, which exhibits an unnatural behavior with respect to the size of the dataset. This peculiar behavior is supposed to be due to the approximate treatment by the Plefka expansion. A possible extension to overcome this behavior is also discussed. △ Less

Submitted 7 September, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

Journal ref: Journal of Physics A: Mathematical and Theoretical, vol.53, 014004, 2019

arXiv:1902.10375 [pdf, ps, other]

doi 10.1088/1751-8121/ab3e89

Cross validation in sparse linear regression with piecewise continuous nonconvex penalties and its acceleration

Authors: Tomoyuki Obuchi, Ayaka Sakata

Abstract: We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the SCAD penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each… ▽ More We investigate the signal reconstruction performance of sparse linear regression in the presence of noise when piecewise continuous nonconvex penalties are used. Among such penalties, we focus on the SCAD penalty. The contributions of this study are three-fold: We first present a theoretical analysis of a typical reconstruction performance, using the replica method, under the assumption that each component of the design matrix is given as an independent and identically distributed (i.i.d.) Gaussian variable. This clarifies the superiority of the SCAD estimator compared with $\ell_1$ in a wide parameter range, although the nonconvex nature of the penalty tends to lead to solution multiplicity in certain regions. This multiplicity is shown to be connected to replica symmetry breaking in the spin-glass theory. We also show that the global minimum of the mean square error between the estimator and the true signal is located in the replica symmetric phase. Second, we develop an approximate formula efficiently computing the cross-validation error without actually conducting the cross-validation, which is also applicable to the non-i.i.d. design matrices. It is shown that this formula is only applicable to the unique solution region and tends to be unstable in the multiple solution region. We implement instability detection procedures, which allows the approximate formula to stand alone and resultantly enables us to draw phase diagrams for any specific dataset. Third, we propose an annealing procedure, called nonconvexity annealing, to obtain the solution path efficiently. Numerical simulations are conducted on simulated datasets to examine these results to verify the theoretical results consistency and the approximate formula efficiency. Another numerical experiment on a real-world dataset is conducted; its results are consistent with those of earlier studies using the $\ell_0$ formulation. △ Less

Submitted 25 December, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

Comments: 33 pages, 18 figures. MATLAB codes implementing the proposed method are distributed in https://github.com/T-Obuchi/SLRpackage_AcceleratedCV_matlab

arXiv:1902.07436 [pdf, other]

Perfect reconstruction of sparse signals with piecewise continuous nonconvex penalties and nonconvexity control

Authors: Ayaka Sakata, Tomoyuki Obuchi

Abstract: We consider compressed sensing formulated as a minimization problem of nonconvex sparse penalties, Smoothly Clipped Absolute deviation (SCAD) and Minimax Concave Penalty (MCP). The nonconvexity of these penalties is controlled by nonconvexity parameters, and L1 penalty is contained as a limit with respect to these parameters. The analytically derived reconstruction limit overcomes that of L1 and t… ▽ More We consider compressed sensing formulated as a minimization problem of nonconvex sparse penalties, Smoothly Clipped Absolute deviation (SCAD) and Minimax Concave Penalty (MCP). The nonconvexity of these penalties is controlled by nonconvexity parameters, and L1 penalty is contained as a limit with respect to these parameters. The analytically derived reconstruction limit overcomes that of L1 and the algorithmic limit in the Bayes-optimal setting, when the nonconvexity parameters have suitable values. However, for small nonconvexity parameters, where the reconstruction of the relatively dense signals is theoretically guaranteed, the corresponding approximate message passing (AMP) cannot achieve perfect reconstruction. We identify that the shrinks in the basin of attraction to the perfect reconstruction causes the discrepancy between the AMP and corresponding theory using state evolution. A part of the discrepancy is resolved by introducing the control of the nonconvexity parameters to guide the AMP trajectory to the basin of the attraction. △ Less

Submitted 5 June, 2021; v1 submitted 20 February, 2019; originally announced February 2019.

Comments: 25 pages, 17 figures

arXiv:1810.11908 [pdf, other]

doi 10.1088/1742-5468/ab3456

Mean-field theory of graph neural networks in graph partitioning

Authors: Tatsuro Kawamoto, Masashi Tsubaki, Tomoyuki Obuchi

Abstract: A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. More… ▽ More A theoretical performance analysis of the graph neural network (GNN) is presented. For classification tasks, the neural network approach has the advantage in terms of flexibility that it can be employed in a data-driven manner, whereas Bayesian inference requires the assumption of a specific model. A fundamental question is then whether GNN has a high accuracy in addition to this flexibility. Moreover, whether the achieved performance is predominately a result of the backpropagation or the architecture itself is a matter of considerable interest. To gain a better insight into these questions, a mean-field theory of a minimal GNN architecture is developed for the graph partitioning problem. This demonstrates a good agreement with numerical experiments. △ Less

Submitted 28 October, 2018; originally announced October 2018.

Comments: 16 pages, 6 figures, Thirty-second Conference on Neural Information Processing Systems (NIPS2018)

arXiv:1805.11259 [pdf, other]

doi 10.1088/1742-5468/aae02c

Statistical mechanical analysis of sparse linear regression as a variable selection problem

Authors: Tomoyuki Obuchi, Yoshinori Nakanishi-Ohno, Masato Okada, Yoshiyuki Kabashima

Abstract: An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value… ▽ More An algorithmic limit of compressed sensing or related variable-selection problems is analytically evaluated when a design matrix is given by an overcomplete random matrix. The replica method from statistical mechanics is employed to derive the result. The analysis is conducted through evaluation of the entropy, an exponential rate of the number of combinations of variables giving a specific value of fit error to given data which is assumed to be generated from a linear process using the design matrix. This yields the typical achievable limit of the fit error when solving a representative $\ell_0$ problem and includes the presence of unfavourable phase transitions preventing local search algorithms from reaching the minimum-error configuration. The associated phase diagrams are presented. A noteworthy outcome of the phase diagrams is that there exists a wide parameter region where any phase transition is absent from the high temperature to the lowest temperature at which the minimum-error configuration or the ground state is reached. This implies that certain local search algorithms can find the ground state with moderate computational costs in that region. Another noteworthy result is the presence of the random first-order transition in the strong noise case. The theoretical evaluation of the entropy is confirmed by extensive numerical methods using the exchange Monte Carlo and the multi-histogram methods. Another numerical test based on a metaheuristic optimisation algorithm called simulated annealing is conducted, which well supports the theoretical predictions on the local search algorithms. In the successful region with no phase transition, the computational cost of the simulated annealing to reach the ground state is estimated as the third order polynomial of the model dimensionality. △ Less

Submitted 10 September, 2018; v1 submitted 29 May, 2018; originally announced May 2018.

Comments: 39 pages, 14 figures

arXiv:1610.07733 [pdf, ps, other]

Approximate cross-validation formula for Bayesian linear regression

Authors: Yoshiyuki Kabashima, Tomoyuki Obuchi, Makoto Uemura

Abstract: Cross-validation (CV) is a technique for evaluating the ability of statistical models/learning systems based on a given data set. Despite its wide applicability, the rather heavy computational cost can prevent its use as the system size grows. To resolve this difficulty in the case of Bayesian linear regression, we develop a formula for evaluating the leave-one-out CV error approximately without a… ▽ More Cross-validation (CV) is a technique for evaluating the ability of statistical models/learning systems based on a given data set. Despite its wide applicability, the rather heavy computational cost can prevent its use as the system size grows. To resolve this difficulty in the case of Bayesian linear regression, we develop a formula for evaluating the leave-one-out CV error approximately without actually performing CV. The usefulness of the developed formula is tested by statistical mechanical analysis for a synthetic model. This is confirmed by application to a real-world supernova data set as well. △ Less

Submitted 25 October, 2016; originally announced October 2016.

Comments: 5 pages, 2 figures, invited paper for Allerton2016 conference

arXiv:1603.01399 [pdf, ps, other]

Sampling approach to sparse approximation problem: determining degrees of freedom by simulated annealing

Authors: Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: The approximation of a high-dimensional vector by a small combination of column vectors selected from a fixed matrix has been actively debated in several different disciplines. In this paper, a sampling approach based on the Monte Carlo method is presented as an efficient solver for such problems. Especially, the use of simulated annealing (SA), a metaheuristic optimization algorithm, for determin… ▽ More The approximation of a high-dimensional vector by a small combination of column vectors selected from a fixed matrix has been actively debated in several different disciplines. In this paper, a sampling approach based on the Monte Carlo method is presented as an efficient solver for such problems. Especially, the use of simulated annealing (SA), a metaheuristic optimization algorithm, for determining degrees of freedom (the number of used columns) by cross validation is focused on and tested. Test on a synthetic model indicates that our SA-based approach can find a nearly optimal solution for the approximation problem and, when combined with the CV framework, it can optimize the generalization ability. Its utility is also confirmed by application to a real-world supernova data set. △ Less

Submitted 4 October, 2016; v1 submitted 4 March, 2016; originally announced March 2016.

Comments: 5 pages, 3 figures, Proceedings of Eusipco 2016

arXiv:1601.01074 [pdf, ps, other]

doi 10.1088/1742-6596/699/1/012017

Sparse approximation problem: how rapid simulated annealing succeeds and fails

Authors: Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the {\em sparse approximation}. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization pr… ▽ More Information processing techniques based on sparseness have been actively studied in several disciplines. Among them, a mathematical framework to approximately express a given dataset by a combination of a small number of basis vectors of an overcomplete basis is termed the {\em sparse approximation}. In this paper, we apply simulated annealing, a metaheuristic algorithm for general optimization problems, to sparse approximation in the situation where the given data have a planted sparse representation and noise is present. The result in the noiseless case shows that our simulated annealing works well in a reasonable parameter region: the planted solution is found fairly rapidly. This is true even in the case where a common relaxation of the sparse approximation problem, the $\ell_1$-relaxation, is ineffective. On the other hand, when the dimensionality of the data is close to the number of non-zero components, another metastable state emerges, and our algorithm fails to find the planted solution. This phenomenon is associated with a first-order phase transition. In the case of very strong noise, it is no longer meaningful to search for the planted solution. In this situation, our algorithm determines a solution with close-to-minimum distortion fairly quickly. △ Less

Submitted 4 March, 2016; v1 submitted 5 January, 2016; originally announced January 2016.

Comments: 12 pages, 7 figures, a proceedings of HD^3-2015

arXiv:1601.00881 [pdf, ps, other]

doi 10.1088/1742-5468/2016/05/053304

Cross validation in LASSO and its acceleration

Authors: Tomoyuki Obuchi, Yoshiyuki Kabashima

Abstract: We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which e… ▽ More We investigate leave-one-out cross validation (CV) as a determinator of the weight of the penalty term in the least absolute shrinkage and selection operator (LASSO). First, on the basis of the message passing algorithm and a perturbative discussion assuming that the number of observations is sufficiently large, we provide simple formulas for approximately assessing two types of CV errors, which enable us to significantly reduce the necessary cost of computation. These formulas also provide a simple connection of the CV errors to the residual sums of squares between the reconstructed and the given measurements. Second, on the basis of this finding, we analytically evaluate the CV errors when the design matrix is given as a simple random matrix in the large size limit by using the replica method. Finally, these results are compared with those of numerical simulations on finite-size systems and are confirmed to be correct. We also apply the simple formulas of the first type of CV error to an actual dataset of the supernovae. △ Less

Submitted 4 March, 2016; v1 submitted 28 December, 2015; originally announced January 2016.

Comments: 32 pages, 7 figures

arXiv:1510.02189 [pdf, ps, other]

doi 10.1088/1742-5468/2016/06/063302

Sparse approximation based on a random overcomplete basis

Authors: Yoshinori Nakanishi-Ohno, Tomoyuki Obuchi, Masato Okada, Yoshiyuki Kabashima

Abstract: We discuss a strategy of sparse approximation that is based on the use of an overcomplete basis, and evaluate its performance when a random matrix is used as this basis. A small combination of basis vectors is chosen from a given overcomplete basis, according to a given compression rate, such that they compactly represent the target data with as small a distortion as possible. As a selection metho… ▽ More We discuss a strategy of sparse approximation that is based on the use of an overcomplete basis, and evaluate its performance when a random matrix is used as this basis. A small combination of basis vectors is chosen from a given overcomplete basis, according to a given compression rate, such that they compactly represent the target data with as small a distortion as possible. As a selection method, we study the $\ell_0$- and $\ell_1$-based methods, which employ the exhaustive search and $\ell_1$-norm regularization techniques, respectively. The performance is assessed in terms of the trade-off relation between the representation distortion and the compression rate. First, we evaluate the performance analytically in the case that the methods are carried out ideally, using methods of statistical mechanics. Our result clarifies the fact that the $\ell_0$-based method greatly outperforms the $\ell_1$-based one. Second, we examine the practical performances of two well-known algorithms, orthogonal matching pursuit and approximate message passing, when they are used to execute the $\ell_0$- and $\ell_1$-based methods, respectively. Our examination shows that orthogonal matching pursuit achieves a much better performance than the exact execution of the $\ell_1$-based method, as well as approximate message passing. However, regarding the $\ell_0$-based method, there is still room to design more effective greedy algorithms than orthogonal matching pursuit. Finally, we evaluate the performances of the algorithms when they are applied to image data compression. △ Less

Submitted 2 March, 2016; v1 submitted 7 October, 2015; originally announced October 2015.

Comments: 35 pages, 11 figures

arXiv:1412.7012 [pdf, ps, other]

doi 10.7566/JPSJ.85.114803

Boltzmann-Machine Learning of Prior Distributions of Binarized Natural Images

Authors: Tomoyuki Obuchi, Hirokazu Koma, Muneki Yasuda

Abstract: Prior distributions of binarized natural images are learned by using a Boltzmann machine. According the results of this study, there emerges a structure with two sublattices in the interactions, and the nearest-neighbor and next-nearest-neighbor interactions correspondingly take two discriminative values, which reflects the individual characteristics of the three sets of pictures that we process.… ▽ More Prior distributions of binarized natural images are learned by using a Boltzmann machine. According the results of this study, there emerges a structure with two sublattices in the interactions, and the nearest-neighbor and next-nearest-neighbor interactions correspondingly take two discriminative values, which reflects the individual characteristics of the three sets of pictures that we process. Meanwhile, in a longer spatial scale, a longer-range, although still rapidly decaying, ferromagnetic interaction commonly appears in all cases. The characteristic length scale of the interactions is universally up to approximately four lattice spacings $ξ\approx 4$. These results are derived by using the mean-field method, which effectively reduces the computational time required in a Boltzmann machine. An improved mean-field method called the Bethe approximation also gives the same results, as well as the Monte Carlo method does for small size images. These reinforce the validity of our analysis and findings. Relations to criticality, frustration, and simple-cell receptive fields are also discussed. △ Less

Submitted 23 October, 2016; v1 submitted 15 December, 2014; originally announced December 2014.

Comments: 32 pages, 33 figures

Journal ref: J. Phys. Soc. Jpn. 85 (2016) 114803

arXiv:1011.3722 [pdf, ps, other]

doi 10.1088/1751-8113/44/8/085002

Statistical mechanical analysis of a hierarchical random code ensemble in signal processing

Authors: Tomoyuki Obuchi, Kazutaka Takahashi, Koujin Takeda

Abstract: We study a random code ensemble with a hierarchical structure, which is closely related to the generalized random energy model with discrete energy values. Based on this correspondence, we analyze the hierarchical random code ensemble by using the replica method in two situations: lossy data compression and channel coding. For both the situations, the exponents of large deviation analysis characte… ▽ More We study a random code ensemble with a hierarchical structure, which is closely related to the generalized random energy model with discrete energy values. Based on this correspondence, we analyze the hierarchical random code ensemble by using the replica method in two situations: lossy data compression and channel coding. For both the situations, the exponents of large deviation analysis characterizing the performance of the ensemble, the distortion rate of lossy data compression and the error exponent of channel coding in Gallager's formalism, are accessible by a generating function of the generalized random energy model. We discuss that the transitions of those exponents observed in the preceding work can be interpreted as phase transitions with respect to the replica number. We also show that the replica symmetry breaking plays an essential role in these transitions. △ Less

Submitted 2 February, 2011; v1 submitted 16 November, 2010; originally announced November 2010.

Comments: 24 pages, 4 figures

Journal ref: J. Phys. A: Math. Theor. 44 (2011) 085002

Showing 1–17 of 17 results for author: Obuchi, T