-
Fully Data-driven Normalized and Exponentiated Kernel Density Estimator with Hyvärinen Score
Authors:
Shunsuke Imai,
Takuya Koriyama,
Shouto Yonekura,
Shonosuke Sugasawa,
Yoshihiko Nishiyama
Abstract:
We introduce a new deal of kernel density estimation using an exponentiated form of kernel density estimators. The density estimator has two hyperparameters flexibly controlling the smoothness of the resulting density. We tune them in a data-driven manner by minimizing an objective function based on the Hyvärinen score to avoid the optimization involving the intractable normalizing constant due to…
▽ More
We introduce a new deal of kernel density estimation using an exponentiated form of kernel density estimators. The density estimator has two hyperparameters flexibly controlling the smoothness of the resulting density. We tune them in a data-driven manner by minimizing an objective function based on the Hyvärinen score to avoid the optimization involving the intractable normalizing constant due to the exponentiation. We show the asymptotic properties of the proposed estimator and emphasize the importance of including the two hyperparameters for flexible density estimation. Our simulation studies and application to income data show that the proposed density estimator is appealing when the underlying density is multi-modal or observations contain outliers.
△ Less
Submitted 13 February, 2024; v1 submitted 2 December, 2022;
originally announced December 2022.
-
On Selection Criteria for the Tuning Parameter in Robust Divergence
Authors:
Shonosuke Sugasawa,
Shouto Yonekura
Abstract:
While robust divergence such as density power divergence and $γ$-divergence is helpful for robust statistical inference in the presence of outliers, the tuning parameter that controls the degree of robustness is chosen in a rule-of-thumb, which may lead to an inefficient inference. We here propose a selection criterion based on an asymptotic approximation of the Hyvarinen score applied to an unnor…
▽ More
While robust divergence such as density power divergence and $γ$-divergence is helpful for robust statistical inference in the presence of outliers, the tuning parameter that controls the degree of robustness is chosen in a rule-of-thumb, which may lead to an inefficient inference. We here propose a selection criterion based on an asymptotic approximation of the Hyvarinen score applied to an unnormalized model defined by robust divergence. The proposed selection criterion only requires first and second-order partial derivatives of an assumed density function with respect to observations, which can be easily computed regardless of the number of parameters. We demonstrate the usefulness of the proposed method via numerical studies using normal distributions and regularized linear regression.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Adaptation of the Tuning Parameter in General Bayesian Inference with Robust Divergence
Authors:
Shouto Yonekura,
Shonosuke Sugasawa
Abstract:
We introduce a methodology for robust Bayesian estimation with robust divergence (e.g., density power divergence or γ-divergence), indexed by a single tuning parameter. It is well known that the posterior density induced by robust divergence gives highly robust estimators against outliers if the tuning parameter is appropriately and carefully chosen. In a Bayesian framework, one way to find the op…
▽ More
We introduce a methodology for robust Bayesian estimation with robust divergence (e.g., density power divergence or γ-divergence), indexed by a single tuning parameter. It is well known that the posterior density induced by robust divergence gives highly robust estimators against outliers if the tuning parameter is appropriately and carefully chosen. In a Bayesian framework, one way to find the optimal tuning parameter would be using evidence (marginal likelihood). However, we numerically illustrate that evidence induced by the density power divergence does not work to select the optimal tuning parameter since robust divergence is not regarded as a statistical model. To overcome the problems, we treat the exponential of robust divergence as an unnormalized statistical model, and we estimate the tuning parameter via minimizing the Hyvarinen score. We also provide adaptive computational methods based on sequential Monte Carlo (SMC) samplers, which enables us to obtain the optimal tuning parameter and samples from posterior distributions simultaneously. The empirical performance of the proposed method through simulations and an application to real data are also provided.
△ Less
Submitted 30 June, 2022; v1 submitted 12 June, 2021;
originally announced June 2021.
-
Online Smoothing for Diffusion Processes Observed with Noise
Authors:
Shouto Yonekura,
Alexandros Beskos
Abstract:
We introduce a methodology for online estimation of smoothing expectations for a class of additive functionals, in the context of a rich family of diffusion processes (that may include jumps) -- observed at discrete-time instances. We overcome the unavailability of the transition density of the underlying SDE by working on the augmented pathspace. The new method can be applied, for instance, to ca…
▽ More
We introduce a methodology for online estimation of smoothing expectations for a class of additive functionals, in the context of a rich family of diffusion processes (that may include jumps) -- observed at discrete-time instances. We overcome the unavailability of the transition density of the underlying SDE by working on the augmented pathspace. The new method can be applied, for instance, to carry out online parameter inference for the designated class of models. Algorithms defined on the infinite-dimensional pathspace have been developed in the last years mainly in the context of MCMC techniques. There, the main benefit is the achievement of mesh-free mixing times for the practical time-discretised algorithm used on a PC. Our own methodology sets up the framework for infinite-dimensional online filtering -- an important positive practical consequence is the construct of estimates with the variance that does not increase with decreasing mesh-size. Besides regularity conditions, our method is, in principle, applicable under the weak assumption -- relatively to restrictive conditions often required in the MCMC or filtering literature of methods defined on pathspace -- that the SDE covariance matrix is invertible.
△ Less
Submitted 11 August, 2021; v1 submitted 27 March, 2020;
originally announced March 2020.
-
Asymptotic Analysis of Model Selection Criteria for General Hidden Markov Models
Authors:
Shouto Yonekura,
Alexandros Beskos,
Sumeetpal S. Singh
Abstract:
The paper obtains analytical results for the asymptotic properties of Model Selection Criteria -- widely used in practice -- for a general family of hidden Markov models (HMMs), thereby substantially extending the related theory beyond typical i.i.d.-like model structures and filling in an important gap in the relevant literature. In particular, we look at the Bayesian and Akaike Information Crite…
▽ More
The paper obtains analytical results for the asymptotic properties of Model Selection Criteria -- widely used in practice -- for a general family of hidden Markov models (HMMs), thereby substantially extending the related theory beyond typical i.i.d.-like model structures and filling in an important gap in the relevant literature. In particular, we look at the Bayesian and Akaike Information Criteria (BIC and AIC) and the model evidence. In the setting of nested classes of models, we prove that BIC and the evidence are strongly consistent for HMMs (under regularity conditions), whereas AIC is not weakly consistent. Numerical experiments support our theoretical results.
△ Less
Submitted 30 March, 2020; v1 submitted 28 November, 2018;
originally announced November 2018.