Zum Hauptinhalt springen

Showing 1–50 of 53 results for author: Fukumizu, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.04042  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Scaling Law of Sim2Real Transfer Learning in Expanding Computational Materials Databases for Real-World Predictions

    Authors: Shunya Minami, Yoshihiro Hayashi, Stephen Wu, Kenji Fukumizu, Hiroki Sugisawa, Masashi Ishii, Isao Kuwajima, Kazuya Shiratori, Ryo Yoshida

    Abstract: To address the challenge of limited experimental materials data, extensive physical property databases are being developed based on high-throughput computational experiments, such as molecular dynamics simulations. Previous studies have shown that fine-tuning a predictor pretrained on a computational database to a real system can result in models with outstanding generalization capabilities compar… ▽ More

    Submitted 7 August, 2024; originally announced August 2024.

    Comments: 22 pages, 6 figures

  2. arXiv:2405.20879  [pdf, other

    cs.LG

    Flow matching achieves minimax optimal convergence

    Authors: Kenji Fukumizu, Taiji Suzuki, Noboru Isobe, Kazusato Oko, Masanori Koyama

    Abstract: Flow matching (FM) has gained significant attention as a simulation-free generative model. Unlike diffusion models, which are based on stochastic differential equations, FM employs a simpler approach by solving an ordinary differential equation with an initial condition from a normal distribution, thus streamlining the sample generation process. This paper discusses the convergence properties of F… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  3. arXiv:2403.11520  [pdf, other

    cs.LG stat.ML

    State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards

    Authors: Yuto Tanimoto, Kenji Fukumizu

    Abstract: While many multi-armed bandit algorithms assume that rewards for all arms are constant across rounds, this assumption does not hold in many real-world scenarios. This paper considers the setting of recovering bandits (Pike-Burke & Grunewalder, 2019), where the reward depends on the number of rounds elapsed since the last time an arm was pulled. We propose a new reinforcement learning (RL) algorith… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  4. arXiv:2403.10859  [pdf, other

    stat.ML cs.LG

    Neural-Kernel Conditional Mean Embeddings

    Authors: Eiki Shimizu, Kenji Fukumizu, Dino Sejdinovic

    Abstract: Kernel conditional mean embeddings (CMEs) offer a powerful framework for representing conditional distribution, but they often face scalability and expressiveness challenges. In this work, we propose a new method that effectively combines the strengths of deep learning with CMEs in order to address these challenges. Specifically, our approach leverages the end-to-end neural network (NN) optimizati… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  5. arXiv:2402.18839  [pdf, other

    cs.LG math.AP math.FA math.OC math.PR

    Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity Equation

    Authors: Noboru Isobe, Masanori Koyama, Jinzhe Zhang, Kohei Hayashi, Kenji Fukumizu

    Abstract: The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can… ▽ More

    Submitted 5 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 27 pages, 10 figures, We have corrected an error in our experiment on COT-FM

    MSC Class: 68T07 (Primary); 49Q22 (Secondary)

  6. arXiv:2402.04516  [pdf, other

    stat.ML cs.LG

    Generalized Sobolev Transport for Probability Measures on a Graph

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: We study the optimal transport (OT) problem for measures supported on a graph metric space. Recently, Le et al. (2022) leverage the graph structure and propose a variant of OT, namely Sobolev transport (ST), which yields a closed-form expression for a fast computation. However, ST is essentially coupled with the $L^p$ geometric structure within its definition which makes it nontrivial to utilize S… ▽ More

    Submitted 29 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: To appear at ICML'2024

  7. arXiv:2310.13653  [pdf, other

    stat.ML cs.LG

    Optimal Transport for Measures with Noisy Tree Metric

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: We study optimal transport (OT) problem for probability measures supported on a tree metric space. It is known that such OT problem (i.e., tree-Wasserstein (TW)) admits a closed-form expression, but depends fundamentally on the underlying tree structure over supports of input measures. In practice, the given tree structure may be, however, perturbed due to noisy or adversarial measurements. To mit… ▽ More

    Submitted 29 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: To appear in AISTATS 2024

  8. arXiv:2307.11972  [pdf, other

    stat.ML cs.LG

    Out-of-Distribution Optimality of Invariant Risk Minimization

    Authors: Shoji Toyota, Kenji Fukumizu

    Abstract: Deep Neural Networks often inherit spurious correlations embedded in training data and hence may fail to generalize to unseen domains, which have different distributions from the domain to provide training data. M. Arjovsky et al. (2019) introduced the concept out-of-distribution (o.o.d.) risk, which is the maximum risk among all domains, and formulated the issue caused by spurious correlations as… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: 23 pages, submitted for a publication

  9. arXiv:2305.18484  [pdf, other

    stat.ML cs.LG

    Neural Fourier Transform: A General Approach to Equivariant Representation Learning

    Authors: Masanori Koyama, Kenji Fukumizu, Kohei Hayashi, Takeru Miyato

    Abstract: Symmetry learning has proven to be an effective approach for extracting the hidden structure of data, with the concept of equivariance relation playing the central role. However, most of the current studies are built on architectural theory and corresponding assumptions on the form of data. We propose Neural Fourier Transform (NFT), a general framework of learning the latent linear action of the g… ▽ More

    Submitted 14 February, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  10. arXiv:2304.12770  [pdf, other

    cs.LG stat.ML

    Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network

    Authors: Yuri Kinoshita, Kenta Oono, Kenji Fukumizu, Yuichi Yoshida, Shin-ichi Maeda

    Abstract: Variational autoencoders (VAEs) are one of the deep generative models that have experienced enormous success over the past decades. However, in practice, they suffer from a problem called posterior collapse, which occurs when the encoder coincides, or collapses, with the prior taking no information from the latent structure of the input data into consideration. In this work, we introduce an invers… ▽ More

    Submitted 2 February, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: accepted to ICML 2023, some notations adjusted from the submitted version

  11. arXiv:2302.12498  [pdf, other

    cs.LG stat.ML

    Scalable Unbalanced Sobolev Transport for Measures on a Graph

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: Optimal transport (OT) is a popular and powerful tool for comparing probability measures. However, OT suffers a few drawbacks: (i) input measures required to have the same mass, (ii) a high computational complexity, and (iii) indefiniteness which limits its applications on kernel-dependent algorithmic approaches. To tackle issues (ii)--(iii), Le et al. (2022) recently proposed Sobolev transport fo… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: to appear in AISTATS 2023. arXiv admin note: text overlap with arXiv:2101.09756

  12. arXiv:2210.09745  [pdf, other

    stat.ML cs.LG

    Transfer learning with affine model transformation

    Authors: Shunya Minami, Kenji Fukumizu, Yoshihiro Hayashi, Ryo Yoshida

    Abstract: Supervised transfer learning has received considerable attention due to its potential to boost the predictive power of machine learning in scenarios where data are scarce. Generally, a given set of source models and a dataset from a target domain are used to adapt the pre-trained models to a target domain by statistically learning domain shift and domain-specific factors. While such procedurally a… ▽ More

    Submitted 19 January, 2024; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 34 pages

    Journal ref: NeurIPS 2023

  13. arXiv:2210.07413  [pdf, other

    stat.ML cs.LG

    Invariance-adapted decomposition and Lasso-type contrastive learning

    Authors: Masanori Koyama, Takeru Miyato, Kenji Fukumizu

    Abstract: Recent years have witnessed the effectiveness of contrastive learning in obtaining the representation of dataset that is useful in interpretation and downstream tasks. However, the mechanism that describes this effectiveness have not been thoroughly analyzed, and many studies have been conducted to investigate the data structures captured by contrastive learning. In particular, the recent study of… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Journal ref: 2022 ICML workshop of Topology, Algebra and Geometry in Machine Learning (spotlight)

  14. arXiv:2210.05972  [pdf, other

    cs.LG stat.ML

    Unsupervised Learning of Equivariant Structure from Sequences

    Authors: Takeru Miyato, Masanori Koyama, Kenji Fukumizu

    Abstract: In this study, we present meta-sequential prediction (MSP), an unsupervised framework to learn the symmetry from the time sequence of length at least three. Our method leverages the stationary property (e.g. constant velocity, constant acceleration) of the time sequence to learn the underlying equivariant structure of the dataset by simply training the encoder-decoder model to be able to predict t… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  15. arXiv:2206.01795  [pdf, other

    math.ST cs.CG cs.LG math.AT stat.ML

    Robust Topological Inference in the Presence of Outliers

    Authors: Siddharth Vishwanath, Bharath K. Sriperumbudur, Kenji Fukumizu, Satoshi Kuriki

    Abstract: The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this w… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: 50 pages, 10 figures

    MSC Class: 62R40; 55N31; 68T09

  16. arXiv:2203.15549  [pdf, other

    stat.ML cs.LG

    Invariance Learning based on Label Hierarchy

    Authors: Shoji Toyota, Kenji Fukumizu

    Abstract: Deep Neural Networks inherit spurious correlations embedded in training data and hence may fail to predict desired labels on unseen domains (or environments), which have different distributions from the domain used in training. Invariance Learning (IL) has been developed recently to overcome this shortcoming; using training data in many domains, IL estimates such a predictor that is invariant to a… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 30 pages, submitted for a publication

  17. ALGAN: Anomaly Detection by Generating Pseudo Anomalous Data via Latent Variables

    Authors: Hironori Murase, Kenji Fukumizu

    Abstract: In many anomaly detection tasks, where anomalous data rarely appear and are difficult to collect, training using only normal data is important. Although it is possible to manually create anomalous data using prior knowledge, they may be subject to user bias. In this paper, we propose an Anomalous Latent variable Generative Adversarial Network (ALGAN) in which the GAN generator produces pseudo-anom… ▽ More

    Submitted 9 May, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: 13 pages, 8 figures

    Journal ref: IEEE Access, vol. 10, pp. 44259-44270, 2022

  18. arXiv:2110.05225  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    $β$-Intact-VAE: Identifying and Estimating Causal Effects under Limited Overlap

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: As an important problem in causal inference, we discuss the identification and estimation of treatment effects (TEs) under limited overlap; that is, when subjects with certain features belong to a single treatment group. We use a latent variable to model a prognostic score which is widely used in biostatistics and sufficient for TEs; i.e., we build a generative prognostic model. We prove that the… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: Updated version of the NeurIPS 2021 submission (https://openreview.net/forum?id=Z3yd722b5X5). Largely improve readability and the presentation of experimental results. arXiv admin note: text overlap with arXiv:2109.15062, arXiv:2101.06662

  19. arXiv:2109.15062  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    Towards Principled Causal Effect Estimation by Deep Identifiable Models

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: As an important problem in causal inference, we discuss the estimation of treatment effects (TEs). Representing the confounder as a latent variable, we propose Intact-VAE, a new variant of variational autoencoder (VAE), motivated by the prognostic score that is sufficient for identifying TEs. Our VAE also naturally gives representations balanced for treatment groups, using its prior. Experiments o… ▽ More

    Submitted 1 November, 2021; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: Fully updated. Largely improve clarity, add identification under unconfoundedness (Sec. 4.2), and more. arXiv admin note: substantial text overlap with arXiv:2101.06662

  20. arXiv:2108.11018  [pdf, other

    cs.LG cs.CV

    A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?

    Authors: Hiroaki Mikami, Kenji Fukumizu, Shogo Murai, Shuji Suzuki, Yuta Kikuchi, Taiji Suzuki, Shin-ichi Maeda, Kohei Hayashi

    Abstract: Synthetic-to-real transfer learning is a framework in which a synthetically generated dataset is used to pre-train a model to improve its performance on real vision tasks. The most significant advantage of using synthetic images is that the ground-truth labels are automatically available, enabling unlimited expansion of the data size without human cost. However, synthetic data may have a huge doma… ▽ More

    Submitted 8 October, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

  21. arXiv:2101.06662  [pdf, other

    stat.ML cs.LG stat.ME

    Intact-VAE: Estimating Treatment Effects under Unobserved Confounding

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: NOTE: This preprint has a flawed theoretical formulation. Please avoid it and refer to the ICLR22 publication https://openreview.net/forum?id=q7n2RngwOM. Also, arXiv:2109.15062 contains some new ideas on unobserved Confounding. As an important problem of causal inference, we discuss the identification and estimation of treatment effects under unobserved confounding. Representing the confounder a… ▽ More

    Submitted 20 April, 2022; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: This preprint has a flawed theoretical formulation. It was intended as a theoretical update of https://openreview.net/forum?id=D3TNqCspFpM

  22. arXiv:2011.02256  [pdf, other

    stat.ML cs.LG

    Advantage of Deep Neural Networks for Estimating Functions with Singularity on Hypersurfaces

    Authors: Masaaki Imaizumi, Kenji Fukumizu

    Abstract: We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods. For nonparametric regression problems, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs. This study tries to fil… ▽ More

    Submitted 8 February, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Complete version of arXiv:1802.04474

  23. arXiv:2007.02809  [pdf, other

    stat.ML cs.LG

    Meta Learning for Causal Direction

    Authors: Jean-Francois Ton, Dino Sejdinovic, Kenji Fukumizu

    Abstract: The inaccessibility of controlled randomized trials due to inherent constraints in many fields of science has been a fundamental issue in causal inference. In this paper, we focus on distinguishing the cause from effect in the bivariate setting under limited observational data. Based on recent developments in meta learning as well as in causal inference, we introduce a novel generative model that… ▽ More

    Submitted 21 February, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

  24. arXiv:2006.13228  [pdf, other

    stat.ML cs.LG

    A General Class of Transfer Learning Regression without Implementation Cost

    Authors: Shunya Minami, Song Liu, Stephen Wu, Kenji Fukumizu, Ryo Yoshida

    Abstract: We propose a novel framework that unifies and extends existing methods of transfer learning (TL) for regression. To bridge a pretrained source model to the model on a target task, we introduce a density-ratio reweighting function, which is estimated through the Bayesian framework with a specific prior distribution. By changing two intrinsic hyperparameters and the choice of the density-ratio model… ▽ More

    Submitted 16 December, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: 31 pages, 6 figures

  25. arXiv:2006.10012  [pdf, other

    math.ST cs.CG cs.LG math.AT stat.ML

    Robust Persistence Diagrams using Reproducing Kernels

    Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath Sriperumbudur

    Abstract: Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtra… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 June, 2020; originally announced June 2020.

    MSC Class: 55N31; 62R40; 62G07; 46E22

  26. arXiv:2004.01822  [pdf, other

    cs.LG stat.ML

    The equivalence between Stein variational gradient descent and black-box variational inference

    Authors: Casey Chu, Kentaro Minami, Kenji Fukumizu

    Abstract: We formalize an equivalence between two popular methods for Bayesian inference: Stein variational gradient descent (SVGD) and black-box variational inference (BBVI). In particular, we show that BBVI corresponds precisely to SVGD when the kernel is the neural tangent kernel. Furthermore, we interpret SVGD and BBVI as kernel gradient flows; we do this by leveraging the recent perspective that views… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: ICLR 2020, Workshop on Integration of Deep Neural Models and Differential Equations

  27. arXiv:2002.04185  [pdf, other

    cs.LG stat.ML

    Smoothness and Stability in GANs

    Authors: Casey Chu, Kentaro Minami, Kenji Fukumizu

    Abstract: Generative adversarial networks, or GANs, commonly display unstable behavior during training. In this work, we develop a principled theoretical framework for understanding the stability of various types of GANs. In particular, we derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  28. arXiv:2001.01894  [pdf

    stat.ML cs.LG

    Causal Mosaic: Cause-Effect Inference via Nonlinear ICA and Ensemble Method

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: We address the problem of distinguishing cause from effect in bivariate setting. Based on recent developments in nonlinear independent component analysis (ICA), we train nonparametrically general nonlinear causal models that allow non-additive noise. Further, we build an ensemble framework, namely Causal Mosaic, which models a causal pair by a mixture of nonlinear models. We compare this method wi… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: Accepted to AISTATS 2020. Camera-ready version in preparation

    Journal ref: An updated version at AISTATS 2020: http://proceedings.mlr.press/v108/wu20b/wu20b.pdf. Main changes: a correction in Theorem 3 and additional explanations in Sec. 4

  29. Exchangeable deep neural networks for set-to-set matching and learning

    Authors: Yuki Saito, Takuma Nakamura, Hirotaka Hachiya, Kenji Fukumizu

    Abstract: Matching two different sets of items, called heterogeneous set-to-set matching problem, has recently received attention as a promising problem. The difficulties are to extract features to match a correct pair of different sets and also preserve two types of exchangeability required for set-to-set matching: the pair of sets, as well as the items in each set, should be exchangeable. In this study, w… ▽ More

    Submitted 28 January, 2021; v1 submitted 22 October, 2019; originally announced October 2019.

  30. A Kernel Stein Test for Comparing Latent Variable Models

    Authors: Heishiro Kanagawa, Wittawat Jitkrittum, Lester Mackey, Kenji Fukumizu, Arthur Gretton

    Abstract: We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables, such that the marginal distribution of the observed variables is intractable. The proposed test generalizes the recently proposed kernel Stein discrepancy (KSD) tests (Liu et al., 2016, Chwialkowski et al., 2016, Yang et al., 2018) t… ▽ More

    Submitted 9 May, 2023; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: This is a pre-copyedited, author-produced version of an article accepted for publication in The Journal of the Royal Statistical Society Series: B following peer review

  31. arXiv:1906.04868  [pdf, other

    cs.LG stat.ML

    Semi-flat minima and saddle points by embedding neural networks to overparameterization

    Authors: Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, Mirai Tanaka

    Abstract: We theoretically study the landscape of the training error for neural networks in overparameterized cases. We consider three basic methods for embedding a network into a wider one with more hidden units, and discuss whether a minimum point of the narrower network gives a minimum or saddle point of the wider one. Our results show that the networks with smooth and ReLU activation have different part… ▽ More

    Submitted 14 June, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: 38 pages, 4 figures

  32. arXiv:1902.00342  [pdf, other

    stat.ML cs.LG

    Tree-Sliced Variants of Wasserstein Distances

    Authors: Tam Le, Makoto Yamada, Kenji Fukumizu, Marco Cuturi

    Abstract: Optimal transport (\OT) theory defines a powerful set of tools to compare probability distributions. \OT~suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the \textit{sliced} formulation, which exploits the closed-form formula between univariate distri… ▽ More

    Submitted 28 October, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Camera-ready for NeurIPS 2019

  33. Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions

    Authors: Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui

    Abstract: In this paper, we propose a new kernel-based co-occurrence measure that can be applied to sparse linguistic expressions (e.g., sentences) with a very short learning time, as an alternative to pointwise mutual information (PMI). As well as deriving PMI from mutual information, we derive this new measure from the Hilbert--Schmidt independence criterion (HSIC); thus, we call the new measure the point… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: Accepted by EMNLP 2018

    Journal ref: EMNLP 2018

  34. arXiv:1805.08463  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Variational Learning on Aggregate Outputs with Gaussian Processes

    Authors: Ho Chung Leon Law, Dino Sejdinovic, Ewan Cameron, Tim CD Lucas, Seth Flaxman, Katherine Battle, Kenji Fukumizu

    Abstract: While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global mapping of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on varia… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  35. arXiv:1802.05411  [pdf, ps, other

    cs.LG stat.ML

    Selecting the Best in GANs Family: a Post Selection Inference Framework

    Authors: Yao-Hung Hubert Tsai, Makoto Yamada, Denny Wu, Ruslan Salakhutdinov, Ichiro Takeuchi, Kenji Fukumizu

    Abstract: "Which Generative Adversarial Networks (GANs) generates the most plausible images?" has been a frequently asked question among researchers. To address this problem, we first propose an \emph{incomplete} U-statistics estimate of maximum mean discrepancy $\mathrm{MMD}_{inc}$ to measure the distribution discrepancy between generated and real images. $\mathrm{MMD}_{inc}$ enjoys the advantages of asymp… ▽ More

    Submitted 23 June, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

  36. arXiv:1705.07673  [pdf, other

    stat.ML cs.LG

    A Linear-Time Kernel Goodness-of-Fit Test

    Authors: Wittawat Jitkrittum, Wenkai Xu, Zoltan Szabo, Kenji Fukumizu, Arthur Gretton

    Abstract: We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are constructed via Stein's method, meaning that it is not necessary to compute the normalising constant of the model. We anal… ▽ More

    Submitted 24 October, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: Accepted to NIPS 2017

    MSC Class: 46E22; 62G10 ACM Class: G.3; I.2.6

  37. arXiv:1605.09522  [pdf, ps, other

    stat.ML cs.LG

    Kernel Mean Embedding of Distributions: A Review and Beyond

    Authors: Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Bernhard Schölkopf

    Abstract: A Hilbert space embedding of a distribution---in short, a kernel mean embedding---has recently emerged as a powerful tool for machine learning and inference. The basic idea behind this framework is to map distributions into a reproducing kernel Hilbert space (RKHS) in which the whole arsenal of kernel methods can be extended to probability measures. It can be viewed as a generalization of the orig… ▽ More

    Submitted 13 December, 2020; v1 submitted 31 May, 2016; originally announced May 2016.

    Comments: 147 pages; this is the final version

    Journal ref: Foundations and Trends in Machine Learning: Vol. 10: No. 1-2, pp 1-141 (2017)

  38. arXiv:1510.09155  [pdf, ps, other

    q-bio.QM cs.DM q-bio.PE

    A characterization of minimum spanning tree-like metric spaces

    Authors: Momoko Hayamizu, Hiroshi Endo, Kenji Fukumizu

    Abstract: Recent years have witnessed a surge of biological interest in the minimum spanning tree (MST) problem for its relevance to automatic model construction using the distances between data points. Despite the increasing use of MST algorithms for this purpose, the goodness-of-fit of an MST to the data is often elusive because no quantitative criteria have been developed to measure it. Motivated by this… ▽ More

    Submitted 30 October, 2015; originally announced October 2015.

    Comments: 9 pages, 2 figures

    MSC Class: Primary 05C12; Secondary 05C05

  39. arXiv:1506.02784  [pdf, other

    stat.ML cs.LG

    Estimating Posterior Ratio for Classification: Transfer Learning from Probabilistic Perspective

    Authors: Song Liu, Kenji Fukumizu

    Abstract: Transfer learning assumes classifiers of similar tasks share certain parameter structures. Unfortunately, modern classifiers uses sophisticated feature representations with huge parameter spaces which lead to costly transfer. Under the impression that changes from one classifier to another should be ``simple'', an efficient transfer learning criteria that only learns the ``differences'' is propose… ▽ More

    Submitted 19 October, 2015; v1 submitted 9 June, 2015; originally announced June 2015.

    Comments: Revision Comments: The proofs were corrected from a few mistakes. The title and the introduction was changed. We have also re-run a few experiments

  40. arXiv:1501.06794  [pdf, other

    stat.ML cs.DS cs.LG

    Computing Functions of Random Variables via Reproducing Kernel Hilbert Space Representations

    Authors: Bernhard Schölkopf, Krikamol Muandet, Kenji Fukumizu, Jonas Peters

    Abstract: We describe a method to perform functional operations on probability distributions of random variables. The method uses reproducing kernel Hilbert space representations of probability distributions, and it is applicable to all operations which can be applied to points drawn from the respective distributions. We refer to our approach as {\em kernel probabilistic programming}. We illustrate it on sy… ▽ More

    Submitted 27 January, 2015; originally announced January 2015.

    ACM Class: G.3; I.2.6; D.3.3

    Journal ref: Statistics and Computing 25:755-766 (2015)

  41. arXiv:1405.5505  [pdf, ps, other

    stat.ML cs.LG

    Kernel Mean Shrinkage Estimators

    Authors: Krikamol Muandet, Bharath Sriperumbudur, Kenji Fukumizu, Arthur Gretton, Bernhard Schölkopf

    Abstract: A mean function in a reproducing kernel Hilbert space (RKHS), or a kernel mean, is central to kernel methods in that it is used by many classical algorithms such as kernel principal component analysis, and it also forms the core inference step of modern kernel methods that rely on embedding probability distributions in RKHSs. Given a finite sample, an empirical average has been used commonly as a… ▽ More

    Submitted 25 February, 2016; v1 submitted 21 May, 2014; originally announced May 2014.

    Comments: 41 pages

  42. arXiv:1306.0842  [pdf, ps, other

    stat.ML cs.LG math.ST

    Kernel Mean Estimation and Stein's Effect

    Authors: Krikamol Muandet, Kenji Fukumizu, Bharath Sriperumbudur, Arthur Gretton, Bernhard Schölkopf

    Abstract: A mean function in reproducing kernel Hilbert space, or a kernel mean, is an important part of many applications ranging from kernel principal component analysis to Hilbert-space embedding of distributions. Given finite samples, an empirical average is the standard estimate for the true kernel mean. We show that this estimator can be improved via a well-known phenomenon in statistics called Stein'… ▽ More

    Submitted 6 June, 2013; v1 submitted 4 June, 2013; originally announced June 2013.

    Comments: first draft

  43. arXiv:1210.4887  [pdf

    cs.LG cs.AI stat.ML

    Hilbert Space Embeddings of POMDPs

    Authors: Yu Nishiyama, Abdeslam Boularias, Arthur Gretton, Kenji Fukumizu

    Abstract: A nonparametric approach for policy learning for POMDPs is proposed. The approach represents distributions over the states, observations, and actions as embeddings in feature spaces, which are reproducing kernel Hilbert spaces. Distributions over states given the observations are obtained by applying the kernel Bayes' rule to these distribution embeddings. Policies and value functions are defined… ▽ More

    Submitted 16 October, 2012; originally announced October 2012.

    Comments: Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

    Report number: UAI-P-2012-PG-644-653

  44. arXiv:1207.6076  [pdf, ps, other

    stat.ME cs.LG math.ST stat.ML

    Equivalence of distance-based and RKHS-based statistics in hypothesis testing

    Authors: Dino Sejdinovic, Bharath Sriperumbudur, Arthur Gretton, Kenji Fukumizu

    Abstract: We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, maximum mean discrepancies (MMD), that is, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. In the cas… ▽ More

    Submitted 12 November, 2013; v1 submitted 25 July, 2012; originally announced July 2012.

    Comments: Published in at http://dx.doi.org/10.1214/13-AOS1140 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOS-AOS1140

    Journal ref: Annals of Statistics 2013, Vol. 41, No. 5, 2263-2291

  45. arXiv:1205.0411  [pdf, ps, other

    cs.LG stat.ME stat.ML

    Hypothesis testing using pairwise distances and associated kernels (with Appendix)

    Authors: Dino Sejdinovic, Arthur Gretton, Bharath Sriperumbudur, Kenji Fukumizu

    Abstract: We provide a unifying framework linking two classes of statistics used in two-sample and independence testing: on the one hand, the energy distances and distance covariances from the statistics literature; on the other, distances between embeddings of distributions to reproducing kernel Hilbert spaces (RKHS), as established in machine learning. The equivalence holds when energy distances are compu… ▽ More

    Submitted 21 May, 2012; v1 submitted 2 May, 2012; originally announced May 2012.

    Comments: Appearing in Proceedings of the 29th International Conference on Machine Learning, Edinburgh, Scotland, UK, 2012

  46. arXiv:1202.6504  [pdf, ps, other

    stat.ML cs.LG

    Learning from Distributions via Support Measure Machines

    Authors: Krikamol Muandet, Kenji Fukumizu, Francesco Dinuzzo, Bernhard Schölkopf

    Abstract: This paper presents a kernel-based discriminative learning framework on probability measures. Rather than relying on large collections of vectorial training examples, our framework learns using a collection of probability distributions that have been constructed to meaningfully represent training data. By representing these probability distributions as mean embeddings in the reproducing kernel Hil… ▽ More

    Submitted 12 January, 2013; v1 submitted 29 February, 2012; originally announced February 2012.

    Comments: Advances in Neural Information Processing Systems 25

  47. arXiv:1109.0455  [pdf, ps, other

    stat.ML cs.LG

    Gradient-based kernel dimension reduction for supervised learning

    Authors: Kenji Fukumizu, Chenlei Leng

    Abstract: This paper proposes a novel kernel approach to linear dimension reduction for supervised learning. The purpose of the dimension reduction is to find directions in the input space to explain the output as effectively as possible. The proposed method uses an estimator for the gradient of regression function, based on the covariance operators on reproducing kernel Hilbert spaces. In comparison with o… ▽ More

    Submitted 2 September, 2011; originally announced September 2011.

    Comments: 21 pages

  48. arXiv:1103.0605  [pdf, ps, other

    cs.AI cs.DM

    Loopy Belief Propagation, Bethe Free Energy and Graph Zeta Function

    Authors: Yusuke Watanabe, Kenji Fukumizu

    Abstract: We propose a new approach to the theoretical analysis of Loopy Belief Propagation (LBP) and the Bethe free energy (BFE) by establishing a formula to connect LBP and BFE with a graph zeta function. The proposed approach is applicable to a wide class of models including multinomial and Gaussian types. The connection derives a number of new theoretical results on LBP and BFE. This paper focuses two o… ▽ More

    Submitted 2 March, 2011; originally announced March 2011.

  49. arXiv:1002.3307  [pdf, ps, other

    cs.AI cs.DM math-ph

    Graph Zeta Function in the Bethe Free Energy and Loopy Belief Propagation

    Authors: Yusuke Watanabe, Kenji Fukumizu

    Abstract: We propose a new approach to the analysis of Loopy Belief Propagation (LBP) by establishing a formula that connects the Hessian of the Bethe free energy with the edge zeta function. The formula has a number of theoretical implications on LBP. It is applied to give a sufficient condition that the Hessian of the Bethe free energy is positive definite, which shows non-convexity for graphs with mult… ▽ More

    Submitted 17 February, 2010; originally announced February 2010.

    Comments: 19 pages, Annual Conference on Neural Information Processing Systems (NIPS 2009), together with the supplementary material

    Journal ref: Advances in Neural Information Processing Systems 22, pages 2017-2025

  50. arXiv:0908.3850  [pdf, ps, other

    math.CO cs.DM

    New graph polynomials from the Bethe approximation of the Ising partition function

    Authors: Yusuke Watanabe, Kenji Fukumizu

    Abstract: We introduce two graph polynomials and discuss their properties. One is a polynomial of two variables whose investigation is motivated by the performance analysis of the Bethe approximation of the Ising partition function. The other is a polynomial of one variable that is obtained by the specialization of the first one. It is shown that these polynomials satisfy deletion-contraction relations and… ▽ More

    Submitted 3 June, 2010; v1 submitted 26 August, 2009; originally announced August 2009.

    Comments: To appear in Combinatorics, Probability & Computing. Revised from the first version, 28 pages