Skip to main content

Showing 1–50 of 88 results for author: Fukumizu, K

.
  1. arXiv:2405.20879  [pdf, other

    cs.LG

    Flow matching achieves minimax optimal convergence

    Authors: Kenji Fukumizu, Taiji Suzuki, Noboru Isobe, Kazusato Oko, Masanori Koyama

    Abstract: Flow matching (FM) has gained significant attention as a simulation-free generative model. Unlike diffusion models, which are based on stochastic differential equations, FM employs a simpler approach by solving an ordinary differential equation with an initial condition from a normal distribution, thus streamlining the sample generation process. This paper discusses the convergence properties of F… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  2. arXiv:2403.11520  [pdf, other

    cs.LG stat.ML

    State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards

    Authors: Yuto Tanimoto, Kenji Fukumizu

    Abstract: While many multi-armed bandit algorithms assume that rewards for all arms are constant across rounds, this assumption does not hold in many real-world scenarios. This paper considers the setting of recovering bandits (Pike-Burke & Grunewalder, 2019), where the reward depends on the number of rounds elapsed since the last time an arm was pulled. We propose a new reinforcement learning (RL) algorith… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  3. arXiv:2403.10859  [pdf, other

    stat.ML cs.LG

    Neural-Kernel Conditional Mean Embeddings

    Authors: Eiki Shimizu, Kenji Fukumizu, Dino Sejdinovic

    Abstract: Kernel conditional mean embeddings (CMEs) offer a powerful framework for representing conditional distribution, but they often face scalability and expressiveness challenges. In this work, we propose a new method that effectively combines the strengths of deep learning with CMEs in order to address these challenges. Specifically, our approach leverages the end-to-end neural network (NN) optimizati… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  4. arXiv:2402.18839  [pdf, other

    cs.LG math.AP math.FA math.OC math.PR

    Extended Flow Matching: a Method of Conditional Generation with Generalized Continuity Equation

    Authors: Noboru Isobe, Masanori Koyama, **zhe Zhang, Kohei Hayashi, Kenji Fukumizu

    Abstract: The task of conditional generation is one of the most important applications of generative models, and numerous methods have been developed to date based on the celebrated flow-based models. However, many flow-based models in use today are not built to allow one to introduce an explicit inductive bias to how the conditional distribution to be generated changes with respect to conditions. This can… ▽ More

    Submitted 5 July, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: 27 pages, 10 figures, We have corrected an error in our experiment on COT-FM

    MSC Class: 68T07 (Primary); 49Q22 (Secondary)

  5. arXiv:2402.04516  [pdf, other

    stat.ML cs.LG

    Generalized Sobolev Transport for Probability Measures on a Graph

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: We study the optimal transport (OT) problem for measures supported on a graph metric space. Recently, Le et al. (2022) leverage the graph structure and propose a variant of OT, namely Sobolev transport (ST), which yields a closed-form expression for a fast computation. However, ST is essentially coupled with the $L^p$ geometric structure within its definition which makes it nontrivial to utilize S… ▽ More

    Submitted 29 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: To appear at ICML'2024

  6. arXiv:2310.13653  [pdf, other

    stat.ML cs.LG

    Optimal Transport for Measures with Noisy Tree Metric

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: We study optimal transport (OT) problem for probability measures supported on a tree metric space. It is known that such OT problem (i.e., tree-Wasserstein (TW)) admits a closed-form expression, but depends fundamentally on the underlying tree structure over supports of input measures. In practice, the given tree structure may be, however, perturbed due to noisy or adversarial measurements. To mit… ▽ More

    Submitted 29 February, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: To appear in AISTATS 2024

  7. arXiv:2307.11972  [pdf, other

    stat.ML cs.LG

    Out-of-Distribution Optimality of Invariant Risk Minimization

    Authors: Shoji Toyota, Kenji Fukumizu

    Abstract: Deep Neural Networks often inherit spurious correlations embedded in training data and hence may fail to generalize to unseen domains, which have different distributions from the domain to provide training data. M. Arjovsky et al. (2019) introduced the concept out-of-distribution (o.o.d.) risk, which is the maximum risk among all domains, and formulated the issue caused by spurious correlations as… ▽ More

    Submitted 21 July, 2023; originally announced July 2023.

    Comments: 23 pages, submitted for a publication

  8. arXiv:2305.18484  [pdf, other

    stat.ML cs.LG

    Neural Fourier Transform: A General Approach to Equivariant Representation Learning

    Authors: Masanori Koyama, Kenji Fukumizu, Kohei Hayashi, Takeru Miyato

    Abstract: Symmetry learning has proven to be an effective approach for extracting the hidden structure of data, with the concept of equivariance relation playing the central role. However, most of the current studies are built on architectural theory and corresponding assumptions on the form of data. We propose Neural Fourier Transform (NFT), a general framework of learning the latent linear action of the g… ▽ More

    Submitted 14 February, 2024; v1 submitted 29 May, 2023; originally announced May 2023.

  9. arXiv:2304.12770  [pdf, other

    cs.LG stat.ML

    Controlling Posterior Collapse by an Inverse Lipschitz Constraint on the Decoder Network

    Authors: Yuri Kinoshita, Kenta Oono, Kenji Fukumizu, Yuichi Yoshida, Shin-ichi Maeda

    Abstract: Variational autoencoders (VAEs) are one of the deep generative models that have experienced enormous success over the past decades. However, in practice, they suffer from a problem called posterior collapse, which occurs when the encoder coincides, or collapses, with the prior taking no information from the latent structure of the input data into consideration. In this work, we introduce an invers… ▽ More

    Submitted 2 February, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

    Comments: accepted to ICML 2023, some notations adjusted from the submitted version

  10. arXiv:2302.12498  [pdf, other

    cs.LG stat.ML

    Scalable Unbalanced Sobolev Transport for Measures on a Graph

    Authors: Tam Le, Truyen Nguyen, Kenji Fukumizu

    Abstract: Optimal transport (OT) is a popular and powerful tool for comparing probability measures. However, OT suffers a few drawbacks: (i) input measures required to have the same mass, (ii) a high computational complexity, and (iii) indefiniteness which limits its applications on kernel-dependent algorithmic approaches. To tackle issues (ii)--(iii), Le et al. (2022) recently proposed Sobolev transport fo… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: to appear in AISTATS 2023. arXiv admin note: text overlap with arXiv:2101.09756

  11. arXiv:2210.09745  [pdf, other

    stat.ML cs.LG

    Transfer learning with affine model transformation

    Authors: Shunya Minami, Kenji Fukumizu, Yoshihiro Hayashi, Ryo Yoshida

    Abstract: Supervised transfer learning has received considerable attention due to its potential to boost the predictive power of machine learning in scenarios where data are scarce. Generally, a given set of source models and a dataset from a target domain are used to adapt the pre-trained models to a target domain by statistically learning domain shift and domain-specific factors. While such procedurally a… ▽ More

    Submitted 19 January, 2024; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 34 pages

    Journal ref: NeurIPS 2023

  12. arXiv:2210.07413  [pdf, other

    stat.ML cs.LG

    Invariance-adapted decomposition and Lasso-type contrastive learning

    Authors: Masanori Koyama, Takeru Miyato, Kenji Fukumizu

    Abstract: Recent years have witnessed the effectiveness of contrastive learning in obtaining the representation of dataset that is useful in interpretation and downstream tasks. However, the mechanism that describes this effectiveness have not been thoroughly analyzed, and many studies have been conducted to investigate the data structures captured by contrastive learning. In particular, the recent study of… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Journal ref: 2022 ICML workshop of Topology, Algebra and Geometry in Machine Learning (spotlight)

  13. arXiv:2210.05972  [pdf, other

    cs.LG stat.ML

    Unsupervised Learning of Equivariant Structure from Sequences

    Authors: Takeru Miyato, Masanori Koyama, Kenji Fukumizu

    Abstract: In this study, we present meta-sequential prediction (MSP), an unsupervised framework to learn the symmetry from the time sequence of length at least three. Our method leverages the stationary property (e.g. constant velocity, constant acceleration) of the time sequence to learn the underlying equivariant structure of the dataset by simply training the encoder-decoder model to be able to predict t… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: Accepted to NeurIPS 2022

  14. arXiv:2206.01795  [pdf, other

    math.ST cs.CG cs.LG math.AT stat.ML

    Robust Topological Inference in the Presence of Outliers

    Authors: Siddharth Vishwanath, Bharath K. Sriperumbudur, Kenji Fukumizu, Satoshi Kuriki

    Abstract: The distance function to a compact set plays a crucial role in the paradigm of topological data analysis. In particular, the sublevel sets of the distance function are used in the computation of persistent homology -- a backbone of the topological data analysis pipeline. Despite its stability to perturbations in the Hausdorff distance, persistent homology is highly sensitive to outliers. In this w… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: 50 pages, 10 figures

    MSC Class: 62R40; 55N31; 68T09

  15. arXiv:2204.12194  [pdf, other

    nlin.PS cond-mat.str-el physics.data-an

    Procedure to Reveal the Mechanism of Pattern Formation Process by Topological Data Analysis

    Authors: Yoh-ichi Mototake, Masaichiro Mizumaki, Kazue Kudo, Kenji Fukumizu

    Abstract: Topological data analysis (TDA) is a versatile tool that can be used to extract scientific knowledge from complex pattern formation processes. However, the physics correspondence between the features obtained from TDA and pattern dynamics does not agree one-to-one, and the physical interpretation of the TDA features needs to be set appropriately according to the phenomenon to be analyzed. In this… ▽ More

    Submitted 8 July, 2024; v1 submitted 26 April, 2022; originally announced April 2022.

    Comments: 54 pages, 19 figures

  16. arXiv:2203.15549  [pdf, other

    stat.ML cs.LG

    Invariance Learning based on Label Hierarchy

    Authors: Shoji Toyota, Kenji Fukumizu

    Abstract: Deep Neural Networks inherit spurious correlations embedded in training data and hence may fail to predict desired labels on unseen domains (or environments), which have different distributions from the domain used in training. Invariance Learning (IL) has been developed recently to overcome this shortcoming; using training data in many domains, IL estimates such a predictor that is invariant to a… ▽ More

    Submitted 29 March, 2022; originally announced March 2022.

    Comments: 30 pages, submitted for a publication

  17. ALGAN: Anomaly Detection by Generating Pseudo Anomalous Data via Latent Variables

    Authors: Hironori Murase, Kenji Fukumizu

    Abstract: In many anomaly detection tasks, where anomalous data rarely appear and are difficult to collect, training using only normal data is important. Although it is possible to manually create anomalous data using prior knowledge, they may be subject to user bias. In this paper, we propose an Anomalous Latent variable Generative Adversarial Network (ALGAN) in which the GAN generator produces pseudo-anom… ▽ More

    Submitted 9 May, 2022; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: 13 pages, 8 figures

    Journal ref: IEEE Access, vol. 10, pp. 44259-44270, 2022

  18. arXiv:2110.05225  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    $β$-Intact-VAE: Identifying and Estimating Causal Effects under Limited Overlap

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: As an important problem in causal inference, we discuss the identification and estimation of treatment effects (TEs) under limited overlap; that is, when subjects with certain features belong to a single treatment group. We use a latent variable to model a prognostic score which is widely used in biostatistics and sufficient for TEs; i.e., we build a generative prognostic model. We prove that the… ▽ More

    Submitted 11 October, 2021; originally announced October 2021.

    Comments: Updated version of the NeurIPS 2021 submission (https://openreview.net/forum?id=Z3yd722b5X5). Largely improve readability and the presentation of experimental results. arXiv admin note: text overlap with arXiv:2109.15062, arXiv:2101.06662

  19. arXiv:2109.15062  [pdf, other

    stat.ML cs.LG econ.EM stat.ME

    Towards Principled Causal Effect Estimation by Deep Identifiable Models

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: As an important problem in causal inference, we discuss the estimation of treatment effects (TEs). Representing the confounder as a latent variable, we propose Intact-VAE, a new variant of variational autoencoder (VAE), motivated by the prognostic score that is sufficient for identifying TEs. Our VAE also naturally gives representations balanced for treatment groups, using its prior. Experiments o… ▽ More

    Submitted 1 November, 2021; v1 submitted 30 September, 2021; originally announced September 2021.

    Comments: Fully updated. Largely improve clarity, add identification under unconfoundedness (Sec. 4.2), and more. arXiv admin note: substantial text overlap with arXiv:2101.06662

  20. arXiv:2108.11018  [pdf, other

    cs.LG cs.CV

    A Scaling Law for Synthetic-to-Real Transfer: How Much Is Your Pre-training Effective?

    Authors: Hiroaki Mikami, Kenji Fukumizu, Shogo Murai, Shuji Suzuki, Yuta Kikuchi, Taiji Suzuki, Shin-ichi Maeda, Kohei Hayashi

    Abstract: Synthetic-to-real transfer learning is a framework in which a synthetically generated dataset is used to pre-train a model to improve its performance on real vision tasks. The most significant advantage of using synthetic images is that the ground-truth labels are automatically available, enabling unlimited expansion of the data size without human cost. However, synthetic data may have a huge doma… ▽ More

    Submitted 8 October, 2021; v1 submitted 24 August, 2021; originally announced August 2021.

  21. arXiv:2101.06662  [pdf, other

    stat.ML cs.LG stat.ME

    Intact-VAE: Estimating Treatment Effects under Unobserved Confounding

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: NOTE: This preprint has a flawed theoretical formulation. Please avoid it and refer to the ICLR22 publication https://openreview.net/forum?id=q7n2RngwOM. Also, arXiv:2109.15062 contains some new ideas on unobserved Confounding. As an important problem of causal inference, we discuss the identification and estimation of treatment effects under unobserved confounding. Representing the confounder a… ▽ More

    Submitted 20 April, 2022; v1 submitted 17 January, 2021; originally announced January 2021.

    Comments: This preprint has a flawed theoretical formulation. It was intended as a theoretical update of https://openreview.net/forum?id=D3TNqCspFpM

  22. arXiv:2011.02315  [pdf, other

    math.ST

    Kernel Mean Embedding of Probability Measures and its Applications to Functional Data Analysis

    Authors: Saeed Hayati, Kenji Fukumizu, Afshin Parvardeh

    Abstract: This study intends to introduce kernel mean embedding of probability measures over infinite-dimensional separable Hilbert spaces induced by functional response statistical models. The embedded function represents the concentration of probability measures in small open neighborhoods, which identifies a pseudo-likelihood and fosters a rich framework for statistical inference. Utilizing Maximum Mean… ▽ More

    Submitted 4 November, 2020; originally announced November 2020.

    Comments: 37 Pages, 2 figures, Submitted to Electronic Journal of Statistic

    MSC Class: 62R10 (Primary) 46N30 (Secondary)

  23. arXiv:2011.02256  [pdf, other

    stat.ML cs.LG

    Advantage of Deep Neural Networks for Estimating Functions with Singularity on Hypersurfaces

    Authors: Masaaki Imaizumi, Kenji Fukumizu

    Abstract: We develop a minimax rate analysis to describe the reason that deep neural networks (DNNs) perform better than other standard methods. For nonparametric regression problems, it is well known that many standard methods attain the minimax optimal rate of estimation errors for smooth functions, and thus, it is not straightforward to identify the theoretical advantages of DNNs. This study tries to fil… ▽ More

    Submitted 8 February, 2022; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: Complete version of arXiv:1802.04474

  24. arXiv:2007.02809  [pdf, other

    stat.ML cs.LG

    Meta Learning for Causal Direction

    Authors: Jean-Francois Ton, Dino Sejdinovic, Kenji Fukumizu

    Abstract: The inaccessibility of controlled randomized trials due to inherent constraints in many fields of science has been a fundamental issue in causal inference. In this paper, we focus on distinguishing the cause from effect in the bivariate setting under limited observational data. Based on recent developments in meta learning as well as in causal inference, we introduce a novel generative model that… ▽ More

    Submitted 21 February, 2021; v1 submitted 6 July, 2020; originally announced July 2020.

  25. arXiv:2006.13228  [pdf, other

    stat.ML cs.LG

    A General Class of Transfer Learning Regression without Implementation Cost

    Authors: Shunya Minami, Song Liu, Stephen Wu, Kenji Fukumizu, Ryo Yoshida

    Abstract: We propose a novel framework that unifies and extends existing methods of transfer learning (TL) for regression. To bridge a pretrained source model to the model on a target task, we introduce a density-ratio reweighting function, which is estimated through the Bayesian framework with a specific prior distribution. By changing two intrinsic hyperparameters and the choice of the density-ratio model… ▽ More

    Submitted 16 December, 2020; v1 submitted 23 June, 2020; originally announced June 2020.

    Comments: 31 pages, 6 figures

  26. arXiv:2006.10012  [pdf, other

    math.ST cs.CG cs.LG math.AT stat.ML

    Robust Persistence Diagrams using Reproducing Kernels

    Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath Sriperumbudur

    Abstract: Persistent homology has become an important tool for extracting geometric and topological features from data, whose multi-scale features are summarized in a persistence diagram. From a statistical perspective, however, persistence diagrams are very sensitive to perturbations in the input space. In this work, we develop a framework for constructing robust persistence diagrams from superlevel filtra… ▽ More

    Submitted 3 June, 2022; v1 submitted 17 June, 2020; originally announced June 2020.

    MSC Class: 55N31; 62R40; 62G07; 46E22

  27. arXiv:2004.01822  [pdf, other

    cs.LG stat.ML

    The equivalence between Stein variational gradient descent and black-box variational inference

    Authors: Casey Chu, Kentaro Minami, Kenji Fukumizu

    Abstract: We formalize an equivalence between two popular methods for Bayesian inference: Stein variational gradient descent (SVGD) and black-box variational inference (BBVI). In particular, we show that BBVI corresponds precisely to SVGD when the kernel is the neural tangent kernel. Furthermore, we interpret SVGD and BBVI as kernel gradient flows; we do this by leveraging the recent perspective that views… ▽ More

    Submitted 3 April, 2020; originally announced April 2020.

    Comments: ICLR 2020, Workshop on Integration of Deep Neural Models and Differential Equations

  28. arXiv:2002.04185  [pdf, other

    cs.LG stat.ML

    Smoothness and Stability in GANs

    Authors: Casey Chu, Kentaro Minami, Kenji Fukumizu

    Abstract: Generative adversarial networks, or GANs, commonly display unstable behavior during training. In this work, we develop a principled theoretical framework for understanding the stability of various types of GANs. In particular, we derive conditions that guarantee eventual stationarity of the generator when it is trained with gradient descent, conditions that must be satisfied by the divergence that… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: ICLR 2020

  29. arXiv:2001.01894  [pdf

    stat.ML cs.LG

    Causal Mosaic: Cause-Effect Inference via Nonlinear ICA and Ensemble Method

    Authors: Pengzhou Wu, Kenji Fukumizu

    Abstract: We address the problem of distinguishing cause from effect in bivariate setting. Based on recent developments in nonlinear independent component analysis (ICA), we train nonparametrically general nonlinear causal models that allow non-additive noise. Further, we build an ensemble framework, namely Causal Mosaic, which models a causal pair by a mixture of nonlinear models. We compare this method wi… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: Accepted to AISTATS 2020. Camera-ready version in preparation

    Journal ref: An updated version at AISTATS 2020: http://proceedings.mlr.press/v108/wu20b/wu20b.pdf. Main changes: a correction in Theorem 3 and additional explanations in Sec. 4

  30. arXiv:2001.00220  [pdf, other

    math.PR math.AT math.ST

    On the Limits of Topological Data Analysis for Statistical Inference

    Authors: Siddharth Vishwanath, Kenji Fukumizu, Satoshi Kuriki, Bharath Sriperumbudur

    Abstract: Topological data analysis has emerged as a powerful tool for extracting the metric, geometric and topological features underlying the data as a multi-resolution summary statistic, and has found applications in several areas where data arises from complex sources. In this paper, we examine the use of topological summary statistics through the lens of statistical inference. We investigate necessary… ▽ More

    Submitted 15 February, 2024; v1 submitted 1 January, 2020; originally announced January 2020.

    Comments: 36 pages, 9 figures

    MSC Class: 62F30; 55N31; 62R40

  31. Exchangeable deep neural networks for set-to-set matching and learning

    Authors: Yuki Saito, Takuma Nakamura, Hirotaka Hachiya, Kenji Fukumizu

    Abstract: Matching two different sets of items, called heterogeneous set-to-set matching problem, has recently received attention as a promising problem. The difficulties are to extract features to match a correct pair of different sets and also preserve two types of exchangeability required for set-to-set matching: the pair of sets, as well as the items in each set, should be exchangeable. In this study, w… ▽ More

    Submitted 28 January, 2021; v1 submitted 22 October, 2019; originally announced October 2019.

  32. arXiv:1908.09112  [pdf, other

    stat.ME stat.AP

    Disjunct Support Spike and Slab Priors for Variable Selection in Regression under Quasi-sparseness

    Authors: Daniel Andrade, Kenji Fukumizu

    Abstract: Sparseness of the regression coefficient vector is often a desirable property, since, among other benefits, sparseness improves interpretability. In practice, many true regression coefficients might be negligibly small, but non-zero, which we refer to as quasi-sparseness. Spike-and-slab priors as introduced in (Chipman et al., 2001) can be tuned to ignore very small regression coefficients, and, a… ▽ More

    Submitted 29 September, 2019; v1 submitted 24 August, 2019; originally announced August 2019.

  33. A Kernel Stein Test for Comparing Latent Variable Models

    Authors: Heishiro Kanagawa, Wittawat Jitkrittum, Lester Mackey, Kenji Fukumizu, Arthur Gretton

    Abstract: We propose a kernel-based nonparametric test of relative goodness of fit, where the goal is to compare two models, both of which may have unobserved latent variables, such that the marginal distribution of the observed variables is intractable. The proposed test generalizes the recently proposed kernel Stein discrepancy (KSD) tests (Liu et al., 2016, Chwialkowski et al., 2016, Yang et al., 2018) t… ▽ More

    Submitted 9 May, 2023; v1 submitted 1 July, 2019; originally announced July 2019.

    Comments: This is a pre-copyedited, author-produced version of an article accepted for publication in The Journal of the Royal Statistical Society Series: B following peer review

  34. arXiv:1906.04868  [pdf, other

    cs.LG stat.ML

    Semi-flat minima and saddle points by embedding neural networks to overparameterization

    Authors: Kenji Fukumizu, Shoichiro Yamaguchi, Yoh-ichi Mototake, Mirai Tanaka

    Abstract: We theoretically study the landscape of the training error for neural networks in overparameterized cases. We consider three basic methods for embedding a network into a wider one with more hidden units, and discuss whether a minimum point of the narrower network gives a minimum or saddle point of the wider one. Our results show that the networks with smooth and ReLU activation have different part… ▽ More

    Submitted 14 June, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: 38 pages, 4 figures

  35. arXiv:1903.01680  [pdf, other

    stat.CO stat.ME stat.ML

    Convex Covariate Clustering for Classification

    Authors: Daniel Andrade, Kenji Fukumizu, Yuzuru Okajima

    Abstract: Clustering, like covariate selection for classification, is an important step to compress and interpret the data. However, clustering of covariates is often performed independently of the classification step, which can lead to undesirable clustering results that harm interpretability and compression rate. Therefore, we propose a method that can cluster covariates while taking into account class la… ▽ More

    Submitted 6 April, 2020; v1 submitted 5 March, 2019; originally announced March 2019.

    Comments: Under consideration at Pattern Recognition Letters

  36. arXiv:1902.00342  [pdf, other

    stat.ML cs.LG

    Tree-Sliced Variants of Wasserstein Distances

    Authors: Tam Le, Makoto Yamada, Kenji Fukumizu, Marco Cuturi

    Abstract: Optimal transport (\OT) theory defines a powerful set of tools to compare probability distributions. \OT~suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the \textit{sliced} formulation, which exploits the closed-form formula between univariate distri… ▽ More

    Submitted 28 October, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Camera-ready for NeurIPS 2019

  37. Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions

    Authors: Sho Yokoi, Sosuke Kobayashi, Kenji Fukumizu, Jun Suzuki, Kentaro Inui

    Abstract: In this paper, we propose a new kernel-based co-occurrence measure that can be applied to sparse linguistic expressions (e.g., sentences) with a very short learning time, as an alternative to pointwise mutual information (PMI). As well as deriving PMI from mutual information, we derive this new measure from the Hilbert--Schmidt independence criterion (HSIC); thus, we call the new measure the point… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: Accepted by EMNLP 2018

    Journal ref: EMNLP 2018

  38. arXiv:1806.05924  [pdf, other

    stat.AP stat.CO stat.ML

    Robust Bayesian Model Selection for Variable Clustering with the Gaussian Graphical Model

    Authors: Daniel Andrade, Akiko Takeda, Kenji Fukumizu

    Abstract: Variable clustering is important for explanatory analysis. However, only few dedicated methods for variable clustering with the Gaussian graphical model have been proposed. Even more severe, small insignificant partial correlations due to noise can dramatically change the clustering result when evaluating for example with the Bayesian Information Criteria (BIC). In this work, we try to address thi… ▽ More

    Submitted 15 June, 2018; originally announced June 2018.

  39. arXiv:1805.08463  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Variational Learning on Aggregate Outputs with Gaussian Processes

    Authors: Ho Chung Leon Law, Dino Sejdinovic, Ewan Cameron, Tim CD Lucas, Seth Flaxman, Katherine Battle, Kenji Fukumizu

    Abstract: While a typical supervised learning framework assumes that the inputs and the outputs are measured at the same levels of granularity, many applications, including global map** of disease, only have access to outputs at a much coarser level than that of the inputs. Aggregation of outputs makes generalization to new inputs much more difficult. We consider an approach to this problem based on varia… ▽ More

    Submitted 22 May, 2018; originally announced May 2018.

  40. arXiv:1802.08404  [pdf, other

    stat.ML

    Kernel Recursive ABC: Point Estimation with Intractable Likelihood

    Authors: Takafumi Kajihara, Motonobu Kanagawa, Keisuke Yamazaki, Kenji Fukumizu

    Abstract: We propose a novel approach to parameter estimation for simulator-based statistical models with intractable likelihood. Our proposed method involves recursive application of kernel ABC and kernel herding to the same observed data. We provide a theoretical explanation regarding why the approach works, showing (for the population setting) that, under a certain assumption, point estimates obtained wi… ▽ More

    Submitted 12 June, 2018; v1 submitted 23 February, 2018; originally announced February 2018.

    Comments: to appear in ICML 2018. 18 pages

  41. arXiv:1802.06226  [pdf, other

    stat.ML

    Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

    Authors: Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Ichiro Takeuchi, Ruslan Salakhutdinov, Kenji Fukumizu

    Abstract: Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post… ▽ More

    Submitted 17 February, 2018; originally announced February 2018.

  42. arXiv:1802.05411  [pdf, ps, other

    cs.LG stat.ML

    Selecting the Best in GANs Family: a Post Selection Inference Framework

    Authors: Yao-Hung Hubert Tsai, Makoto Yamada, Denny Wu, Ruslan Salakhutdinov, Ichiro Takeuchi, Kenji Fukumizu

    Abstract: "Which Generative Adversarial Networks (GANs) generates the most plausible images?" has been a frequently asked question among researchers. To address this problem, we first propose an \emph{incomplete} U-statistics estimate of maximum mean discrepancy $\mathrm{MMD}_{inc}$ to measure the distribution discrepancy between generated and real images. $\mathrm{MMD}_{inc}$ enjoys the advantages of asymp… ▽ More

    Submitted 23 June, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

  43. arXiv:1802.04474  [pdf, other

    stat.ML

    Deep Neural Networks Learn Non-Smooth Functions Effectively

    Authors: Masaaki Imaizumi, Kenji Fukumizu

    Abstract: We theoretically discuss why deep neural networks (DNNs) performs better than other models in some cases by investigating statistical properties of DNNs for non-smooth functions. While DNNs have empirically shown higher performance than other standard methods, understanding its mechanism is still a challenging problem. From an aspect of the statistical theory, it is known many standard methods att… ▽ More

    Submitted 7 July, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

    Comments: 31 pages

  44. arXiv:1709.00147  [pdf, other

    math.NA stat.ML

    Convergence Analysis of Deterministic Kernel-Based Quadrature Rules in Misspecified Settings

    Authors: Motonobu Kanagawa, Bharath K. Sriperumbudur, Kenji Fukumizu

    Abstract: This paper presents a convergence analysis of kernel-based quadrature rules in misspecified settings, focusing on deterministic quadrature in Sobolev spaces. In particular, we deal with misspecified settings where a test integrand is less smooth than a Sobolev RKHS based on which a quadrature rule is constructed. We provide convergence guarantees based on two different assumptions on a quadrature… ▽ More

    Submitted 30 October, 2018; v1 submitted 1 September, 2017; originally announced September 2017.

    Comments: 36 pages

    MSC Class: 65D30 (Primary); 65D32; 65D05; 46E35; 46E22 (Secondary)

  45. arXiv:1706.03472  [pdf, other

    stat.ML math.AT physics.data-an

    Kernel method for persistence diagrams via kernel embedding and weight factor

    Authors: Genki Kusano, Kenji Fukumizu, Yasuaki Hiraoka

    Abstract: Topological data analysis is an emerging mathematical concept for characterizing shapes in multi-scale data. In this field, persistence diagrams are widely used as a descriptor of the input data, and can distinguish robust and noisy topological properties. Nowadays, it is highly desired to develop a statistical framework on persistence diagrams to deal with practical data. This paper proposes a ke… ▽ More

    Submitted 12 June, 2017; originally announced June 2017.

    Comments: 12 figures, 30 pages

  46. arXiv:1705.07673  [pdf, other

    stat.ML cs.LG

    A Linear-Time Kernel Goodness-of-Fit Test

    Authors: Wittawat Jitkrittum, Wenkai Xu, Zoltan Szabo, Kenji Fukumizu, Arthur Gretton

    Abstract: We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are constructed via Stein's method, meaning that it is not necessary to compute the normalising constant of the model. We anal… ▽ More

    Submitted 24 October, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

    Comments: Accepted to NIPS 2017

    MSC Class: 46E22; 62G10 ACM Class: G.3; I.2.6

  47. arXiv:1705.04194  [pdf, ps, other

    stat.ML

    Influence Function and Robust Variant of Kernel Canonical Correlation Analysis

    Authors: Md. Ashad Alam, Kenji Fukumizu, Yu-** Wang

    Abstract: Many unsupervised kernel methods rely on the estimation of the kernel covariance operator (kernel CO) or kernel cross-covariance operator (kernel CCO). Both kernel CO and kernel CCO are sensitive to contaminated data, even when bounded positive definite kernels are used. To the best of our knowledge, there are few well-founded robust kernel methods for statistical unsupervised learning. In additio… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

    Comments: arXiv admin note: text overlap with arXiv:1602.05563

  48. arXiv:1703.03216  [pdf, other

    stat.ML

    Trimmed Density Ratio Estimation

    Authors: Song Liu, Akiko Takeda, Taiji Suzuki, Kenji Fukumizu

    Abstract: Density ratio estimation is a vital tool in both machine learning and statistical community. However, due to the unbounded nature of density ratio, the estimation procedure can be vulnerable to corrupted data points, which often pushes the estimated ratio toward infinity. In this paper, we present a robust estimator which automatically identifies and trims outliers. The proposed estimator has a co… ▽ More

    Submitted 6 November, 2017; v1 submitted 9 March, 2017; originally announced March 2017.

    Comments: Made minor revisions. Restructured the introductory sections

  49. arXiv:1701.01582  [pdf, other

    stat.ML

    Learning Sparse Structural Changes in High-dimensional Markov Networks: A Review on Methodologies and Theories

    Authors: Song Liu, Kenji Fukumizu, Taiji Suzuki

    Abstract: Recent years have seen an increasing popularity of learning the sparse \emph{changes} in Markov Networks. Changes in the structure of Markov Networks reflect alternations of interactions between random variables under different regimes and provide insights into the underlying system. While each individual network structure can be complicated and difficult to learn, the overall change from one netw… ▽ More

    Submitted 9 January, 2017; v1 submitted 6 January, 2017; originally announced January 2017.

    Comments: Fixed a few typos in Section 4.4: θshould be δ

  50. arXiv:1610.03725  [pdf, ps, other

    stat.ML stat.ME

    Post Selection Inference with Kernels

    Authors: Makoto Yamada, Yuta Umezu, Kenji Fukumizu, Ichiro Takeuchi

    Abstract: We propose a novel kernel based post selection inference (PSI) algorithm, which can not only handle non-linearity in data but also structured output such as multi-dimensional and multi-label outputs. Specifically, we develop a PSI algorithm for independence measures, and propose the Hilbert-Schmidt Independence Criterion (HSIC) based PSI algorithm (hsicInf). The novelty of the proposed algorithm i… ▽ More

    Submitted 13 October, 2016; v1 submitted 12 October, 2016; originally announced October 2016.