Skip to main content

Showing 1–50 of 67 results for author: Yamada, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2310.10143  [pdf, other

    stat.ML cs.LG

    An Empirical Study of Self-supervised Learning with Wasserstein Distance

    Authors: Makoto Yamada, Yuki Takezawa, Guillaume Houry, Kira Michaela Dusterwald, Deborah Sulem, Han Zhao, Yao-Hung Hubert Tsai

    Abstract: In this study, we delve into the problem of self-supervised learning (SSL) utilizing the 1-Wasserstein distance on a tree structure (a.k.a., Tree-Wasserstein distance (TWD)), where TWD is defined as the L1 distance between two tree-embedded vectors. In SSL methods, the cosine similarity is often utilized as an objective function; however, it has not been well studied when utilizing the Wasserstein… ▽ More

    Submitted 5 February, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  2. arXiv:2305.11420  [pdf, other

    cs.LG cs.DC stat.ML

    Beyond Exponential Graph: Communication-Efficient Topologies for Decentralized Learning via Finite-time Convergence

    Authors: Yuki Takezawa, Ryoma Sato, Han Bao, Kenta Niwa, Makoto Yamada

    Abstract: Decentralized learning has recently been attracting increasing attention for its applications in parallel computation and privacy preservation. Many recent studies stated that the underlying network topology with a faster consensus rate (a.k.a. spectral gap) leads to a better convergence rate and accuracy for decentralized learning. However, a topology with a fast consensus rate, e.g., the exponen… ▽ More

    Submitted 15 October, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: NeurIPS 2023

  3. arXiv:2302.02139  [pdf, other

    cs.LG stat.ML

    Structural Explanations for Graph Neural Networks using HSIC

    Authors: Ayato Toyokuni, Makoto Yamada

    Abstract: Graph neural networks (GNNs) are a type of neural model that tackle graphical tasks in an end-to-end manner. Recently, GNNs have been receiving increased attention in machine learning and data mining communities because of the higher performance they achieve in various tasks, including graph classification, link prediction, and recommendation. However, the complicated dynamics of GNNs make it diff… ▽ More

    Submitted 4 February, 2023; originally announced February 2023.

  4. arXiv:2301.12656  [pdf, other

    stat.ME math.OC

    NPSA: Nonparametric Simulated Annealing for Global Optimization

    Authors: Rong Chen, Alan Schumitzky, Alona Kryshchenko, Julian D. Otalvaro, Walter M. Yamada, Michael N. Neely

    Abstract: In this paper we describe NPSA, the first parallel nonparametric global maximum likelihood optimization algorithm using simulated annealing (SA). Unlike the nonparametric adaptive grid search method NPAG, which is not guaranteed to find a global optimum solution, and may suffer from the curse of dimensionality, NPSA is a global optimizer and it is free from these grid related issues. We illustrate… ▽ More

    Submitted 29 January, 2023; originally announced January 2023.

    Comments: 35 pages, 11 figures, 1 table

  5. arXiv:2207.10283  [pdf, other

    cs.LG cs.AI stat.ML

    One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

    Authors: Sekitoshi Kanai, Shin'ya Yamaguchi, Masanori Yamada, Hiroshi Takahashi, Kentaro Ohno, Yasutoshi Ida

    Abstract: This paper proposes a new loss function for adversarial training. Since adversarial training has difficulties, e.g., necessity of high model capacity, focusing on important data points by weighting cross-entropy loss has attracted much attention. However, they are vulnerable to sophisticated attacks, e.g., Auto-Attack. This paper experimentally reveals that the cause of their vulnerability is thei… ▽ More

    Submitted 26 April, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: ICML2023, 26 pages, 19 figures

  6. arXiv:2206.12116  [pdf, other

    stat.ML cs.AI cs.LG

    Approximating 1-Wasserstein Distance with Trees

    Authors: Makoto Yamada, Yuki Takezawa, Ryoma Sato, Han Bao, Zornitsa Kozareva, Sujith Ravi

    Abstract: Wasserstein distance, which measures the discrepancy between distributions, shows efficacy in various types of natural language processing (NLP) and computer vision (CV) applications. One of the challenges in estimating Wasserstein distance is that it is computationally expensive and does not scale well for many distribution comparison tasks. In this paper, we aim to approximate the 1-Wasserstein… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

  7. arXiv:2206.02077  [pdf, other

    stat.ME

    RPEM: Randomized Monte Carlo Parametric Expectation Maximization Algorithm

    Authors: Rong Chen, Alan Schumitzky, Alona Kryshchenko, Romain Garreau, Julian D. Otalvaro, Walter M. Yamada, Michael N. Neely

    Abstract: Inspired from quantum Monte Carlo, by using unbiased estimators all the time and sampling discrete and continuous variables at the same time using Metropolis algorithm, we present a novel, fast, and accurate high performance Monte Carlo Parametric Expectation Maximization (MCPEM) algorithm. We named it Randomized Parametric Expectation Maximization (RPEM). In particular, we compared RPEM with Mono… ▽ More

    Submitted 23 December, 2022; v1 submitted 4 June, 2022; originally announced June 2022.

    Comments: 28 pages, 6 figures, 2 tables

  8. arXiv:2206.00516  [pdf, other

    cs.LG stat.ML

    Feature Selection for Discovering Distributional Treatment Effect Modifiers

    Authors: Yoichi Chikahara, Makoto Yamada, Hisashi Kashima

    Abstract: Finding the features relevant to the difference in treatment effects is essential to unveil the underlying causal mechanisms. Existing methods seek such features by measuring how greatly the feature attributes affect the degree of the {\it conditional average treatment effect} (CATE). However, these methods may overlook important features because CATE, a measure of the average treatment effect, ca… ▽ More

    Submitted 12 June, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: 18 pages, Accepted to UAI2022

  9. arXiv:2110.08577  [pdf, other

    math.OC cs.LG stat.ML

    Nys-Newton: Nyström-Approximated Curvature for Stochastic Optimization

    Authors: Dinesh Singh, Hardik Tankaria, Makoto Yamada

    Abstract: Second-order optimization methods are among the most widely used optimization approaches for convex optimization problems, and have recently been used to optimize non-convex optimization problems such as deep learning models. The widely used second-order optimization methods such as quasi-Newton methods generally provide curvature information by approximating the Hessian using the secant equation.… ▽ More

    Submitted 29 January, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

  10. arXiv:2109.14875  [pdf, other

    stat.ML cs.LG math.OC

    Adversarial Regression with Doubly Non-negative Weighting Matrices

    Authors: Tam Le, Truyen Nguyen, Makoto Yamada, Jose Blanchet, Viet Anh Nguyen

    Abstract: Many machine learning tasks that involve predicting an output response can be solved by training a weighted regression model. Unfortunately, the predictive power of this type of models may severely deteriorate under low sample sizes or under covariate perturbations. Reweighting the training samples has aroused as an effective mitigation strategy to these problems. In this paper, we propose a novel… ▽ More

    Submitted 30 September, 2021; originally announced September 2021.

    Comments: Accepted to the Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS2021)

  11. arXiv:2103.01400  [pdf, other

    cs.LG cs.AI stat.ML

    Smoothness Analysis of Adversarial Training

    Authors: Sekitoshi Kanai, Masanori Yamada, Hiroshi Takahashi, Yuki Yamanaka, Yasutoshi Ida

    Abstract: Deep neural networks are vulnerable to adversarial attacks. Recent studies about adversarial robustness focus on the loss landscape in the parameter space since it is related to optimization and generalization performance. These studies conclude that the difficulty of adversarial training is caused by the non-smoothness of the loss function: i.e., its gradient is not Lipschitz continuous. However,… ▽ More

    Submitted 15 June, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 22 pages, 7 figures. In V3, we add the results of EntropySGD for adversarial training

  12. arXiv:2102.04108  [pdf, other

    stat.ML cs.LG

    Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares

    Authors: Hiroaki Yamada, Makoto Yamada

    Abstract: A recently introduced technique for a sparse optimization problem called "safe screening" allows us to identify irrelevant variables in the early stage of optimization. In this paper, we first propose a flexible framework for safe screening based on the Fenchel-Rockafellar duality and then derive a strong safe screening rule for norm-regularized least squares by the framework. We call the proposed… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

    Journal ref: Advances in Neural Information Processing Systems 34 (2021) 14645--14655

  13. arXiv:2102.02950  [pdf, other

    stat.ML cs.AI cs.LG

    Adversarial Training Makes Weight Loss Landscape Sharper in Logistic Regression

    Authors: Masanori Yamada, Sekitoshi Kanai, Tomoharu Iwata, Tomokatsu Takahashi, Yuki Yamanaka, Hiroshi Takahashi, Atsutoshi Kumagai

    Abstract: Adversarial training is actively studied for learning robust models against adversarial examples. A recent study finds that adversarially trained models degenerate generalization performance on adversarial examples when their weight loss landscape, which is loss changes with respect to weights, is sharp. Unfortunately, it has been experimentally shown that adversarial training sharpens the weight… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: 9 pages, 5 figures

  14. arXiv:2101.11520  [pdf, other

    cs.LG stat.ML

    Supervised Tree-Wasserstein Distance

    Authors: Yuki Takezawa, Ryoma Sato, Makoto Yamada

    Abstract: To measure the similarity of documents, the Wasserstein distance is a powerful tool, but it requires a high computational cost. Recently, for fast computation of the Wasserstein distance, methods for approximating the Wasserstein distance using a tree metric have been proposed. These tree-based methods allow fast comparisons of a large number of documents; however, they are unsupervised and do not… ▽ More

    Submitted 23 July, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

  15. arXiv:2010.15659  [pdf, other

    math.ST stat.ML

    Post-selection inference with HSIC-Lasso

    Authors: Tobias Freidling, Benjamin Poignard, Héctor Climente-González, Makoto Yamada

    Abstract: Detecting influential features in non-linear and/or high-dimensional data is a challenging and increasingly important task in machine learning. Variable selection methods have thus been gaining much attention as well as post-selection inference. Indeed, the selected features can be significantly flawed when the selection procedure is not accounted for. We propose a selective inference procedure us… ▽ More

    Submitted 17 June, 2021; v1 submitted 29 October, 2020; originally announced October 2020.

    Comments: Changes to previous version: * Incorporating comments and remarks from reviewers * Evaluation of power of the proposed method * Summarising behaviour for different hyper-parameters in one paragraph, instead of several figures * Pseudocode of the algorithm * Additional, in-depth experiment on real-world data

  16. arXiv:2010.09157  [pdf, other

    cs.DL cs.IR cs.LG stat.ML

    Poincare: Recommending Publication Venues via Treatment Effect Estimation

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: Choosing a publication venue for an academic paper is a crucial step in the research process. However, in many cases, decisions are based solely on the experience of researchers, which often leads to suboptimal results. Although there exist venue recommender systems for academic papers, they recommend venues where the paper is expected to be published. In this study, we aim to recommend publicatio… ▽ More

    Submitted 2 September, 2022; v1 submitted 18 October, 2020; originally announced October 2020.

    Comments: Journal of Informetrics

  17. arXiv:2010.02558  [pdf, other

    stat.ML cs.AI cs.LG

    Constraining Logits by Bounded Function for Adversarial Robustness

    Authors: Sekitoshi Kanai, Masanori Yamada, Shin'ya Yamaguchi, Hiroshi Takahashi, Yasutoshi Ida

    Abstract: We propose a method for improving adversarial robustness by addition of a new bounded function just before softmax. Recent studies hypothesize that small logits (inputs of softmax) by logit regularization can improve adversarial robustness of deep learning. Following this hypothesis, we analyze norms of logit vectors at the optimal point under the assumption of universal approximation and explore… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 19 pages, 16 figures

  18. arXiv:2006.07593  [pdf, other

    cs.LG cs.NE stat.ML

    Optimal Transport Kernels for Sequential and Parallel Neural Architecture Search

    Authors: Vu Nguyen, Tam Le, Makoto Yamada, Michael A Osborne

    Abstract: Neural architecture search (NAS) automates the design of deep neural networks. One of the main challenges in searching complex and non-continuous architectures is to compare the similarity of networks that the conventional Euclidean metric may fail to capture. Optimal transport (OT) is resilient to such complex structure by considering the minimal cost for transporting a network into another. Howe… ▽ More

    Submitted 10 June, 2021; v1 submitted 13 June, 2020; originally announced June 2020.

    Comments: 23 pages, camera ready ICML2021

  19. arXiv:2006.05553  [pdf, other

    cs.LG stat.ME stat.ML

    Neural Methods for Point-wise Dependency Estimation

    Authors: Yao-Hung Hubert Tsai, Han Zhao, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: Since its inception, the neural estimation of mutual information (MI) has demonstrated the empirical success of modeling expected dependency between high-dimensional random variables. However, MI is an aggregate statistic and cannot be used to measure point-wise dependency between different events. In this work, instead of estimating the expected dependency, we focus on estimating point-wise depen… ▽ More

    Submitted 14 October, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

  20. arXiv:2006.02703  [pdf, ps, other

    cs.LG stat.ML

    Fast Unbalanced Optimal Transport on a Tree

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: This study examines the time complexities of the unbalanced optimal transport problems from an algorithmic perspective for the first time. We reveal which problems in unbalanced optimal transport can/cannot be solved efficiently. Specifically, we prove that the Kantorovich Rubinstein distance and optimal partial transport in the Euclidean metric cannot be computed in strongly subquadratic time und… ▽ More

    Submitted 7 January, 2021; v1 submitted 4 June, 2020; originally announced June 2020.

    Comments: Accepted to NeurIPS 2020

  21. arXiv:2005.12123  [pdf, other

    stat.ML cs.LG

    Feature Robust Optimal Transport for High-dimensional Data

    Authors: Mathis Petrovich, Chao Liang, Ryoma Sato, Yanbin Liu, Yao-Hung Hubert Tsai, Linchao Zhu, Yi Yang, Ruslan Salakhutdinov, Makoto Yamada

    Abstract: Optimal transport is a machine learning problem with applications including distribution comparison, feature selection, and generative adversarial networks. In this paper, we propose feature-robust optimal transport (FROT) for high-dimensional data, which solves high-dimensional OT problems using feature selection to avoid the curse of dimensionality. Specifically, we find a transport plan with di… ▽ More

    Submitted 29 September, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

  22. arXiv:2003.11243  [pdf, other

    cs.LG stat.ML

    Volumization as a Natural Generalization of Weight Decay

    Authors: Liu Ziyin, Zihao Wang, Makoto Yamada, Masahito Ueda

    Abstract: We propose a novel regularization method, called \textit{volumization}, for neural networks. Inspired by physics, we define a physical volume for the weight parameters in neural networks, and we show that this method is an effective way of regularizing neural networks. Intuitively, this method interpolates between an $L_2$ and $L_\infty$ regularization. Therefore, weight decay and weight clip**… ▽ More

    Submitted 1 April, 2020; v1 submitted 25 March, 2020; originally announced March 2020.

    Comments: 18 pages, 20 figures

  23. arXiv:2003.05747  [pdf, other

    cs.LG stat.ML

    Fast local linear regression with anchor regularization

    Authors: Mathis Petrovich, Makoto Yamada

    Abstract: Regression is an important task in machine learning and data mining. It has several applications in various domains, including finance, biomedical, and computer vision. Recently, network Lasso, which estimates local models by making clusters using the network information, was proposed and its superior performance was demonstrated. In this study, we propose a simple yet effective local model traini… ▽ More

    Submitted 21 February, 2020; originally announced March 2020.

  24. arXiv:2002.03155  [pdf, ps, other

    cs.LG stat.ML

    Random Features Strengthen Graph Neural Networks

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: Graph neural networks (GNNs) are powerful machine learning models for various graph learning tasks. Recently, the limitations of the expressive power of various GNN models have been revealed. For example, GNNs cannot distinguish some non-isomorphic graphs and they cannot learn efficient graph algorithms. In this paper, we demonstrate that GNNs become powerful just by adding a random feature to eac… ▽ More

    Submitted 18 January, 2021; v1 submitted 8 February, 2020; originally announced February 2020.

    Comments: Accepted to SDM 2021

  25. arXiv:2002.01615  [pdf, ps, other

    stat.ML cs.LG

    Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces

    Authors: Ryoma Sato, Marco Cuturi, Makoto Yamada, Hisashi Kashima

    Abstract: Comparing two probability measures supported on heterogeneous spaces is an increasingly important problem in machine learning. Such problems arise when comparing for instance two populations of biological cells, each described with its own set of features, or when looking at families of word embeddings trained across different corpora/languages. For such settings, the Gromov Wasserstein (GW) dista… ▽ More

    Submitted 10 February, 2021; v1 submitted 4 February, 2020; originally announced February 2020.

  26. arXiv:2001.08322  [pdf, other

    cs.LG stat.ML

    FsNet: Feature Selection Network on High-dimensional Biological Data

    Authors: Dinesh Singh, Héctor Climente-González, Mathis Petrovich, Eiryo Kawakami, Makoto Yamada

    Abstract: Biological data including gene expression data are generally high-dimensional and require efficient, generalizable, and scalable machine-learning methods to discover their complex nonlinear patterns. The recent advances in machine learning can be attributed to deep neural networks (DNNs), which excel in various tasks in terms of computer vision and natural language processing. However, standard DN… ▽ More

    Submitted 17 December, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

  27. arXiv:2001.06216  [pdf, other

    cs.LG stat.ML

    GraphLIME: Local Interpretable Model Explanations for Graph Neural Networks

    Authors: Qiang Huang, Makoto Yamada, Yuan Tian, Dinesh Singh, Dawei Yin, Yi Chang

    Abstract: Graph structured data has wide applicability in various domains such as physics, chemistry, biology, computer vision, and social networks, to name a few. Recently, graph neural networks (GNN) were shown to be successful in effectively representing graph structured data because of their good performance and generalization ability. GNN is a deep learning based method that learns a node representatio… ▽ More

    Submitted 27 September, 2020; v1 submitted 17 January, 2020; originally announced January 2020.

  28. arXiv:1910.12252  [pdf, other

    cs.LG stat.ML

    Kernel Stein Tests for Multiple Model Comparison

    Authors: Jen Ning Lim, Makoto Yamada, Bernhard Schölkopf, Wittawat Jitkrittum

    Abstract: We address the problem of non-parametric multiple model comparison: given $l$ candidate models, decide whether each candidate is as good as the best one(s) or worse than it. We propose two statistical tests, each controlling a different notion of decision errors. The first test, building on the post selection inference framework, provably controls the number of best models that are wrongly declare… ▽ More

    Submitted 27 October, 2019; originally announced October 2019.

    Comments: Accepted to NeurIPS 2019

  29. arXiv:1910.06134  [pdf, other

    cs.LG stat.ML

    More Powerful Selective Kernel Tests for Feature Selection

    Authors: Jen Ning Lim, Makoto Yamada, Wittawat Jitkrittum, Yoshikazu Terada, Shigeyuki Matsui, Hidetoshi Shimodaira

    Abstract: Refining one's hypotheses in the light of data is a common scientific practice; however, the dependency on the data introduces selection bias and can lead to specious statistical analysis. An approach for addressing this is via conditioning on the selection procedure to account for how we have used the data to generate our hypotheses, and prevent information to be used again after selection. Many… ▽ More

    Submitted 29 February, 2020; v1 submitted 14 October, 2019; originally announced October 2019.

    Comments: Accepted to AISTATS 2020

  30. arXiv:1910.04483  [pdf, other

    stat.ML cs.LG

    Tree-Wasserstein Barycenter for Large-Scale Multilevel Clustering and Scalable Bayes

    Authors: Tam Le, Viet Huynh, Nhat Ho, Dinh Phung, Makoto Yamada

    Abstract: We study in this paper a variant of Wasserstein barycenter problem, which we refer to as tree-Wasserstein barycenter, by leveraging a specific class of ground metrics, namely tree metrics, for Wasserstein distance. Drawing on the tree structure, we propose an efficient algorithmic approach to solve the tree-Wasserstein barycenter and its variants. The proposed approach is not only fast for computa… ▽ More

    Submitted 26 February, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  31. arXiv:1910.04462  [pdf, other

    stat.ML cs.LG

    Flow-based Alignment Approaches for Probability Measures in Different Spaces

    Authors: Tam Le, Nhat Ho, Makoto Yamada

    Abstract: Gromov-Wasserstein (GW) is a powerful tool to compare probability measures whose supports are in different metric spaces. GW suffers however from a computational drawback since it requires to solve a complex non-convex quadratic program. We consider in this work a specific family of cost metrics, namely \textit{tree metrics} for a space of supports of each probability measure, and aim for developi… ▽ More

    Submitted 17 June, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  32. arXiv:1909.08830  [pdf, other

    stat.ML cs.CV cs.LG

    Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks

    Authors: Sekitoshi Kanai, Yasutoshi Ida, Yasuhiro Fujiwara, Masanori Yamada, Shuichi Adachi

    Abstract: We propose Absum, which is a regularization method for improving adversarial robustness of convolutional neural networks (CNNs). Although CNNs can accurately recognize images, recent studies have shown that the convolution operations in CNNs commonly have structural sensitivity to specific noise composed of Fourier basis functions. By exploiting this sensitivity, they proposed a simple black-box a… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: 16 pages, 39 figures

  33. arXiv:1909.02373  [pdf, other

    stat.ML cs.LG

    LSMI-Sinkhorn: Semi-supervised Mutual Information Estimation with Optimal Transport

    Authors: Yanbin Liu, Makoto Yamada, Yao-Hung Hubert Tsai, Tam Le, Ruslan Salakhutdinov, Yi Yang

    Abstract: Estimating mutual information is an important statistics and machine learning problem. To estimate the mutual information from data, a common practice is preparing a set of paired samples $\{(\mathbf{x}_i,\mathbf{y}_i)\}_{i=1}^n \stackrel{\mathrm{i.i.d.}}{\sim} p(\mathbf{x},\mathbf{y})$. However, in many situations, it is difficult to obtain a large number of data pairs. To address this problem, w… ▽ More

    Submitted 27 June, 2021; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: ECML/PKDD 2021

  34. arXiv:1908.11775  [pdf, ps, other

    cs.LG stat.ML

    Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel

    Authors: Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov

    Abstract: Transformer is a powerful architecture that achieves superior performance on various sequence learning tasks, including neural machine translation, language understanding, and sequence prediction. At the core of the Transformer is the attention mechanism, which concurrently processes all inputs in the streams. In this paper, we present a new formulation of attention via the lens of the kernel. To… ▽ More

    Submitted 11 November, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: EMNLP 2019

  35. arXiv:1905.10261  [pdf, ps, other

    cs.LG stat.ML

    Approximation Ratios of Graph Neural Networks for Combinatorial Problems

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: In this paper, from a theoretical perspective, we study how powerful graph neural networks (GNNs) can be for learning approximation algorithms for combinatorial problems. To this end, we first establish a new class of GNNs that can solve a strictly wider variety of problems than existing GNNs. Then, we bridge the gap between GNN theory and the theory of distributed local algorithms. We theoretical… ▽ More

    Submitted 8 November, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: Accepted to NeurIPS 2019

  36. arXiv:1903.10709  [pdf, other

    stat.ML cs.LG

    Autoencoding Binary Classifiers for Supervised Anomaly Detection

    Authors: Yuki Yamanaka, Tomoharu Iwata, Hiroshi Takahashi, Masanori Yamada, Sekitoshi Kanai

    Abstract: We propose the Autoencoding Binary Classifiers (ABC), a novel supervised anomaly detector based on the Autoencoder (AE). There are two main approaches in anomaly detection: supervised and unsupervised. The supervised approach accurately detects the known anomalies included in training data, but it cannot detect the unknown anomalies. Meanwhile, the unsupervised approach can detect both known and u… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

  37. arXiv:1903.09366  [pdf, other

    cs.LG cs.AI cs.RO stat.AP stat.ML

    Macro Action Reinforcement Learning with Sequence Disentanglement using Variational Autoencoder

    Authors: Heecheol Kim, Masanori Yamada, Kosuke Miyoshi, Hiroshi Yamakawa

    Abstract: One problem in the application of reinforcement learning to real-world problems is the curse of dimensionality on the action space. Macro actions, a sequence of primitive actions, have been studied to diminish the dimensionality of the action space with regard to the time axis. However, previous studies relied on humans defining macro actions or assumed macro actions as repetitions of the same pri… ▽ More

    Submitted 3 June, 2019; v1 submitted 22 March, 2019; originally announced March 2019.

    Comments: First and second authors equally contributed to this paper

  38. arXiv:1902.09722  [pdf, other

    cs.LG stat.ML

    Topological Bayesian Optimization with Persistence Diagrams

    Authors: Tatsuya Shiraishi, Tam Le, Hisashi Kashima, Makoto Yamada

    Abstract: Finding an optimal parameter of a black-box function is important for searching stable material structures and finding optimal neural network structures, and Bayesian optimization algorithms are widely used for the purpose. However, most of existing Bayesian optimization algorithms can only handle vector data and cannot handle complex structured data. In this paper, we propose the topological Baye… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

  39. arXiv:1902.09700  [pdf, ps, other

    cs.LG stat.ML

    Learning to Sample Hard Instances for Graph Algorithms

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: Hard instances, which require a long time for a specific algorithm to solve, help (1) analyze the algorithm for accelerating it and (2) build a good benchmark for evaluating the performance of algorithms. There exist several efforts for automatic generation of hard instances. For example, evolutionary algorithms have been utilized to generate hard instances. However, they generate only finite numb… ▽ More

    Submitted 3 October, 2019; v1 submitted 25 February, 2019; originally announced February 2019.

    Comments: 16 pages, 4 figures, accepted by ACML 2019

  40. arXiv:1902.08341  [pdf, other

    stat.ML cs.LG

    FAVAE: Sequence Disentanglement using Information Bottleneck Principle

    Authors: Masanori Yamada, Heecheol Kim, Kosuke Miyoshi, Hiroshi Yamakawa

    Abstract: We propose the factorized action variational autoencoder (FAVAE), a state-of-the-art generative model for learning disentangled and interpretable representations from sequential data via the information bottleneck without supervision. The purpose of disentangled representation learning is to obtain interpretable and transferable representations from data. We focused on the disentangled representat… ▽ More

    Submitted 30 May, 2019; v1 submitted 21 February, 2019; originally announced February 2019.

  41. arXiv:1902.00342  [pdf, other

    stat.ML cs.LG

    Tree-Sliced Variants of Wasserstein Distances

    Authors: Tam Le, Makoto Yamada, Kenji Fukumizu, Marco Cuturi

    Abstract: Optimal transport (\OT) theory defines a powerful set of tools to compare probability distributions. \OT~suffers however from a few drawbacks, computational and statistical, which have encouraged the proposal of several regularized variants of OT in the recent literature, one of the most notable being the \textit{sliced} formulation, which exploits the closed-form formula between univariate distri… ▽ More

    Submitted 28 October, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: Camera-ready for NeurIPS 2019

  42. arXiv:1901.07868  [pdf, ps, other

    cs.LG stat.ML

    Constant Time Graph Neural Networks

    Authors: Ryoma Sato, Makoto Yamada, Hisashi Kashima

    Abstract: The recent advancements in graph neural networks (GNNs) have led to state-of-the-art performances in various applications, including chemo-informatics, question-answering systems, and recommender systems. However, scaling up these methods to huge graphs, such as social networks and Web graphs, remains a challenge. In particular, the existing methods for accelerating GNNs either are not theoretical… ▽ More

    Submitted 29 March, 2022; v1 submitted 23 January, 2019; originally announced January 2019.

    Comments: TKDD 2022

    Journal ref: ACM Trans. Knowl. Discov. Data. 16, 5, Article 92 (March 2022)

  43. arXiv:1809.05284  [pdf, other

    stat.ML cs.LG

    Variational Autoencoder with Implicit Optimal Priors

    Authors: Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, Satoshi Yagi

    Abstract: The variational autoencoder (VAE) is a powerful generative model that can estimate the probability of a data point by using latent variables. In the VAE, the posterior of the latent variable given the data point is regularized by the prior of the latent variable using Kullback Leibler (KL) divergence. Although the standard Gaussian distribution is usually used for the prior, this simple prior incu… ▽ More

    Submitted 26 December, 2019; v1 submitted 14 September, 2018; originally announced September 2018.

    Comments: 9 pages, 9 figures, accepted at AAAI 2019. Code is available at https://github.com/takahashihiroshi/vae_iop

  44. arXiv:1802.06226  [pdf, other

    stat.ML

    Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

    Authors: Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Ichiro Takeuchi, Ruslan Salakhutdinov, Kenji Fukumizu

    Abstract: Measuring divergence between two distributions is essential in machine learning and statistics and has various applications including binary classification, change point detection, and two-sample test. Furthermore, in the era of big data, designing divergence measure that is interpretable and can handle high-dimensional and complex data becomes extremely important. In the paper, we propose a post… ▽ More

    Submitted 17 February, 2018; originally announced February 2018.

  45. arXiv:1802.05411  [pdf, ps, other

    cs.LG stat.ML

    Selecting the Best in GANs Family: a Post Selection Inference Framework

    Authors: Yao-Hung Hubert Tsai, Makoto Yamada, Denny Wu, Ruslan Salakhutdinov, Ichiro Takeuchi, Kenji Fukumizu

    Abstract: "Which Generative Adversarial Networks (GANs) generates the most plausible images?" has been a frequently asked question among researchers. To address this problem, we first propose an \emph{incomplete} U-statistics estimate of maximum mean discrepancy $\mathrm{MMD}_{inc}$ to measure the distribution discrepancy between generated and real images. $\mathrm{MMD}_{inc}$ enjoys the advantages of asymp… ▽ More

    Submitted 23 June, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

  46. arXiv:1802.05408  [pdf, ps, other

    cs.IT cs.LG stat.ML

    "Dependency Bottleneck" in Auto-encoding Architectures: an Empirical Study

    Authors: Denny Wu, Yixiu Zhao, Yao-Hung Hubert Tsai, Makoto Yamada, Ruslan Salakhutdinov

    Abstract: Recent works investigated the generalization properties in deep neural networks (DNNs) by studying the Information Bottleneck in DNNs. However, the mea- surement of the mutual information (MI) is often inaccurate due to the density estimation. To address this issue, we propose to measure the dependency instead of MI between layers in DNNs. Specifically, we propose to use Hilbert-Schmidt Independen… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

  47. arXiv:1802.03569  [pdf, other

    stat.ML cs.LG math.AT

    Persistence Fisher Kernel: A Riemannian Manifold Kernel for Persistence Diagrams

    Authors: Tam Le, Makoto Yamada

    Abstract: Algebraic topology methods have recently played an important role for statistical analysis with complicated geometric structured data such as shapes, linked twist maps, and material data. Among them, \textit{persistent homology} is a well-known tool to extract robust topological features, and outputs as \textit{persistence diagrams} (PDs). However, PDs are point multi-sets which can not be used in… ▽ More

    Submitted 26 October, 2018; v1 submitted 10 February, 2018; originally announced February 2018.

    Comments: to appear at the 32nd Conference on Neural Information Processing Systems (NIPS), Canada, 2018. (Camera-ready version)

  48. arXiv:1711.06047  [pdf, other

    cs.CV stat.ML

    Deep Matching Autoencoders

    Authors: Tanmoy Mukherjee, Makoto Yamada, Timothy M. Hospedales

    Abstract: Increasingly many real world tasks involve data in multiple modalities or views. This has motivated the development of many effective algorithms for learning a common latent space to relate multiple domains. However, most existing cross-view learning algorithms assume access to paired data for training. Their applicability is thus limited as the paired data assumption is often violated in practice… ▽ More

    Submitted 16 November, 2017; originally announced November 2017.

    Comments: 10 pages

  49. arXiv:1705.05197  [pdf, ps, other

    stat.ML

    Convex Coupled Matrix and Tensor Completion

    Authors: Kishan Wimalawarne, Makoto Yamada, Hiroshi Mamitsuka

    Abstract: We propose a set of convex low rank inducing norms for a coupled matrices and tensors (hereafter coupled tensors), which shares information between matrices and tensors through common modes. More specifically, we propose a mixture of the overlapped trace norm and the latent norms with the matrix trace norm, and then, we propose a new completion algorithm based on the proposed norms. A key advantag… ▽ More

    Submitted 14 June, 2018; v1 submitted 15 May, 2017; originally announced May 2017.

  50. arXiv:1702.06354  [pdf, other

    stat.ML cs.LG

    Interpreting Outliers: Localized Logistic Regression for Density Ratio Estimation

    Authors: Makoto Yamada, Song Liu, Samuel Kaski

    Abstract: We propose an inlier-based outlier detection method capable of both identifying the outliers and explaining why they are outliers, by identifying the outlier-specific features. Specifically, we employ an inlier-based outlier detection criterion, which uses the ratio of inlier and test probability densities as a measure of plausibility of being an outlier. For estimating the density ratio function,… ▽ More

    Submitted 21 February, 2017; originally announced February 2017.