Skip to main content

Showing 1–16 of 16 results for author: Tanaka, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.05964  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Safe Sample Screening

    Authors: Hiroyuki Hanada, Aoyama Tatsuya, Akahane Satoshi, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Shion Takeno, Taro Murayama, Hanju Lee, Shinya Kojima, Ichiro Takeuchi

    Abstract: In this study, we propose a machine learning method called Distributionally Robust Safe Sample Screening (DRSSS). DRSSS aims to identify unnecessary training samples, even when the distribution of the training samples changes in the future. To achieve this, we effectively combine the distributionally robust (DR) paradigm, which aims to enhance model robustness against variations in data distributi… ▽ More

    Submitted 9 June, 2024; originally announced June 2024.

  2. arXiv:2404.16328  [pdf, other

    stat.ML cs.LG

    Distributionally Robust Safe Screening

    Authors: Hiroyuki Hanada, Satoshi Akahane, Tatsuya Aoyama, Tomonari Tanaka, Yoshito Okura, Yu Inatsu, Noriaki Hashimoto, Taro Murayama, Lee Hanju, Shinya Kojima, Ichiro Takeuchi

    Abstract: In this study, we propose a method Distributionally Robust Safe Screening (DRSS), for identifying unnecessary samples and features within a DR covariate shift setting. This method effectively combines DR learning, a paradigm aimed at enhancing model robustness against variations in data distribution, with safe screening (SS), a sparse optimization technique designed to identify irrelevant samples… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

  3. arXiv:2401.07711  [pdf, other

    cs.LG stat.ML

    Efficient Nonparametric Tensor Decomposition for Binary and Count Data

    Authors: Zerui Tao, Toshihisa Tanaka, Qibin Zhao

    Abstract: In numerous applications, binary reactions or event counts are observed and stored within high-order tensors. Tensor decompositions (TDs) serve as a powerful tool to handle such high-dimensional and sparse data. However, many traditional TDs are explicitly or implicitly designed based on the Gaussian distribution, which is unsuitable for discrete data. Moreover, most TDs rely on predefined multi-l… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: AAAI-24

  4. arXiv:2305.08501  [pdf, other

    stat.ML cs.LG

    Label Smoothing is Robustification against Model Misspecification

    Authors: Ryoya Yamasaki, Toshiyuki Tanaka

    Abstract: Label smoothing (LS) adopts smoothed targets in classification tasks. For example, in binary classification, instead of the one-hot target $(1,0)^\top$ used in conventional logistic regression (LR), LR with LS (LSLR) uses the smoothed target $(1-\fracα{2},\fracα{2})^\top$ with a smoothing level $α\in(0,1)$, which causes squeezing of values of the logit. Apart from the common regularization-based i… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 12 pages, 5 figures, preprint version

  5. arXiv:2305.08463  [pdf, other

    stat.ML cs.LG

    Convergence Analysis of Mean Shift

    Authors: Ryoya Yamasaki, Toshiyuki Tanaka

    Abstract: The mean shift (MS) algorithm seeks a mode of the kernel density estimate (KDE). This study presents a convergence guarantee of the mode estimate sequence generated by the MS algorithm and an evaluation of the convergence rate, under fairly mild conditions, with the help of the argument concerning the Łojasiewicz inequality. Our findings extend existing ones covering analytic kernels and the Epane… ▽ More

    Submitted 7 November, 2023; v1 submitted 15 May, 2023; originally announced May 2023.

    Comments: 18 pages, 2 figures, preprint version

  6. arXiv:2304.10046  [pdf, other

    stat.ML cs.LG

    Optimal Kernel for Kernel-Based Modal Statistical Methods

    Authors: Ryoya Yamasaki, Toshiyuki Tanaka

    Abstract: Kernel-based modal statistical methods include mode estimation, regression, and clustering. Estimation accuracy of these methods depends on the kernel used as well as the bandwidth. We study effect of the selection of the kernel function to the estimation accuracy of these methods. In particular, we theoretically show a (multivariate) optimal kernel that minimizes its analytically-obtained asympto… ▽ More

    Submitted 19 April, 2023; originally announced April 2023.

    Comments: 51 pages, 4 figures

  7. arXiv:2303.08068  [pdf, other

    cs.CV cs.LG stat.ML

    Style Feature Extraction Using Contrastive Conditioned Variational Autoencoders with Mutual Information Constraints

    Authors: Suguru Yasutomi, Toshihisa Tanaka

    Abstract: Extracting fine-grained features such as styles from unlabeled data is crucial for data analysis. Unsupervised methods such as variational autoencoders (VAEs) can extract styles that are usually mixed with other features. Conditional VAEs (CVAEs) can isolate styles using class labels; however, there are no established methods to extract only styles using unlabeled data. In this paper, we propose a… ▽ More

    Submitted 16 March, 2023; v1 submitted 3 February, 2023; originally announced March 2023.

  8. arXiv:2206.12141  [pdf, other

    stat.ML cs.LG

    Aggregated Multi-output Gaussian Processes with Knowledge Transfer Across Domains

    Authors: Yusuke Tanaka, Toshiyuki Tanaka, Tomoharu Iwata, Takeshi Kurashima, Maya Okawa, Yasunori Akagi, Hiroyuki Toda

    Abstract: Aggregate data often appear in various fields such as socio-economics and public security. The aggregate data are associated not with points but with supports (e.g., spatial regions in a city). Since the supports may have various granularities depending on attributes (e.g., poverty rate and crime rate), modeling such data is not straightforward. This article offers a multi-output Gaussian process… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  9. arXiv:2004.12741  [pdf, other

    stat.AP

    Assessment of design and analysis frameworks for on-farm experimentation through a simulation study of wheat yield in Japan

    Authors: Takashi S. T. Tanaka

    Abstract: On-farm experiments can provide farmers with information on more efficient crop management in their own fields. Developments in precision agricultural technologies, such as yield monitoring and variable-rate application technology, allow farmers to implement on-farm experiments. Research frameworks including the experimental design and the statistical analysis method strongly influences the precis… ▽ More

    Submitted 15 March, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: 12 pages, 4 figures, Accepted by Precision Agriculture

  10. arXiv:2004.08066  [pdf, other

    cs.CV cs.LG eess.IV stat.ML

    YuruGAN: Yuru-Chara Mascot Generator Using Generative Adversarial Networks With Clustering Small Dataset

    Authors: Yuki Hagiwara, Toshihisa Tanaka

    Abstract: A yuru-chara is a mascot character created by local governments and companies for publicizing information on areas and products. Because it takes various costs to create a yuruchara, the utilization of machine learning techniques such as generative adversarial networks (GANs) can be expected. In recent years, it has been reported that the use of class conditions in a dataset for GANs training stab… ▽ More

    Submitted 17 April, 2020; originally announced April 2020.

    Comments: conference

  11. arXiv:2001.11168  [pdf, other

    stat.ML cs.LG math.ST stat.CO

    Kernel Selection for Modal Linear Regression: Optimal Kernel and IRLS Algorithm

    Authors: Ryoya Yamasaki, Toshiyuki Tanaka

    Abstract: Modal linear regression (MLR) is a method for obtaining a conditional mode predictor as a linear model. We study kernel selection for MLR from two perspectives: "which kernel achieves smaller error?" and "which kernel is computationally efficient?". First, we show that a Biweight kernel is optimal in the sense of minimizing an asymptotic mean squared error of a resulting MLR parameter. This result… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.

    Comments: 7 pages, 4 figures, published in the proceedings of the 18th IEEE International Conference on Machine Learning and Applications - ICMLA 2019

    MSC Class: 62G30 ACM Class: F.2.1

  12. arXiv:1907.08350  [pdf, other

    stat.ML cs.LG

    Spatially Aggregated Gaussian Processes with Multivariate Areal Outputs

    Authors: Yusuke Tanaka, Toshiyuki Tanaka, Tomoharu Iwata, Takeshi Kurashima, Maya Okawa, Yasunori Akagi, Hiroyuki Toda

    Abstract: We propose a probabilistic model for inferring the multivariate function from multiple areal data sets with various granularities. Here, the areal data are observed not at location points but at regions. Existing regression-based models can only utilize the sufficiently fine-grained auxiliary data sets on the same domain (e.g., a city). With the proposed model, the functions for respective areal d… ▽ More

    Submitted 7 January, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: Accepted at NeurIPS 2019

  13. arXiv:1812.08295  [pdf

    cs.LG stat.ML

    Efficient logic architecture in training gradient boosting decision tree for high-performance and edge computing

    Authors: Takuya Tanaka, Ryosuke Kasahara, Daishiro Kobayashi

    Abstract: This study proposes a logic architecture for the high-speed and power efficiently training of a gradient boosting decision tree model of binary classification. We implemented the proposed logic architecture on an FPGA and compared training time and power efficiency with three general GBDT software libraries using CPU and GPU. The training speed of the logic architecture on the FPGA was 26-259 time… ▽ More

    Submitted 19 December, 2018; originally announced December 2018.

  14. arXiv:1809.07952  [pdf, other

    stat.ML cs.LG

    Refining Coarse-grained Spatial Data using Auxiliary Spatial Data Sets with Various Granularities

    Authors: Yusuke Tanaka, Tomoharu Iwata, Toshiyuki Tanaka, Takeshi Kurashima, Maya Okawa, Hiroyuki Toda

    Abstract: We propose a probabilistic model for refining coarse-grained spatial data by utilizing auxiliary spatial data sets. Existing methods require that the spatial granularities of the auxiliary data sets are the same as the desired granularity of target data. The proposed model can effectively make use of auxiliary data sets with various granularities by hierarchically incorporating Gaussian processes.… ▽ More

    Submitted 17 July, 2019; v1 submitted 21 September, 2018; originally announced September 2018.

    Comments: Appears in Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019)

  15. arXiv:1804.09348  [pdf, other

    cs.LG eess.SP stat.ML

    Generalized Gaussian Kernel Adaptive Filtering

    Authors: Tomoya Wada, Kosuke Fukumori, Toshihisa Tanaka, Simone Fiori

    Abstract: The present paper proposes generalized Gaussian kernel adaptive filtering, where the kernel parameters are adaptive and data-driven. The Gaussian kernel is parametrized by a center vector and a symmetric positive definite (SPD) precision matrix, which is regarded as a generalization of the scalar width parameter. These parameters are adaptively updated on the basis of a proposed least-square-type… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

  16. arXiv:1203.3497  [pdf

    cs.LG stat.ML

    Parametric Return Density Estimation for Reinforcement Learning

    Authors: Tetsuro Morimura, Masashi Sugiyama, Hisashi Kashima, Hirotaka Hachiya, Toshiyuki Tanaka

    Abstract: Most conventional Reinforcement Learning (RL) algorithms aim to optimize decision-making rules in terms of the expected returns. However, especially for risk management purposes, other risk-sensitive criteria such as the value-at-risk or the expected shortfall are sometimes preferred in real applications. Here, we describe a parametric method for estimating density of the returns, which allows us… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

    Comments: Appears in Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence (UAI2010)

    Report number: UAI-P-2010-PG-368-375