Skip to main content

Showing 1–12 of 12 results for author: Yoshida, T

Searching in archive stat. Search in all archives.
.
  1. arXiv:2306.13561  [pdf, other

    stat.ML cs.LG

    Efficient Model Selection for Predictive Pattern Mining Model by Safe Pattern Pruning

    Authors: Takumi Yoshida, Hiroyuki Hanada, Kazuya Nakagawa, Kouichi Taji, Koji Tsuda, Ichiro Takeuchi

    Abstract: Predictive pattern mining is an approach used to construct prediction models when the input is represented by structured data, such as sets, graphs, and sequences. The main idea behind predictive pattern mining is to build a prediction model by considering substructures, such as subsets, subgraphs, and subsequences (referred to as patterns), present in the structured data as features of the model.… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  2. arXiv:2305.08443  [pdf

    stat.ME

    A linearization for stable and fast geographically weighted Poisson regression

    Authors: Daisuke Murakami, Narumasa Tsutsumida, Takahiro Yoshida, Tomoki Nakaya, Binbin Lu, Paul Harris

    Abstract: Although geographically weighted Poisson regression (GWPR) is a popular regression for spatially indexed count data, its development is relatively limited compared to that found for linear geographically weighted regression (GWR), where many extensions (e.g., multiscale GWR, scalable GWR) have been proposed. The weak development of GWPR can be attributed to the computational cost and identificatio… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

  3. arXiv:2305.05106  [pdf, ps, other

    stat.ME

    Unit-level mixed effects models for conditional extremes

    Authors: Koki Momoki, Takuma Yoshida

    Abstract: Extreme value theory (EVT) provides an elegant mathematical tool for statistically analyzing rare events. When data are collected from multiple population subgroups, the scientific interest of researchers would generally be to improve the estimates obtained directly from each subgroup because some subgroups may have less data available for extreme value analysis. To achieve this, we incorporate th… ▽ More

    Submitted 30 September, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

    Comments: 48 pages

  4. arXiv:2302.08099  [pdf, other

    stat.AP

    Bayesian Active Questionnaire Design for Cause-of-Death Assignment Using Verbal Autopsies

    Authors: Toshiya Yoshida, Trinity Shuxian Fan, Tyler McCormick, Zhenke Wu, Zehang Richard Li

    Abstract: Only about one-third of the deaths worldwide are assigned a medically-certified cause, and understanding the causes of deaths occurring outside of medical facilities is logistically and financially challenging. Verbal autopsy (VA) is a routinely used tool to collect information on cause of death in such settings. VA is a survey-based method where a structured questionnaire is conducted to family m… ▽ More

    Submitted 27 April, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Accepted at CHIL 2023

  5. arXiv:2107.12539  [pdf

    stat.AP

    Spatial prediction of apartment rent using regression-based and machine learning-based approaches with a large dataset

    Authors: Takahiro Yoshida, Hajime Seya

    Abstract: Employing a large dataset (at most, the order of n = 10^6), this study attempts enhance the literature on the comparison between regression and machine learning (ML)-based rent price prediction models by adding new empirical evidence and considering the spatial dependence of the observations. The regression-based approach incorporates the nearest neighbor Gaussian processes (NNGP) model, enabling… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

  6. arXiv:2101.03491  [pdf

    stat.AP stat.CO stat.OT

    gwpcorMapper: an interactive map** tool for exploring geographically weighted correlation and partial correlation in high-dimensional geospatial datasets

    Authors: Joseph Emile Honour Percival, Narumasa Tsutsumida, Daisuke Murakami, Takahiro Yoshida, Tomoki Nakaya

    Abstract: Exploratory spatial data analysis (ESDA) plays a key role in research that includes geographic data. In ESDA, analysts often want to be able to visualize observations and local relationships on a map. However, software dedicated to visualizing local spatial relations be-tween multiple variables in high dimensional datasets remains undeveloped. This paper introduces gwpcorMapper, a newly developed… ▽ More

    Submitted 8 May, 2022; v1 submitted 10 January, 2021; originally announced January 2021.

    Comments: 18 pages, 8 figures, 2 tables

  7. arXiv:2002.10855  [pdf, other

    stat.ML cs.LG

    Gaussian Hierarchical Latent Dirichlet Allocation: Bringing Polysemy Back

    Authors: Takahiro Yoshida, Ryohei Hisano, Takaaki Ohnishi

    Abstract: Topic models are widely used to discover the latent representation of a set of documents. The two canonical models are latent Dirichlet allocation, and Gaussian latent Dirichlet allocation, where the former uses multinomial distributions over words, and the latter uses multivariate Gaussian distributions over pre-trained word embedding vectors as the latent topic representations, respectively. Com… ▽ More

    Submitted 7 June, 2023; v1 submitted 25 February, 2020; originally announced February 2020.

  8. Distance Metric Learning for Graph Structured Data

    Authors: Tomoki Yoshida, Ichiro Takeuchi, Masayuki Karasuyama

    Abstract: Graphs are versatile tools for representing structured data. As a result, a variety of machine learning methods have been studied for graph data analysis. Although many such learning methods depend on the measurement of differences between input graphs, defining an appropriate distance metric for graphs remains a controversial issue. Hence, we propose a supervised distance metric learning method f… ▽ More

    Submitted 17 June, 2021; v1 submitted 3 February, 2020; originally announced February 2020.

    Comments: 38 pages, 11 figures. This is a pre-print of an article published in Machine Learning Journal. The final authenticated version is available online at: https://doi.org/10.1007/s10994-021-06009-3

  9. arXiv:1905.00266  [pdf

    stat.ME

    Scalable GWR: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels

    Authors: Daisuke Murakami, Narumasa Tsutsumida, Takahiro Yoshida, Tomoki Nakaya, Binbin Lu

    Abstract: Although a number of studies have developed fast geographically weighted regression (GWR) algorithms for large samples, none of them has achieved linear-time estimation, which is considered a requisite for big data analysis in machine learning, geostatistics, and related domains. Against this backdrop, this study proposes a scalable GWR (ScaGWR) for large datasets. The key improvement is the calib… ▽ More

    Submitted 23 April, 2020; v1 submitted 1 May, 2019; originally announced May 2019.

  10. arXiv:1810.00210  [pdf

    stat.AP

    Which country epitomizes the world? A study from the perspective of demographic composition

    Authors: Takahiro Yoshida, Rim Er-Rbib, Morito Tsutsumi

    Abstract: Demographic indicators are an essential element in considering various problems in the social economy, such as predicting economic fluctuations and establishing policies. The literature widely discusses the growth of the world population or issues pertaining to its aging, but has given little to no attention to population structures and transition patterns. In this article, we take advantage of th… ▽ More

    Submitted 29 September, 2018; originally announced October 2018.

  11. arXiv:1802.03923  [pdf, other

    stat.ML

    Safe Triplet Screening for Distance Metric Learning

    Authors: Tomoki Yoshida, Ichiro Takeuchi, Masayuki Karasuyama

    Abstract: We study safe screening for metric learning. Distance metric learning can optimize a metric over a set of triplets, each one of which is defined by a pair of same class instances and an instance in a different class. However, the number of possible triplets is quite huge even for a small dataset. Our safe triplet screening identifies triplets which can be safely removed from the optimization probl… ▽ More

    Submitted 5 October, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: 36 pages, 12 figures

  12. arXiv:1606.06885  [pdf

    stat.ME

    A Moran coefficient-based mixed effects approach to investigate spatially varying relationships

    Authors: Daisuke Murakami, Takahiro Yoshida, Hajime Seya, Daniel A. Griffith, Yoshiki Yamagata

    Abstract: This study develops a spatially varying coefficient model by extending the random effects eigenvector spatial filtering model. The developed model has the following properties: its coefficients are interpretable in terms of the Moran coefficient; each of its coefficients can have a different degree of spatial smoothness; and it yields a variant of a Bayesian spatially varying coefficient model. Al… ▽ More

    Submitted 10 August, 2016; v1 submitted 22 June, 2016; originally announced June 2016.