Search | arXiv e-print repository

Breaking the curse of dimensionality with Isolation Kernel

Authors: Kai Ming Ting, Takashi Washio, Ye Zhu, Yang Xu

Abstract: The curse of dimensionality has been studied in different aspects. However, breaking the curse has been elusive. We show for the first time that it is possible to break the curse using the recently introduced Isolation Kernel. We show that only Isolation Kernel performs consistently well in indexed search, spectral & density peaks clustering, SVM classification and t-SNE visualization in both low… ▽ More The curse of dimensionality has been studied in different aspects. However, breaking the curse has been elusive. We show for the first time that it is possible to break the curse using the recently introduced Isolation Kernel. We show that only Isolation Kernel performs consistently well in indexed search, spectral & density peaks clustering, SVM classification and t-SNE visualization in both low and high dimensions, compared with distance, Gaussian and linear kernels. This is also supported by our theoretical analyses that Isolation Kernel is the only kernel that has the provable ability to break the curse, compared with existing metric-based Lipschitz continuous kernels. △ Less

Submitted 29 September, 2021; originally announced September 2021.

arXiv:2009.12196 [pdf, other]

Isolation Distributional Kernel: A New Tool for Point & Group Anomaly Detection

Authors: Kai Ming Ting, Bi-Cun Xu, Takashi Washio, Zhi-Hua Zhou

Abstract: We introduce Isolation Distributional Kernel as a new way to measure the similarity between two distributions. Existing approaches based on kernel mean embedding, which convert a point kernel to a distributional kernel, have two key issues: the point kernel employed has a feature map with intractable dimensionality; and it is {\em data independent}. This paper shows that Isolation Distributional K… ▽ More We introduce Isolation Distributional Kernel as a new way to measure the similarity between two distributions. Existing approaches based on kernel mean embedding, which convert a point kernel to a distributional kernel, have two key issues: the point kernel employed has a feature map with intractable dimensionality; and it is {\em data independent}. This paper shows that Isolation Distributional Kernel (IDK), which is based on a {\em data dependent} point kernel, addresses both key issues. We demonstrate IDK's efficacy and efficiency as a new tool for kernel based anomaly detection for both point and group anomalies. Without explicit learning, using IDK alone outperforms existing kernel based point anomaly detector OCSVM and other kernel mean embedding methods that rely on Gaussian kernel. For group anomaly detection,we introduce an IDK based detector called IDK$^2$. It reformulates the problem of group anomaly detection in input space into the problem of point anomaly detection in Hilbert space, without the need for learning. IDK$^2$ runs orders of magnitude faster than group anomaly detector OCSMM.We reveal for the first time that an effective kernel based anomaly detector based on kernel mean embedding must employ a characteristic kernel which is data dependent. △ Less

Submitted 24 September, 2020; originally announced September 2020.

Comments: 14 pages

arXiv:1907.01104 [pdf, other]

Isolation Kernel: The X Factor in Efficient and Effective Large Scale Online Kernel Learning

Authors: Kai Ming Ting, Jonathan R. Wells, Takashi Washio

Abstract: Large scale online kernel learning aims to build an efficient and scalable kernel-based predictive model incrementally from a sequence of potentially infinite data points. A current key approach focuses on ways to produce an approximate finite-dimensional feature map, assuming that the kernel used has a feature map with intractable dimensionality---an assumption traditionally held in kernel-based… ▽ More Large scale online kernel learning aims to build an efficient and scalable kernel-based predictive model incrementally from a sequence of potentially infinite data points. A current key approach focuses on ways to produce an approximate finite-dimensional feature map, assuming that the kernel used has a feature map with intractable dimensionality---an assumption traditionally held in kernel-based methods. While this approach can deal with large scale datasets efficiently, this outcome is achieved by compromising predictive accuracy because of the approximation. We offer an alternative approach which overrides the assumption and puts the kernel used at the heart of the approach. It focuses on creating an exact, sparse and finite-dimensional feature map of a kernel called Isolation Kernel. Using this new approach, to achieve the above aim of large scale online kernel learning becomes extremely simple---simply use Isolation Kernel instead of a kernel having a feature map with intractable dimensionality. We show that, using Isolation Kernel, large scale online kernel learning can be achieved efficiently without sacrificing accuracy. △ Less

Submitted 24 September, 2019; v1 submitted 1 July, 2019; originally announced July 2019.

Comments: Textural updates. Restructured section 8.4 including additional experimental results

arXiv:1902.03402 [pdf, ps, other]

A new simple and effective measure for bag-of-word inter-document similarity measurement

Authors: Sunil Aryal, Kai Ming Ting, Takashi Washio, Gholamreza Haffari

Abstract: To measure the similarity of two documents in the bag-of-words (BoW) vector representation, different term weighting schemes are used to improve the performance of cosine similarity---the most widely used inter-document similarity measure in text mining. In this paper, we identify the shortcomings of the underlying assumptions of term weighting in the inter-document similarity measurement task; an… ▽ More To measure the similarity of two documents in the bag-of-words (BoW) vector representation, different term weighting schemes are used to improve the performance of cosine similarity---the most widely used inter-document similarity measure in text mining. In this paper, we identify the shortcomings of the underlying assumptions of term weighting in the inter-document similarity measurement task; and provide a more fit-to-the-purpose alternative. Based on this new assumption, we introduce a new simple but effective similarity measure which does not require explicit term weighting. The proposed measure employs a more nuanced probabilistic approach than those used in term weighting to measure the similarity of two documents w.r.t each term occurring in the two documents. Our empirical comparison with the existing similarity measures using different term weighting schemes shows that the new measure produces (i) better results in the binary BoW representation; and (ii) competitive and more consistent results in the term-frequency-based BoW representation. △ Less

Submitted 9 February, 2019; originally announced February 2019.

arXiv:1812.05193 [pdf]

doi 10.1038/s41598-019-46164-1

Free-hand gas identification based on transfer function ratios without gas flow control

Authors: Gaku Imamura, Kota Shiba, Genki Yoshikawa, Takashi Washio

Abstract: Gas identification is one of the most important functions of gas sensor systems. To identify gas species from sensing signals, however, gas input patterns (e.g. the gas flow sequence) must be controlled or monitored precisely with additional instruments such as pumps or mass flow controllers; otherwise, effective signal features for analysis are difficult to be extracted. Toward a compact and easy… ▽ More Gas identification is one of the most important functions of gas sensor systems. To identify gas species from sensing signals, however, gas input patterns (e.g. the gas flow sequence) must be controlled or monitored precisely with additional instruments such as pumps or mass flow controllers; otherwise, effective signal features for analysis are difficult to be extracted. Toward a compact and easy-to-use gas sensor system that can identify gas species, it is necessary to overcome such restrictions on gas input patterns. Here we develop a novel gas identification protocol that is applicable to arbitrary gas input patterns without controlling or monitoring any gas flow. By combining the protocol with newly developed MEMS-based sensors (i.e. Membrane-type Surface stress Sensors (MSS)), we have realized the gas identification with the free-hand measurement, in which one can simply hold a small sensor chip near samples. From sensing signals obtained through the free-hand measurement, we have developed machine learning models that can identify not only solvent vapors but also odors of spices and herbs with high accuracies. Since no bulky gas flow control units are required, this protocol will expand the applicability of gas sensors to portable electronics and wearable devices, leading to practical artificial olfaction. △ Less

Submitted 12 December, 2018; originally announced December 2018.

Comments: 19 pages, 8 figures, 3 tables

arXiv:1812.03395 [pdf, other]

Learning Graph Representation via Formal Concept Analysis

Authors: Yuka Yoneda, Mahito Sugiyama, Takashi Washio

Abstract: We present a novel method that can learn a graph representation from multivariate data. In our representation, each node represents a cluster of data points and each edge represents the subset-superset relationship between clusters, which can be mutually overlapped. The key to our method is to use formal concept analysis (FCA), which can extract hierarchical relationships between clusters based on… ▽ More We present a novel method that can learn a graph representation from multivariate data. In our representation, each node represents a cluster of data points and each edge represents the subset-superset relationship between clusters, which can be mutually overlapped. The key to our method is to use formal concept analysis (FCA), which can extract hierarchical relationships between clusters based on the algebraic closedness property. We empirically show that our method can effectively extract hierarchical structures of clusters compared to the baseline method. △ Less

Submitted 8 December, 2018; originally announced December 2018.

Comments: 5 pages, 2 figures, Relational Representation Learning Workshop (NeurIPS 2018)

arXiv:1802.06698 [pdf, other]

doi 10.7717/peerj-cs.169

Analysis of cause-effect inference by comparing regression errors

Authors: Patrick Blöbaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, Bernhard Schölkopf

Abstract: We address the problem of inferring the causal direction between two variables by comparing the least-squares errors of the predictions in both possible directions. Under the assumption of an independence between the function relating cause and effect, the conditional noise distribution, and the distribution of the cause, we show that the errors are smaller in causal direction if both variables ar… ▽ More We address the problem of inferring the causal direction between two variables by comparing the least-squares errors of the predictions in both possible directions. Under the assumption of an independence between the function relating cause and effect, the conditional noise distribution, and the distribution of the cause, we show that the errors are smaller in causal direction if both variables are equally scaled and the causal relation is close to deterministic. Based on this, we provide an easily applicable algorithm that only requires a regression in both possible causal directions and a comparison of the errors. The performance of the algorithm is compared with various related causal inference methods in different artificial and real-world data sets. △ Less

Submitted 24 January, 2019; v1 submitted 19 February, 2018; originally announced February 2018.

Comments: This is an extended version of the AISTATS 2018 paper

Journal ref: PeerJ, 2019

arXiv:1610.03263 [pdf, other]

doi 10.1007/s41237-017-0022-z

Error Asymmetry in Causal and Anticausal Regression

Authors: Patrick Blöbaum, Takashi Washio, Shohei Shimizu

Abstract: It is generally difficult to make any statements about the expected prediction error in an univariate setting without further knowledge about how the data were generated. Recent work showed that knowledge about the real underlying causal structure of a data generation process has implications for various machine learning settings. Assuming an additive noise and an independence between data generat… ▽ More It is generally difficult to make any statements about the expected prediction error in an univariate setting without further knowledge about how the data were generated. Recent work showed that knowledge about the real underlying causal structure of a data generation process has implications for various machine learning settings. Assuming an additive noise and an independence between data generating mechanism and its input, we draw a novel connection between the intrinsic causal relationship of two variables and the expected prediction error. We formulate the theorem that the expected error of the true data generating function as prediction model is generally smaller when the effect is predicted from its cause and, on the contrary, greater when the cause is predicted from its effect. The theorem implies an asymmetry in the error depending on the prediction direction. This is further corroborated with empirical evaluations in artificial and real-world data sets. △ Less

Submitted 17 April, 2017; v1 submitted 11 October, 2016; originally announced October 2016.

Journal ref: Behaviormetrika, 2017, 10.1007/s41237-017-0022-z

arXiv:1408.0337 [pdf, ps, other]

A Bayesian estimation approach to analyze non-Gaussian data-generating processes with latent classes

Authors: Naoki Tanaka, Shohei Shimizu, Takashi Washio

Abstract: A large amount of observational data has been accumulated in various fields in recent times, and there is a growing need to estimate the generating processes of these data. A linear non-Gaussian acyclic model (LiNGAM) based on the non-Gaussianity of external influences has been proposed to estimate the data-generating processes of variables. However, the results of the estimation can be biased if… ▽ More A large amount of observational data has been accumulated in various fields in recent times, and there is a growing need to estimate the generating processes of these data. A linear non-Gaussian acyclic model (LiNGAM) based on the non-Gaussianity of external influences has been proposed to estimate the data-generating processes of variables. However, the results of the estimation can be biased if there are latent classes. In this paper, we first review LiNGAM, its extended model, as well as the estimation procedure for LiNGAM in a Bayesian framework. We then propose a new Bayesian estimation procedure that solves the problem. △ Less

Submitted 2 August, 2014; originally announced August 2014.

Comments: 10 pages, 1 figures

arXiv:1401.5636 [pdf, ps, other]

Causal Discovery in a Binary Exclusive-or Skew Acyclic Model: BExSAM

Authors: Takanori Inazumi, Takashi Washio, Shohei Shimizu, Joe Suzuki, Akihiro Yamamoto, Yoshinobu Kawahara

Abstract: Discovering causal relations among observed variables in a given data set is a major objective in studies of statistics and artificial intelligence. Recently, some techniques to discover a unique causal model have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary… ▽ More Discovering causal relations among observed variables in a given data set is a major objective in studies of statistics and artificial intelligence. Recently, some techniques to discover a unique causal model have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose an efficient new approach to deriving the unique causal model governing a given binary data set under skew distributions of external binary noises. Experimental evaluation shows excellent performance for both artificial and real world data sets. △ Less

Submitted 22 January, 2014; originally announced January 2014.

Comments: 10 pages. A longer version of our UAI2011 paper (Inazumi et al., 2011). arXiv admin note: text overlap with arXiv:1202.3736

arXiv:1401.5625 [pdf, ps, other]

Identifiability of an Integer Modular Acyclic Additive Noise Model and its Causal Structure Discovery

Authors: Joe Suzuki, Takanori Inazumi, Takashi Washio, Shohei Shimizu

Abstract: The notion of causality is used in many situations dealing with uncertainty. We consider the problem whether causality can be identified given data set generated by discrete random variables rather than continuous ones. In particular, for non-binary data, thus far it was only known that causality can be identified except rare cases. In this paper, we present necessary and sufficient condition for… ▽ More The notion of causality is used in many situations dealing with uncertainty. We consider the problem whether causality can be identified given data set generated by discrete random variables rather than continuous ones. In particular, for non-binary data, thus far it was only known that causality can be identified except rare cases. In this paper, we present necessary and sufficient condition for an integer modular acyclic additive noise (IMAN) of two variables. In addition, we relate bivariate and multivariate causal identifiability in a more explicit manner, and develop a practical algorithm to find the order of variables and their parent sets. We demonstrate its performance in applications to artificial data and real world body motion data with comparisons to conventional methods. △ Less

Submitted 22 January, 2014; originally announced January 2014.

Comments: 30 pages, 4 figures

arXiv:1401.4785 [pdf, ps, other]

doi 10.1103/PhysRevA.89.022104

Anomaly detection in reconstructed quantum states using a machine-learning technique

Authors: Satoshi Hara, Takafumi Ono, Ryo Okamoto, Takashi Washio, Shigeki Takeuchi

Abstract: The accurate detection of small deviations in given density matrices is important for quantum information processing. Here we propose a new method based on the concept of data mining. We demonstrate that the proposed method can more accurately detect small erroneous deviations in reconstructed density matrices, which contain intrinsic fluctuations due to the limited number of samples, than a naive… ▽ More The accurate detection of small deviations in given density matrices is important for quantum information processing. Here we propose a new method based on the concept of data mining. We demonstrate that the proposed method can more accurately detect small erroneous deviations in reconstructed density matrices, which contain intrinsic fluctuations due to the limited number of samples, than a naive method of checking the trace distance from the average of the given density matrices. This method has the potential to be a key tool in broad areas of physics where the detection of small deviations of quantum states reconstructed using a limited number of samples are essential. △ Less

Submitted 19 January, 2014; originally announced January 2014.

Comments: Accepted for Physical Review A

MSC Class: 81V80

arXiv:1303.7410 [pdf, ps, other]

ParceLiNGAM: A causal ordering method robust against latent confounders

Authors: Tatsuya Tashiro, Shohei Shimizu, Aapo Hyvarinen, Takashi Washio

Abstract: We consider learning a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal order… ▽ More We consider learning a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal orders that is robust against one typical violation of the model assumptions: latent confounders. The key idea is to detect latent confounders by testing independence between estimated external influences and find subsets (parcels) that include variables that are not affected by latent confounders. We demonstrate the effectiveness of our method using artificial data and simulated brain imaging data. △ Less

Submitted 28 July, 2013; v1 submitted 29 March, 2013; originally announced March 2013.

Comments: A revised version of this was accepted in Neural Computation. 18 pages and 5 figures. arXiv admin note: substantial text overlap with arXiv:1204.1795

arXiv:1204.1795 [pdf, ps, other]

Estimation of causal orders in a linear non-Gaussian acyclic model: a method robust against latent confounders

Authors: Tatsuya Tashiro, Shohei Shimizu, Aapo Hyvarinen, Takashi Washio

Abstract: We consider to learn a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal order… ▽ More We consider to learn a causal ordering of variables in a linear non-Gaussian acyclic model called LiNGAM. Several existing methods have been shown to consistently estimate a causal ordering assuming that all the model assumptions are correct. But, the estimation results could be distorted if some assumptions actually are violated. In this paper, we propose a new algorithm for learning causal orders that is robust against one typical violation of the model assumptions: latent confounders. We demonstrate the effectiveness of our method using artificial data. △ Less

Submitted 9 April, 2012; originally announced April 2012.

Comments: 8 pages, 2 figures

arXiv:1203.0117 [pdf, ps, other]

Learning a Common Substructure of Multiple Graphical Gaussian Models

Authors: Satoshi Hara, Takashi Washio

Abstract: Properties of data are frequently seen to vary depending on the sampled situations, which usually changes along a time evolution or owing to environmental effects. One way to analyze such data is to find invariances, or representative features kept constant over changes. The aim of this paper is to identify one such feature, namely interactions or dependencies among variables that are common acros… ▽ More Properties of data are frequently seen to vary depending on the sampled situations, which usually changes along a time evolution or owing to environmental effects. One way to analyze such data is to find invariances, or representative features kept constant over changes. The aim of this paper is to identify one such feature, namely interactions or dependencies among variables that are common across multiple datasets collected under different conditions. To that end, we propose a common substructure learning (CSSL) framework based on a graphical Gaussian model. We further present a simple learning algorithm based on the Dual Augmented Lagrangian and the Alternating Direction Method of Multipliers. We confirm the performance of CSSL over other existing techniques in finding unchanging dependency structures in multiple datasets through numerical simulations on synthetic data and through a real world application to anomaly detection in automobile sensors. △ Less

Submitted 24 September, 2012; v1 submitted 1 March, 2012; originally announced March 2012.

Comments: 47 pages, 6 figures, elsarticle.cls

arXiv:1202.3736 [pdf]

Discovering causal structures in binary exclusive-or skew acyclic models

Authors: Takanori Inazumi, Takashi Washio, Shohei Shimizu, Joe Suzuki, Akihiro Yamamoto, Yoshinobu Kawahara

Abstract: Discovering causal relations among observed variables in a given data set is a main topic in studies of statistics and artificial intelligence. Recently, some techniques to discover an identifiable causal structure have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for… ▽ More Discovering causal relations among observed variables in a given data set is a main topic in studies of statistics and artificial intelligence. Recently, some techniques to discover an identifiable causal structure have been explored based on non-Gaussianity of the observed data distribution. However, most of these are limited to continuous data. In this paper, we present a novel causal model for binary data and propose a new approach to derive an identifiable causal structure governing the data based on skew Bernoulli distributions of external noise. Experimental evaluation shows excellent performance for both artificial and real world data sets. △ Less

Submitted 14 February, 2012; originally announced February 2012.

Report number: UAI-P-2011-PG-373-382

arXiv:1110.3879 [pdf, ps, other]

doi 10.1587/transinf.E95.D.1947

GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search

Authors: Akihiro Inokuchi, Hiroaki Ikuta, Takashi Washio

Abstract: The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine frequent patterns from graph sequences under the assumption that changes in graphs are gradual. Although GTRACE mines the frequent patterns efficiently, it still… ▽ More The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine frequent patterns from graph sequences under the assumption that changes in graphs are gradual. Although GTRACE mines the frequent patterns efficiently, it still needs substantial computation time to mine the patterns from graph sequences containing large graphs and long sequences. In this paper, we propose a new version of GTRACE that enables efficient mining of frequent patterns based on the principle of a reverse search. The underlying concept of the reverse search is a general scheme for designing efficient algorithms for hard enumeration problems. Our performance study shows that the proposed method is efficient and scalable for mining both long and large graph sequence patterns and is several orders of magnitude faster than the original GTRACE. △ Less

Submitted 18 October, 2011; originally announced October 2011.

arXiv:1108.4217 [pdf, ps, other]

Prismatic Algorithm for Discrete D.C. Programming Problems

Authors: Yoshinobu Kawahara, Takashi Washio

Abstract: In this paper, we propose the first exact algorithm for minimizing the difference of two submodular functions (D.S.), i.e., the discrete version of the D.C. programming problem. The developed algorithm is a branch-and-bound-based algorithm which responds to the structure of this problem through the relationship between submodularity and convexity. The D.S. programming problem covers a broad range… ▽ More In this paper, we propose the first exact algorithm for minimizing the difference of two submodular functions (D.S.), i.e., the discrete version of the D.C. programming problem. The developed algorithm is a branch-and-bound-based algorithm which responds to the structure of this problem through the relationship between submodularity and convexity. The D.S. programming problem covers a broad range of applications in machine learning because this generalizes the optimization of a wide class of set functions. We empirically investigate the performance of our algorithm, and illustrate the difference between exact and approximate solutions respectively obtained by the proposed and existing algorithms in feature selection and discriminative structure learning. △ Less

Submitted 21 August, 2011; originally announced August 2011.

arXiv:1101.2489 [pdf, ps, other]

DirectLiNGAM: A direct method for learning a linear non-Gaussian structural equation model

Authors: Shohei Shimizu, Takanori Inazumi, Yasuhiro Sogawa, Aapo Hyvarinen, Yoshinobu Kawahara, Takashi Washio, Patrik O. Hoyer, Kenneth Bollen

Abstract: Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the data-generating process of variables. Recently, it was shown that use of non-Gaussianity identifies the full structure of a linear acyclic model, i.e., a causal ordering of variables and their conn… ▽ More Structural equation models and Bayesian networks have been widely used to analyze causal relations between continuous variables. In such frameworks, linear acyclic models are typically used to model the data-generating process of variables. Recently, it was shown that use of non-Gaussianity identifies the full structure of a linear acyclic model, i.e., a causal ordering of variables and their connection strengths, without using any prior knowledge on the network structure, which is not the case with conventional methods. However, existing estimation methods are based on iterative search algorithms and may not converge to a correct solution in a finite number of steps. In this paper, we propose a new direct method to estimate a causal ordering and connection strengths based on non-Gaussianity. In contrast to the previous methods, our algorithm requires no algorithmic parameters and is guaranteed to converge to the right solution within a small fixed number of steps if the data strictly follows the model. △ Less

Submitted 7 April, 2011; v1 submitted 12 January, 2011; originally announced January 2011.

Comments: A revised version of this was accepted in Journal of Machine Learning Research

arXiv:1006.5041 [pdf, ps, other]

GroupLiNGAM: Linear non-Gaussian acyclic models for sets of variables

Authors: Yoshinobu Kawahara, Kenneth Bollen, Shohei Shimizu, Takashi Washio

Abstract: Finding the structure of a graphical model has been received much attention in many fields. Recently, it is reported that the non-Gaussianity of data enables us to identify the structure of a directed acyclic graph without any prior knowledge on the structure. In this paper, we propose a novel non-Gaussianity based algorithm for more general type of models; chain graphs. The algorithm finds an ord… ▽ More Finding the structure of a graphical model has been received much attention in many fields. Recently, it is reported that the non-Gaussianity of data enables us to identify the structure of a directed acyclic graph without any prior knowledge on the structure. In this paper, we propose a novel non-Gaussianity based algorithm for more general type of models; chain graphs. The algorithm finds an ordering of the disjoint subsets of variables by iteratively evaluating the independence between the variable subset and the residuals when the remaining variables are regressed on those. However, its computational cost grows exponentially according to the number of variables. Therefore, we further discuss an efficient approximate approach for applying the algorithm to large sized graphs. We illustrate the algorithm with artificial and real-world datasets. △ Less

Submitted 24 June, 2010; originally announced June 2010.

arXiv:0904.0838 [pdf, ps, other]

doi 10.1007/978-3-642-15819-3_10

Finding Exogenous Variables in Data with Many More Variables than Observations

Authors: Shohei Shimizu, Takashi Washio, Aapo Hyvarinen, Seiya Imoto

Abstract: Many statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations (p<n, p: the number of variables and n: the number of observations). However, modern datasets including gene expression data need high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations (p>>n). In this pape… ▽ More Many statistical methods have been proposed to estimate causal models in classical situations with fewer variables than observations (p<n, p: the number of variables and n: the number of observations). However, modern datasets including gene expression data need high-dimensional causal modeling in challenging situations with orders of magnitude more variables than observations (p>>n). In this paper, we propose a method to find exogenous variables in a linear non-Gaussian causal model, which requires much smaller sample sizes than conventional methods and works even when p>>n. The key idea is to identify which variables are exogenous based on non-Gaussianity instead of estimating the entire structure of the model. Exogenous variables work as triggers that activate a causal chain in the model, and their identification leads to more efficient experimental designs and better understanding of the causal mechanism. We present experiments with artificial data and real-world gene expression data to evaluate the method. △ Less

Submitted 7 April, 2011; v1 submitted 5 April, 2009; originally announced April 2009.

Comments: A revised version of this was published in Proc. ICANN2010

Journal ref: ARTIFICIAL NEURAL NETWORKS - ICANN 2010. Lecture Notes in Computer Science, 2010, Volume 6352/2010, 67-76

Showing 1–21 of 21 results for author: Washio, T