Search | arXiv e-print repository

arXiv:2406.19948 [pdf, other]

Kolmogorov-Smirnov GAN

Authors: Maciej Falkiewicz, Naoya Takeishi, Alexandros Kalousis

Abstract: We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN). Unlike existing approaches, KSGAN formulates the learning process as a minimization of the Kolmogorov-Smirnov (KS) distance, generalized to handle multivariate distributions. This distance is calculated using the quantile function, which acts as the critic in the adversarial training process. W… ▽ More We propose a novel deep generative model, the Kolmogorov-Smirnov Generative Adversarial Network (KSGAN). Unlike existing approaches, KSGAN formulates the learning process as a minimization of the Kolmogorov-Smirnov (KS) distance, generalized to handle multivariate distributions. This distance is calculated using the quantile function, which acts as the critic in the adversarial training process. We formally demonstrate that minimizing the KS distance leads to the trained approximate distribution aligning with the target distribution. We propose an efficient implementation and evaluate its effectiveness through experiments. The results show that KSGAN performs on par with existing adversarial methods, exhibiting stability during training, resistance to mode drop** and collapse, and tolerance to variations in hyperparameter settings. Additionally, we review the literature on the Generalized KS test and discuss the connections between KSGAN and existing adversarial generative models. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Code available at https://github.com/DMML-Geneva/ksgan

arXiv:2310.13402 [pdf, other]

Calibrating Neural Simulation-Based Inference with Differentiable Coverage Probability

Authors: Maciej Falkiewicz, Naoya Takeishi, Imahn Shekhzadeh, Antoine Wehenkel, Arnaud Delaunoy, Gilles Louppe, Alexandros Kalousis

Abstract: Bayesian inference allows expressing the uncertainty of posterior belief under a probabilistic model given prior information and the likelihood of the evidence. Predominantly, the likelihood function is only implicitly established by a simulator posing the need for simulation-based inference (SBI). However, the existing algorithms can yield overconfident posteriors (Hermans *et al.*, 2022) defeati… ▽ More Bayesian inference allows expressing the uncertainty of posterior belief under a probabilistic model given prior information and the likelihood of the evidence. Predominantly, the likelihood function is only implicitly established by a simulator posing the need for simulation-based inference (SBI). However, the existing algorithms can yield overconfident posteriors (Hermans *et al.*, 2022) defeating the whole purpose of credibility if the uncertainty quantification is inaccurate. We propose to include a calibration term directly into the training objective of the neural model in selected amortized SBI techniques. By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation. The proposed method is not tied to any particular neural model and brings moderate computational overhead compared to the profits it introduces. It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference. We empirically show on six benchmark problems that the proposed method achieves competitive or better results in terms of coverage and expected posterior density than the previously existing approaches. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: Code available at https://github.com/DMML-Geneva/calibrated-posterior

arXiv:2210.13103 [pdf, other]

Deep Grey-Box Modeling With Adaptive Data-Driven Models Toward Trustworthy Estimation of Theory-Driven Models

Authors: Naoya Takeishi, Alexandros Kalousis

Abstract: The combination of deep neural nets and theory-driven models, which we call deep grey-box modeling, can be inherently interpretable to some extent thanks to the theory backbone. Deep grey-box models are usually learned with a regularized risk minimization to prevent a theory-driven part from being overwritten and ignored by a deep neural net. However, an estimation of the theory-driven part obtain… ▽ More The combination of deep neural nets and theory-driven models, which we call deep grey-box modeling, can be inherently interpretable to some extent thanks to the theory backbone. Deep grey-box models are usually learned with a regularized risk minimization to prevent a theory-driven part from being overwritten and ignored by a deep neural net. However, an estimation of the theory-driven part obtained by uncritically optimizing a regularizer can hardly be trustworthy when we are not sure what regularizer is suitable for the given data, which may harm the interpretability. Toward a trustworthy estimation of the theory-driven part, we should analyze regularizers' behavior to compare different candidates and to justify a specific choice. In this paper, we present a framework that enables us to analyze a regularizer's behavior empirically with a slight change in the neural net's architecture and the training objective. △ Less

Submitted 24 October, 2022; originally announced October 2022.

Comments: 16 pages, 8 figures

arXiv:2206.01900 [pdf, other]

Estimating counterfactual treatment outcomes over time in complex multiagent scenarios

Authors: Keisuke Fujii, Koh Takeuchi, Atsushi Kuribayashi, Naoya Takeishi, Yoshinobu Kawahara, Kazuya Takeda

Abstract: Evaluation of intervention in a multiagent system, e.g., when humans should intervene in autonomous driving systems and when a player should pass to teammates for a good shot, is challenging in various engineering and scientific fields. Estimating the individual treatment effect (ITE) using counterfactual long-term prediction is practical to evaluate such interventions. However, most of the conven… ▽ More Evaluation of intervention in a multiagent system, e.g., when humans should intervene in autonomous driving systems and when a player should pass to teammates for a good shot, is challenging in various engineering and scientific fields. Estimating the individual treatment effect (ITE) using counterfactual long-term prediction is practical to evaluate such interventions. However, most of the conventional frameworks did not consider the time-varying complex structure of multiagent relationships and covariate counterfactual prediction. This may lead to erroneous assessments of ITE and difficulty in interpretation. Here we propose an interpretable, counterfactual recurrent network in multiagent systems to estimate the effect of the intervention. Our model leverages graph variational recurrent neural networks and theory-based computation with domain knowledge for the ITE estimation framework based on long-term prediction of multiagent covariates and outcomes, which can confirm the circumstances under which the intervention is effective. On simulated models of an automated vehicle and biological agents with time-varying confounders, we show that our methods achieved lower estimation errors in counterfactual covariates and the most effective treatment timing than the baselines. Furthermore, using real basketball data, our methods performed realistic counterfactual predictions and evaluated the counterfactual passes in shot scenarios. △ Less

Submitted 17 February, 2024; v1 submitted 4 June, 2022; originally announced June 2022.

Comments: 16 pages, 11 figures. Accepted in IEEE Transactions on Neural Networks and Learning Systems

arXiv:2107.05326 [pdf, other]

Learning interaction rules from multi-animal trajectories via augmented behavioral models

Authors: Keisuke Fujii, Naoya Takeishi, Kazushi Tsutsui, Emyo Fujioka, Nozomi Nishiumi, Ryoya Tanaka, Mika Fukushiro, Kaoru Ide, Hiroyoshi Kohno, Ken Yoda, Susumu Takahashi, Shizuko Hiryu, Yoshinobu Kawahara

Abstract: Extracting the interaction rules of biological agents from movement sequences pose challenges in various domains. Granger causality is a practical framework for analyzing the interactions from observed time-series data; however, this framework ignores the structures and assumptions of the generative process in animal behaviors, which may lead to interpretational problems and sometimes erroneous as… ▽ More Extracting the interaction rules of biological agents from movement sequences pose challenges in various domains. Granger causality is a practical framework for analyzing the interactions from observed time-series data; however, this framework ignores the structures and assumptions of the generative process in animal behaviors, which may lead to interpretational problems and sometimes erroneous assessments of causality. In this paper, we propose a new framework for learning Granger causality from multi-animal trajectories via augmented theory-based behavioral models with interpretable data-driven models. We adopt an approach for augmenting incomplete multi-agent behavioral models described by time-varying dynamical systems with neural networks. For efficient and interpretable learning, our model leverages theory-based architectures separating navigation and motion processes, and the theory-guided regularization for reliable behavioral modeling. This can provide interpretable signs of Granger-causal effects over time, i.e., when specific others cause the approach or separation. In experiments using synthetic datasets, our method achieved better performance than various baselines. We then analyzed multi-animal datasets of mice, flies, birds, and bats, which verified our method and obtained novel biological insights. △ Less

Submitted 25 October, 2021; v1 submitted 12 July, 2021; originally announced July 2021.

Comments: 24 pages, 5 figures, to appear in NeurIPS 2021

arXiv:2102.13156 [pdf, other]

Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling

Authors: Naoya Takeishi, Alexandros Kalousis

Abstract: Integrating physics models within machine learning models holds considerable promise toward learning robust models with improved interpretability and abilities to extrapolate. In this work, we focus on the integration of incomplete physics models into deep generative models. In particular, we introduce an architecture of variational autoencoders (VAEs) in which a part of the latent space is ground… ▽ More Integrating physics models within machine learning models holds considerable promise toward learning robust models with improved interpretability and abilities to extrapolate. In this work, we focus on the integration of incomplete physics models into deep generative models. In particular, we introduce an architecture of variational autoencoders (VAEs) in which a part of the latent space is grounded by physics. A key technical challenge is to strike a balance between the incomplete physics and trainable components such as neural networks for ensuring that the physics part is used in a meaningful manner. To this end, we propose a regularized learning method that controls the effect of the trainable components and preserves the semantics of the physics-based latent variables as intended. We not only demonstrate generative performance improvements over a set of synthetic and real-world datasets, but we also show that we learn robust models that can consistently extrapolate beyond the training distribution in a meaningful manner. Moreover, we show that we can control the generative process in an interpretable manner. △ Less

Submitted 26 October, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

arXiv:2007.03155 [pdf, other]

doi 10.1016/j.neunet.2023.11.068

Decentralized policy learning with partial observation and mechanical constraints for multiperson modeling

Authors: Keisuke Fujii, Naoya Takeishi, Yoshinobu Kawahara, Kazuya Takeda

Abstract: Extracting the rules of real-world multi-agent behaviors is a current challenge in various scientific and engineering fields. Biological agents independently have limited observation and mechanical constraints; however, most of the conventional data-driven models ignore such assumptions, resulting in lack of biological plausibility and model interpretability for behavioral analyses. Here we propos… ▽ More Extracting the rules of real-world multi-agent behaviors is a current challenge in various scientific and engineering fields. Biological agents independently have limited observation and mechanical constraints; however, most of the conventional data-driven models ignore such assumptions, resulting in lack of biological plausibility and model interpretability for behavioral analyses. Here we propose sequential generative models with partial observation and mechanical constraints in a decentralized manner, which can model agents' cognition and body dynamics, and predict biologically plausible behaviors. We formulate this as a decentralized multi-agent imitation-learning problem, leveraging binary partial observation and decentralized policy models based on hierarchical variational recurrent neural networks with physical and biomechanical penalties. Using real-world basketball and soccer datasets, we show the effectiveness of our method in terms of the constraint violations, long-term trajectory prediction, and partial observation. Our approach can be used as a multi-agent simulator to generate realistic trajectories using real-world data. △ Less

Submitted 1 December, 2023; v1 submitted 6 July, 2020; originally announced July 2020.

Comments: 17 pages with 7 figures and 4 tables, accepted in Neural Networks

arXiv:2006.08935 [pdf, other]

Learning Dynamics Models with Stable Invariant Sets

Authors: Naoya Takeishi, Yoshinobu Kawahara

Abstract: Invariance and stability are essential notions in dynamical systems study, and thus it is of great interest to learn a dynamics model with a stable invariant set. However, existing methods can only handle the stability of an equilibrium. In this paper, we propose a method to ensure that a dynamics model has a stable invariant set of general classes such as limit cycles and line attractors. We star… ▽ More Invariance and stability are essential notions in dynamical systems study, and thus it is of great interest to learn a dynamics model with a stable invariant set. However, existing methods can only handle the stability of an equilibrium. In this paper, we propose a method to ensure that a dynamics model has a stable invariant set of general classes such as limit cycles and line attractors. We start with the approach by Manek and Kolter (2019), where they use a learnable Lyapunov function to make a model stable with regard to an equilibrium. We generalize it for general sets by introducing projection onto them. To resolve the difficulty of specifying a to-be stable invariant set analytically, we propose defining such a set as a primitive shape (e.g., sphere) in a latent space and learning the transformation between the original and latent spaces. It enables us to compute the projection easily, and at the same time, we can maintain the model's flexibility using various invertible neural networks for the transformation. We present experimental results that show the validity of the proposed method and the usefulness for long-term prediction. △ Less

Submitted 29 October, 2020; v1 submitted 16 June, 2020; originally announced June 2020.

Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 35(11), 9782-9790, 2021

arXiv:2004.04464 [pdf, other]

A Characteristic Function for Shapley-Value-Based Attribution of Anomaly Scores

Authors: Naoya Takeishi, Yoshinobu Kawahara

Abstract: In anomaly detection, the degree of irregularity is often summarized as a real-valued anomaly score. We address the problem of attributing such anomaly scores to input features for interpreting the results of anomaly detection. We particularly investigate the use of the Shapley value for attributing anomaly scores of semi-supervised detection methods. We propose a characteristic function specifica… ▽ More In anomaly detection, the degree of irregularity is often summarized as a real-valued anomaly score. We address the problem of attributing such anomaly scores to input features for interpreting the results of anomaly detection. We particularly investigate the use of the Shapley value for attributing anomaly scores of semi-supervised detection methods. We propose a characteristic function specifically designed for attributing anomaly scores. The idea is to approximate the absence of some features by locally minimizing the anomaly score with regard to the to-be-absent features. We examine the applicability of the proposed characteristic function and other general approaches for interpreting anomaly scores on multiple datasets and multiple anomaly detection methods. The results indicate the potential utility of the attribution methods including the proposed one. △ Less

Submitted 16 February, 2023; v1 submitted 9 April, 2020; originally announced April 2020.

Journal ref: Transactions on Machine Learning Research, 2023

arXiv:1909.03495 [pdf, other]

Shapley Values of Reconstruction Errors of PCA for Explaining Anomaly Detection

Authors: Naoya Takeishi

Abstract: We present a method to compute the Shapley values of reconstruction errors of principal component analysis (PCA), which is particularly useful in explaining the results of anomaly detection based on PCA. Because features are usually correlated when PCA-based anomaly detection is applied, care must be taken in computing a value function for the Shapley values. We utilize the probabilistic view of P… ▽ More We present a method to compute the Shapley values of reconstruction errors of principal component analysis (PCA), which is particularly useful in explaining the results of anomaly detection based on PCA. Because features are usually correlated when PCA-based anomaly detection is applied, care must be taken in computing a value function for the Shapley values. We utilize the probabilistic view of PCA, particularly its conditional distribution, to exactly compute a value function for the Shapely values. We also present numerical examples, which imply that the Shapley values are advantageous for explaining detected anomalies than raw reconstruction errors of each feature. △ Less

Submitted 22 February, 2020; v1 submitted 8 September, 2019; originally announced September 2019.

Comments: Workshop on Learning and Mining with Industrial Data (LMID) 2019. typos fixed

arXiv:1905.04859 [pdf, other]

doi 10.1038/s41598-020-58064-w

Physically-interpretable classification of biological network dynamics for complex collective motions

Authors: Keisuke Fujii, Naoya Takeishi, Motokazu Hojo, Yuki Inaba, Yoshinobu Kawahara

Abstract: Understanding biological network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion… ▽ More Understanding biological network dynamics is a fundamental issue in various scientific and engineering fields. Network theory is capable of revealing the relationship between elements and their propagation; however, for complex collective motions, the network properties often transiently and complexly change. A fundamental question addressed here pertains to the classification of collective motion network based on physically-interpretable dynamical properties. Here we apply a data-driven spectral analysis called graph dynamic mode decomposition, which obtains the dynamical properties for collective motion classification. Using a ballgame as an example, we classified the strategic collective motions in different global behaviours and discovered that, in addition to the physical properties, the contextual node information was critical for classification. Furthermore, we discovered the label-specific stronger spectra in the relationship among the nearest agents, providing physical and semantic interpretations. Our approach contributes to the understanding of principles of biological complex network dynamics from the perspective of nonlinear dynamical systems. △ Less

Submitted 13 June, 2020; v1 submitted 13 May, 2019; originally announced May 2019.

Comments: 42 pages with 7 figures and 3 tables. The latest version is published in Scientific Reports, 2020

Journal ref: Scientific Reports, 10, 3005, 2020

arXiv:1902.02068 [pdf, other]

doi 10.24963/ijcai.2020/331

Knowledge-Based Regularization in Generative Modeling

Authors: Naoya Takeishi, Yoshinobu Kawahara

Abstract: Prior domain knowledge can greatly help to learn generative models. However, it is often too costly to hard-code prior knowledge as a specific model architecture, so we often have to use general-purpose models. In this paper, we propose a method to incorporate prior knowledge of feature relations into the learning of general-purpose generative models. To this end, we formulate a regularizer that m… ▽ More Prior domain knowledge can greatly help to learn generative models. However, it is often too costly to hard-code prior knowledge as a specific model architecture, so we often have to use general-purpose models. In this paper, we propose a method to incorporate prior knowledge of feature relations into the learning of general-purpose generative models. To this end, we formulate a regularizer that makes the marginals of a generative model to follow prescribed relative dependence of features. It can be incorporated into off-the-shelf learning methods of many generative models, including variational autoencoders and generative adversarial networks, as its gradients can be computed using standard backpropagation techniques. We show the effectiveness of the proposed method with experiments on multiple types of datasets and generative models. △ Less

Submitted 10 December, 2020; v1 submitted 6 February, 2019; originally announced February 2019.

Comments: 12 pages, 7 figures, 9 tables, presented in IJCAI 2020

arXiv:1806.11332 [pdf, other]

Knowledge-Based Distant Regularization in Learning Probabilistic Models

Authors: Naoya Takeishi, Kosuke Akimoto

Abstract: Exploiting the appropriate inductive bias based on the knowledge of data is essential for achieving good performance in statistical machine learning. In practice, however, the domain knowledge of interest often provides information on the relationship of data attributes only distantly, which hinders direct utilization of such domain knowledge in popular regularization methods. In this paper, we pr… ▽ More Exploiting the appropriate inductive bias based on the knowledge of data is essential for achieving good performance in statistical machine learning. In practice, however, the domain knowledge of interest often provides information on the relationship of data attributes only distantly, which hinders direct utilization of such domain knowledge in popular regularization methods. In this paper, we propose the knowledge-based distant regularization framework, in which we utilize the distant information encoded in a knowledge graph for regularization of probabilistic model estimation. In particular, we propose to impose prior distributions on model parameters specified by knowledge graph embeddings. As an instance of the proposed framework, we present the factor analysis model with the knowledge-based distant regularization. We show the results of preliminary experiments on the improvement of the generalization capability of such model. △ Less

Submitted 29 June, 2018; originally announced June 2018.

Comments: 7 pages, 2 figures, presented in the Eighth International Workshop on Statistical Relational AI

arXiv:1710.04340 [pdf, other]

Learning Koopman Invariant Subspaces for Dynamic Mode Decomposition

Authors: Naoya Takeishi, Yoshinobu Kawahara, Takehisa Yairi

Abstract: Spectral decomposition of the Koopman operator is attracting attention as a tool for the analysis of nonlinear dynamical systems. Dynamic mode decomposition is a popular numerical algorithm for Koopman spectral analysis; however, we often need to prepare nonlinear observables manually according to the underlying dynamics, which is not always possible since we may not have any a priori knowledge ab… ▽ More Spectral decomposition of the Koopman operator is attracting attention as a tool for the analysis of nonlinear dynamical systems. Dynamic mode decomposition is a popular numerical algorithm for Koopman spectral analysis; however, we often need to prepare nonlinear observables manually according to the underlying dynamics, which is not always possible since we may not have any a priori knowledge about them. In this paper, we propose a fully data-driven method for Koopman spectral analysis based on the principle of learning Koopman invariant subspaces from observed data. To this end, we propose minimization of the residual sum of squares of linear least-squares regression to estimate a set of functions that transforms data into a form in which the linear regression fits well. We introduce an implementation with neural networks and evaluate performance empirically using nonlinear dynamical systems and applications. △ Less

Submitted 30 January, 2018; v1 submitted 11 October, 2017; originally announced October 2017.

Comments: 18 pages, 7 figures, presented in NIPS 2017

Showing 1–14 of 14 results for author: Takeishi, N