-
Likelihood-free Model Choice for Simulator-based Models with the Jensen--Shannon Divergence
Authors:
Jukka Corander,
Ulpu Remes,
Timo Koski
Abstract:
Choice of appropriate structure and parametric dimension of a model in the light of data has a rich history in statistical research, where the first seminal approaches were developed in 1970s, such as the Akaike's and Schwarz's model scoring criteria that were inspired by information theory and embodied the rationale called Occam's razor. After those pioneering works, model choice was quickly esta…
▽ More
Choice of appropriate structure and parametric dimension of a model in the light of data has a rich history in statistical research, where the first seminal approaches were developed in 1970s, such as the Akaike's and Schwarz's model scoring criteria that were inspired by information theory and embodied the rationale called Occam's razor. After those pioneering works, model choice was quickly established as its own field of research, gaining considerable attention in both computer science and statistics. However, to date, there have been limited attempts to derive scoring criteria for simulator-based models lacking a likelihood expression. Bayes factors have been considered for such models, but arguments have been put both for and against use of them and around issues related to their consistency. Here we use the asymptotic properties of Jensen--Shannon divergence (JSD) to derive a consistent model scoring criterion for the likelihood-free setting called JSD-Razor. Relationships of JSD-Razor with established scoring criteria for the likelihood-based approach are analyzed and we demonstrate the favorable properties of our criterion using both synthetic and real modeling examples.
△ Less
Submitted 8 June, 2022;
originally announced June 2022.
-
Nonparametric likelihood-free inference with Jensen-Shannon divergence for simulator-based models with categorical output
Authors:
Jukka Corander,
Ulpu Remes,
Ida Holopainen,
Timo Koski
Abstract:
Likelihood-free inference for simulator-based statistical models has recently attracted a surge of interest, both in the machine learning and statistics communities. The primary focus of these research fields has been to approximate the posterior distribution of model parameters, either by various types of Monte Carlo sampling algorithms or deep neural network -based surrogate models. Frequentist…
▽ More
Likelihood-free inference for simulator-based statistical models has recently attracted a surge of interest, both in the machine learning and statistics communities. The primary focus of these research fields has been to approximate the posterior distribution of model parameters, either by various types of Monte Carlo sampling algorithms or deep neural network -based surrogate models. Frequentist inference for simulator-based models has been given much less attention to date, despite that it would be particularly amenable to applications with big data where implicit asymptotic approximation of the likelihood is expected to be accurate and can leverage computationally efficient strategies. Here we derive a set of theoretical results to enable estimation, hypothesis testing and construction of confidence intervals for model parameters using asymptotic properties of the Jensen--Shannon divergence. Such asymptotic approximation offers a rapid alternative to more computation-intensive approaches and can be attractive for diverse applications of simulator-based models. 61
△ Less
Submitted 26 May, 2022; v1 submitted 22 May, 2022;
originally announced May 2022.
-
Exact simulation of coupled Wright-Fisher diffusions
Authors:
Celia GarcĂa-Pareja,
Henrik Hult,
Timo Koski
Abstract:
In this paper an exact rejection algorithm for simulating paths of the coupled Wright-Fisher diffusion is introduced. The coupled Wright-Fisher diffusion is a family of multidimensional Wright-Fisher diffusions that have drifts depending on each other through a coupling term and that find applications in the study of interacting genes' networks as those encountered in studies of antibiotic resista…
▽ More
In this paper an exact rejection algorithm for simulating paths of the coupled Wright-Fisher diffusion is introduced. The coupled Wright-Fisher diffusion is a family of multidimensional Wright-Fisher diffusions that have drifts depending on each other through a coupling term and that find applications in the study of interacting genes' networks as those encountered in studies of antibiotic resistance. Our algorithm uses independent neutral Wright-Fisher diffusions as candidate proposals, which can be sampled exactly by means of existing algorithms and are only needed at a finite number of points. Once a candidate is accepted, the remaining of the path can be recovered by sampling from a neutral multivariate Wright-Fisher bridge, for which we also provide an exact sampling strategy. The technique relies on a modification of the alternating series method and extends existing algorithms that are currently available for the one-dimensional case. Finally, the algorithm's complexity is derived and its performance demonstrated in a simulation study.
△ Less
Submitted 7 September, 2020; v1 submitted 25 September, 2019;
originally announced September 2019.
-
A dual process for the coupled Wright-Fisher diffusion
Authors:
Martina Favero,
Henrik Hult,
Timo Koski
Abstract:
The coupled Wright-Fisher diffusion is a multi-dimensional Wright-Fisher diffusion for multi-locus and multi-allelic genetic frequencies, expressed as the strong solution to a system of stochastic differential equations that are coupled in the drift, where the pairwise interaction among loci is modelled by an inter-locus selection.
In this paper, a dual process to the coupled Wright-Fisher diffu…
▽ More
The coupled Wright-Fisher diffusion is a multi-dimensional Wright-Fisher diffusion for multi-locus and multi-allelic genetic frequencies, expressed as the strong solution to a system of stochastic differential equations that are coupled in the drift, where the pairwise interaction among loci is modelled by an inter-locus selection.
In this paper, a dual process to the coupled Wright-Fisher diffusion is derived, which contains transition rates corresponding to coalescence and mutation as well as single-locus selection and double-locus selection. The coalescence and mutation rates correspond to the typical transition rates of Kingman's coalescent process. The single-locus selection rate not only contains the single-locus selection parameters in a form that generalises the rates for an ancestral selection graph, but it also contains the double-selection parameters to include the effect of the pairwise interaction on the single locus. The double-locus selection rate reflects the particular structure of pairwise interactions of the coupled Wright-Fisher diffusion.
Moreover, in the special case of two loci, two alleles, with selection and parent independent mutation, the stationary density for the coupled Wright-Fisher diffusion and the transition rates of the dual process are obtained in an explicit form.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
On a Multilocus Wright-Fisher Model with Mutation and a Svirezhev-Shahshahani Gradient-like Selection Dynamics
Authors:
Erik Aurell,
Magnus Ekeberg,
Timo Koski
Abstract:
In this paper we introduce a multilocus diffusion model of a population of $N$ haploid, asexually reproducing individuals. The model includes parent-dependent mutation and interlocus selection, the latter limited to pairwise relationships but among a large number of simultaneous loci. The diffusion is expressed as a system of stochastic differential equations (SDEs) that are coupled in the drift f…
▽ More
In this paper we introduce a multilocus diffusion model of a population of $N$ haploid, asexually reproducing individuals. The model includes parent-dependent mutation and interlocus selection, the latter limited to pairwise relationships but among a large number of simultaneous loci. The diffusion is expressed as a system of stochastic differential equations (SDEs) that are coupled in the drift functions through a Shahshahani gradient-like structure for interlocus selection. The system of SDEs is derived from a sequence of Markov chains by weak convergence. We find the explicit stationary (invariant) density by solving the corresponding stationary Fokker-Planck equation under parent-independent mutation, i.e., Kingman's house-of-cards mutation. The density formula enables us to readily construct families of Wright-Fisher models corresponding to networks of loci.
△ Less
Submitted 13 December, 2019; v1 submitted 3 June, 2019;
originally announced June 2019.
-
Materials Informatics for Dark Matter Detection
Authors:
R. Matthias Geilhufe,
Bart Olsthoorn,
Alfredo Ferella,
Timo Koski,
Felix Kahlhoefer,
Jan Conrad,
Alexander V. Balatsky
Abstract:
Dark Matter particles are commonly assumed to be weakly interacting massive particles (WIMPs) with a mass in the GeV to TeV range. However, recent interest has shifted towards lighter WIMPs, which are more difficult to probe experimentally. A detection of sub-GeV WIMPs would require the use of small gap materials in sensors. Using recent estimates of the WIMP mass, we identify the relevant target…
▽ More
Dark Matter particles are commonly assumed to be weakly interacting massive particles (WIMPs) with a mass in the GeV to TeV range. However, recent interest has shifted towards lighter WIMPs, which are more difficult to probe experimentally. A detection of sub-GeV WIMPs would require the use of small gap materials in sensors. Using recent estimates of the WIMP mass, we identify the relevant target space towards small gap materials (100-10 meV). Dirac Materials, a class of small- or zero-gap materials, emerge as natural candidates for sensors for Dark Matter detection. We propose the use of informatics tools to rapidly assay materials band structures to search for small gap semiconductors and semimetals, rather than focusing on a few preselected compounds. As a specific example of the proposed strategy, we use the organic materials database (omdb.diracmaterials.org) to identify organic candidates for sensors: the narrow band gap semiconductors BNQ-TTF and DEBTTT with gaps of 40 and 38 meV, and the Dirac-line semimetal (BEDT-TTF)$\cdot$Br which exhibits a tiny gap of $\approx$ 50 meV when spin-orbit coupling is included. We outline a novel and powerful approach to search for dark matter detection sensor materials by means of a rapid assay of materials using informatics tools.
△ Less
Submitted 23 August, 2018; v1 submitted 15 June, 2018;
originally announced June 2018.
-
Testing for Causality in Continuous Time Bayesian Network Models of High-Frequency Data
Authors:
Jonas Hallgren,
Timo Koski
Abstract:
Continuous time Bayesian networks are investigated with a special focus on their ability to express causality. A framework is presented for doing inference in these networks. The central contributions are a representation of the intensity matrices for the networks and the introduction of a causality measure. A new model for high-frequency financial data is presented. It is calibrated to market dat…
▽ More
Continuous time Bayesian networks are investigated with a special focus on their ability to express causality. A framework is presented for doing inference in these networks. The central contributions are a representation of the intensity matrices for the networks and the introduction of a causality measure. A new model for high-frequency financial data is presented. It is calibrated to market data and by the new causality measure it performs better than older models.
△ Less
Submitted 25 January, 2016;
originally announced January 2016.
-
A Prior Distribution over Directed Acyclic Graphs for Sparse Bayesian Networks
Authors:
Felix L. Rios,
John M. Noble,
Timo J. T. Koski
Abstract:
The main contribution of this article is a new prior distribution over directed acyclic graphs, which gives larger weight to sparse graphs. This distribution is intended for structured Bayesian networks, where the structure is given by an ordered block model. That is, the nodes of the graph are objects which fall into categories (or blocks); the blocks have a natural ordering. The presence of a re…
▽ More
The main contribution of this article is a new prior distribution over directed acyclic graphs, which gives larger weight to sparse graphs. This distribution is intended for structured Bayesian networks, where the structure is given by an ordered block model. That is, the nodes of the graph are objects which fall into categories (or blocks); the blocks have a natural ordering. The presence of a relationship between two objects is denoted by an arrow, from the object of lower category to the object of higher category. The models considered here were introduced in Kemp et al. (2004) for relational data and extended to multivariate data in Mansinghka et al. (2006). The prior over graph structures presented here has an explicit formula. The number of nodes in each layer of the graph follow a Hoppe Ewens urn model.
We consider the situation where the nodes of the graph represent random variables, whose joint probability distribution factorises along the DAG. We describe Monte Carlo schemes for finding the optimal aposteriori structure given a data matrix and compare the performance with Mansinghka et al. (2006) and also with the uniform prior.
△ Less
Submitted 25 April, 2015;
originally announced April 2015.
-
Context-specific independence in graphical log-linear models
Authors:
Henrik Nyman,
Johan Pensar,
Timo Koski,
Jukka Corander
Abstract:
Log-linear models are the popular workhorses of analyzing contingency tables. A log-linear parameterization of an interaction model can be more expressive than a direct parameterization based on probabilities, leading to a powerful way of defining restrictions derived from marginal, conditional and context-specific independence. However, parameter estimation is often simpler under a direct paramet…
▽ More
Log-linear models are the popular workhorses of analyzing contingency tables. A log-linear parameterization of an interaction model can be more expressive than a direct parameterization based on probabilities, leading to a powerful way of defining restrictions derived from marginal, conditional and context-specific independence. However, parameter estimation is often simpler under a direct parameterization, provided that the model enjoys certain decomposability properties. Here we introduce a cyclical projection algorithm for obtaining maximum likelihood estimates of log-linear parameters under an arbitrary context-specific graphical log-linear model, which needs not satisfy criteria of decomposability. We illustrate that lifting the restriction of decomposability makes the models more expressive, such that additional context-specific independencies embedded in real data can be identified. It is also shown how a context-specific graphical model can correspond to a non-hierarchical log-linear parameterization with a concise interpretation. This observation can pave way to further development of non-hierarchical log-linear models, which have been largely neglected due to their believed lack of interpretability.
△ Less
Submitted 9 September, 2014;
originally announced September 2014.
-
Decomposition Sampling applied to Parallelization of Metropolis-Hastings
Authors:
Jonas Hallgren,
Timo Koski
Abstract:
This paper presents an algorithm for sampling random variables that allows to separation of the sampling process into subproblems by dividing the sample space into overlap** parts. The subproblems can be solved independently of each other and are thus well suited for parallelization. Furthermore, on each of these subproblems it is possible to use distinct and independent sampling methods. In oth…
▽ More
This paper presents an algorithm for sampling random variables that allows to separation of the sampling process into subproblems by dividing the sample space into overlap** parts. The subproblems can be solved independently of each other and are thus well suited for parallelization. Furthermore, on each of these subproblems it is possible to use distinct and independent sampling methods. In other words, specific samplers can be designed for specific parts of the sample space. The algorithms are demonstrated on a particle marginal Metropolis-Hastings sampler applied to calibration of a volatility model and two toy examples. Significant speedup and decrease of total variation is observed in experiments.
△ Less
Submitted 25 January, 2016; v1 submitted 12 February, 2014;
originally announced February 2014.
-
Labeled Directed Acyclic Graphs: a generalization of context-specific independence in directed graphical models
Authors:
Johan Pensar,
Henrik Nyman,
Timo Koski,
Jukka Corander
Abstract:
We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of the…
▽ More
We introduce a novel class of labeled directed acyclic graph (LDAG) models for finite sets of discrete variables. LDAGs generalize earlier proposals for allowing local structures in the conditional probability distribution of a node, such that unrestricted label sets determine which edges can be deleted from the underlying directed acyclic graph (DAG) for a given context. Several properties of these models are derived, including a generalization of the concept of Markov equivalence classes. Efficient Bayesian learning of LDAGs is enabled by introducing an LDAG-based factorization of the Dirichlet prior for the model parameters, such that the marginal likelihood can be calculated analytically. In addition, we develop a novel prior distribution for the model structures that can appropriately penalize a model for its labeling complexity. A non-reversible Markov chain Monte Carlo algorithm combined with a greedy hill climbing approach is used for illustrating the useful properties of LDAG models for both real and synthetic data sets.
△ Less
Submitted 4 October, 2013;
originally announced October 2013.
-
Stratified Graphical Models - Context-Specific Independence in Graphical Models
Authors:
Henrik Nyman,
Johan Pensar,
Timo Koski,
Jukka Corander
Abstract:
Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic map** of diseases, credit risk evaluation, reliability and computer security, etc. Despite of their generic applicability and wide adoptance, the constraints imposed by undirected graphical models and Bayesian networks…
▽ More
Theory of graphical models has matured over more than three decades to provide the backbone for several classes of models that are used in a myriad of applications such as genetic map** of diseases, credit risk evaluation, reliability and computer security, etc. Despite of their generic applicability and wide adoptance, the constraints imposed by undirected graphical models and Bayesian networks have also been recognized to be unnecessarily stringent under certain circumstances. This observation has led to the proposal of several generalizations that aim at more relaxed constraints by which the models can impose local or context-specific dependence structures. Here we consider an additional class of such models, termed as stratified graphical models. We develop a method for Bayesian learning of these models by deriving an analytical expression for the marginal likelihood of data under a specific subclass of decomposable stratified models. A non-reversible Markov chain Monte Carlo approach is further used to identify models that are highly supported by the posterior distribution over the model space. Our method is illustrated and compared with ordinary graphical models through application to several real and synthetic datasets.
△ Less
Submitted 14 November, 2013; v1 submitted 25 September, 2013;
originally announced September 2013.