-
Efficient estimation of divergence-based sensitivity indices with Gaussian process surrogates
Authors:
A. W. Eggels,
D. T. Crommelin
Abstract:
We consider the estimation of sensitivity indices based on divergence measures such as Hellinger distance. For sensitivity analysis of complex models, these divergence-based indices can be estimated by Monte-Carlo sampling (MCS) in combination with kernel density estimation (KDE). In a direct approach, the complex model must be evaluated at every input point generated by MCS, resulting in samples…
▽ More
We consider the estimation of sensitivity indices based on divergence measures such as Hellinger distance. For sensitivity analysis of complex models, these divergence-based indices can be estimated by Monte-Carlo sampling (MCS) in combination with kernel density estimation (KDE). In a direct approach, the complex model must be evaluated at every input point generated by MCS, resulting in samples in the input-output space that can be used for density estimation. However, if the computational cost of the complex model strongly limits the number of model evaluations, this direct method gives large errors. We propose to use Gaussian process (GP) surrogates to increase the number of samples in the combined input-output space. By enlarging this sample set, the KDE becomes more accurate, leading to improved estimates. To compare the GP surrogates, we use a surrogate constructed by samples obtained with stochastic collocation, combined with Lagrange interpolation. Furthermore, we propose a new estimation method for these sensitivity indices based on minimum spanning trees. Finally, we also propose a new type of sensitivity indices based on divergence measures, namely direct sensitivity indices. These are useful when the input data is dependent.
△ Less
Submitted 17 September, 2019; v1 submitted 8 April, 2019;
originally announced April 2019.
-
Quantifying dependencies for sensitivity analysis with multivariate input sample data
Authors:
A. W. Eggels,
D. T. Crommelin
Abstract:
We present a novel method for quantifying dependencies in multivariate datasets, based on estimating the Rényi entropy by minimum spanning trees (MSTs). The length of the MSTs can be used to order pairs of variables from strongly to weakly dependent, making it a useful tool for sensitivity analysis with dependent input variables. It is well-suited for cases where the input distribution is unknown…
▽ More
We present a novel method for quantifying dependencies in multivariate datasets, based on estimating the Rényi entropy by minimum spanning trees (MSTs). The length of the MSTs can be used to order pairs of variables from strongly to weakly dependent, making it a useful tool for sensitivity analysis with dependent input variables. It is well-suited for cases where the input distribution is unknown and only a sample of the inputs is available. We introduce an estimator to quantify dependency based on the MST length, and investigate its properties with several numerical examples. To reduce the computational cost of constructing the exact MST for large datasets, we explore methods to compute approximations to the exact MST, and find the multilevel approach introduced recently by Zhong et al. (2015) to be the most accurate. We apply our proposed method to an artificial testcase based on the Ishigami function, as well as to a real-world testcase involving sediment transport in the North Sea. The results are consistent with prior knowledge and heuristic understanding, as well as with variance-based analysis using Sobol indices in the case where these indices can be computed.
△ Less
Submitted 6 February, 2018;
originally announced February 2018.
-
Clustering-based collocation for uncertainty propagation with multivariate dependent inputs
Authors:
A. W. Eggels,
D. T. Crommelin,
J. A. S. Witteveen
Abstract:
In this article, we propose the use of partitioning and clustering methods as an alternative to Gaussian quadrature for stochastic collocation. The key idea is to use cluster centers as the nodes for collocation. In this way, we can extend the use of collocation methods to uncertainty propagation with multivariate, dependent input, in which the output approximation is piecewise constant on the clu…
▽ More
In this article, we propose the use of partitioning and clustering methods as an alternative to Gaussian quadrature for stochastic collocation. The key idea is to use cluster centers as the nodes for collocation. In this way, we can extend the use of collocation methods to uncertainty propagation with multivariate, dependent input, in which the output approximation is piecewise constant on the clusters. The approach is particularly useful in situations where the probability distribution of the input is unknown, and only a sample from the input distribution is available. We examine several clustering methods and assess the convergence of collocation based on these methods both theoretically and numerically. We demonstrate good performance of the proposed methods, most notably for the challenging case of nonlinearly dependent inputs in higher dimensions. Numerical tests with input dimension up to 16 are included, using as benchmarks the Genz test functions and a test case from computational fluid dynamics (lid-driven cavity flow).
△ Less
Submitted 15 April, 2019; v1 submitted 8 March, 2017;
originally announced March 2017.
-
Stochastic Climate Theory
Authors:
Georg A. Gottwald,
Daan T. Crommelin,
Christian L. E. Franzke
Abstract:
In this chapter we review stochastic modelling methods in climate science. First we provide a conceptual framework for stochastic modelling of deterministic dynamical systems based on the Mori-Zwanzig formalism. The Mori-Zwanzig equations contain a Markov term, a memory term and a term suggestive of stochastic noise. Within this framework we express standard model reduction methods such as averagi…
▽ More
In this chapter we review stochastic modelling methods in climate science. First we provide a conceptual framework for stochastic modelling of deterministic dynamical systems based on the Mori-Zwanzig formalism. The Mori-Zwanzig equations contain a Markov term, a memory term and a term suggestive of stochastic noise. Within this framework we express standard model reduction methods such as averaging and homogenization which eliminate the memory term. We further discuss ways to deal with the memory term and how the type of noise depends on the underlying deterministic chaotic system. Secondly, we review current approaches in stochastic data-driven models. We discuss how the drift and diffusion coefficients of models in the form of stochastic differential equations can be estimated from observational data. We pay attention to situations where the data stems from multi scale systems, a relevant topic in the context of data from the climate system. Furthermore, we discuss the use of discrete stochastic processes (Markov chains) for e.g. stochastic subgrid-scale modeling and other topics in climate science.
△ Less
Submitted 22 December, 2016;
originally announced December 2016.