Search | arXiv e-print repository

Non-Sequential Ensemble Kalman Filtering using Distributed Arrays

Authors: Cédric Travelletti, Jörg Franke, David Ginsbourger, Stefan Brönnimann

Abstract: This work introduces a new, distributed implementation of the Ensemble Kalman Filter (EnKF) that allows for non-sequential assimilation of large datasets in high-dimensional problems. The traditional EnKF algorithm is computationally intensive and exhibits difficulties in applications requiring interaction with the background covariance matrix, prompting the use of methods like sequential assimila… ▽ More This work introduces a new, distributed implementation of the Ensemble Kalman Filter (EnKF) that allows for non-sequential assimilation of large datasets in high-dimensional problems. The traditional EnKF algorithm is computationally intensive and exhibits difficulties in applications requiring interaction with the background covariance matrix, prompting the use of methods like sequential assimilation which can introduce unwanted consequences, such as dependency on observation ordering. Our implementation leverages recent advancements in distributed computing to enable the construction and use of the full model error covariance matrix in distributed memory, allowing for single-batch assimilation of all observations and eliminating order dependencies. Comparative performance assessments, involving both synthetic and real-world paleoclimatic reconstruction applications, indicate that the new, non-sequential implementation outperforms the traditional, sequential one. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2310.07315 [pdf, ps, other]

Consistency of some sequential experimental design strategies for excursion set estimation based on vector-valued Gaussian processes

Authors: Philip Stange, David Ginsbourger

Abstract: We tackle the extension to the vector-valued case of consistency results for Stepwise Uncertainty Reduction sequential experimental design strategies established in [Bect et al., A supermartingale approach to Gaussian process based sequential design of experiments, Bernoulli 25, 2019]. This lead us in the first place to clarify, assuming a compact index set, how the connection between continuous G… ▽ More We tackle the extension to the vector-valued case of consistency results for Stepwise Uncertainty Reduction sequential experimental design strategies established in [Bect et al., A supermartingale approach to Gaussian process based sequential design of experiments, Bernoulli 25, 2019]. This lead us in the first place to clarify, assuming a compact index set, how the connection between continuous Gaussian processes and Gaussian measures on the Banach space of continuous functions carries over to vector-valued settings. From there, a number of concepts and properties from the aforementioned paper can be readily extended. However, vector-valued settings do complicate things for some results, mainly due to the lack of continuity for the pseudo-inverse map** that affects the conditional mean and covariance function given finitely many pointwise observations. We apply obtained results to the Integrated Bernoulli Variance and the Expected Measure Variance uncertainty functionals employed in [Fossum et al., Learning excursion sets of vector-valued Gaussian random fields for autonomous ocean sampling, The Annals of Applied Statistics 15, 2021] for the estimation for excursion sets of vector-valued functions. △ Less

Submitted 11 October, 2023; originally announced October 2023.

arXiv:2310.04082 [pdf, other]

An energy-based model approach to rare event probability estimation

Authors: Lea Friedli, David Ginsbourger, Arnaud Doucet, Niklas Linde

Abstract: The estimation of rare event probabilities plays a pivotal role in diverse fields. Our aim is to determine the probability of a hazard or system failure occurring when a quantity of interest exceeds a critical value. In our approach, the distribution of the quantity of interest is represented by an energy density, characterized by a free energy function. To efficiently estimate the free energy, a… ▽ More The estimation of rare event probabilities plays a pivotal role in diverse fields. Our aim is to determine the probability of a hazard or system failure occurring when a quantity of interest exceeds a critical value. In our approach, the distribution of the quantity of interest is represented by an energy density, characterized by a free energy function. To efficiently estimate the free energy, a bias potential is introduced. Using concepts from energy-based models (EBM), this bias potential is optimized such that the corresponding probability density function approximates a pre-defined distribution targeting the failure region of interest. Given the optimal bias potential, the free energy function and the rare event probability of interest can be determined. The approach is applicable not just in traditional rare event settings where the variable upon which the quantity of interest relies has a known distribution, but also in inversion settings where the variable follows a posterior distribution. By combining the EBM approach with a Stein discrepancy-based stop** criterion, we aim for a balanced accuracy-efficiency trade-off. Furthermore, we explore both parametric and non-parametric approaches for the bias potential, with the latter eliminating the need for choosing a particular parameterization, but depending strongly on the accuracy of the kernel density estimate used in the optimization process. Through three illustrative test cases encompassing both traditional and inversion settings, we show that the proposed EBM approach, when properly configured, (i) allows stable and efficient estimation of rare event probabilities and (ii) compares favorably against subset sampling approaches. △ Less

Submitted 6 October, 2023; originally announced October 2023.

arXiv:2307.05846 [pdf, other]

Assessing the calibration of multivariate probabilistic forecasts

Authors: Sam Allen, Johanna Ziegel, David Ginsbourger

Abstract: Rank and PIT histograms are established tools to assess the calibration of probabilistic forecasts. They not only check whether an ensemble forecast is calibrated, but they also reveal what systematic biases (if any) are present in the forecasts. Several extensions of rank histograms have been proposed to evaluate the calibration of probabilistic forecasts for multivariate outcomes. These extensio… ▽ More Rank and PIT histograms are established tools to assess the calibration of probabilistic forecasts. They not only check whether an ensemble forecast is calibrated, but they also reveal what systematic biases (if any) are present in the forecasts. Several extensions of rank histograms have been proposed to evaluate the calibration of probabilistic forecasts for multivariate outcomes. These extensions introduce a so-called pre-rank function that condenses the multivariate forecasts and observations into univariate objects, from which a standard rank histogram can be produced. Existing pre-rank functions typically aim to preserve as much information as possible when condensing the multivariate forecasts and observations into univariate objects. Although this is sensible when conducting statistical tests for multivariate calibration, it can hinder the interpretation of the resulting histograms. In this paper, we demonstrate that there are few restrictions on the choice of pre-rank function, meaning forecasters can choose a pre-rank function depending on what information they want to extract from their forecasts. We introduce the concept of simple pre-rank functions, and provide examples that can be used to assess the location, scale, and dependence structure of multivariate probabilistic forecasts, as well as pre-rank functions tailored to the evaluation of probabilistic spatial field forecasts. The simple pre-rank functions that we introduce are easy to interpret, easy to implement, and they deliberately provide complementary information, meaning several pre-rank functions can be employed to achieve a more complete understanding of multivariate forecast performance. We then discuss how e-values can be employed to formally test for multivariate calibration over time. This is demonstrated in an application to wind speed forecasting using the EUPPBench post-processing benchmark data set. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2206.07588 [pdf, ps, other]

Characteristic kernels on Hilbert spaces, Banach spaces, and on sets of measures

Authors: Johanna Ziegel, David Ginsbourger, Lutz Dümbgen

Abstract: We present new classes of positive definite kernels on non-standard spaces that are integrally strictly positive definite or characteristic. In particular, we discuss radial kernels on separable Hilbert spaces, and introduce broad classes of kernels on Banach spaces and on metric spaces of strong negative type. The general results are used to give explicit classes of kernels on separable $L^p$ spa… ▽ More We present new classes of positive definite kernels on non-standard spaces that are integrally strictly positive definite or characteristic. In particular, we discuss radial kernels on separable Hilbert spaces, and introduce broad classes of kernels on Banach spaces and on metric spaces of strong negative type. The general results are used to give explicit classes of kernels on separable $L^p$ spaces and on sets of measures. △ Less

Submitted 15 June, 2022; originally announced June 2022.

arXiv:2202.12732 [pdf, other]

Evaluating forecasts for high-impact events using transformed kernel scores

Authors: Sam Allen, David Ginsbourger, Johanna Ziegel

Abstract: It is informative to evaluate a forecaster's ability to predict outcomes that have a large impact on the forecast user. Although weighted scoring rules have become a well-established tool to achieve this, such scores have been studied almost exclusively in the univariate case, with interest typically placed on extreme events. However, a large impact may also result from events not considered to be… ▽ More It is informative to evaluate a forecaster's ability to predict outcomes that have a large impact on the forecast user. Although weighted scoring rules have become a well-established tool to achieve this, such scores have been studied almost exclusively in the univariate case, with interest typically placed on extreme events. However, a large impact may also result from events not considered to be extreme from a statistical perspective: the interaction of several moderate events could also generate a high impact. Compound weather events provide a good example of this. To assess forecasts made for high-impact events, this work extends existing results on weighted scoring rules by introducing weighted multivariate scores. To do so, we utilise kernel scores. We demonstrate that the threshold-weighted continuous ranked probability score (twCRPS), arguably the most well-known weighted scoring rule, is a kernel score. This result leads to a convenient representation of the twCRPS when the forecast is an ensemble, and also permits a generalisation that can be employed with alternative kernels, allowing us to introduce, for example, a threshold-weighted energy score and threshold-weighted variogram score. To illustrate the additional information that these weighted multivariate scoring rules provide, results are presented for a case study in which the weighted scores are used to evaluate daily precipitation accumulation forecasts, with particular interest on events that could lead to flooding. △ Less

Submitted 25 February, 2022; originally announced February 2022.

arXiv:2110.05210 [pdf, other]

doi 10.1093/gji/ggab381

Lithological Tomography with the Correlated Pseudo-Marginal Method

Authors: Lea Friedli, Niklas Linde, David Ginsbourger, Arnaud Doucet

Abstract: We consider lithological tomography in which the posterior distribution of (hydro)geological parameters of interest is inferred from geophysical data by treating the intermediate geophysical properties as latent variables. In such a latent variable model, one needs to estimate the intractable likelihood of the (hydro)geological parameters given the geophysical data. The pseudo-marginal method is a… ▽ More We consider lithological tomography in which the posterior distribution of (hydro)geological parameters of interest is inferred from geophysical data by treating the intermediate geophysical properties as latent variables. In such a latent variable model, one needs to estimate the intractable likelihood of the (hydro)geological parameters given the geophysical data. The pseudo-marginal method is an adaptation of the Metropolis-Hastings algorithm in which an unbiased approximation of this likelihood is obtained by Monte Carlo averaging over samples from, in this setting, the noisy petrophysical relationship linking (hydro)geological and geophysical properties. To make the method practical in data-rich geophysical settings with low noise levels, we demonstrate that the Monte Carlo sampling must rely on importance sampling distributions that well approximate the posterior distribution of petrophysical scatter around the sampled (hydro)geological parameter field. To achieve a suitable acceptance rate, we rely both on (1) the correlated pseudo-marginal method, which correlates the samples used in the proposed and current states of the Markov chain, and (2) a model proposal scheme that preserves the prior distribution. As a synthetic test example, we infer porosity fields using crosshole ground-penetrating radar (GPR) first-arrival travel times. We use a (50x50)-dimensional pixel-based parameterization of the multi-Gaussian porosity field with known statistical parameters, resulting in a parameter space of high dimension. We demonstrate that the correlated pseudo-marginal method with our proposed importance sampling and prior-preserving proposal scheme outperforms current state-of-the-art methods in both linear and non-linear settings by greatly enhancing the posterior exploration. △ Less

Submitted 11 October, 2021; originally announced October 2021.

Journal ref: Geophysical Journal International, Volume 228, Issue 2, February 2022

arXiv:2109.03457 [pdf, other]

Uncertainty Quantification and Experimental Design for Large-Scale Linear Inverse Problems under Gaussian Process Priors

Authors: Cédric Travelletti, David Ginsbourger, Niklas Linde

Abstract: We consider the use of Gaussian process (GP) priors for solving inverse problems in a Bayesian framework. As is well known, the computational complexity of GPs scales cubically in the number of datapoints. We here show that in the context of inverse problems involving integral operators, one faces additional difficulties that hinder inversion on large grids. Furthermore, in that context, covarianc… ▽ More We consider the use of Gaussian process (GP) priors for solving inverse problems in a Bayesian framework. As is well known, the computational complexity of GPs scales cubically in the number of datapoints. We here show that in the context of inverse problems involving integral operators, one faces additional difficulties that hinder inversion on large grids. Furthermore, in that context, covariance matrices can become too large to be stored. By leveraging results about sequential disintegrations of Gaussian measures, we are able to introduce an implicit representation of posterior covariance matrices that reduces the memory footprint by only storing low rank intermediate matrices, while allowing individual elements to be accessed on-the-fly without needing to build full posterior covariance matrices. Moreover, it allows for fast sequential inclusion of new observations. These features are crucial when considering sequential experimental design tasks. We demonstrate our approach by computing sequential data collection plans for excursion set recovery for a gravimetric inverse problem, where the goal is to provide fine resolution estimates of high density regions inside the Stromboli volcano, Italy. Sequential data collection plans are computed by extending the weighted integrated variance reduction (wIVR) criterion to inverse problems. Our results show that this criterion is able to significantly reduce the uncertainty on the excursion volume, reaching close to minimal levels of residual uncertainty. Overall, our techniques allow the advantages of probabilistic models to be brought to bear on large-scale inverse problems arising in the natural sciences. △ Less

Submitted 31 August, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

MSC Class: 86A22; 60G15; 62F15; 62L05

arXiv:2104.08156 [pdf, other]

Fast ABC with joint generative modelling and subset simulation

Authors: Eliane Maalouf, David Ginsbourger, Niklas Linde

Abstract: We propose a novel approach for solving inverse-problems with high-dimensional inputs and an expensive forward map**. It leverages joint deep generative modelling to transfer the original problem spaces to a lower dimensional latent space. By jointly modelling input and output variables and endowing the latent with a prior distribution, the fitted probabilistic model indirectly gives access to t… ▽ More We propose a novel approach for solving inverse-problems with high-dimensional inputs and an expensive forward map**. It leverages joint deep generative modelling to transfer the original problem spaces to a lower dimensional latent space. By jointly modelling input and output variables and endowing the latent with a prior distribution, the fitted probabilistic model indirectly gives access to the approximate conditional distributions of interest. Since model error and observational noise with unknown distributions are common in practice, we resort to likelihood-free inference with Approximate Bayesian Computation (ABC). Our method calls on ABC by Subset Simulation to explore the regions of the latent space with dissimilarities between generated and observed outputs below prescribed thresholds. We diagnose the diversity of approximate posterior solutions by monitoring the probability content of these regions as a function of the threshold. We further analyze the curvature of the resulting diagnostic curve to propose an adequate ABC threshold. When applied to a cross-borehole tomography example from geophysics, our approach delivers promising performance without using prior knowledge of the forward nor of the noise distribution. △ Less

Submitted 16 April, 2021; originally announced April 2021.

Comments: 13 pages, 6 figures

arXiv:2102.07612 [pdf, other]

doi 10.1051/proc/202171108

Goal-oriented adaptive sampling under random field modelling of response probability distributions

Authors: Athénaïs Gautier, David Ginsbourger, Guillaume Pirot

Abstract: In the study of natural and artificial complex systems, responses that are not completely determined by the considered decision variables are commonly modelled probabilistically, resulting in response distributions varying across decision space. We consider cases where the spatial variation of these response distributions does not only concern their mean and/or variance but also other features inc… ▽ More In the study of natural and artificial complex systems, responses that are not completely determined by the considered decision variables are commonly modelled probabilistically, resulting in response distributions varying across decision space. We consider cases where the spatial variation of these response distributions does not only concern their mean and/or variance but also other features including for instance shape or uni-modality versus multi-modality. Our contributions build upon a non-parametric Bayesian approach to modelling the thereby induced fields of probability distributions, and in particular to a spatial extension of the logistic Gaussian model. The considered models deliver probabilistic predictions of response distributions at candidate points, allowing for instance to perform (approximate) posterior simulations of probability density functions, to jointly predict multiple moments and other functionals of target distributions, as well as to quantify the impact of collecting new samples on the state of knowledge of the distribution field of interest. In particular, we introduce adaptive sampling strategies leveraging the potential of the considered random distribution field models to guide system evaluations in a goal-oriented way, with a view towards parsimoniously addressing calibration and related problems from non-linear (stochastic) inversion and global optimisation. △ Less

Submitted 17 March, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

arXiv:2101.03108 [pdf, other]

Fast calculation of Gaussian Process multiple-fold cross-validation residuals and their covariances

Authors: David Ginsbourger, Cedric Schärer

Abstract: We generalize fast Gaussian process leave-one-out formulae to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in both Simple and Universal Kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cro… ▽ More We generalize fast Gaussian process leave-one-out formulae to multiple-fold cross-validation, highlighting in turn the covariance structure of cross-validation residuals in both Simple and Universal Kriging frameworks. We illustrate how resulting covariances affect model diagnostics. We further establish in the case of noiseless observations that correcting for covariances between residuals in cross-validation-based estimation of the scale parameter leads back to MLE. Also, we highlight in broader settings how differences between pseudo-likelihood and likelihood methods boil down to accounting or not for residual covariances. The proposed fast calculation of cross-validation residuals is implemented and benchmarked against a naive implementation. Numerical experiments highlight the accuracy and substantial speed-ups that our approach enables. However, as supported by a discussion on main drivers of computational costs and by a numerical benchmark, speed-ups steeply decline as the number of folds (say, all sharing the same size) decreases. An application to a contaminant localization test case illustrates that grou** clustered observations in folds may help improving model assessment and parameter fitting compared to Leave-One-Out. Overall, our results enable fast multiple-fold cross-validation, have direct consequences in model diagnostics, and pave the way to future work on hyperparameter fitting and on the promising field of goal-oriented fold design. △ Less

Submitted 3 June, 2023; v1 submitted 8 January, 2021; originally announced January 2021.

arXiv:2007.03722 [pdf, other]

Learning excursion sets of vector-valued Gaussian random fields for autonomous ocean sampling

Authors: Trygve Olav Fossum, Cédric Travelletti, Jo Eidsvik, David Ginsbourger, Kanna Rajan

Abstract: Improving and optimizing oceanographic sampling is a crucial task for marine science and maritime resource management. Faced with limited resources in understanding processes in the water-column, the combination of statistics and autonomous systems provide new opportunities for experimental design. In this work we develop efficient spatial sampling methods for characterizing regions defined by sim… ▽ More Improving and optimizing oceanographic sampling is a crucial task for marine science and maritime resource management. Faced with limited resources in understanding processes in the water-column, the combination of statistics and autonomous systems provide new opportunities for experimental design. In this work we develop efficient spatial sampling methods for characterizing regions defined by simultaneous exceedances above prescribed thresholds of several responses, with an application focus on map** coastal ocean phenomena based on temperature and salinity measurements. Specifically, we define a design criterion based on uncertainty in the excursions of vector-valued Gaussian random fields, and derive tractable expressions for the expected integrated Bernoulli variance reduction in such a framework. We demonstrate how this criterion can be used to prioritize sampling efforts at locations that are ambiguous, making exploration more effective. We use simulations to study and compare properties of the considered approaches, followed by results from field deployments with an autonomous underwater vehicle as part of a study map** the boundary of a river plume. The results demonstrate the potential of combining statistical methods and robotic platforms to effectively inform and execute data-driven environmental sampling. △ Less

Submitted 18 August, 2020; v1 submitted 7 July, 2020; originally announced July 2020.

arXiv:1912.11827 [pdf, other]

Area-covering postprocessing of ensemble precipitation forecasts using topographical and seasonal conditions

Authors: Lea Friedli, David Ginsbourger, Jonas Bhend

Abstract: Probabilistic weather forecasts from ensemble systems require statistical postprocessing to yield calibrated and sharp predictive distributions. This paper presents an area-covering postprocessing method for ensemble precipitation predictions. We rely on the ensemble model output statistics (EMOS) approach, which generates probabilistic forecasts with a parametric distribution whose parameters dep… ▽ More Probabilistic weather forecasts from ensemble systems require statistical postprocessing to yield calibrated and sharp predictive distributions. This paper presents an area-covering postprocessing method for ensemble precipitation predictions. We rely on the ensemble model output statistics (EMOS) approach, which generates probabilistic forecasts with a parametric distribution whose parameters depend on (statistics of) the ensemble prediction. A case study with daily precipitation predictions across Switzerland highlights that postprocessing at observation locations indeed improves high-resolution ensemble forecasts, with 4.5% CRPS reduction on average in the case of a lead time of 1 day. Our main aim is to achieve such an improvement without binding the model to stations, by leveraging topographical covariates. Specifically, regression coefficients are estimated by weighting the training data in relation to the topographical similarity between their station of origin and the prediction location. In our case study, this approach is found to reproduce the performance of the local model without using local historical data for calibration. We further identify that one key difficulty is that postprocessing often degrades the performance of the ensemble forecast during summer and early autumn. To mitigate, we additionally estimate on the training set whether postprocessing at a specific location is expected to improve the prediction. If not, the direct model output is used. This extension reduces the CRPS of the topographical model by up to another 1.7% on average at the price of a slight degradation in calibration. In this case, the highest improvement is achieved for a lead time of 4 days. △ Less

Submitted 12 October, 2020; v1 submitted 26 December, 2019; originally announced December 2019.

arXiv:1910.04086 [pdf, other]

Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization

Authors: Poompol Buathong, David Ginsbourger, Tipaluck Krityakierne

Abstract: We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the ``Double Sum'' (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called ``Deep Embedding'' (DE) kernels th… ▽ More We focus on kernel methods for set-valued inputs and their application to Bayesian set optimization, notably combinatorial optimization. We investigate two classes of set kernels that both rely on Reproducing Kernel Hilbert Space embeddings, namely the ``Double Sum'' (DS) kernels recently considered in Bayesian set optimization, and a class introduced here called ``Deep Embedding'' (DE) kernels that essentially consists in applying a radial kernel on Hilbert space on top of the canonical distance induced by another kernel such as a DS kernel. We establish in particular that while DS kernels typically suffer from a lack of strict positive definiteness, vast subclasses of DE kernels built upon DS kernels do possess this property, enabling in turn combinatorial optimization without requiring to introduce a jitter parameter. Proofs of theoretical results about considered kernels are complemented by a few practicalities regarding hyperparameter fitting. We furthermore demonstrate the applicability of our approach in prediction and optimization tasks, relying both on toy examples and on two test cases from mechanical engineering and hydrogeology, respectively. Experimental results highlight the applicability and compared merits of the considered approaches while opening new perspectives in prediction and sequential design with set inputs. △ Less

Submitted 10 March, 2020; v1 submitted 9 October, 2019; originally announced October 2019.

arXiv:1805.00753 [pdf, other]

Gaussian processes with multidimensional distribution inputs via optimal transport and Hilbertian embedding

Authors: Francois Bachoc, Alexandra Suvorikova, David Ginsbourger, Jean-Michel Loubes, Vladimir Spokoiny

Abstract: In this work, we investigate Gaussian Processes indexed by multidimensional distributions. While directly constructing radial positive definite kernels based on the Wasserstein distance has been proven to be possible in the unidimensional case, such constructions do not extend well to the multidimensional case as we illustrate here. To tackle the problem of defining positive definite kernels betwe… ▽ More In this work, we investigate Gaussian Processes indexed by multidimensional distributions. While directly constructing radial positive definite kernels based on the Wasserstein distance has been proven to be possible in the unidimensional case, such constructions do not extend well to the multidimensional case as we illustrate here. To tackle the problem of defining positive definite kernels between multivariate distributions based on optimal transport, we appeal instead to Hilbert space embeddings relying on optimal transport maps to a reference distribution, that we suggest to take as a Wasserstein barycenter. We characterize in turn radial positive definite kernels on Hilbert spaces, and show that the covariance parameters of virtually all parametric families of covariance functions are microergodic in the case of (infinite-dimensional) Hilbert spaces. We also investigate statistical properties of our suggested positive definite kernels on multidimensional distributions, with a focus on consistency when a population Wasserstein barycenter is replaced by an empirical barycenter and additional explicit results in the special case of Gaussian distributions. Finally, we study the Gaussian process methodology based on our suggested positive definite kernels in regression problems with multidimensional distribution inputs, on simulation data stemming both from synthetic examples and from a mechanical engineering test case. △ Less

Submitted 11 April, 2019; v1 submitted 2 May, 2018; originally announced May 2018.

arXiv:1711.01878 [pdf, other]

Modeling non-stationary extreme dependence with stationary max-stable processes and multidimensional scaling

Authors: Clément Chevalier, David Ginsbourger, Olivia Martius

Abstract: Modeling the joint distribution of extreme weather events in multiple locations is a challenging task with important applications. In this study, we use max-stable models to study extreme daily precipitation events in Switzerland. The non-stationarity of the spatial process at hand involves important challenges, which are often dealt with by using a stationary model in a so-called climate space, w… ▽ More Modeling the joint distribution of extreme weather events in multiple locations is a challenging task with important applications. In this study, we use max-stable models to study extreme daily precipitation events in Switzerland. The non-stationarity of the spatial process at hand involves important challenges, which are often dealt with by using a stationary model in a so-called climate space, with well-chosen covariates. Here, we instead chose to warp the weather stations under study in a latent space of higher dimension using multidimensional scaling (MDS). The advantage of this approach is its improved flexibility to reproduce highly non-stationary phenomena, while kee** a tractable stationary spatial model in the latent space. Two model fitting approaches, which both use MDS, are presented and compared to a classical approach that relies on composite likelihood maximization in a climate space. Results suggest that the proposed methods better reproduce the observed extremal coefficients and their complex spatial dependence. △ Less

Submitted 28 November, 2018; v1 submitted 6 November, 2017; originally announced November 2017.

arXiv:1710.00688 [pdf, other]

doi 10.1080/00401706.2018.1562987

Profile extrema for visualizing and quantifying uncertainties on excursion regions. Application to coastal flooding

Authors: Dario Azzimonti, David Ginsbourger, Jérémy Rohmer, Déborah Idier

Abstract: We consider the problem of describing excursion sets of a real-valued function $f$, i.e. the set of inputs where $f$ is above a fixed threshold. Such regions are hard to visualize if the input space dimension, $d$, is higher than 2. For a given projection matrix from the input space to a lower dimensional (usually $1,2$) subspace, we introduce profile sup (inf) functions that associate to each poi… ▽ More We consider the problem of describing excursion sets of a real-valued function $f$, i.e. the set of inputs where $f$ is above a fixed threshold. Such regions are hard to visualize if the input space dimension, $d$, is higher than 2. For a given projection matrix from the input space to a lower dimensional (usually $1,2$) subspace, we introduce profile sup (inf) functions that associate to each point in the projection's image the sup (inf) of the function constrained over the pre-image of this point by the considered projection. Plots of profile extrema functions convey a simple, although intrinsically partial, visualization of the set. We consider expensive to evaluate functions where only a very limited number of evaluations, $n$, is available, e.g. $n<100d$, and we surrogate $f$ with a posterior quantity of a Gaussian process (GP) model. We first compute profile extrema functions for the posterior mean given $n$ evaluations of $f$. We quantify the uncertainty on such estimates by studying the distribution of GP profile extrema with posterior quasi-realizations obtained from an approximating process. We control such approximation with a bound inherited from the Borell-TIS inequality. The technique is applied to analytical functions ($d=2,3$) and to a $5$-dimensional coastal flooding test case for a site located on the Atlantic French coast. Here $f$ is a numerical model returning the area of flooded surface in the coastal region given some offshore conditions. Profile extrema functions allowed us to better understand which offshore conditions impact large flooding events. △ Less

Submitted 3 December, 2018; v1 submitted 2 October, 2017; originally announced October 2017.

Journal ref: Technometrics, 2019

arXiv:1704.05318 [pdf, other]

On the choice of the low-dimensional domain for global optimization via random embeddings

Authors: Mickaël Binois, David Ginsbourger, Olivier Roustant

Abstract: The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budg… ▽ More The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budget of evaluations is severely limited, global optimization with random embeddings appears as a sound alternative to random search. Yet, in the case of box constraints on the native variables, defining suitable bounds on a low dimensional domain appears to be complex. Indeed, a small search domain does not guarantee to find a solution even under restrictive hypotheses about the function, while a larger one may slow down convergence dramatically. Here we tackle the issue of low-dimensional domain selection based on a detailed study of the properties of the random embedding, giving insight on the aforementioned difficulties. In particular, we describe a minimal low-dimensional set in correspondence with the embedded search space. We additionally show that an alternative equivalent embedding procedure yields simultaneously a simpler definition of the low-dimensional minimal set and better properties in practice. Finally, the performance and robustness gains of the proposed enhancements for Bayesian optimization are illustrated on numerical examples. △ Less

Submitted 22 October, 2018; v1 submitted 18 April, 2017; originally announced April 2017.

arXiv:1611.07256 [pdf, other]

doi 10.1080/00401706.2019.1693427

Adaptive Design of Experiments for Conservative Estimation of Excursion Sets

Authors: Dario Azzimonti, David Ginsbourger, Clément Chevalier, Julien Bect, Yann Richet

Abstract: We consider the problem of estimating the set of all inputs that leads a system to some particular behavior. The system is modeled by an expensive-to-evaluate function, such as a computer experiment, and we are interested in its excursion set, i.e. the set of points where the function takes values above or below some prescribed threshold. The objective function is emulated with a Gaussian Process… ▽ More We consider the problem of estimating the set of all inputs that leads a system to some particular behavior. The system is modeled by an expensive-to-evaluate function, such as a computer experiment, and we are interested in its excursion set, i.e. the set of points where the function takes values above or below some prescribed threshold. The objective function is emulated with a Gaussian Process (GP) model based on an initial design of experiments enriched with evaluation results at (batch-)sequentially determined input points. The GP model provides conservative estimates for the excursion set, which control false positives while minimizing false negatives. We introduce adaptive strategies that sequentially select new evaluations of the function by reducing the uncertainty on conservative estimates. Following the Stepwise Uncertainty Reduction approach we obtain new evaluations by minimizing adapted criteria. Tractable formulae for the conservative criteria are derived, which allow more convenient optimization. The method is benchmarked on random functions generated under the model assumptions in different scenarios of noise and batch size. We then apply it to a reliability engineering test case. Overall, the proposed strategy of minimizing false negatives in conservative estimation achieves competitive performance both in terms of model-based and model-free indicators. △ Less

Submitted 4 February, 2020; v1 submitted 22 November, 2016; originally announced November 2016.

Journal ref: Technometrics, 63(1):13-26, 2021

arXiv:1609.02700 [pdf, ps, other]

Efficient batch-sequential Bayesian optimization with moments of truncated Gaussian vectors

Authors: Sébastien Marmin, Clément Chevalier, David Ginsbourger

Abstract: We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian cumulative distribution functions is established for a generalized version of the multipoint expected improvement criterion. In turn, the latter relies on intermedia… ▽ More We deal with the efficient parallelization of Bayesian global optimization algorithms, and more specifically of those based on the expected improvement criterion and its variants. A closed form formula relying on multivariate Gaussian cumulative distribution functions is established for a generalized version of the multipoint expected improvement criterion. In turn, the latter relies on intermediate results that could be of independent interest concerning moments of truncated Gaussian vectors. The obtained expansion of the criterion enables studying its differentiability with respect to point batches and calculating the corresponding gradient in closed form. Furthermore , we derive fast numerical approximations of this gradient and propose efficient batch optimization strategies. Numerical experiments illustrate that the proposed approaches enable computational savings of between one and two order of magnitudes, hence enabling derivative-based batch-sequential acquisition function maximization to become a practically implementable and efficient standard. △ Less

Submitted 9 September, 2016; originally announced September 2016.

arXiv:1608.01118 [pdf, ps, other]

A supermartingale approach to Gaussian process based sequential design of experiments

Authors: Julien Bect, François Bachoc, David Ginsbourger

Abstract: Gaussian process (GP) models have become a well-established frameworkfor the adaptive design of costly experiments, and notably of computerexperiments. GP-based sequential designs have been found practicallyefficient for various objectives, such as global optimization(estimating the global maximum or maximizer(s) of a function),reliability analysis (estimating a probability of failure) or theesti… ▽ More Gaussian process (GP) models have become a well-established frameworkfor the adaptive design of costly experiments, and notably of computerexperiments. GP-based sequential designs have been found practicallyefficient for various objectives, such as global optimization(estimating the global maximum or maximizer(s) of a function),reliability analysis (estimating a probability of failure) or theestimation of level sets and excursion sets. In this paper, we studythe consistency of an important class of sequential designs, known asstepwise uncertainty reduction (SUR) strategies. Our approach relieson the key observation that the sequence of residual uncertaintymeasures, in SUR strategies, is generally a supermartingale withrespect to the filtration generated by the observations. Thisobservation enables us to establish generic consistency results for abroad class of SUR strategies. The consistency of several popularsequential design strategies is then obtained by means of this generalresult. Notably, we establish the consistency of two SUR strategiesproposed by Bect, Ginsbourger, Li, Picheny and Vazquez (Stat. Comp.,2012)---to the best of our knowledge, these are the first proofs ofconsistency for GP-based sequential design algorithms dedicated to theestimation of excursion sets and their measure. We also establish anew, more general proof of consistency for the expected improvementalgorithm for global optimization which, unlike previous results inthe literature, applies to any GP with continuous sample paths. △ Less

Submitted 30 August, 2018; v1 submitted 3 August, 2016; originally announced August 2016.

arXiv:1603.05031 [pdf, other]

doi 10.1080/10618600.2017.1360781

Estimating orthant probabilities of high dimensional Gaussian vectors with an application to set estimation

Authors: Dario Azzimonti, David Ginsbourger

Abstract: The computation of Gaussian orthant probabilities has been extensively studied for low-dimensional vectors. Here, we focus on the high-dimensional case and we present a two-step procedure relying on both deterministic and stochastic techniques. The proposed estimator relies indeed on splitting the probability into a low-dimensional term and a remainder. While the low-dimensional probability can be… ▽ More The computation of Gaussian orthant probabilities has been extensively studied for low-dimensional vectors. Here, we focus on the high-dimensional case and we present a two-step procedure relying on both deterministic and stochastic techniques. The proposed estimator relies indeed on splitting the probability into a low-dimensional term and a remainder. While the low-dimensional probability can be estimated by fast and accurate quadrature, the remainder requires Monte Carlo sampling. We further refine the estimation by using a novel asymmetric nested Monte Carlo (anMC) algorithm for the remainder and we highlight cases where this approximation brings substantial efficiency gains. The proposed methods are compared against state-of-the-art techniques in a numerical study, which also calls attention to the advantages and drawbacks of the procedure. Finally, the proposed method is applied to derive conservative estimates of excursion sets of expensive to evaluate deterministic functions under a Gaussian random field prior, without requiring a Markov assumption. Supplementary material for this article is available online. △ Less

Submitted 30 November, 2018; v1 submitted 16 March, 2016; originally announced March 2016.

Journal ref: Journal of Computational and Graphical Statistics, Taylor \& Francis, 2018, 27 (2), pp.255-267

arXiv:1503.05509 [pdf, ps, other]

Differentiating the multipoint Expected Improvement for optimal batch design

Authors: Sébastien Marmin, Clément Chevalier, David Ginsbourger

Abstract: This work deals with parallel optimization of expensive objective functions which are modeled as sample realizations of Gaussian processes. The study is formalized as a Bayesian optimization problem, or continuous multi-armed bandit problem, where a batch of q > 0 arms is pulled in parallel at each iteration. Several algorithms have been developed for choosing batches by trading off exploitation a… ▽ More This work deals with parallel optimization of expensive objective functions which are modeled as sample realizations of Gaussian processes. The study is formalized as a Bayesian optimization problem, or continuous multi-armed bandit problem, where a batch of q > 0 arms is pulled in parallel at each iteration. Several algorithms have been developed for choosing batches by trading off exploitation and exploration. As of today, the maximum Expected Improvement (EI) and Upper Confidence Bound (UCB) selection rules appear as the most prominent approaches for batch selection. Here, we build upon recent work on the multipoint Expected Improvement criterion, for which an analytic expansion relying on Tallis' formula was recently established. The computational burden of this selection rule being still an issue in application, we derive a closed-form expression for the gradient of the multipoint Expected Improvement, which aims at facilitating its maximization using gradient-based ascent algorithms. Substantial computational savings are shown in application. In addition, our algorithms are tested numerically and compared to state-of-the-art UCB-based batch-sequential algorithms. Combining starting designs relying on UCB with gradient-based EI local optimization finally appears as a sound option for batch design in distributed Gaussian Process optimization. △ Less

Submitted 2 September, 2019; v1 submitted 18 March, 2015; originally announced March 2015.

arXiv:1501.03659 [pdf, ps, other]

doi 10.1137/141000749

Quantifying uncertainties on excursion sets under a Gaussian random field prior

Authors: Dario Azzimonti, Julien Bect, Clément Chevalier, David Ginsbourger

Abstract: We focus on the problem of estimating and quantifying uncertainties on the excursion set of a function under a limited evaluation budget. We adopt a Bayesian approach where the objective function is assumed to be a realization of a Gaussian random field. In this setting, the posterior distribution on the objective function gives rise to a posterior distribution on excursion sets. Several approache… ▽ More We focus on the problem of estimating and quantifying uncertainties on the excursion set of a function under a limited evaluation budget. We adopt a Bayesian approach where the objective function is assumed to be a realization of a Gaussian random field. In this setting, the posterior distribution on the objective function gives rise to a posterior distribution on excursion sets. Several approaches exist to summarize the distribution of such sets based on random closed set theory. While the recently proposed Vorob'ev approach exploits analytical formulae, further notions of variability require Monte Carlo estimators relying on Gaussian random field conditional simulations. In the present work we propose a method to choose Monte Carlo simulation points and obtain quasi-realizations of the conditional field at fine designs through affine predictors. The points are chosen optimally in the sense that they minimize the posterior expected distance in measure between the excursion set and its reconstruction. The proposed method reduces the computational costs due to Monte Carlo simulations and enables the computation of quasi-realizations on fine designs in large dimensions. We apply this reconstruction approach to obtain realizations of an excursion set on a fine grid which allow us to give a new measure of uncertainty based on the distance transform of the excursion set. Finally we present a safety engineering test case where the simulation method is employed to compute a Monte Carlo estimate of a contour line. △ Less

Submitted 13 April, 2016; v1 submitted 15 January, 2015; originally announced January 2015.

Journal ref: SIAM/ASA Journal on Uncertainty Quantification, 4(1):850-874, 2016

arXiv:1411.3685 [pdf, other]

A warped kernel improving robustness in Bayesian optimization via random embeddings

Authors: Mickaël Binois, David Ginsbourger, Olivier Roustant

Abstract: This works extends the Random Embedding Bayesian Optimization approach by integrating a war** of the high dimensional subspace within the covariance kernel. The proposed war**, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates c… ▽ More This works extends the Random Embedding Bayesian Optimization approach by integrating a war** of the high dimensional subspace within the covariance kernel. The proposed war**, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates constraints on bound selection for the embedded domain, thus improving the robustness, as illustrated with a test case with 25 variables and intrinsic dimension 6. △ Less

Submitted 18 March, 2015; v1 submitted 13 November, 2014; originally announced November 2014.

arXiv:1308.1359 [pdf, other]

Invariances of random fields paths, with applications in Gaussian Process Regression

Authors: David Ginsbourger, Olivier Roustant, Nicolas Durrande

Abstract: We study pathwise invariances of centred random fields that can be controlled through the covariance. A result involving composition operators is obtained in second-order settings, and we show that various path properties including additivity boil down to invariances of the covariance kernel. These results are extended to a broader class of operators in the Gaussian case, via the Loève isometry. S… ▽ More We study pathwise invariances of centred random fields that can be controlled through the covariance. A result involving composition operators is obtained in second-order settings, and we show that various path properties including additivity boil down to invariances of the covariance kernel. These results are extended to a broader class of operators in the Gaussian case, via the Loève isometry. Several covariance-driven pathwise invariances are illustrated, including fields with symmetric paths, centred paths, harmonic paths, or sparse paths. The proposed approach delivers a number of promising results and perspectives in Gaussian process regression. △ Less

Submitted 6 August, 2013; originally announced August 2013.

arXiv:1203.6452 [pdf, ps, other]

Corrected Kriging update formulae for batch-sequential data assimilation

Authors: Clément Chevalier, David Ginsbourger

Abstract: Recently, a lot of effort has been paid to the efficient computation of Kriging predictors when observations are assimilated sequentially. In particular, Kriging update formulae enabling significant computational savings were derived in Barnes and Watson (1992), Gao et al. (1996), and Emery (2009). Taking advantage of the previous Kriging mean and variance calculations helps avoiding a costly… ▽ More Recently, a lot of effort has been paid to the efficient computation of Kriging predictors when observations are assimilated sequentially. In particular, Kriging update formulae enabling significant computational savings were derived in Barnes and Watson (1992), Gao et al. (1996), and Emery (2009). Taking advantage of the previous Kriging mean and variance calculations helps avoiding a costly $(n+1) \times (n+1)$ matrix inversion when adding one observation to the $n$ already available ones. In addition to traditional update formulae taking into account a single new observation, Emery (2009) also proposed formulae for the batch-sequential case, i.e. when $r > 1$ new observations are simultaneously assimilated. However, the Kriging variance and covariance formulae given without proof in Emery (2009) for the batch-sequential case are not correct. In this paper we fix this issue and establish corrected expressions for updated Kriging variances and covariances when assimilating several observations in parallel. △ Less

Submitted 29 March, 2012; originally announced March 2012.

arXiv:1111.6233 [pdf, ps, other]

Additive Covariance Kernels for High-Dimensional Gaussian Process Modeling

Authors: Nicolas Durrande, David Ginsbourger, Olivier Roustant, Laurent Carraro

Abstract: Gaussian process models -also called Kriging models- are often used as mathematical approximations of expensive experiments. However, the number of observation required for building an emulator becomes unrealistic when using classical covariance kernels when the dimension of input increases. In oder to get round the curse of dimensionality, a popular approach is to consider simplified models such… ▽ More Gaussian process models -also called Kriging models- are often used as mathematical approximations of expensive experiments. However, the number of observation required for building an emulator becomes unrealistic when using classical covariance kernels when the dimension of input increases. In oder to get round the curse of dimensionality, a popular approach is to consider simplified models such as additive models. The ambition of the present work is to give an insight into covariance kernels that are well suited for building additive Kriging models and to describe some properties of the resulting models. △ Less

Submitted 27 November, 2011; originally announced November 2011.

Comments: arXiv admin note: substantial text overlap with arXiv:1103.4023

Journal ref: Annales de la Faculté de Sciences de Toulouse Tome 21, numéro 3 (2012) p. 481-499

arXiv:1106.3571 [pdf, ps, other]

ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis

Authors: Nicolas Durrande, David Ginsbourger, Olivier Roustant, Laurent Carraro

Abstract: Given a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an inter… ▽ More Given a reproducing kernel Hilbert space H of real-valued functions and a suitable measure mu over the source space D (subset of R), we decompose H as the sum of a subspace of centered functions for mu and its orthogonal in H. This decomposition leads to a special case of ANOVA kernels, for which the functional ANOVA representation of the best predictor can be elegantly derived, either in an interpolation or regularization framework. The proposed kernels appear to be particularly convenient for analyzing the e ffect of each (group of) variable(s) and computing sensitivity indices without recursivity. △ Less

Submitted 7 December, 2012; v1 submitted 17 June, 2011; originally announced June 2011.

Journal ref: Journal of Multivariate Analysis 115 (2013) 57-67

arXiv:1103.4023 [pdf, ps, other]

Additive Kernels for Gaussian Process Modeling

Authors: Nicolas Durrande, David Ginsbourger, Olivier Roustant

Abstract: Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditio… ▽ More Gaussian Process (GP) models are often used as mathematical approximations of computationally expensive experiments. Provided that its kernel is suitably chosen and that enough data is available to obtain a reasonable fit of the simulator, a GP model can beneficially be used for tasks such as prediction, optimization, or Monte-Carlo-based quantification of uncertainty. However, the former conditions become unrealistic when using classical GPs as the dimension of input increases. One popular alternative is then to turn to Generalized Additive Models (GAMs), relying on the assumption that the simulator's response can approximately be decomposed as a sum of univariate functions. If such an approach has been successfully applied in approximation, it is nevertheless not completely compatible with the GP framework and its versatile applications. The ambition of the present work is to give an insight into the use of GPs for additive models by integrating additivity within the kernel, and proposing a parsimonious numerical method for data-driven parameter estimation. The first part of this article deals with the kernels naturally associated to additive processes and the properties of the GP models based on such kernels. The second part is dedicated to a numerical procedure based on relaxation for additive kernel parameter estimation. Finally, the efficiency of the proposed method is illustrated and compared to other approaches on Sobol's g-function. △ Less

Submitted 21 March, 2011; originally announced March 2011.

arXiv:1009.5177 [pdf, ps, other]

doi 10.1007/s11222-011-9241-4

Sequential design of computer experiments for the estimation of a probability of failure

Authors: Julien Bect, David Ginsbourger, Ling Li, Victor Picheny, Emmanuel Vazquez

Abstract: This paper deals with the problem of estimating the volume of the excursion set of a function $f:\mathbb{R}^d \to \mathbb{R}$ above a given threshold, under a probability measure on $\mathbb{R}^d$ that is assumed to be known. In the industrial world, this corresponds to the problem of estimating a probability of failure of a system. When only an expensive-to-simulate model of the system is availab… ▽ More This paper deals with the problem of estimating the volume of the excursion set of a function $f:\mathbb{R}^d \to \mathbb{R}$ above a given threshold, under a probability measure on $\mathbb{R}^d$ that is assumed to be known. In the industrial world, this corresponds to the problem of estimating a probability of failure of a system. When only an expensive-to-simulate model of the system is available, the budget for simulations is usually severely limited and therefore classical Monte Carlo methods ought to be avoided. One of the main contributions of this article is to derive SUR (stepwise uncertainty reduction) strategies from a Bayesian-theoretic formulation of the problem of estimating a probability of failure. These sequential strategies use a Gaussian process model of $f$ and aim at performing evaluations of $f$ as efficiently as possible to infer the value of the probability of failure. We compare these strategies to other strategies also based on a Gaussian process model for estimating a probability of failure. △ Less

Submitted 24 April, 2012; v1 submitted 27 September, 2010; originally announced September 2010.

Comments: This is an author-generated postprint version. The published version is available at http://www.springerlink.com

MSC Class: 62L05; 62C10; 62P30

Journal ref: Statistics and Computing, 22(3):773-793, 2012

Showing 1–31 of 31 results for author: Ginsbourger, D