Search | arXiv e-print repository

Hoeffding decomposition of black-box models with dependent inputs

Authors: Marouane Il Idrissi, Nicolas Bousquet, Fabrice Gamboa, Bertrand Iooss, Jean-Michel Loubes

Abstract: One of the main challenges for interpreting black-box models is the ability to uniquely decompose square-integrable functions of non-independent random inputs into a sum of functions of every possible subset of variables. However, dealing with dependencies among inputs can be complicated. We propose a novel framework to study this problem, linking three domains of mathematics: probability theory,… ▽ More One of the main challenges for interpreting black-box models is the ability to uniquely decompose square-integrable functions of non-independent random inputs into a sum of functions of every possible subset of variables. However, dealing with dependencies among inputs can be complicated. We propose a novel framework to study this problem, linking three domains of mathematics: probability theory, functional analysis, and combinatorics. We show that, under two reasonable assumptions on the inputs (non-perfect functional dependence and non-degenerate stochastic dependence), it is always possible to decompose such a function uniquely. This generalizes the well-known Hoeffding decomposition. The elements of this decomposition can be expressed using oblique projections and allow for novel interpretability indices for evaluation and variance decomposition purposes. The properties of these novel indices are studied and discussed. This generalization offers a path towards a more precise uncertainty quantification, which can benefit sensitivity analysis and interpretability studies whenever the inputs are dependent. This decomposition is illustrated analytically, and the challenges for adopting these results in practice are discussed. △ Less

Submitted 7 March, 2024; v1 submitted 10 October, 2023; originally announced October 2023.

arXiv:2305.07314 [pdf, other]

A comparison between Bayesian and ordinary kriging based on validation criteria: application to radiological characterisation

Authors: Martin Wieskotten, Marielle Crozet, Bertrand Iooss, Céline Lacaux, Amandine Marrel

Abstract: In decommissioning projects of nuclear facilities, the radiological characterisation step aims to estimate the quantity and spatial distribution of different radionuclides. To carry out the estimation, measurements are performed on site to obtain preliminary information. The usual industrial practice consists in applying spatial interpolation tools (as the ordinary kriging method) on these data to… ▽ More In decommissioning projects of nuclear facilities, the radiological characterisation step aims to estimate the quantity and spatial distribution of different radionuclides. To carry out the estimation, measurements are performed on site to obtain preliminary information. The usual industrial practice consists in applying spatial interpolation tools (as the ordinary kriging method) on these data to predict the value of interest for the contamination (radionuclide concentration, radioactivity, etc.) at unobserved positions. This paper questions the ordinary kriging tool on the well-known problem of the overoptimistic prediction variances due to not taking into account uncertainties on the estimation of the kriging parameters (variance and range). To overcome this issue, the practical use of the Bayesian kriging method, where the model parameters are considered as random variables, is deepened. The usefulness of Bayesian kriging, whilst comparing its performance to that of ordinary kriging, is demonstrated in the small data context (which is often the case in decommissioning projects). This result is obtained via several numerical tests on different toy models, and using complementary validation criteria: the predictivity coefficient (Q${}^2$), the Predictive Variance Adequacy (PVA), the $α$-Confidence Interval plot (and its associated Mean Squared Error alpha (MSEalpha)), and the Predictive Interval Adequacy (PIA). The latter is a new criterion adapted to the Bayesian kriging results. Finally, the same comparison is performed on a real dataset coming from the decommissioning project of the CEA Marcoule G3 reactor. It illustrates the practical interest of Bayesian kriging in industrial radiological characterisation. △ Less

Submitted 12 May, 2023; originally announced May 2023.

arXiv:2301.02539 [pdf, other]

On the coalitional decomposition of parameters of interest

Authors: Marouane Il Idrissi, Nicolas Bousquet, Fabrice Gamboa, Bertrand Iooss, Jean-Michel Loubes

Abstract: Understanding the behavior of a black-box model with probabilistic inputs can be based on the decomposition of a parameter of interest (e.g., its variance) into contributions attributed to each coalition of inputs (i.e., subsets of inputs). In this paper, we produce conditions for obtaining unambiguous and interpretable decompositions of very general parameters of interest. This allows to recover… ▽ More Understanding the behavior of a black-box model with probabilistic inputs can be based on the decomposition of a parameter of interest (e.g., its variance) into contributions attributed to each coalition of inputs (i.e., subsets of inputs). In this paper, we produce conditions for obtaining unambiguous and interpretable decompositions of very general parameters of interest. This allows to recover known decompositions, holding under weaker assumptions than stated in the literature. △ Less

Submitted 6 January, 2023; originally announced January 2023.

arXiv:2210.13065 [pdf, other]

Proportional marginal effects for global sensitivity analysis

Authors: Margot Herin, Marouane Il Idrissi, Vincent Chabridon, Bertrand Iooss

Abstract: Performing (variance-based) global sensitivity analysis (GSA) with dependent inputs has recently benefited from cooperative game theory concepts.By using this theory, despite the potential correlation between the inputs, meaningful sensitivity indices can be defined via allocation shares of the model output's variance to each input. The ``Shapley effects'', i.e., the Shapley values transposed to v… ▽ More Performing (variance-based) global sensitivity analysis (GSA) with dependent inputs has recently benefited from cooperative game theory concepts.By using this theory, despite the potential correlation between the inputs, meaningful sensitivity indices can be defined via allocation shares of the model output's variance to each input. The ``Shapley effects'', i.e., the Shapley values transposed to variance-based GSA problems, allowed for this suitable solution. However, these indices exhibit a particular behavior that can be undesirable: an exogenous input (i.e., which is not explicitly included in the structural equations of the model) can be associated with a strictly positive index when it is correlated to endogenous inputs. In the present work, the use of a different allocation, called the ``proportional values'' is investigated. A first contribution is to propose an extension of this allocation, suitable for variance-based GSA. Novel GSA indices are then proposed, called the ``proportional marginal effects'' (PME). The notion of exogeneity is formally defined in the context of variance-based GSA, and it is shown that the PME allow the distinction of exogenous variables, even when they are correlated to endogenous inputs. Moreover, their behavior is compared to the Shapley effects on analytical toy-cases and more realistic use-cases. △ Less

Submitted 24 October, 2022; originally announced October 2022.

arXiv:2209.11539 [pdf, other]

Quantile-constrained Wasserstein projections for robust interpretability of numerical and machine learning models

Authors: Marouane Il Idrissi, Nicolas Bousquet, Fabrice Gamboa, Bertrand Iooss, Jean-Michel Loubes

Abstract: Robustness studies of black-box models is recognized as a necessary task for numerical models based on structural equations and predictive models learned from data. These studies must assess the model's robustness to possible misspecification of regarding its inputs (e.g., covariate shift). The study of black-box models, through the prism of uncertainty quantification (UQ), is often based on sensi… ▽ More Robustness studies of black-box models is recognized as a necessary task for numerical models based on structural equations and predictive models learned from data. These studies must assess the model's robustness to possible misspecification of regarding its inputs (e.g., covariate shift). The study of black-box models, through the prism of uncertainty quantification (UQ), is often based on sensitivity analysis involving a probabilistic structure imposed on the inputs, while ML models are solely constructed from observed data. Our work aim at unifying the UQ and ML interpretability approaches, by providing relevant and easy-to-use tools for both paradigms. To provide a generic and understandable framework for robustness studies, we define perturbations of input information relying on quantile constraints and projections with respect to the Wasserstein distance between probability measures, while preserving their dependence structure. We show that this perturbation problem can be analytically solved. Ensuring regularity constraints by means of isotonic polynomial approximations leads to smoother perturbations, which can be more suitable in practice. Numerical experiments on real case studies, from the UQ and ML fields, highlight the computational feasibility of such studies and provide local and global insights on the robustness of black-box models to input perturbations. △ Less

Submitted 23 September, 2022; originally announced September 2022.

arXiv:2207.03724 [pdf, other]

Model predictivity assessment: incremental test-set selection and accuracy evaluation

Authors: Elias Fekhari, Bertrand Iooss, Joseph Muré, Luc Pronzato, Maria-João Rendas

Abstract: Unbiased assessment of the predictivity of models learnt by supervised machine-learning methods requires knowledge of the learned function over a reserved test set (not used by the learning algorithm). The quality of the assessment depends, naturally, on the properties of the test set and on the error statistic used to estimate the prediction error. In this work we tackle both issues, proposing a… ▽ More Unbiased assessment of the predictivity of models learnt by supervised machine-learning methods requires knowledge of the learned function over a reserved test set (not used by the learning algorithm). The quality of the assessment depends, naturally, on the properties of the test set and on the error statistic used to estimate the prediction error. In this work we tackle both issues, proposing a new predictivity criterion that carefully weights the individual observed errors to obtain a global error estimate, and using incremental experimental design methods to "optimally" select the test points on which the criterion is computed. Several incremental constructions are studied, including greedy-packing (coffee-house design), support points and kernel herding techniques. Our results show that the incremental and weighted versions of the latter two, based on Maximum Mean Discrepancy concepts, yield superior performance. An industrial test case provided by the historical French electricity supplier (EDF) illustrates the practical relevance of the methodology, indicating that it is an efficient alternative to expensive cross-validation techniques. △ Less

Submitted 8 July, 2022; originally announced July 2022.

Comments: Studies in Theoretical and Applied Statistics, Springer, In press

arXiv:2107.00394 [pdf, other]

Global sensitivity analysis using derivative-based sparse Poincaré chaos expansions

Authors: Nora Lüthen, Olivier Roustant, Fabrice Gamboa, Bertrand Iooss, Stefano Marelli, Bruno Sudret

Abstract: Variance-based global sensitivity analysis, in particular Sobol' analysis, is widely used for determining the importance of input variables to a computational model. Sobol' indices can be computed cheaply based on spectral methods like polynomial chaos expansions (PCE). Another choice are the recently developed Poincaré chaos expansions (PoinCE), whose orthonormal tensor-product basis is generated… ▽ More Variance-based global sensitivity analysis, in particular Sobol' analysis, is widely used for determining the importance of input variables to a computational model. Sobol' indices can be computed cheaply based on spectral methods like polynomial chaos expansions (PCE). Another choice are the recently developed Poincaré chaos expansions (PoinCE), whose orthonormal tensor-product basis is generated from the eigenfunctions of one-dimensional Poincaré differential operators. In this paper, we show that the Poincaré basis is the unique orthonormal basis with the property that partial derivatives of the basis form again an orthogonal basis with respect to the same measure as the original basis. This special property makes PoinCE ideally suited for incorporating derivative information into the surrogate modelling process. Assuming that partial derivative evaluations of the computational model are available, we compute spectral expansions in terms of Poincaré basis functions or basis partial derivatives, respectively, by sparse regression. We show on two numerical examples that the derivative-based expansions provide accurate estimates for Sobol' indices, even outperforming PCE in terms of bias and variance. In addition, we derive an analytical expression based on the PoinCE coefficients for a second popular sensitivity index, the derivative-based sensitivity measure (DGSM), and explore its performance as upper bound to the corresponding total Sobol' indices. △ Less

Submitted 9 June, 2023; v1 submitted 1 July, 2021; originally announced July 2021.

Report number: RSUQ-2021-004C

Journal ref: International Journal for Uncertainty Quantification, vol. 13 (6), 57-82, 2023

arXiv:2101.08083 [pdf, ps, other]

Developments and applications of Shapley effects to reliability-oriented sensitivity analysis with correlated inputs

Authors: Marouane Il Idrissi, Vincent Chabridon, Bertrand Iooss

Abstract: Reliability-oriented sensitivity analysis methods have been developed for understanding the influence of model inputs relative to events which characterize the failure of a system (e.g., a threshold exceedance of the model output). In this field, the target sensitivity analysis focuses primarily on capturing the influence of the inputs on the occurrence of such a critical event. This paper propose… ▽ More Reliability-oriented sensitivity analysis methods have been developed for understanding the influence of model inputs relative to events which characterize the failure of a system (e.g., a threshold exceedance of the model output). In this field, the target sensitivity analysis focuses primarily on capturing the influence of the inputs on the occurrence of such a critical event. This paper proposes new target sensitivity indices, based on the Shapley values and called "target Shapley effects", allowing for interpretable sensitivity measures under dependent inputs. Two algorithms (one based on Monte Carlo sampling, and a given-data algorithm based on a nearest-neighbors procedure) are proposed for the estimation of these target Shapley effects based on the $\ell^2$ norm. Additionally, the behavior of these target Shapley effects are theoretically and empirically studied through various toy-cases. Finally, the application of these new indices in two real-world use-cases (a river flood model and a COVID-19 epidemiological model) is discussed. △ Less

Submitted 19 May, 2021; v1 submitted 20 January, 2021; originally announced January 2021.

arXiv:2008.03060 [pdf, other]

An information geometry approach for robustness analysis in uncertainty quantification of computer codes

Authors: Clement Gauchy, Jerome Stenger, Roman Sueur, Bertrand Iooss

Abstract: Robustness analysis is an emerging field in the domain of uncertainty quantification. It consists of analysing the response of a computer model with uncertain inputs to the perturbation of one or several of its input distributions. Thus, a practical robustness analysis methodology should rely on a coherent definition of a distribution perturbation. This paper addresses this issue by exposing a rig… ▽ More Robustness analysis is an emerging field in the domain of uncertainty quantification. It consists of analysing the response of a computer model with uncertain inputs to the perturbation of one or several of its input distributions. Thus, a practical robustness analysis methodology should rely on a coherent definition of a distribution perturbation. This paper addresses this issue by exposing a rigorous way of perturbing densities. The proposed methodology is based the Fisher distance on manifolds of probability distributions. A numerical method to calculate perturbed densities in practice is presented. This method comes from Lagrangian mechanics and consists of solving an ordinary differential equations system. This perturbation definition is then used to compute quantile-oriented robustness indices. The resulting Perturbed-Law based sensitivity Indices (PLI) are illustrated on several numerical models. This methodology is also applied to an industrial study (simulation of a loss of coolant accident in a nuclear reactor), where several tens of the model physical parameters are uncertain with limited knowledge concerning their distributions. △ Less

Submitted 15 December, 2020; v1 submitted 7 August, 2020; originally announced August 2020.

arXiv:2004.04663 [pdf, other]

The ICSCREAM methodology: Identification of penalizing configurations in computer experiments using screening and metamodel -- Applications in thermal-hydraulics

Authors: A. Marrel, Bertrand Iooss, V Chabridon

Abstract: In the framework of risk assessment in nuclear accident analysis, best-estimatecomputer codes, associated to a probabilistic modeling of the uncertain input variables,are used to estimate safety margins. A first step in such uncertainty quantificationstudies is often to identify the critical configurations (or penalizing, in thesense of a prescribed safety margin) of several input parameters (call… ▽ More In the framework of risk assessment in nuclear accident analysis, best-estimatecomputer codes, associated to a probabilistic modeling of the uncertain input variables,are used to estimate safety margins. A first step in such uncertainty quantificationstudies is often to identify the critical configurations (or penalizing, in thesense of a prescribed safety margin) of several input parameters (called ``scenarioinputs''), under the uncertainty on the other input parameters. However, the largeCPU-time cost of most of the computer codes used in nuclear engineering, as theones related to thermal-hydraulic accident scenario simulations, involve to develophighly efficient strategies. This work focuses on machine learning algorithms bythe way of the metamodel-based approach (i.e., a mathematical model which is fittedon a small-size sample of simulations). To achieve it with a very large numberof inputs, a specific and original methodology, called ICSCREAM (Identificationof penalizing Configurations using SCREening And Metamodel), is proposed. Thescreening of influential inputs is based on an advanced global sensitivity analysistool (HSIC importance measures). A Gaussian process metamodel is then sequentiallybuilt and used to estimate, within a Bayesian framework, the conditionalprobabilities of exceeding a high-level threshold, according to the scenario inputs.The efficiency of this methodology is illustrated on two high-dimensional (arounda hundred inputs) thermal-hydraulic industrial cases simulating an accident of primarycoolant loss in a pressurized water reactor. For both use cases, the studyfocuses on the peak cladding temperature (PCT) and critical configurations aredefined by exceeding the 90%-quantile of PCT. In both cases, the ICSCREAMmethodology allows to estimate, by using only around one thousand of code simulations,the impact of the scenario inputs and their critical areas of values. △ Less

Submitted 27 August, 2021; v1 submitted 8 April, 2020; originally announced April 2020.

arXiv:2002.11475 [pdf]

doi 10.1115/1.4046020

A Visual Sensitivity Analysis for Parameter-Augmented Ensembles of Curves

Authors: Alejandro Ribes, Joachim Pouderoux, Bertrand Iooss

Abstract: Engineers and computational scientists often study the behavior of their simulations by repeated solutions with variations in their parameters, which can be for instance boundary values or initial conditions. Through such simulation ensembles, uncertainty in a solution is studied as a function of the various input parameters. Solutions of numerical simulations are often temporal functions, spatial… ▽ More Engineers and computational scientists often study the behavior of their simulations by repeated solutions with variations in their parameters, which can be for instance boundary values or initial conditions. Through such simulation ensembles, uncertainty in a solution is studied as a function of the various input parameters. Solutions of numerical simulations are often temporal functions, spatial maps or spatio-temporal outputs. The usual way to deal with such complex outputs is to limit the analysis to several probes in the temporal/spatial domain. This leads to smaller and more tractable ensembles of functional outputs (curves) with their associated input parameters: augmented ensembles of curves. This article describes a system for the interactive exploration and analysis of such augmented ensembles. Descriptive statistics on the functional outputs are performed by Principal Component Analysis projection, kernel density estimation and the computation of High Density Regions. This makes possible the calculation of functional quantiles and outliers. Brushing and linking the elements of the system allows in-depth analysis of the ensemble. The system allows for functional descriptive statistics, cluster detection and finally for the realization of a visual sensitivity analysis via cobweb plots. We present two synthetic examples and then validate our approach in an industrial use-case concerning a marine current study using a hydraulic solver. △ Less

Submitted 26 February, 2020; originally announced February 2020.

Journal ref: The Journal of Verification, Validation and Uncertainty Quantification (VVUQ), 2019, 4 (4)

arXiv:2001.11860 [pdf, other]

A graph clustering approach to localization for adaptive covariance tuning in data assimilation based on state-observation map**

Authors: Sibo Cheng, Jean-Philippe Argaud, Bertrand Iooss, Angélique Ponçot, Didier Lucor

Abstract: An original graph clustering approach to efficient localization of error covariances is proposed within an ensemble-variational data assimilation framework. Here the localization term is very generic and refers to the idea of breaking up a global assimilation into subproblems. This unsupervised localization technique based on a linearizedstate-observation measure is general and does not rely on… ▽ More An original graph clustering approach to efficient localization of error covariances is proposed within an ensemble-variational data assimilation framework. Here the localization term is very generic and refers to the idea of breaking up a global assimilation into subproblems. This unsupervised localization technique based on a linearizedstate-observation measure is general and does not rely on any prior information such as relevant spatial scales, empirical cut-off radius or homogeneity assumptions. It automatically segregates the state and observation variables in an optimal number of clusters (otherwise named as subspaces or communities), more amenable to scalable data assimilation.The application of this method does not require underlying block-diagonal structures of prior covariance matrices. In order to deal with inter-cluster connectivity, two alternative data adaptations are proposed. Once the localization is completed, an adaptive covariance diagnosis and tuning is performed within each cluster. Numerical tests show that this approach is less costly and more flexible than a global covariance tuning, and most often results in more accurate background and observations error covariances. △ Less

Submitted 31 January, 2020; originally announced January 2020.

arXiv:1910.09408 [pdf, other]

Background Error Covariance Iterative Updating with Invariant Observation Measures for Data Assimilation

Authors: Sibo Cheng, Jean-Philippe Argaud, Bertrand Iooss, Didier Lucor, Angélique Ponçot

Abstract: In order to leverage the information embedded in the background state and observations, covariance matrices modelling is a pivotal point in data assimilation algorithms. These matrices are often estimated from an ensemble of observations or forecast differences. Nevertheless, for many industrial applications the modelling still remains empirical based on some form of expertise and physical constra… ▽ More In order to leverage the information embedded in the background state and observations, covariance matrices modelling is a pivotal point in data assimilation algorithms. These matrices are often estimated from an ensemble of observations or forecast differences. Nevertheless, for many industrial applications the modelling still remains empirical based on some form of expertise and physical constraints enforcement in the absence of historical observations or predictions. We have developed two novel robust adaptive assimilation methods named CUTE (Covariance Updating iTerativE) and PUB (Partially Updating BLUE). These two non-parametric methods are based on different optimization objectives, both capable of sequentially adapting background error covariance matrices in order to improve assimilation results under the assumption of a good knowledge of the observation error covariances. We have compared these two methods with the standard approach using a misspecified background matrix in a shallow water twin experiments framework with a linear observation operator. Numerical experiments have shown that the proposed methods bear a real advantage both in terms of posterior error correlation identification and assimilation accuracy. △ Less

Submitted 11 October, 2019; originally announced October 2019.

arXiv:1906.09883 [pdf, other]

Sensitivity Analysis and Generalized Chaos Expansions. Lower Bounds for Sobol indices

Authors: O Roustant, F. Gamboa, B Iooss

Abstract: The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol' sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol' indices and give general lower bounds for these indices. The case of the eigenfunctions syste… ▽ More The so-called polynomial chaos expansion is widely used in computer experiments. For example, it is a powerful tool to estimate Sobol' sensitivity indices. In this paper, we consider generalized chaos expansions built on general tensor Hilbert basis. In this frame, we revisit the computation of the Sobol' indices and give general lower bounds for these indices. The case of the eigenfunctions system associated with a Poincar{é} differential operator leads to lower bounds involving the derivatives of the analyzed function and provides an efficient tool for variable screening. These lower bounds are put in action both on toy and real life models demonstrating their accuracy. △ Less

Submitted 24 June, 2019; originally announced June 2019.

arXiv:1905.04180 [pdf, other]

Large scale in transit computation of quantiles for ensemble runs

Authors: Alejandro Ribes, Théophile Terraz, Bertrand Iooss, Yvan Fournier, Bruno Raffin

Abstract: The classical approach for quantiles computation requires availability of the full sample before ranking it. In uncertainty quantification of numerical simulation models, this approach is not suitable at exascale as large ensembles of simulation runs would need to gather a prohibitively large amount of data. This problem is solved thanks to an on-the-fly and iterative approach based on the Robbins… ▽ More The classical approach for quantiles computation requires availability of the full sample before ranking it. In uncertainty quantification of numerical simulation models, this approach is not suitable at exascale as large ensembles of simulation runs would need to gather a prohibitively large amount of data. This problem is solved thanks to an on-the-fly and iterative approach based on the Robbins-Monro algorithm. This approach relies on Melissa, a file avoiding, adaptive, fault-tolerant and elastic framework. On a validation case producing 11 TB of data, which consists in 3000 fluid dynamics parallel simulations on a 6M cell mesh, it allows on-line computation of spatio-temporal maps of percentiles. △ Less

Submitted 10 May, 2019; originally announced May 2019.

arXiv:1901.07903 [pdf, other]

Optimal Uncertainty Quantification of a risk measurement from a thermal-hydraulic code using Canonical Moments

Authors: Jerome Stenger, Fabrice Gamboa, Merlin Keller, Bertrand Iooss

Abstract: We study an industrial computer code related to nuclear safety. A major topic of interest is to assess the uncertainties tainting the results of a computer simulation. In this work we gain robustness on the quantification of a risk measurement by accounting for all sources of uncertainties tainting the inputs of a computer code. To that extent, we evaluate the maximum quantile over a class of dist… ▽ More We study an industrial computer code related to nuclear safety. A major topic of interest is to assess the uncertainties tainting the results of a computer simulation. In this work we gain robustness on the quantification of a risk measurement by accounting for all sources of uncertainties tainting the inputs of a computer code. To that extent, we evaluate the maximum quantile over a class of distributions defined only by constraints on their moments. Two options are available when dealing with such complex optimization problems: one can either optimize under constraints; or preferably, one should reformulate the objective function. We identify a well suited parameterization to compute the optimal quantile based on the theory of canonical moments. It allows an effective, free of constraints, optimization. △ Less

Submitted 28 August, 2019; v1 submitted 22 January, 2019; originally announced January 2019.

Comments: arXiv admin note: substantial text overlap with arXiv:1811.12788

arXiv:1811.12788 [pdf, other]

Optimal Uncertainty Quantification on moment class using canonical moments

Authors: Jerome Stenger, Fabrice Gamboa, Merlin Keller, Bertrand Iooss

Abstract: We gain robustness on the quantification of a risk measurement by accounting for all sources of uncertainties tainting the inputs of a computer code. We evaluate the maximum quantile over a class of distributions defined only by constraints on their moments. The methodology is based on the theory of canonical moments that appears to be a well-suited framework for practical optimization. We gain robustness on the quantification of a risk measurement by accounting for all sources of uncertainties tainting the inputs of a computer code. We evaluate the maximum quantile over a class of distributions defined only by constraints on their moments. The methodology is based on the theory of canonical moments that appears to be a well-suited framework for practical optimization. △ Less

Submitted 30 November, 2018; originally announced November 2018.

Comments: 21 pages, 9 figures

arXiv:1707.01334 [pdf, other]

Shapley effects for sensitivity analysis with correlated inputs: comparisons with Sobol' indices, numerical estimation and applications

Authors: Bertrand Iooss, Clémentine Prieur

Abstract: The global sensitivity analysis of a numerical model aims to quantify, by means of sensitivity indices estimate, the contributions of each uncertain input variable to the model output uncertainty. The so-called Sobol' indices, which are based on the functional variance analysis, present a difficult interpretation in the presence of statistical dependence between inputs. The Shapley effect was rece… ▽ More The global sensitivity analysis of a numerical model aims to quantify, by means of sensitivity indices estimate, the contributions of each uncertain input variable to the model output uncertainty. The so-called Sobol' indices, which are based on the functional variance analysis, present a difficult interpretation in the presence of statistical dependence between inputs. The Shapley effect was recently introduced to overcome this problem as they allocate the mutual contribution (due to correlation and interaction) of a group of inputs to each individual input within the group.In this paper, using several new analytical results, we study the effects of linear correlation between some Gaussian input variables on Shapley effects, and compare these effects to classical first-order and total Sobol' indices.This illustrates the interest, in terms of sensitivity analysis setting and interpretation, of the Shapley effects in the case of dependent inputs. For the practical issue of computationally demanding computer models, we show that the substitution of the original model by a metamodel (here, kriging) makes it possible to estimate these indices with precision at a reasonable computational cost. △ Less

Submitted 25 November, 2019; v1 submitted 5 July, 2017; originally announced July 2017.

arXiv:1707.01296 [pdf]

Sensitivity analysis using perturbed-law based indices for quantiles and application to an industrial case

Authors: Roman Sueur, Bertrand Iooss, Thibault Delage

Abstract: In this paper, we present perturbed law-based sensitivity indices and how to adapt them for quantile-oriented sensitivity analysis. We exhibit a simple way to compute these indices in practice using an importance sampling estimator for quantiles. Some useful asymptotic results about this estimator are also provided. Finally, we apply this method to the study of a numerical model which simulates th… ▽ More In this paper, we present perturbed law-based sensitivity indices and how to adapt them for quantile-oriented sensitivity analysis. We exhibit a simple way to compute these indices in practice using an importance sampling estimator for quantiles. Some useful asymptotic results about this estimator are also provided. Finally, we apply this method to the study of a numerical model which simulates the behaviour of a component in a hydraulic system in case of severe transient solicitations. The sensitivity analysis is used to assess the impact of epistemic uncertainties about some physical parameters on the output of the model. △ Less

Submitted 5 July, 2017; originally announced July 2017.

arXiv:1704.07090 [pdf, ps, other]

An efficient methodology for the analysis and modeling of computer experiments with large number of inputs

Authors: Bertrand Iooss, Amandine Marrel

Abstract: Complex computer codes are often too time expensive to be directly used to perform uncertainty, sensitivity, optimization and robustness analyses. A widely accepted method to circumvent this problem consists in replacing cpu-time expensive computer models by cpu inexpensive mathematical functions, called metamodels. For example, the Gaussian process (Gp) model has shown strong capabilities to solv… ▽ More Complex computer codes are often too time expensive to be directly used to perform uncertainty, sensitivity, optimization and robustness analyses. A widely accepted method to circumvent this problem consists in replacing cpu-time expensive computer models by cpu inexpensive mathematical functions, called metamodels. For example, the Gaussian process (Gp) model has shown strong capabilities to solve practical problems , often involving several interlinked issues. However, in case of high dimensional experiments (with typically several tens of inputs), the Gp metamodel building process remains difficult, even unfeasible, and application of variable selection techniques cannot be avoided. In this paper, we present a general methodology allowing to build a Gp metamodel with large number of inputs in a very efficient manner. While our work focused on the Gp metamodel, its principles are fully generic and can be applied to any types of metamodel. The objective is twofold: estimating from a minimal number of computer experiments a highly predictive metamodel. This methodology is successfully applied on an industrial computer code. △ Less

Submitted 24 April, 2017; originally announced April 2017.

arXiv:1704.00624 [pdf, other]

Uncertainty and sensitivity analysis of functional risk curves based on Gaussian processes

Authors: Bertrand Iooss, Loïc Le Gratiet

Abstract: A functional risk curve gives the probability of an undesirable event as a function of the value of a critical parameter of a considered physical system. In several applicative situations, this curve is built using phenomenological numerical models which simulate complex physical phenomena. To avoid cpu-time expensive numerical models, we propose to use Gaussian process regression to build functio… ▽ More A functional risk curve gives the probability of an undesirable event as a function of the value of a critical parameter of a considered physical system. In several applicative situations, this curve is built using phenomenological numerical models which simulate complex physical phenomena. To avoid cpu-time expensive numerical models, we propose to use Gaussian process regression to build functional risk curves. An algorithm is given to provide confidence bounds due to this approximation. Two methods of global sensitivity analysis of the models' random input parameters on the functional risk curve are also studied. In particular, the PLI sensitivity indices allow to understand the effect of misjudgment on the input parameters' probability density functions. △ Less

Submitted 25 July, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

arXiv:1701.02373 [pdf, other]

Probabilistic risk bounds for the characterization of radiological contamination

Authors: Géraud Blatman, Thibault Delage, Bertrand Iooss, Nadia Pérot

Abstract: The radiological characterization of contaminated elements (walls, grounds, objects) from nuclear facilities often suffers from a too small number of measurements. In order to determine risk prediction bounds on the level of contamination, some classic statistical methods may then reveal unsuited as they rely upon strong assumptions (e.g. that the underlying distribution is Gaussian) which cannot… ▽ More The radiological characterization of contaminated elements (walls, grounds, objects) from nuclear facilities often suffers from a too small number of measurements. In order to determine risk prediction bounds on the level of contamination, some classic statistical methods may then reveal unsuited as they rely upon strong assumptions (e.g. that the underlying distribution is Gaussian) which cannot be checked. Considering that a set of measurements or their average value arise from a Gaussian distribution can sometimes lead to erroneous conclusion, possibly underconservative. This paper presents several alternative statistical approaches which are based on much weaker hypotheses than Gaussianity. They result from general probabilistic inequalities and order-statistics based formula. Given a data sample, these inequalities make it possible to derive prediction intervals for a random variable, which can be directly interpreted as probabilistic risk bounds. For the sake of validation, they are first applied to synthetic data samples generated from several known theoretical distributions. In a second time, the proposed methods are applied to two data sets obtained from real radiological contamination measurements. △ Less

Submitted 27 May, 2017; v1 submitted 12 December, 2016; originally announced January 2017.

arXiv:1612.03689 [pdf, other]

Poincaré inequalities on intervals -- application to sensitivity analysis

Authors: Olivier Roustant, Franck Barthe, Bertrand Iooss

Abstract: The development of global sensitivity analysis of numerical model outputs has recently raised new issues on 1-dimensional Poincaré inequalities. Typically two kind of sensitivity indices are linked by a Poincaré type inequality, which provide upper bounds of the most interpretable index by using the other one, cheaper to compute. This allows performing a low-cost screening of unessential variables… ▽ More The development of global sensitivity analysis of numerical model outputs has recently raised new issues on 1-dimensional Poincaré inequalities. Typically two kind of sensitivity indices are linked by a Poincaré type inequality, which provide upper bounds of the most interpretable index by using the other one, cheaper to compute. This allows performing a low-cost screening of unessential variables. The efficiency of this screening then highly depends on the accuracy of the upper bounds in Poincaré inequalities. The novelty in the questions concern the wide range of probability distributions involved, which are often truncated on intervals. After providing an overview of the existing knowledge and techniques, we add some theory about Poincaré constants on intervals, with improvements for symmetric intervals. Then we exploit the spectral interpretation for computing exact value of Poincaré constants of any admissible distribution on a given interval. We give semi-analytical results for some frequent distributions (truncated exponential, triangular, truncated normal), and present a numerical method in the general case. Finally, an application is made to a hydrological problem, showing the benefits of the new results in Poincaré inequalities to sensitivity analysis. △ Less

Submitted 12 December, 2016; originally announced December 2016.

arXiv:1512.07060 [pdf, other]

Stochastic simulators based optimization by Gaussian process metamodels -- Application to maintenance investments planning issues

Authors: Thomas Browne, Bertrand Iooss, Loïc Le Gratiet, Jérôme Lonchampt, Emmanuel Remy

Abstract: This paper deals with the optimization of industrial asset management strategies, whose profitability is characterized by the Net Present Value (NPV) indicator which is assessed by a Monte Carlo simulator. The developed method consists in building a metamodel of this stochastic simulator, allowing to get, for a given model input, the NPV probability distribution without running the simulator. The… ▽ More This paper deals with the optimization of industrial asset management strategies, whose profitability is characterized by the Net Present Value (NPV) indicator which is assessed by a Monte Carlo simulator. The developed method consists in building a metamodel of this stochastic simulator, allowing to get, for a given model input, the NPV probability distribution without running the simulator. The present work is concentrated on the emulation of the quantile function of the stochastic simulator by interpolating well chosen basis functions and metamodeling their coefficients (using the Gaussian process metamodel). This quantile function metamodel is then used to treat a problem of strategy maintenance optimization (four systems installed on different plants), in order to optimize an NPV quantile. Using the Gaussian process framework, an adaptive design method (called QFEI) is defined by extending in our case the well known EGO algorithm. This allows to obtain an "optimal" solution using a small number of simulator runs. △ Less

Submitted 3 May, 2016; v1 submitted 22 December, 2015; originally announced December 2015.

arXiv:1509.03880 [pdf, other]

Stochastic simulators based optimization by Gaussian process metamodels - Application to maintenance investments planning issues

Authors: Thomas Browne, Bertrand Iooss, Loïc Le Gratiet, Jérome Lonchampt

Abstract: This paper deals with the construction of a metamodel (i.e. a simplified mathematical model) for a stochastic computer code (also called stochastic numerical model or stochastic simulator), where stochastic means that the code maps the realization of a random variable. The goal is to get, for a given model input, the main information about the output probability distribution by using this metamode… ▽ More This paper deals with the construction of a metamodel (i.e. a simplified mathematical model) for a stochastic computer code (also called stochastic numerical model or stochastic simulator), where stochastic means that the code maps the realization of a random variable. The goal is to get, for a given model input, the main information about the output probability distribution by using this metamodel and without running the computer code. In practical applications, such a metamodel enables one to have estimations of every possible random variable properties, such as the expectation, the probability of exceeding a threshold or any quantile. The present work is concentrated on the emulation of the quantile function of the stochastic simulator by interpolating well chosen basis function and metamodeling their coefficients (using the Gaussian process metamodel). This quantile function metamodel is then used to treat a simple optimization strategy maintenance problem using a stochastic code, in order to optimize the quantile of an economic indicator. Using the Gaussian process framework, an adaptive design method (called QFEI) is defined by extending in our case the well known EGO algorithm. This allows to obtain an "optimal" solution using a small number of simulator runs. △ Less

Submitted 13 September, 2015; originally announced September 2015.

Comments: ENBIS 2015, Sep 2015, Prague, Czech Republic. 2015

arXiv:1501.05242 [pdf, other]

doi 10.1007/978-3-319-11259-6_64-1

Open TURNS: An industrial software for uncertainty quantification in simulation

Authors: Michaël Baudin, Anne Dutfoy, Bertrand Iooss, Anne-Laure Popelin

Abstract: The needs to assess robust performances for complex systems and to answer tighter regulatory processes (security, safety, environmental control, and health impacts, etc.) have led to the emergence of a new industrial simulation challenge: to take uncertainties into account when dealing with complex numerical simulation frameworks. Therefore, a generic methodology has emerged from the joint effor… ▽ More The needs to assess robust performances for complex systems and to answer tighter regulatory processes (security, safety, environmental control, and health impacts, etc.) have led to the emergence of a new industrial simulation challenge: to take uncertainties into account when dealing with complex numerical simulation frameworks. Therefore, a generic methodology has emerged from the joint effort of several industrial companies and academic institutions. EDF R&D, Airbus Group and Phimeca Engineering started a collaboration at the beginning of 2005, joined by IMACS in 2014, for the development of an Open Source software platform dedicated to uncertainty propagation by probabilistic methods, named OpenTURNS for Open source Treatment of Uncertainty, Risk 'N Statistics. OpenTURNS addresses the specific industrial challenges attached to uncertainties, which are transparency, genericity, modularity and multi-accessibility. This paper focuses on OpenTURNS and presents its main features: openTURNS is an open source software under the LGPL license, that presents itself as a C++ library and a Python TUI, and which works under Linux and Windows environment. All the methodological tools are described in the different sections of this paper: uncertainty quantification, uncertainty propagation, sensitivity analysis and metamodeling. A section also explains the generic wrappers way to link openTURNS to any external code. The paper illustrates as much as possible the methodological tools on an educational example that simulates the height of a river and compares it to the height of a dyke that protects industrial facilities. At last, it gives an overview of the main developments planned for the next few years. △ Less

Submitted 5 June, 2015; v1 submitted 21 January, 2015; originally announced January 2015.

arXiv:1412.2619 [pdf, other]

doi 10.1007/978-3-319-11259-6_36-1

Derivative based global sensitivity measures

Authors: Serge Kucherenko, Bertrand Iooss

Abstract: The method of derivative based global sensitivity measures (DGSM) has recently become popular among practitioners. It has a strong link with the Morris screening method and Sobol' sensitivity indices and has several advantages over them. DGSM are very easy to implement and evaluate numerically. The computational time required for numerical evaluation of DGSM is generally much lower than that for e… ▽ More The method of derivative based global sensitivity measures (DGSM) has recently become popular among practitioners. It has a strong link with the Morris screening method and Sobol' sensitivity indices and has several advantages over them. DGSM are very easy to implement and evaluate numerically. The computational time required for numerical evaluation of DGSM is generally much lower than that for estimation of Sobol' sensitivity indices. This paper presents a survey of recent advances in DGSM concerning lower and upper bounds on the values of Sobol' total sensitivity indices $S\_{i}^{tot}$. Using these bounds it is possible in most cases to get a good practical estimation of the values of $S\_{i}^{tot} $. Several examples are used to illustrate an application of DGSM. △ Less

Submitted 22 July, 2015; v1 submitted 8 December, 2014; originally announced December 2014.

arXiv:1405.6677 [pdf, other]

Bregman superquantiles. Estimation methods and applications

Authors: Tatiana Labopin-Richard, Fabrice Gamboa, Aurélien Garivier, Bertrand Iooss

Abstract: In this work, we extend some quantities introduced in "Optimization of conditional value-at-risk" of R.T Rockafellar and S. Uryasev to the case where the proximity between real numbers is measured by using a Bregman divergence. This leads to the definition of the Bregman superquantile. Axioms of a coherent measure of risk discussed in "Coherent approches to risk in optimization under uncertainty"… ▽ More In this work, we extend some quantities introduced in "Optimization of conditional value-at-risk" of R.T Rockafellar and S. Uryasev to the case where the proximity between real numbers is measured by using a Bregman divergence. This leads to the definition of the Bregman superquantile. Axioms of a coherent measure of risk discussed in "Coherent approches to risk in optimization under uncertainty" of R.T Rockafellar are studied in the case of Bregman superquantile. Furthermore, we deal with asymptotic properties of a Monte Carlo estimator of the Bregman superquantile. △ Less

Submitted 6 January, 2016; v1 submitted 26 May, 2014; originally announced May 2014.

arXiv:1404.2405 [pdf, other]

A review on global sensitivity analysis methods

Authors: Bertrand Iooss, Paul Lemaître

Abstract: This chapter makes a review, in a complete methodological framework, of various global sensitivity analysis methods of model output. Numerous statistical and probabilistic tools (regression, smoothing, tests, statistical learning, Monte Carlo, \ldots) aim at determining the model input variables which mostly contribute to an interest quantity depending on model output. This quantity can be for ins… ▽ More This chapter makes a review, in a complete methodological framework, of various global sensitivity analysis methods of model output. Numerous statistical and probabilistic tools (regression, smoothing, tests, statistical learning, Monte Carlo, \ldots) aim at determining the model input variables which mostly contribute to an interest quantity depending on model output. This quantity can be for instance the variance of an output variable. Three kinds of methods are distinguished: the screening (coarse sorting of the most influential inputs among a large number), the measures of importance (quantitative sensitivity indices) and the deep exploration of the model behaviour (measuring the effects of inputs on their all variation range). A progressive application methodology is illustrated on a scholar application. A synthesis is given to place every method according to several axes, mainly the cost in number of model evaluations, the model complexity and the nature of brought information. △ Less

Submitted 9 April, 2014; originally announced April 2014.

arXiv:1307.6835 [pdf, other]

doi 10.1057/jos.2013.16

Numerical studies of space filling designs: optimization of Latin Hypercube Samples and subprojection properties

Authors: Guillaume Damblin, Mathieu Couplet, Bertrand Iooss

Abstract: Quantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer mod… ▽ More Quantitative assessment of the uncertainties tainting the results of computer simulations is nowadays a major topic of interest in both industrial and scientific communities. One of the key issues in such studies is to get information about the output when the numerical simulations are expensive to run. This paper considers the problem of exploring the whole space of variations of the computer model input variables in the context of a large dimensional exploration space. Various properties of space filling designs are justified: interpoint-distance, discrepancy, minimum spanning tree criteria. A specific class of design, the optimized Latin Hypercube Sample, is considered. Several optimization algorithms, coming from the literature, are studied in terms of convergence speed, robustness to subprojection and space filling properties of the resulting design. Some recommendations for building such designs are given. Finally, another contribution of this paper is the deep analysis of the space filling properties of the design 2D-subprojections. △ Less

Submitted 25 July, 2013; originally announced July 2013.

arXiv:1307.2223 [pdf, ps, other]

A Bayesian approach for global sensitivity analysis of (multi-fidelity) computer codes

Authors: Loic Le Gratiet, Claire Cannamela, Bertrand Iooss

Abstract: Complex computer codes are widely used in science and engineering to model physical phenomena. Furthermore, it is common that they have a large number of input parameters. Global sensitivity analysis aims to identify those which have the most important impact on the output. Sobol indices are a popular tool to perform such analysis. However, their estimations require an important number of simulati… ▽ More Complex computer codes are widely used in science and engineering to model physical phenomena. Furthermore, it is common that they have a large number of input parameters. Global sensitivity analysis aims to identify those which have the most important impact on the output. Sobol indices are a popular tool to perform such analysis. However, their estimations require an important number of simulations and often cannot be processed under reasonable time constraint. To handle this problem, a Gaussian process regression model is built to surrogate the computer code and the Sobol indices are estimated through it. The aim of this paper is to provide a methodology to estimate the Sobol indices through a surrogate model taking into account both the estimation errors and the surrogate model errors. In particular, it allows us to derive non-asymptotic confidence intervals for the Sobol index estimations. Furthermore, we extend the suggested strategy to the case of multi-fidelity computer codes which can be run at different levels of accuracy. For such simulators, we use an extension of Gaussian process regression models for multivariate outputs. △ Less

Submitted 8 July, 2013; originally announced July 2013.

arXiv:1210.1074 [pdf, ps, other]

Density modification based reliability sensitivity analysis

Authors: Paul Lemaître, Ekatarina Sergienko, Aurélie Arnaud, Nicolas Bousquet, Fabrice Gamboa, Bertrand Iooss

Abstract: Sensitivity analysis of a numerical model, for instance simulating physical phenomena, is useful to quantify the influence of the inputs on the model responses. This paper proposes a new sensitivity index, based upon the modification of the probability density function (pdf) of the random inputs, when the quantity of interest is a failure probability (probability that a model output exceeds a give… ▽ More Sensitivity analysis of a numerical model, for instance simulating physical phenomena, is useful to quantify the influence of the inputs on the model responses. This paper proposes a new sensitivity index, based upon the modification of the probability density function (pdf) of the random inputs, when the quantity of interest is a failure probability (probability that a model output exceeds a given threshold). An input is considered influential if the input pdf modification leads to a broad change in the failure probability. These sensitivity indices can be computed using the sole set of simulations that has already been used to estimate the failure probability, thus limiting the number of calls to the numerical model. In the case of a Monte Carlo sample, asymptotical properties of the indices are derived. Based on Kullback-Leibler divergence, several types of input perturbations are introduced. The relevance of this new sensitivity analysis method is analysed through three case studies. △ Less

Submitted 9 March, 2014; v1 submitted 3 October, 2012; originally announced October 2012.

arXiv:1202.0943 [pdf, ps, other]

Derivative-based global sensitivity measures: general links with Sobol' indices and numerical tests

Authors: Matieyendou Lamboni, Bertrand Iooss, Anne-Laure Popelin, Fabrice Gamboa

Abstract: The estimation of variance-based importance measures (called Sobol' indices) of the input variables of a numerical model can require a large number of model evaluations. It turns to be unacceptable for high-dimensional model involving a large number of input variables (typically more than ten). Recently, Sobol and Kucherenko have proposed the Derivative-based Global Sensitivity Measures (DGSM), de… ▽ More The estimation of variance-based importance measures (called Sobol' indices) of the input variables of a numerical model can require a large number of model evaluations. It turns to be unacceptable for high-dimensional model involving a large number of input variables (typically more than ten). Recently, Sobol and Kucherenko have proposed the Derivative-based Global Sensitivity Measures (DGSM), defined as the integral of the squared derivatives of the model output, showing that it can help to solve the problem of dimensionality in some cases. We provide a general inequality link between DGSM and total Sobol' indices for input variables belonging to the class of Boltzmann probability measures, thus extending the previous results of Sobol and Kucherenko for uniform and normal measures. The special case of log-concave measures is also described. This link provides a DGSM-based maximal bound for the total Sobol indices. Numerical tests show the performance of the bound and its usefulness in practice. △ Less

Submitted 2 July, 2012; v1 submitted 5 February, 2012; originally announced February 2012.

Journal ref: Mathematics and Computers in Simulation 87 (2013) 45-54

arXiv:1010.2334 [pdf, ps, other]

Screening and metamodeling of computer experiments with functional outputs. Application to thermal-hydraulic computations

Authors: Benjamin Auder, Agnes De Crecy, Bertrand Iooss, Michel Marques

Abstract: To perform uncertainty, sensitivity or optimization analysis on scalar variables calculated by a cpu time expensive computer code, a widely accepted methodology consists in first identifying the most influential uncertain inputs (by screening techniques), and then in replacing the cpu time expensive model by a cpu inexpensive mathematical function, called a metamodel. This paper extends this metho… ▽ More To perform uncertainty, sensitivity or optimization analysis on scalar variables calculated by a cpu time expensive computer code, a widely accepted methodology consists in first identifying the most influential uncertain inputs (by screening techniques), and then in replacing the cpu time expensive model by a cpu inexpensive mathematical function, called a metamodel. This paper extends this methodology to the functional output case, for instance when the model output variables are curves. The screening approach is based on the analysis of variance and principal component analysis of output curves. The functional metamodeling consists in a curve classification step, a dimension reduction step, then a classical metamodeling step. An industrial nuclear reactor application (dealing with uncertainties in the pressurized thermal shock analysis) illustrates all these steps. △ Less

Submitted 3 November, 2011; v1 submitted 12 October, 2010; originally announced October 2010.

Journal ref: Reliability Engineering and System Safety 107 (2012) 122-131

arXiv:1001.1049 [pdf, ps, other]

Numerical studies of the metamodel fitting and validation processes

Authors: Bertrand Iooss, Loïc Boussouf, Vincent Feuillard, Amandine Marrel

Abstract: Complex computer codes, for instance simulating physical phenomena, are often too time expensive to be directly used to perform uncertainty, sensitivity, optimization and robustness analyses. A widely accepted method to circumvent this problem consists in replacing cpu time expensive computer models by cpu inexpensive mathematical functions, called metamodels. In this paper, we focus on the Gaussi… ▽ More Complex computer codes, for instance simulating physical phenomena, are often too time expensive to be directly used to perform uncertainty, sensitivity, optimization and robustness analyses. A widely accepted method to circumvent this problem consists in replacing cpu time expensive computer models by cpu inexpensive mathematical functions, called metamodels. In this paper, we focus on the Gaussian process metamodel and two essential steps of its definition phase. First, the initial design of the computer code input variables (which allows to fit the metamodel) has to honor adequate space filling properties. We adopt a numerical approach to compare the performance of different types of space filling designs, in the class of the optimal Latin hypercube samples, in terms of the predictivity of the subsequent fitted metamodel. We conclude that such samples with minimal wrap-around discrepancy are particularly well-suited for the Gaussian process metamodel fitting. Second, the metamodel validation process consists in evaluating the metamodel predictivity with respect to the initial computer code. We propose and test an algorithm which optimizes the distance between the validation points and the metamodel learning points in order to estimate the true metamodel predictivity with a minimum number of validation points. Comparisons with classical validation algorithms and application to a nuclear safety computer code show the relevance of this new sequential validation design. △ Less

Submitted 23 September, 2010; v1 submitted 7 January, 2010; originally announced January 2010.

Journal ref: International Journal of Advances in Systems and Measurements 3 (2010) 11-21

arXiv:0911.1189 [pdf, ps, other]

Global sensitivity analysis for models with spatially dependent outputs

Authors: Amandine Marrel, Bertrand Iooss, Michel Jullien, Beatrice Laurent, Elena Volkova

Abstract: The global sensitivity analysis of a complex numerical model often calls for the estimation of variance-based importance measures, named Sobol' indices. Metamodel-based techniques have been developed in order to replace the cpu time-expensive computer code with an inexpensive mathematical function, which predicts the computer code output. The common metamodel-based sensitivity analysis methods are… ▽ More The global sensitivity analysis of a complex numerical model often calls for the estimation of variance-based importance measures, named Sobol' indices. Metamodel-based techniques have been developed in order to replace the cpu time-expensive computer code with an inexpensive mathematical function, which predicts the computer code output. The common metamodel-based sensitivity analysis methods are well-suited for computer codes with scalar outputs. However, in the environmental domain, as in many areas of application, the numerical model outputs are often spatial maps, which may also vary with time. In this paper, we introduce an innovative method to obtain a spatial map of Sobol' indices with a minimal number of numerical model computations. It is based upon the functional decomposition of the spatial output onto a wavelet basis and the metamodeling of the wavelet coefficients by the Gaussian process. An analytical example is presented to clarify the various steps of our methodology. This technique is then applied to a real hydrogeological case: for each model input variable, a spatial map of Sobol' indices is thus obtained. △ Less

Submitted 23 September, 2010; v1 submitted 6 November, 2009; originally announced November 2009.

Journal ref: Environmentrics 22 (2011) 383-397

arXiv:0909.0329 [pdf, ps, other]

Latin hypercube sampling with inequality constraints

Authors: Matthieu Petelet, Bertrand Iooss, Olivier Asserin, Alexandre Loredo

Abstract: In some studies requiring predictive and CPU-time consuming numerical models, the sampling design of the model input variables has to be chosen with caution. For this purpose, Latin hypercube sampling has a long history and has shown its robustness capabilities. In this paper we propose and discuss a new algorithm to build a Latin hypercube sample (LHS) taking into account inequality constraints b… ▽ More In some studies requiring predictive and CPU-time consuming numerical models, the sampling design of the model input variables has to be chosen with caution. For this purpose, Latin hypercube sampling has a long history and has shown its robustness capabilities. In this paper we propose and discuss a new algorithm to build a Latin hypercube sample (LHS) taking into account inequality constraints between the sampled variables. This technique, called constrained Latin hypercube sampling (cLHS), consists in doing permutations on an initial LHS to honor the desired monotonic constraints. The relevance of this approach is shown on a real example concerning the numerical welding simulation, where the inequality constraints are caused by the physical decreasing of some material properties in function of the temperature. △ Less

Submitted 23 September, 2010; v1 submitted 2 September, 2009; originally announced September 2009.

Journal ref: AStA Advances in Statistical Analysis 3 (2010) 11-21

arXiv:0802.2426 [pdf, ps, other]

doi 10.1214/08-AOAS186

Controlled stratification for quantile estimation

Authors: Claire Cannamela, Josselin Garnier, Bertrand Iooss

Abstract: In this paper we propose and discuss variance reduction techniques for the estimation of quantiles of the output of a complex model with random input parameters. These techniques are based on the use of a reduced model, such as a metamodel or a response surface. The reduced model can be used as a control variate; or a rejection method can be implemented to sample the realizations of the input pa… ▽ More In this paper we propose and discuss variance reduction techniques for the estimation of quantiles of the output of a complex model with random input parameters. These techniques are based on the use of a reduced model, such as a metamodel or a response surface. The reduced model can be used as a control variate; or a rejection method can be implemented to sample the realizations of the input parameters in prescribed relevant strata; or the reduced model can be used to determine a good biased distribution of the input parameters for the implementation of an importance sampling strategy. The different strategies are analyzed and the asymptotic variances are computed, which shows the benefit of an adaptive controlled stratification method. This method is finally applied to a real example (computation of the peak cladding temperature during a large-break loss of coolant accident in a nuclear reactor). △ Less

Submitted 27 January, 2009; v1 submitted 18 February, 2008; originally announced February 2008.

Comments: Published in at http://dx.doi.org/10.1214/08-AOAS186 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOAS-AOAS186

Journal ref: Annals of Applied Statistics 2008, Vol. 2, No. 4, 1554-1580

arXiv:0802.1009 [pdf, ps, other]

Global sensitivity analysis of computer models with functional inputs

Authors: Bertrand Iooss, Mathieu Ribatet

Abstract: Global sensitivity analysis is used to quantify the influence of uncertain input parameters on the response variability of a numerical model. The common quantitative methods are applicable to computer codes with scalar input variables. This paper aims to illustrate different variance-based sensitivity analysis techniques, based on the so-called Sobol indices, when some input variables are functi… ▽ More Global sensitivity analysis is used to quantify the influence of uncertain input parameters on the response variability of a numerical model. The common quantitative methods are applicable to computer codes with scalar input variables. This paper aims to illustrate different variance-based sensitivity analysis techniques, based on the so-called Sobol indices, when some input variables are functional, such as stochastic processes or random spatial fields. In this work, we focus on large cpu time computer codes which need a preliminary meta-modeling step before performing the sensitivity analysis. We propose the use of the joint modeling approach, i.e., modeling simultaneously the mean and the dispersion of the code outputs using two interlinked Generalized Linear Models (GLM) or Generalized Additive Models (GAM). The ``mean'' model allows to estimate the sensitivity indices of each scalar input variables, while the ``dispersion'' model allows to derive the total sensitivity index of the functional input variables. The proposed approach is compared to some classical SA methodologies on an analytical function. Lastly, the proposed methodology is applied to a concrete industrial computer code that simulates the nuclear fuel irradiation. △ Less

Submitted 9 June, 2008; v1 submitted 7 February, 2008; originally announced February 2008.

arXiv:0802.1008 [pdf, ps, other]

Calculations of Sobol indices for the Gaussian process metamodel

Authors: Amandine Marrel, Bertrand Iooss, Beatrice Laurent, Olivier Roustant

Abstract: Global sensitivity analysis of complex numerical models can be performed by calculating variance-based importance measures of the input variables, such as the Sobol indices. However, these techniques, requiring a large number of model evaluations, are often unacceptable for time expensive computer codes. A well known and widely used decision consists in replacing the computer code by a metamodel… ▽ More Global sensitivity analysis of complex numerical models can be performed by calculating variance-based importance measures of the input variables, such as the Sobol indices. However, these techniques, requiring a large number of model evaluations, are often unacceptable for time expensive computer codes. A well known and widely used decision consists in replacing the computer code by a metamodel, predicting the model responses with a negligible computation time and rending straightforward the estimation of Sobol indices. In this paper, we discuss about the Gaussian process model which gives analytical expressions of Sobol indices. Two approaches are studied to compute the Sobol indices: the first based on the predictor of the Gaussian process model and the second based on the global stochastic process model. Comparisons between the two estimates, made on analytical examples, show the superiority of the second approach in terms of convergence and robustness. Moreover, the second approach allows to integrate the modeling error of the Gaussian process model by directly giving some confidence intervals on the Sobol indices. These techniques are finally applied to a real case of hydrogeological modeling. △ Less

Submitted 7 February, 2008; originally announced February 2008.

Showing 1–40 of 40 results for author: Iooss, B