-
Combining additivity and active subspaces for high-dimensional Gaussian process modeling
Authors:
Mickael Binois,
Victor Picheny
Abstract:
Gaussian processes are a widely embraced technique for regression and classification due to their good prediction accuracy, analytical tractability and built-in capabilities for uncertainty quantification. However, they suffer from the curse of dimensionality whenever the number of variables increases. This challenge is generally addressed by assuming additional structure in theproblem, the prefer…
▽ More
Gaussian processes are a widely embraced technique for regression and classification due to their good prediction accuracy, analytical tractability and built-in capabilities for uncertainty quantification. However, they suffer from the curse of dimensionality whenever the number of variables increases. This challenge is generally addressed by assuming additional structure in theproblem, the preferred options being either additivity or low intrinsic dimensionality. Our contribution for high-dimensional Gaussian process modeling is to combine them with a multi-fidelity strategy, showcasing the advantages through experiments on synthetic functions and datasets.
△ Less
Submitted 6 February, 2024;
originally announced February 2024.
-
Shared active subspace for multivariate vector-valued functions
Authors:
Khadija Musayeva,
Mickael Binois
Abstract:
This paper proposes several approaches as baselines to compute a shared active subspace for multivariate vector-valued functions. The goal is to minimize the deviation between the function evaluations on the original space and those on the reconstructed one. This is done either by manipulating the gradients or the symmetric positive (semi-)definite (SPD) matrices computed from the gradients of eac…
▽ More
This paper proposes several approaches as baselines to compute a shared active subspace for multivariate vector-valued functions. The goal is to minimize the deviation between the function evaluations on the original space and those on the reconstructed one. This is done either by manipulating the gradients or the symmetric positive (semi-)definite (SPD) matrices computed from the gradients of each component function so as to get a single structure common to all component functions. These approaches can be applied to any data irrespective of the underlying distribution unlike the existing vector-valued approach that is constrained to a normal distribution. We test the effectiveness of these methods on five optimization problems. The experiments show that, in general, the SPD-level methods are superior to the gradient-level ones, and are close to the vector-valued approach in the case of a normal distribution. Interestingly, in most cases it suffices to take the sum of the SPD matrices to identify the best shared active subspace.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Trajectory-oriented optimization of stochastic epidemiological models
Authors:
Arindam Fadikar,
Mickael Binois,
Nicholson Collier,
Abby Stevens,
Kok Ben Toh,
Jonathan Ozik
Abstract:
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). W…
▽ More
Epidemiological models must be calibrated to ground truth for downstream tasks such as producing forward projections or running what-if scenarios. The meaning of calibration changes in case of a stochastic model since output from such a model is generally described via an ensemble or a distribution. Each member of the ensemble is usually mapped to a random number seed (explicitly or implicitly). With the goal of finding not only the input parameter settings but also the random seeds that are consistent with the ground truth, we propose a class of Gaussian process (GP) surrogates along with an optimization strategy based on Thompson sampling. This Trajectory Oriented Optimization (TOO) approach produces actual trajectories close to the empirical observations instead of a set of parameter settings where only the mean simulation behavior matches with the ground truth.
△ Less
Submitted 13 September, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
A portfolio approach to massively parallel Bayesian optimization
Authors:
Mickael Binois,
Nicholson Collier,
Jonathan Ozik
Abstract:
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of com…
▽ More
One way to reduce the time of conducting optimization studies is to evaluate designs in parallel rather than just one-at-a-time. For expensive-to-evaluate black-boxes, batch versions of Bayesian optimization have been proposed. They work by building a surrogate model of the black-box to simultaneously select multiple designs via an infill criterion. Still, despite the increased availability of computing resources that enable large-scale parallelism, the strategies that work for selecting a few tens of parallel designs for evaluations become limiting due to the complexity of selecting more designs. It is even more crucial when the black-box is noisy, necessitating more evaluations as well as repeating experiments. Here we propose a scalable strategy that can keep up with massive batching natively, focused on the exploration/exploitation trade-off and a portfolio allocation. We compare the approach with related methods on noisy functions, for mono and multi-objective optimization tasks. These experiments show orders of magnitude speed improvements over existing methods with similar or better performance.
△ Less
Submitted 3 April, 2023; v1 submitted 18 October, 2021;
originally announced October 2021.
-
Sensitivity Prewar** for Local Surrogate Modeling
Authors:
Nathan Wycoff,
Mickaël Binois,
Robert B. Gramacy
Abstract:
In the continual effort to improve product quality and decrease operations costs, computational modeling is increasingly being deployed to determine feasibility of product designs or configurations. Surrogate modeling of these computer experiments via local models, which induce sparsity by only considering short range interactions, can tackle huge analyses of complicated input-output relationships…
▽ More
In the continual effort to improve product quality and decrease operations costs, computational modeling is increasingly being deployed to determine feasibility of product designs or configurations. Surrogate modeling of these computer experiments via local models, which induce sparsity by only considering short range interactions, can tackle huge analyses of complicated input-output relationships. However, narrowing focus to local scale means that global trends must be re-learned over and over again. In this article, we propose a framework for incorporating information from a global sensitivity analysis into the surrogate model as an input rotation and rescaling preprocessing step. We discuss the relationship between several sensitivity analysis methods based on kernel regression before describing how they give rise to a transformation of the input variables. Specifically, we perform an input war** such that the "warped simulator" is equally sensitive to all input directions, freeing local models to focus on local dynamics. Numerical experiments on observational data and benchmark test functions, including a high-dimensional computer simulator from the automotive industry, provide empirical validation.
△ Less
Submitted 8 November, 2021; v1 submitted 15 January, 2021;
originally announced January 2021.
-
Sequential Learning of Active Subspaces
Authors:
Nathan Wycoff,
Mickael Binois,
Stefan M. Wild
Abstract:
In recent years, active subspace methods (ASMs) have become a popular means of performing subspace sensitivity analysis on black-box functions. Naively applied, however, ASMs require gradient evaluations of the target function. In the event of noisy, expensive, or stochastic simulators, evaluating gradients via finite differencing may be infeasible. In such cases, often a surrogate model is employ…
▽ More
In recent years, active subspace methods (ASMs) have become a popular means of performing subspace sensitivity analysis on black-box functions. Naively applied, however, ASMs require gradient evaluations of the target function. In the event of noisy, expensive, or stochastic simulators, evaluating gradients via finite differencing may be infeasible. In such cases, often a surrogate model is employed, on which finite differencing is performed. When the surrogate model is a Gaussian process, we show that the ASM estimator is available in closed form, rendering the finite-difference approximation unnecessary. We use our closed-form solution to develop acquisition functions focused on sequential learning tailored to sensitivity analysis on top of ASMs. We also show that the traditional ASM estimator may be viewed as a method of moments estimator for a certain class of Gaussian processes. We demonstrate how uncertainty on Gaussian process hyperparameters may be propagated to uncertainty on the sensitivity analysis, allowing model-based confidence intervals on the active subspace. Our methodological developments are illustrated on several examples.
△ Less
Submitted 20 September, 2020; v1 submitted 26 July, 2019;
originally announced July 2019.
-
The Kalai-Smorodinski solution for many-objective Bayesian optimization
Authors:
Mickaël Binois,
Victor Picheny,
Patrick Taillandier,
Abderrahmane Habbal
Abstract:
An ongoing aim of research in multiobjective Bayesian optimization is to extend its applicability to a large number of objectives. While co** with a limited budget of evaluations, recovering the set of optimal compromise solutions generally requires numerous observations and is less interpretable since this set tends to grow larger with the number of objectives. We thus propose to focus on a sp…
▽ More
An ongoing aim of research in multiobjective Bayesian optimization is to extend its applicability to a large number of objectives. While co** with a limited budget of evaluations, recovering the set of optimal compromise solutions generally requires numerous observations and is less interpretable since this set tends to grow larger with the number of objectives. We thus propose to focus on a specific solution originating from game theory, the Kalai-Smorodinsky solution, which possesses attractive properties. In particular, it ensures equal marginal gains over all objectives. We further make it insensitive to a monotonic transformation of the objectives by considering the objectives in the copula space. A novel tailored algorithm is proposed to search for the solution, in the form of a Bayesian optimization algorithm: sequential sampling decisions are made based on acquisition functions that derive from an instrumental Gaussian process prior. Our approach is tested on four problems with respectively four, six, eight, and nine objectives. The method is available in the Rpackage GPGame available on CRAN at https://cran.r-project.org/package=GPGame.
△ Less
Submitted 3 October, 2019; v1 submitted 18 February, 2019;
originally announced February 2019.
-
On-site surrogates for large-scale calibration
Authors:
Jiangeng Huang,
Robert B. Gramacy,
Mickael Binois,
Mirko Libraschi
Abstract:
Motivated by a computer model calibration problem from the oil and gas industry, involving the design of a honeycomb seal, we develop a new Bayesian methodology to cope with limitations in the canonical apparatus stemming from several factors. We propose a new strategy of on-site design and surrogate modeling for a computer simulator acting on a high-dimensional input space that, although relative…
▽ More
Motivated by a computer model calibration problem from the oil and gas industry, involving the design of a honeycomb seal, we develop a new Bayesian methodology to cope with limitations in the canonical apparatus stemming from several factors. We propose a new strategy of on-site design and surrogate modeling for a computer simulator acting on a high-dimensional input space that, although relatively speedy, is prone to numerical instabilities, missing data, and nonstationary dynamics. Our aim is to strike a balance between data-faithful modeling and computational tractability in a calibration framework--tailoring the computer model to a limited field experiment. Situating our on-site surrogates within the canonical calibration apparatus requires updates to that framework. We describe a novel yet intuitive Bayesian setup that carefully decomposes otherwise prohibitively large matrices by exploiting the sparse blockwise structure. Empirical illustrations demonstrate that this approach performs well on toy data and our motivating honeycomb example.
△ Less
Submitted 18 July, 2019; v1 submitted 3 October, 2018;
originally announced October 2018.
-
Evaluating Gaussian Process Metamodels and Sequential Designs for Noisy Level Set Estimation
Authors:
Xiong Lyu,
Mickael Binois,
Michael Ludkovski
Abstract:
We consider the problem of learning the level set for which a noisy black-box function exceeds a given threshold. To efficiently reconstruct the level set, we investigate Gaussian process (GP) metamodels. Our focus is on strongly stochastic samplers, in particular with heavy-tailed simulation noise and low signal-to-noise ratio. To guard against noise misspecification, we assess the performance of…
▽ More
We consider the problem of learning the level set for which a noisy black-box function exceeds a given threshold. To efficiently reconstruct the level set, we investigate Gaussian process (GP) metamodels. Our focus is on strongly stochastic samplers, in particular with heavy-tailed simulation noise and low signal-to-noise ratio. To guard against noise misspecification, we assess the performance of three variants: (i) GPs with Student-$t$ observations; (ii) Student-$t$ processes (TPs); and (iii) classification GPs modeling the sign of the response. In conjunction with these metamodels, we analyze several acquisition functions for guiding the sequential experimental designs, extending existing stepwise uncertainty reduction criteria to the stochastic contour-finding context. This also motivates our development of (approximate) updating formulas to efficiently compute such acquisition functions. Our schemes are benchmarked by using a variety of synthetic experiments in 1--6 dimensions. We also consider an application of level set estimation for determining the optimal exercise policy of Bermudan options in finance.
△ Less
Submitted 1 March, 2020; v1 submitted 17 July, 2018;
originally announced July 2018.
-
Parameter and Uncertainty Estimation for Dynamical Systems Using Surrogate Stochastic Processes
Authors:
M. Chung,
M. Binois,
R. B. Gramacy,
D. J. Moquin,
A. P. Smith,
A. M. Smith
Abstract:
Inference on unknown quantities in dynamical systems via observational data is essential for providing meaningful insight, furnishing accurate predictions, enabling robust control, and establishing appropriate designs for future experiments. Merging mathematical theory with empirical measurements in a statistically coherent way is critical and challenges abound, e.g.,: ill-posedness of the paramet…
▽ More
Inference on unknown quantities in dynamical systems via observational data is essential for providing meaningful insight, furnishing accurate predictions, enabling robust control, and establishing appropriate designs for future experiments. Merging mathematical theory with empirical measurements in a statistically coherent way is critical and challenges abound, e.g.,: ill-posedness of the parameter estimation problem, proper regularization and incorporation of prior knowledge, and computational limitations on full uncertainty qualification. To address these issues, we propose a new method for learning parameterized dynamical systems from data. In many ways, our proposal turns the canonical framework on its head. We first fit a surrogate stochastic process to observational data, enforcing prior knowledge (e.g., smoothness), and co** with challenging data features like heteroskedasticity, heavy tails and censoring. Then, samples of the stochastic process are used as "surrogate data" and point estimates are computed via ordinary point estimation methods in a modular fashion. An attractive feature of this approach is that it is fully Bayesian and simultaneously parallelizable. We demonstrate the advantages of our new approach on a predator prey simulation study and on a real world application involving within-host influenza virus infection data paired with a viral kinetic model.
△ Less
Submitted 2 February, 2018;
originally announced February 2018.
-
Replication or exploration? Sequential design for stochastic simulation experiments
Authors:
Mickael Binois,
Jiangeng Huang,
Robert B Gramacy,
Mike Ludkovski
Abstract:
We investigate the merits of replication, and provide methods for optimal design (including replicates), with the goal of obtaining globally accurate emulation of noisy computer simulation experiments. We first show that replication can be beneficial from both design and computational perspectives, in the context of Gaussian process surrogate modeling. We then develop a lookahead based sequential…
▽ More
We investigate the merits of replication, and provide methods for optimal design (including replicates), with the goal of obtaining globally accurate emulation of noisy computer simulation experiments. We first show that replication can be beneficial from both design and computational perspectives, in the context of Gaussian process surrogate modeling. We then develop a lookahead based sequential design scheme that can determine if a new run should be at an existing input location (i.e., replicate) or at a new one (explore). When paired with a newly developed heteroskedastic Gaussian process model, our dynamic design scheme facilitates learning of signal and noise relationships which can vary throughout the input space. We show that it does so efficiently, on both computational and statistical grounds. In addition to illustrative synthetic examples, we demonstrate performance on two challenging real-data simulation experiments, from inventory management and epidemiology.
△ Less
Submitted 25 January, 2019; v1 submitted 9 October, 2017;
originally announced October 2017.
-
On the choice of the low-dimensional domain for global optimization via random embeddings
Authors:
Mickaël Binois,
David Ginsbourger,
Olivier Roustant
Abstract:
The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budg…
▽ More
The challenge of taking many variables into account in optimization problems may be overcome under the hypothesis of low effective dimensionality. Then, the search of solutions can be reduced to the random embedding of a low dimensional space into the original one, resulting in a more manageable optimization problem. Specifically, in the case of time consuming black-box functions and when the budget of evaluations is severely limited, global optimization with random embeddings appears as a sound alternative to random search. Yet, in the case of box constraints on the native variables, defining suitable bounds on a low dimensional domain appears to be complex. Indeed, a small search domain does not guarantee to find a solution even under restrictive hypotheses about the function, while a larger one may slow down convergence dramatically. Here we tackle the issue of low-dimensional domain selection based on a detailed study of the properties of the random embedding, giving insight on the aforementioned difficulties. In particular, we describe a minimal low-dimensional set in correspondence with the embedded search space. We additionally show that an alternative equivalent embedding procedure yields simultaneously a simpler definition of the low-dimensional minimal set and better properties in practice. Finally, the performance and robustness gains of the proposed enhancements for Bayesian optimization are illustrated on numerical examples.
△ Less
Submitted 22 October, 2018; v1 submitted 18 April, 2017;
originally announced April 2017.
-
Practical heteroskedastic Gaussian process modeling for large simulation experiments
Authors:
Mickael Binois,
Robert B. Gramacy,
Michael Ludkovski
Abstract:
We present a unified view of likelihood based Gaussian progress regression for simulation experiments exhibiting input-dependent noise. Replication plays an important role in that context, however previous methods leveraging replicates have either ignored the computational savings that come from such design, or have short-cut full likelihood-based inference to remain tractable. Starting with homos…
▽ More
We present a unified view of likelihood based Gaussian progress regression for simulation experiments exhibiting input-dependent noise. Replication plays an important role in that context, however previous methods leveraging replicates have either ignored the computational savings that come from such design, or have short-cut full likelihood-based inference to remain tractable. Starting with homoskedastic processes, we show how multiple applications of a well-known Woodbury identity facilitate inference for all parameters under the likelihood (without approximation), bypassing the typical full-data sized calculations. We then borrow a latent-variable idea from machine learning to address heteroskedasticity, adapting it to work within the same thrifty inferential framework, thereby simultaneously leveraging the computational and statistical efficiency of designs with replication. The result is an inferential scheme that can be characterized as single objective function, complete with closed form derivatives, for rapid library-based optimization. Illustrations are provided, including real-world simulation experiments from manufacturing and the management of epidemics.
△ Less
Submitted 13 November, 2017; v1 submitted 17 November, 2016;
originally announced November 2016.
-
A Bayesian optimization approach to find Nash equilibria
Authors:
Victor Picheny,
Mickael Binois,
Abderrahmane Habbal
Abstract:
Game theory finds nowadays a broad range of applications in engineering and machine learning. However, in a derivative-free, expensive black-box context, very few algorithmic solutions are available to find game equilibria. Here, we propose a novel Gaussian-process based approach for solving games in this context. We follow a classical Bayesian optimization framework, with sequential sampling deci…
▽ More
Game theory finds nowadays a broad range of applications in engineering and machine learning. However, in a derivative-free, expensive black-box context, very few algorithmic solutions are available to find game equilibria. Here, we propose a novel Gaussian-process based approach for solving games in this context. We follow a classical Bayesian optimization framework, with sequential sampling decisions based on acquisition functions. Two strategies are proposed, based either on the probability of achieving equilibrium or on the Stepwise Uncertainty Reduction paradigm. Practical and numerical aspects are discussed in order to enhance the scalability and reduce computation time. Our approach is evaluated on several synthetic game problems with varying number of players and decision space dimensions. We show that equilibria can be found reliably for a fraction of the cost (in terms of black-box evaluations) compared to classical, derivative-based algorithms. The method is available in the R package GPGame available on CRAN at https://cran.r-project.org/package=GPGame.
△ Less
Submitted 27 February, 2018; v1 submitted 8 November, 2016;
originally announced November 2016.
-
A warped kernel improving robustness in Bayesian optimization via random embeddings
Authors:
Mickaël Binois,
David Ginsbourger,
Olivier Roustant
Abstract:
This works extends the Random Embedding Bayesian Optimization approach by integrating a war** of the high dimensional subspace within the covariance kernel. The proposed war**, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates c…
▽ More
This works extends the Random Embedding Bayesian Optimization approach by integrating a war** of the high dimensional subspace within the covariance kernel. The proposed war**, that relies on elementary geometric considerations, allows mitigating the drawbacks of the high extrinsic dimensionality while avoiding the algorithm to evaluate points giving redundant information. It also alleviates constraints on bound selection for the embedded domain, thus improving the robustness, as illustrated with a test case with 25 variables and intrinsic dimension 6.
△ Less
Submitted 18 March, 2015; v1 submitted 13 November, 2014;
originally announced November 2014.