Search | arXiv e-print repository

System-Aware Neural ODE Processes for Few-Shot Bayesian Optimization

Authors: Jixiang Qing, Becky D Langdon, Robert M Lee, Behrang Shafei, Mark van der Wilk, Calvin Tsay, Ruth Misener

Abstract: We consider the problem of optimizing initial conditions and timing in dynamical systems governed by unknown ordinary differential equations (ODEs), where evaluating different initial conditions is costly and there are constraints on observation times. To identify the optimal conditions within several trials, we introduce a few-shot Bayesian Optimization (BO) framework based on the system's prior… ▽ More We consider the problem of optimizing initial conditions and timing in dynamical systems governed by unknown ordinary differential equations (ODEs), where evaluating different initial conditions is costly and there are constraints on observation times. To identify the optimal conditions within several trials, we introduce a few-shot Bayesian Optimization (BO) framework based on the system's prior information. At the core of our approach is the System-Aware Neural ODE Processes (SANODEP), an extension of Neural ODE Processes (NODEP) designed to meta-learn ODE systems from multiple trajectories using a novel context embedding block. Additionally, we propose a multi-scenario loss function specifically for optimization purposes. Our two-stage BO framework effectively incorporates search space constraints, enabling efficient optimization of both initial conditions and observation timings. We conduct extensive experiments showcasing SANODEP's potential for few-shot BO. We also explore SANODEP's adaptability to varying levels of prior information, highlighting the trade-off between prior flexibility and model fitting accuracy. △ Less

Submitted 4 June, 2024; originally announced June 2024.

arXiv:2405.10221 [pdf, other]

Scalarisation-based risk concepts for robust multi-objective optimisation

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational… ▽ More Robust optimisation is a well-established framework for optimising functions in the presence of uncertainty. The inherent goal of this problem is to identify a collection of inputs whose outputs are both desirable for the decision maker, whilst also being robust to the underlying uncertainties in the problem. In this work, we study the multi-objective extension of this problem from a computational standpoint. We identify that the majority of all robust multi-objective algorithms rely on two key operations: robustification and scalarisation. Robustification refers to the strategy that is used to marginalise over the uncertainty in the problem. Whilst scalarisation refers to the procedure that is used to encode the relative importance of each objective. As these operations are not necessarily commutative, the order that they are performed in has an impact on the resulting solutions that are identified and the final decisions that are made. This work aims to give an exposition on the philosophical differences between these two operations and highlight when one should opt for one ordering over the other. As part of our analysis, we showcase how many existing risk concepts can be easily integrated into the specification and solution of a robust multi-objective optimisation problem. Besides this, we also demonstrate how one can principally define the notion of a robust Pareto front and a robust performance metric based on our robustify and scalarise methodology. To illustrate the efficacy of these new ideas, we present two insightful numerical case studies which are based on real-world data sets. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: The code is available at: https://github.com/benmltu/scalarize

arXiv:2405.01404 [pdf, other]

Random Pareto front surfaces

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordina… ▽ More The goal of multi-objective optimisation is to identify the Pareto front surface which is the set obtained by connecting the best trade-off points. Typically this surface is computed by evaluating the objectives at different points and then interpolating between the subset of the best evaluated trade-off points. In this work, we propose to parameterise the Pareto front surface using polar coordinates. More precisely, we show that any Pareto front surface can be equivalently represented using a scalar-valued length function which returns the projected length along any positive radial direction. We then use this representation in order to rigorously develop the theory and applications of stochastic Pareto front surfaces. In particular, we derive many Pareto front surface statistics of interest such as the expectation, covariance and quantiles. We then discuss how these can be used in practice within a design of experiments setting, where the goal is to both infer and use the Pareto front surface distribution in order to make effective decisions. Our framework allows for clear uncertainty quantification and we also develop advanced visualisation techniques for this purpose. Finally we discuss the applicability of our ideas within multivariate extreme value theory and illustrate our methodology in a variety of numerical examples, including a case study with a real-world air pollution data set. △ Less

Submitted 21 June, 2024; v1 submitted 2 May, 2024; originally announced May 2024.

Comments: The code is available at: https://github.com/benmltu/scalarize

arXiv:2402.08406 [pdf, other]

Transition Constrained Bayesian Optimization via Markov Decision Processes

Authors: Jose Pablo Folch, Calvin Tsay, Robert M Lee, Behrang Shafei, Weronika Ormaniec, Andreas Krause, Mark van der Wilk, Ruth Misener, Mojmír Mutný

Abstract: Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, r… ▽ More Bayesian optimization is a methodology to optimize black-box functions. Traditionally, it focuses on the setting where you can arbitrarily query the search space. However, many real-life problems do not offer this flexibility; in particular, the search space of the next query may depend on previous ones. Example challenges arise in the physical sciences in the form of local movement constraints, required monotonicity in certain variables, and transitions influencing the accuracy of measurements. Altogether, such transition constraints necessitate a form of planning. This work extends classical Bayesian optimization via the framework of Markov Decision Processes. We iteratively solve a tractable linearization of our utility function using reinforcement learning to obtain a policy that plans ahead for the entire horizon. This is a parallel to the optimization of an acquisition function in policy space. The resulting policy is potentially history-dependent and non-Markovian. We showcase applications in chemical reactor optimization, informative path planning, machine calibration, and other synthetic examples. △ Less

Submitted 29 May, 2024; v1 submitted 13 February, 2024; originally announced February 2024.

Comments: 10 pages main, 32 pages total, 16 figures, 2 tables, preprint

arXiv:2312.00622 [pdf, other]

Practical Path-based Bayesian Optimization

Authors: Jose Pablo Folch, James Odgers, Shiqiang Zhang, Robert M Lee, Behrang Shafei, David Walz, Calvin Tsay, Mark van der Wilk, Ruth Misener

Abstract: There has been a surge in interest in data-driven experimental design with applications to chemical engineering and drug manufacturing. Bayesian optimization (BO) has proven to be adaptable to such cases, since we can model the reactions of interest as expensive black-box functions. Sometimes, the cost of this black-box functions can be separated into two parts: (a) the cost of the experiment itse… ▽ More There has been a surge in interest in data-driven experimental design with applications to chemical engineering and drug manufacturing. Bayesian optimization (BO) has proven to be adaptable to such cases, since we can model the reactions of interest as expensive black-box functions. Sometimes, the cost of this black-box functions can be separated into two parts: (a) the cost of the experiment itself, and (b) the cost of changing the input parameters. In this short paper, we extend the SnAKe algorithm to deal with both types of costs simultaneously. We further propose extensions to the case of a maximum allowable input change, as well as to the multi-objective setting. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: 6 main pages, 12 with references and appendix. 4 figures, 2 tables. To appear in NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World

Journal ref: NeurIPS 2023 Workshop on Adaptive Experimental Design and Active Learning in the Real World

arXiv:2305.11774 [pdf, other]

Multi-objective optimisation via the R2 utilities

Authors: Ben Tu, Nikolas Kantas, Robert M. Lee, Behrang Shafei

Abstract: The goal of multi-objective optimisation is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimisation problem, practitioners often appeal to the use of scalarisation functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarised… ▽ More The goal of multi-objective optimisation is to identify a collection of points which describe the best possible trade-offs between the multiple objectives. In order to solve this vector-valued optimisation problem, practitioners often appeal to the use of scalarisation functions in order to transform the multi-objective problem into a collection of single-objective problems. This set of scalarised problems can then be solved using traditional single-objective optimisation techniques. In this work, we formalise this convention into a general mathematical framework. We show how this strategy effectively recasts the original multi-objective optimisation problem into a single-objective optimisation problem defined over sets. An appropriate class of objective functions for this new problem are the R2 utilities, which are utility functions that are defined as a weighted integral over the scalarised optimisation problems. As part of our work, we show that these utilities are monotone and submodular set functions which can be optimised effectively using greedy optimisation algorithms. We then analyse the performance of these greedy algorithms both theoretically and empirically. Our analysis largely focusses on Bayesian optimisation, which is a popular probabilistic framework for black-box optimisation. △ Less

Submitted 1 May, 2024; v1 submitted 19 May, 2023; originally announced May 2023.

Comments: The code is available at: https://github.com/benmltu/scalarize

arXiv:2211.06149 [pdf, other]

Combining Multi-Fidelity Modelling and Asynchronous Batch Bayesian Optimization

Authors: Jose Pablo Folch, Robert M Lee, Behrang Shafei, David Walz, Calvin Tsay, Mark van der Wilk, Ruth Misener

Abstract: Bayesian Optimization is a useful tool for experiment design. Unfortunately, the classical, sequential setting of Bayesian Optimization does not translate well into laboratory experiments, for instance battery design, where measurements may come from different sources and their evaluations may require significant waiting times. Multi-fidelity Bayesian Optimization addresses the setting with measur… ▽ More Bayesian Optimization is a useful tool for experiment design. Unfortunately, the classical, sequential setting of Bayesian Optimization does not translate well into laboratory experiments, for instance battery design, where measurements may come from different sources and their evaluations may require significant waiting times. Multi-fidelity Bayesian Optimization addresses the setting with measurements from different sources. Asynchronous batch Bayesian Optimization provides a framework to select new experiments before the results of the prior experiments are revealed. This paper proposes an algorithm combining multi-fidelity and asynchronous batch methods. We empirically study the algorithm behavior, and show it can outperform single-fidelity batch methods and multi-fidelity sequential methods. As an application, we consider designing electrode materials for optimal performance in pouch cells using experiments with coin cells to approximate battery performance. △ Less

Submitted 23 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

Comments: 19 pages in main paper / 28 with references and appendix, 7 figures, 2 tables, accepted into Computers and Chemical Engineering

arXiv:2207.00879 [pdf, other]

Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces

Authors: Alexander Thebelt, Calvin Tsay, Robert M. Lee, Nathan Sudermann-Merx, David Walz, Behrang Shafei, Ruth Misener

Abstract: Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search, as they achieve good predictive performance with little or no manual tuning, naturally handle discrete feature spaces, and are relatively insensitive to outliers in the training data. Two well-known challenges in using tree ensembles for black-box optimization are (i) effecti… ▽ More Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search, as they achieve good predictive performance with little or no manual tuning, naturally handle discrete feature spaces, and are relatively insensitive to outliers in the training data. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. To address both points simultaneously, we propose using the kernel interpretation of tree ensembles as a Gaussian Process prior to obtain model variance estimates, and we develop a compatible optimization formulation for the acquisition function. The latter further allows us to seamlessly integrate known constraints to improve sampling efficiency by considering domain-knowledge in engineering settings and modeling search space symmetries, e.g., hierarchical relationships in neural architecture search. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints. △ Less

Submitted 30 December, 2022; v1 submitted 2 July, 2022; originally announced July 2022.

Comments: 27 pages, 9 figures, 4 tables

arXiv:2202.00060 [pdf, other]

SnAKe: Bayesian Optimization with Pathwise Exploration

Authors: Jose Pablo Folch, Shiqiang Zhang, Robert M Lee, Behrang Shafei, David Walz, Calvin Tsay, Mark van der Wilk, Ruth Misener

Abstract: Bayesian Optimization is a very effective tool for optimizing expensive black-box functions. Inspired by applications develo** and characterizing reaction chemistry using droplet microfluidic reactors, we consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations. We further assume we are working asynchronou… ▽ More Bayesian Optimization is a very effective tool for optimizing expensive black-box functions. Inspired by applications develo** and characterizing reaction chemistry using droplet microfluidic reactors, we consider a novel setting where the expense of evaluating the function can increase significantly when making large input changes between iterations. We further assume we are working asynchronously, meaning we have to select new queries before evaluating previous experiments. This paper investigates the problem and introduces 'Sequential Bayesian Optimization via Adaptive Connecting Samples' (SnAKe), which provides a solution by considering large batches of queries and preemptively building optimization paths that minimize input costs. We investigate some convergence properties and empirically show that the algorithm is able to achieve regret similar to classical Bayesian Optimization algorithms in both synchronous and asynchronous settings, while reducing input costs significantly. We show the method is robust to the choice of its single hyper-parameter and provide a parameter-free alternative. △ Less

Submitted 11 January, 2023; v1 submitted 31 January, 2022; originally announced February 2022.

Comments: 10 main pages, 39 with appendix, 30 figures, 10 tables. Final camera-ready version for NeurIPS, with supplementary material included

arXiv:2111.03140 [pdf, other]

doi 10.1016/j.apenergy.2021.118061

Multi-Objective Constrained Optimization for Energy Applications via Tree Ensembles

Authors: Alexander Thebelt, Calvin Tsay, Robert M. Lee, Nathan Sudermann-Merx, David Walz, Tom Tranter, Ruth Misener

Abstract: Energy systems optimization problems are complex due to strongly non-linear system behavior and multiple competing objectives, e.g. economic gain vs. environmental impact. Moreover, a large number of input variables and different variable types, e.g. continuous and categorical, are challenges commonly present in real-world applications. In some cases, proposed optimal solutions need to obey explic… ▽ More Energy systems optimization problems are complex due to strongly non-linear system behavior and multiple competing objectives, e.g. economic gain vs. environmental impact. Moreover, a large number of input variables and different variable types, e.g. continuous and categorical, are challenges commonly present in real-world applications. In some cases, proposed optimal solutions need to obey explicit input constraints related to physical properties or safety-critical operating conditions. This paper proposes a novel data-driven strategy using tree ensembles for constrained multi-objective optimization of black-box problems with heterogeneous variable spaces for which underlying system dynamics are either too complex to model or unknown. In an extensive case study comprised of synthetic benchmarks and relevant energy applications we demonstrate the competitive performance and sampling efficiency of the proposed algorithm compared to other state-of-the-art tools, making it a useful all-in-one solution for real-world applications with limited evaluation budgets. △ Less

Submitted 4 November, 2021; originally announced November 2021.

Comments: 36 pages, 8 figures, 5 tables

arXiv:2003.04774 [pdf, other]

doi 10.1016/j.compchemeng.2021.107343

ENTMOOT: A Framework for Optimization over Ensemble Tree Models

Authors: Alexander Thebelt, Jan Kronqvist, Miten Mistry, Robert M. Lee, Nathan Sudermann-Merx, Ruth Misener

Abstract: Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which i… ▽ More Gradient boosted trees and other regression tree models perform well in a wide range of real-world, industrial applications. These tree models (i) offer insight into important prediction features, (ii) effectively manage sparse data, and (iii) have excellent prediction capabilities. Despite their advantages, they are generally unpopular for decision-making tasks and black-box optimization, which is due to their difficult-to optimize structure and the lack of a reliable uncertainty measure. ENTMOOT is our new framework for integrating (already trained) tree models into larger optimization problems. The contributions of ENTMOOT include: (i) explicitly introducing a reliable uncertainty measure that is compatible with tree models, (ii) solving the larger optimization problems that incorporate these uncertainty aware tree models, (iii) proving that the solutions are globally optimal, i.e. no better solution exists. In particular, we show how the ENTMOOT approach allows a simple integration of tree models into decision-making and black-box optimization, where it proves as a strong competitor to commonly-used frameworks. △ Less

Submitted 18 May, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

Comments: 33 pages, 10 figures, 2 tables

arXiv:1803.00952 [pdf, other]

Mixed-Integer Convex Nonlinear Optimization with Gradient-Boosted Trees Embedded

Authors: Miten Mistry, Dimitrios Letsios, Gerhard Krennrich, Robert M. Lee, Ruth Misener

Abstract: Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees an… ▽ More Decision trees usefully represent sparse, high dimensional and noisy data. Having learned a function from this data, we may want to thereafter integrate the function into a larger decision-making problem, e.g., for picking the best chemical process catalyst. We study a large-scale, industrially-relevant mixed-integer nonlinear nonconvex optimization problem involving both gradient-boosted trees and penalty functions mitigating risk. This mixed-integer optimization problem with convex penalty terms broadly applies to optimizing pre-trained regression tree models. Decision makers may wish to optimize discrete models to repurpose legacy predictive models, or they may wish to optimize a discrete model that particularly well-represents a data set. We develop several heuristic methods to find feasible solutions, and an exact, branch-and-bound algorithm leveraging structural properties of the gradient-boosted trees and penalty functions. We computationally test our methods on concrete mixture design instance and a chemical catalysis industrial instance. △ Less

Submitted 25 September, 2019; v1 submitted 2 March, 2018; originally announced March 2018.

Showing 1–12 of 12 results for author: Lee, R M