-
Principled Preferential Bayesian Optimization
Authors:
Wenjie Xu,
Wenbin Wang,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. Inspired by the likelihood ratio idea, we construct a confidence set of the black-box function using only the preference feedback. An optimistic algorithm with an efficient computational method is then developed to solve th…
▽ More
We study the problem of preferential Bayesian optimization (BO), where we aim to optimize a black-box function with only preference feedback over a pair of candidate solutions. Inspired by the likelihood ratio idea, we construct a confidence set of the black-box function using only the preference feedback. An optimistic algorithm with an efficient computational method is then developed to solve the problem, which enjoys an information-theoretic bound on the total cumulative regret, a first-of-its-kind for preferential BO. This bound further allows us to design a scheme to report an estimated best solution, with a guaranteed convergence rate. Experimental results on sampled instances from Gaussian processes, standard test functions, and a thermal comfort optimization problem all show that our method stably achieves better or competitive performance as compared to the existing state-of-the-art heuristics, which, however, do not have theoretical guarantees on regret bounds or convergence.
△ Less
Submitted 29 May, 2024; v1 submitted 7 February, 2024;
originally announced February 2024.
-
Stable Linear Subspace Identification: A Machine Learning Approach
Authors:
Loris Di Natale,
Muhammad Zakwan,
Bratislav Svetozarevic,
Philipp Heer,
Giancarlo Ferrari-Trecate,
Colin N. Jones
Abstract:
Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a family of discrete linear multi-step-ahead state-space SI methods using backpropagation. SIMBa relies on a novel Linear-Matrix-Inequality-based free parametrizati…
▽ More
Machine Learning (ML) and linear System Identification (SI) have been historically developed independently. In this paper, we leverage well-established ML tools - especially the automatic differentiation framework - to introduce SIMBa, a family of discrete linear multi-step-ahead state-space SI methods using backpropagation. SIMBa relies on a novel Linear-Matrix-Inequality-based free parametrization of Schur matrices to ensure the stability of the identified model.
We show how SIMBa generally outperforms traditional linear state-space SI methods, and sometimes significantly, although at the price of a higher computational burden. This performance gap is particularly remarkable compared to other SI methods with stability guarantees, where the gain is frequently above 25% in our investigations, hinting at SIMBa's ability to simultaneously achieve state-of-the-art fitting performance and enforce stability. Interestingly, these observations hold for a wide variety of input-output systems and on both simulated and real-world data, showcasing the flexibility of the proposed approach. We postulate that this new SI paradigm presents a great extension potential to identify structured nonlinear models from data, and we hence open-source SIMBa on https://github.com/Cemempamoi/simba.
△ Less
Submitted 26 March, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
Multi-Agent Bayesian Optimization with Coupled Black-Box and Affine Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
This paper studies the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints. A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case for the black-box objective and constraint functions. Additionally, the algorithm guarantees an $\mathcal{O}(N\sqrt{T})$ b…
▽ More
This paper studies the problem of distributed multi-agent Bayesian optimization with both coupled black-box constraints and known affine constraints. A primal-dual distributed algorithm is proposed that achieves similar regret/violation bounds as those in the single-agent case for the black-box objective and constraint functions. Additionally, the algorithm guarantees an $\mathcal{O}(N\sqrt{T})$ bound on the cumulative violation for the known affine constraints, where $N$ is the number of agents. Hence, it is ensured that the average of the samples satisfies the affine constraints up to the error $\mathcal{O}({N}/{\sqrt{T}})$. Furthermore, we characterize certain conditions under which our algorithm can bound a stronger metric of cumulative violation and provide best-iterate convergence without affine constraint. The method is then applied to both sampled instances from Gaussian processes and a real-world optimal power allocation problem for wireless communication; the results show that our method simultaneously provides close-to-optimal performance and maintains minor violations on average, corroborating our theoretical analysis.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Data-driven adaptive building thermal controller tuning with constraints: A primal-dual contextual Bayesian optimization approach
Authors:
Wenjie Xu,
Bratislav Svetozarevic,
Loris Di Natale,
Philipp Heer,
Colin N Jones
Abstract:
We study the problem of tuning the parameters of a room temperature controller to minimize its energy consumption, subject to the constraint that the daily cumulative thermal discomfort of the occupants is below a given threshold. We formulate it as an online constrained black-box optimization problem where, on each day, we observe some relevant environmental context and adaptively select the cont…
▽ More
We study the problem of tuning the parameters of a room temperature controller to minimize its energy consumption, subject to the constraint that the daily cumulative thermal discomfort of the occupants is below a given threshold. We formulate it as an online constrained black-box optimization problem where, on each day, we observe some relevant environmental context and adaptively select the controller parameters. In this paper, we propose to use a data-driven Primal-Dual Contextual Bayesian Optimization (PDCBO) approach to solve this problem. In a simulation case study on a single room, we apply our algorithm to tune the parameters of a Proportional Integral (PI) heating controller and the pre-heating time. Our results show that PDCBO can save up to 4.7% energy consumption compared to other state-of-the-art Bayesian optimization-based methods while kee** the daily thermal discomfort below the given tolerable threshold on average. Additionally, PDCBO can automatically track time-varying tolerable thresholds while existing methods fail to do so. We then study an alternative constrained tuning problem where we aim to minimize the thermal discomfort with a given energy budget. With this formulation, PDCBO reduces the average discomfort by up to 63% compared to state-of-the-art safe optimization methods while kee** the average daily energy consumption below the required threshold.
△ Less
Submitted 1 October, 2023;
originally announced October 2023.
-
Bayesian Optimization of Expensive Nested Grey-Box Functions
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
We consider the problem of optimizing a grey-box objective function, i.e., nested function composed of both black-box and white-box functions. A general formulation for such grey-box problems is given, which covers the existing grey-box optimization formulations as special cases. We then design an optimism-driven algorithm to solve it. Under certain regularity assumptions, our algorithm achieves s…
▽ More
We consider the problem of optimizing a grey-box objective function, i.e., nested function composed of both black-box and white-box functions. A general formulation for such grey-box problems is given, which covers the existing grey-box optimization formulations as special cases. We then design an optimism-driven algorithm to solve it. Under certain regularity assumptions, our algorithm achieves similar regret bound as that for the standard black-box Bayesian optimization algorithm, up to a constant multiplicative term depending on the Lipschitz constants of the functions considered. We further extend our method to the constrained case and discuss special cases. For the commonly used kernel functions, the regret bounds allow us to derive a convergence rate to the optimal solution. Experimental results show that our grey-box optimization method empirically improves the speed of finding the global optimal solution significantly, as compared to the standard black-box optimization algorithm.
△ Less
Submitted 2 August, 2023; v1 submitted 8 June, 2023;
originally announced June 2023.
-
Primal-Dual Contextual Bayesian Optimization for Control System Online Optimization with Time-Average Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
This paper studies the problem of online performance optimization of constrained closed-loop control systems, where both the objective and the constraints are unknown black-box functions affected by exogenous time-varying contextual disturbances. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal soluti…
▽ More
This paper studies the problem of online performance optimization of constrained closed-loop control systems, where both the objective and the constraints are unknown black-box functions affected by exogenous time-varying contextual disturbances. A primal-dual contextual Bayesian optimization algorithm is proposed that achieves sublinear cumulative regret with respect to the dynamic optimal solution under certain regularity conditions. Furthermore, the algorithm achieves zero time-average constraint violation, ensuring that the average value of the constraint function satisfies the desired constraint. The method is applied to both sampled instances from Gaussian processes and a continuous stirred tank reactor parameter tuning problem; simulation results show that the method simultaneously provides close-to-optimal performance and maintains constraint feasibility on average. This contrasts current state-of-the-art methods, which either suffer from large cumulative regret or severe constraint violations for the case studies presented.
△ Less
Submitted 20 September, 2023; v1 submitted 12 April, 2023;
originally announced April 2023.
-
Violation-Aware Contextual Bayesian Optimization for Controller Performance Optimization with Unmodeled Constraints
Authors:
Wenjie Xu,
Colin N Jones,
Bratislav Svetozarevic,
Christopher R. Laughman,
Ankush Chakrabarty
Abstract:
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated to be effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints and time-var…
▽ More
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated to be effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints and time-varying ambient conditions. In this paper, we propose a violation-aware contextual BO algorithm (VACBO) that optimizes closed-loop performance while simultaneously learning constraint-feasible solutions under time-varying ambient conditions. Unlike classical constrained BO methods which allow unlimited constraint violations, or 'safe' BO algorithms that are conservative and try to operate with near-zero violations, we allow budgeted constraint violations to improve constraint learning and accelerate optimization. We demonstrate the effectiveness of our proposed VACBO method for energy minimization of industrial vapor compression systems under time-varying ambient temperature and humidity.
△ Less
Submitted 28 January, 2023;
originally announced January 2023.
-
Towards Scalable Physically Consistent Neural Networks: an Application to Data-driven Multi-zone Thermal Building Models
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin Neil Jones
Abstract:
With more and more data being collected, data-driven modeling methods have been gaining in popularity in recent years. While physically sound, classical gray-box models are often cumbersome to identify and scale, and their accuracy might be hindered by their limited expressiveness. On the other hand, classical black-box methods, typically relying on Neural Networks (NNs) nowadays, often achieve im…
▽ More
With more and more data being collected, data-driven modeling methods have been gaining in popularity in recent years. While physically sound, classical gray-box models are often cumbersome to identify and scale, and their accuracy might be hindered by their limited expressiveness. On the other hand, classical black-box methods, typically relying on Neural Networks (NNs) nowadays, often achieve impressive performance, even at scale, by deriving statistical patterns from data. However, they remain completely oblivious to the underlying physical laws, which may lead to potentially catastrophic failures if decisions for real-world physical systems are based on them. Physically Consistent Neural Networks (PCNNs) were recently developed to address these aforementioned issues, ensuring physical consistency while still leveraging NNs to attain state-of-the-art accuracy.
In this work, we scale PCNNs to model building temperature dynamics and propose a thorough comparison with classical gray-box and black-box methods. More precisely, we design three distinct PCNN extensions, thereby exemplifying the modularity and flexibility of the architecture, and formally prove their physical consistency. In the presented case study, PCNNs are shown to achieve state-of-the-art accuracy, even outperforming classical NN-based models despite their constrained structure. Our investigations furthermore provide a clear illustration of NNs achieving seemingly good performance while remaining completely physics-agnostic, which can be misleading in practice. While this performance comes at the cost of computational complexity, PCNNs on the other hand show accuracy improvements of 17-35% compared to all other physically consistent methods, paving the way for scalable physically consistent models with state-of-the-art performance.
△ Less
Submitted 4 April, 2023; v1 submitted 23 December, 2022;
originally announced December 2022.
-
Computationally Efficient Reinforcement Learning: Targeted Exploration leveraging Simple Rules
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones
Abstract:
Model-free Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state-action space to find well-performing policies. On the other hand, we postulate that expert knowledge of the system often allows us to design simple rules we expect good policies to follow at all times. In this work, we hence propose a simple yet effective m…
▽ More
Model-free Reinforcement Learning (RL) generally suffers from poor sample complexity, mostly due to the need to exhaustively explore the state-action space to find well-performing policies. On the other hand, we postulate that expert knowledge of the system often allows us to design simple rules we expect good policies to follow at all times. In this work, we hence propose a simple yet effective modification of continuous actor-critic frameworks to incorporate such rules and avoid regions of the state-action space that are known to be suboptimal, thereby significantly accelerating the convergence of RL agents. Concretely, we saturate the actions chosen by the agent if they do not comply with our intuition and, critically, modify the gradient update step of the policy to ensure the learning process is not affected by the saturation step. On a room temperature control case study, it allows agents to converge to well-performing policies up to 6-7x faster than classical agents without computational overhead and while retaining good final performance.
△ Less
Submitted 12 September, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
CONFIG: Constrained Efficient Global Optimization for Closed-Loop Control System Optimization with Unmodeled Constraints
Authors:
Wenjie Xu,
Yuning Jiang,
Bratislav Svetozarevic,
Colin N. Jones
Abstract:
In this paper, the CONFIG algorithm, a simple and provably efficient constrained global optimization algorithm, is applied to optimize the closed-loop control performance of an unknown system with unmodeled constraints. Existing Gaussian process based closed-loop optimization methods, either can only guarantee local convergence (e.g., SafeOPT), or have no known optimality guarantee (e.g., constrai…
▽ More
In this paper, the CONFIG algorithm, a simple and provably efficient constrained global optimization algorithm, is applied to optimize the closed-loop control performance of an unknown system with unmodeled constraints. Existing Gaussian process based closed-loop optimization methods, either can only guarantee local convergence (e.g., SafeOPT), or have no known optimality guarantee (e.g., constrained expected improvement) at all, whereas the recently introduced CONFIG algorithm has been proven to enjoy a theoretical global optimality guarantee. In this study, we demonstrate the effectiveness of CONFIG algorithm in the applications. The algorithm is first applied to an artificial numerical benchmark problem to corroborate its effectiveness. It is then applied to a classical constrained steady-state optimization problem of a continuous stirred-tank reactor. Simulation results show that our CONFIG algorithm can achieve performance competitive with the popular CEI (Constrained Expected Improvement) algorithm, which has no known optimality guarantee. As such, the CONFIG algorithm offers a new tool, with both a provable global optimality guarantee and competitive empirical performance, to optimize the closed-loop control performance for a system with soft unmodeled constraints. Last, but not least, the open-source code is available as a python package to facilitate future applications.
△ Less
Submitted 18 December, 2022; v1 submitted 21 November, 2022;
originally announced November 2022.
-
Physically Consistent Neural ODEs for Learning Multi-Physics Systems
Authors:
Muhammad Zakwan,
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones,
Giancarlo Ferrari Trecate
Abstract:
Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most m…
▽ More
Despite the immense success of neural networks in modeling system dynamics from data, they often remain physics-agnostic black boxes. In the particular case of physical systems, they might consequently make physically inconsistent predictions, which makes them unreliable in practice. In this paper, we leverage the framework of Irreversible port-Hamiltonian Systems (IPHS), which can describe most multi-physics systems, and rely on Neural Ordinary Differential Equations (NODEs) to learn their parameters from data. Since IPHS models are consistent with the first and second principles of thermodynamics by design, so are the proposed Physically Consistent NODEs (PC-NODEs). Furthermore, the NODE training procedure allows us to seamlessly incorporate prior knowledge of the system properties in the learned dynamics. We demonstrate the effectiveness of the proposed method by learning the thermodynamics of a building from the real-world measurements and the dynamics of a simulated gas-piston system. Thanks to the modularity and flexibility of the IPHS framework, PC-NODEs can be extended to learn physically consistent models of multi-physics distributed systems.
△ Less
Submitted 11 November, 2022;
originally announced November 2022.
-
Near-optimal Deep Reinforcement Learning Policies from Data for Zone Temperature Control
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones
Abstract:
Replacing poorly performing existing controllers with smarter solutions will decrease the energy intensity of the building sector. Recently, controllers based on Deep Reinforcement Learning (DRL) have been shown to be more effective than conventional baselines. However, since the optimal solution is usually unknown, it is still unclear if DRL agents are attaining near-optimal performance in genera…
▽ More
Replacing poorly performing existing controllers with smarter solutions will decrease the energy intensity of the building sector. Recently, controllers based on Deep Reinforcement Learning (DRL) have been shown to be more effective than conventional baselines. However, since the optimal solution is usually unknown, it is still unclear if DRL agents are attaining near-optimal performance in general or if there is still a large gap to bridge.
In this paper, we investigate the performance of DRL agents compared to the theoretically optimal solution. To that end, we leverage Physically Consistent Neural Networks (PCNNs) as simulation environments, for which optimal control inputs are easy to compute. Furthermore, PCNNs solely rely on data to be trained, avoiding the difficult physics-based modeling phase, while retaining physical consistency. Our results hint that DRL agents not only clearly outperform conventional rule-based controllers, they furthermore attain near-optimal performance.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Physically Consistent Neural Networks for building thermal modeling: theory and analysis
Authors:
Loris Di Natale,
Bratislav Svetozarevic,
Philipp Heer,
Colin N. Jones
Abstract:
Due to their high energy intensity, buildings play a major role in the current worldwide energy transition. Building models are ubiquitous since they are needed at each stage of the life of buildings, i.e. for design, retrofitting, and control operations. Classical white-box models, based on physical equations, are bound to follow the laws of physics but the specific design of their underlying str…
▽ More
Due to their high energy intensity, buildings play a major role in the current worldwide energy transition. Building models are ubiquitous since they are needed at each stage of the life of buildings, i.e. for design, retrofitting, and control operations. Classical white-box models, based on physical equations, are bound to follow the laws of physics but the specific design of their underlying structure might hinder their expressiveness and hence their accuracy. On the other hand, black-box models are better suited to capture nonlinear building dynamics and thus can often achieve better accuracy, but they require a lot of data and might not follow the laws of physics, a problem that is particularly common for neural network (NN) models. To counter this known generalization issue, physics-informed NNs have recently been introduced, where researchers introduce prior knowledge in the structure of NNs to ground them in known underlying physical laws and avoid classical NN generalization issues.
In this work, we present a novel physics-informed NN architecture, dubbed Physically Consistent NN (PCNN), which only requires past operational data and no engineering overhead, including prior knowledge in a linear module running in parallel to a classical NN. We formally prove that such networks are physically consistent - by design and even on unseen data - with respect to different control inputs and temperatures outside and in neighboring zones. We demonstrate their performance on a case study, where the PCNN attains an accuracy up to 40% better than a classical physics-based resistance-capacitance model on 3-day long prediction horizons. Furthermore, despite their constrained structure, PCNNs attain similar performance to classical NNs on the validation data, overfitting the training data less and retaining high expressiveness to tackle the generalization issue.
△ Less
Submitted 11 July, 2022; v1 submitted 6 December, 2021;
originally announced December 2021.
-
VABO: Violation-Aware Bayesian Optimization for Closed-Loop Control Performance Optimization with Unmodeled Constraints
Authors:
Wenjie Xu,
Colin N Jones,
Bratislav Svetozarevic,
Christopher R. Laughman,
Ankush Chakrabarty
Abstract:
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints. In this paper, we…
▽ More
We study the problem of performance optimization of closed-loop control systems with unmodeled dynamics. Bayesian optimization (BO) has been demonstrated effective for improving closed-loop performance by automatically tuning controller gains or reference setpoints in a model-free manner. However, BO methods have rarely been tested on dynamical systems with unmodeled constraints. In this paper, we propose a violation-aware BO algorithm (VABO) that optimizes closed-loop performance while simultaneously learning constraint-feasible solutions. Unlike classical constrained BO methods which allow an unlimited constraint violations, or safe BO algorithms that are conservative and try to operate with near-zero violations, we allow budgeted constraint violations to improve constraint learning and accelerate optimization. We demonstrate the effectiveness of our proposed VABO method for energy minimization of industrial vapor compression systems.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Data-driven control of room temperature and bidirectional EV charging using deep reinforcement learning: simulations and experiments
Authors:
B. Svetozarevic,
C. Baumann,
S. Muntwiler,
L. Di Natale,
M. Zeilinger,
P. Heer
Abstract:
This work presents a fully data-driven, black-box pipeline to obtain an optimal control policy for a multi-loop building control problem based on historical building and weather data, thus without the need for complex physics-based modelling. We demonstrate the method for joint control of room temperature and bidirectional EV charging to maximize the occupant thermal comfort and energy savings whi…
▽ More
This work presents a fully data-driven, black-box pipeline to obtain an optimal control policy for a multi-loop building control problem based on historical building and weather data, thus without the need for complex physics-based modelling. We demonstrate the method for joint control of room temperature and bidirectional EV charging to maximize the occupant thermal comfort and energy savings while leaving enough energy in the EV battery for the next trip. We modelled the room temperature with a recurrent neural network and EV charging with a piece-wise linear function. Using these models as a simulation environment, we applied a deep reinforcement learning (DRL) algorithm to obtain an optimal control policy. The learnt policy achieves on average 17% energy savings over the heating season and 19% better comfort satisfaction than a standard RB room temperature controller. When a bidirectional EV is additionally connected and a two-tariff electricity pricing is applied, the MIMO DRL policy successfully leverages the battery and decreases the overall cost of electricity compared to two standard RB controllers, one controlling the room temperature and another controlling the bidirectional EV (dis-)charging. Finally, we demonstrate a successful transfer of the learnt DRL policy from simulation onto a real building, the DFAB HOUSE at Empa Duebendorf in Switzerland, achieving up to 30% energy savings while maintaining similar comfort levels compared to a conventional RB room temperature controller over three weeks during the heating season.
△ Less
Submitted 17 June, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.