Search | arXiv e-print repository

Efficient Data-Driven MPC for Demand Response of Commercial Buildings

Authors: Marie-Christine Paré, Vasken Dermardiros, Antoine Lesage-Landry

Abstract: Model predictive control (MPC) has been shown to significantly improve the energy efficiency of buildings while maintaining thermal comfort. Data-driven approaches based on neural networks have been proposed to facilitate system modelling. However, such approaches are generally nonconvex and result in computationally intractable optimization problems. In this work, we design a readily implementabl… ▽ More Model predictive control (MPC) has been shown to significantly improve the energy efficiency of buildings while maintaining thermal comfort. Data-driven approaches based on neural networks have been proposed to facilitate system modelling. However, such approaches are generally nonconvex and result in computationally intractable optimization problems. In this work, we design a readily implementable energy management method for small commercial buildings. We then leverage our approach to formulate a real-time demand bidding strategy. We propose a data-driven and mixed-integer convex MPC which is solved via derivative-free optimization given a limited computational time of 5 minutes to respect operational constraints. We consider rooftop unit heating, ventilation, and air conditioning systems with discrete controls to accurately model the operation of most commercial buildings. Our approach uses an input convex recurrent neural network to model the thermal dynamics. We apply our approach in several demand response (DR) settings, including a demand bidding, a time-of-use, and a critical peak rebate program. Controller performance is evaluated on a state-of-the-art building simulation. The proposed approach improves thermal comfort while reducing energy consumption and cost through DR participation, when compared to other data-driven approaches or a set-point controller. △ Less

Submitted 15 May, 2024; v1 submitted 28 January, 2024; originally announced January 2024.

arXiv:2310.07557 [pdf, other]

Quality of Service-Constrained Online Routing in High Throughput Satellites

Authors: Olivier Bélanger, Olfa Ben Yahia, Stéphane Martel, Antoine Lesage-Landry, Gunes Karabulut Kurt

Abstract: High throughput satellites (HTSs) outpace traditional satellites due to their multi-beam transmission. The rise of low Earth orbit mega constellations amplifies HTS data rate demands to terabits/second with acceptable latency. This surge in data rate necessitates multiple modems, often exceeding single device capabilities. Consequently, satellites employ several processors, forming a complex packe… ▽ More High throughput satellites (HTSs) outpace traditional satellites due to their multi-beam transmission. The rise of low Earth orbit mega constellations amplifies HTS data rate demands to terabits/second with acceptable latency. This surge in data rate necessitates multiple modems, often exceeding single device capabilities. Consequently, satellites employ several processors, forming a complex packet-switch network. This can lead to potential internal congestion and challenges in adhering to strict quality of service (QoS) constraints. While significant research exists on constellation-level routing, a literature gap remains on the internal routing within a single HTS. The intricacy of this internal network architecture presents a significant challenge to achieve high data rates. This paper introduces an online optimal flow allocation and scheduling method for HTSs. The problem is presented as a multi-commodity flow instance with different priority data streams. An initial full time horizon model is proposed as a benchmark. We apply a model predictive control (MPC) approach to enable adaptive routing based on current information and the forecast within the prediction time horizon while allowing for deviation of the latter. Importantly, MPC is inherently suited to handle uncertainty in incoming flows. Our approach minimizes the packet loss by optimally and adaptively managing the priority queue schedulers and flow exchanges between satellite processing modules. Central to our method is a routing model focusing on optimal priority scheduling to enhance data rates and maintain QoS. The model's stages are critically evaluated, and results are compared to traditional methods via numerical simulations. Through simulations, our method demonstrates performance nearly on par with the hindsight optimum, showcasing its efficiency and adaptability in addressing satellite communication challenges. △ Less

Submitted 31 May, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

Comments: Added constraints and updated numerical results. Layout improvement

arXiv:2306.10835 [pdf, other]

Online Dynamic Submodular Optimization

Authors: Antoine Lesage-Landry, Julien Pallage

Abstract: We propose new algorithms with provable performance for online binary optimization subject to general constraints and in dynamic settings. We consider the subset of problems in which the objective function is submodular. We propose the online submodular greedy algorithm (OSGA) which solves to optimality an approximation of the previous round loss function to avoid the NP-hardness of the original p… ▽ More We propose new algorithms with provable performance for online binary optimization subject to general constraints and in dynamic settings. We consider the subset of problems in which the objective function is submodular. We propose the online submodular greedy algorithm (OSGA) which solves to optimality an approximation of the previous round loss function to avoid the NP-hardness of the original problem. We extend OSGA to a generic approximation function. We show that OSGA has a dynamic regret bound similar to the tightest bounds in online convex optimization with respect to the time horizon and the cumulative round optimum variation. For instances where no approximation exists or a computationally simpler implementation is desired, we design the online submodular projected gradient descent (OSPGD) by leveraging the Lovaśz extension. We obtain a regret bound that is akin to the conventional online gradient descent (OGD). Finally, we numerically test our algorithms in two power system applications: fast-timescale demand response and real-time distribution network reconfiguration. △ Less

Submitted 2 May, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

arXiv:2301.02593 [pdf, other]

Multi-Agent Reinforcement Learning for Fast-Timescale Demand Response of Residential Loads

Authors: Vincent Mai, Philippe Maisonneuve, Tianyu Zhang, Hadi Nekoei, Liam Paull, Antoine Lesage-Landry

Abstract: To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints str… ▽ More To integrate high amounts of renewable energy resources, electrical power grids must be able to cope with high amplitude, fast timescale variations in power generation. Frequency regulation through demand response has the potential to coordinate temporally flexible loads, such as air conditioners, to counteract these variations. Existing approaches for discrete control with dynamic constraints struggle to provide satisfactory performance for fast timescale action selection with hundreds of agents. We propose a decentralized agent trained with multi-agent proximal policy optimization with localized communication. We explore two communication frameworks: hand-engineered, or learned through targeted multi-agent communication. The resulting policies perform well and robustly for frequency regulation, and scale seamlessly to arbitrary numbers of houses for constant processing times. △ Less

Submitted 6 January, 2023; originally announced January 2023.

Comments: Presented as an extended abstract at AAMAS 2023

arXiv:2104.09343 [pdf, other]

Approximated Multi-Agent Fitted Q Iteration

Authors: Antoine Lesage-Landry, Duncan S. Callaway

Abstract: We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximated multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, learned Q-function. In each iteration and policy evaluation, AMAFQI… ▽ More We formulate an efficient approximation for multi-agent batch reinforcement learning, the approximated multi-agent fitted Q iteration (AMAFQI). We present a detailed derivation of our approach. We propose an iterative policy search and show that it yields a greedy policy with respect to multiple approximations of the centralized, learned Q-function. In each iteration and policy evaluation, AMAFQI requires a number of computations that scales linearly with the number of agents whereas the analogous number of computations increase exponentially for the fitted Q iteration (FQI), a commonly used approaches in batch reinforcement learning. This property of AMAFQI is fundamental for the design of a tractable multi-agent approach. We evaluate the performance of AMAFQI and compare it to FQI in numerical simulations. The simulations illustrate the significant computation time reduction when using AMAFQI instead of FQI in multi-agent problems and corroborate the similar performance of both approaches. △ Less

Submitted 4 April, 2023; v1 submitted 19 April, 2021; originally announced April 2021.

arXiv:2005.02274 [pdf, other]

doi 10.1109/TAC.2021.3061625

Online Convex Optimization with Binary Constraints

Authors: Antoine Lesage-Landry, Joshua A. Taylor, Duncan S. Callaway

Abstract: We consider online optimization with binary decision variables and convex loss functions. We design a new algorithm, binary online gradient descent (bOGD) and bound its expected dynamic regret. We provide a regret bound that holds for any time horizon and a specialized bound for finite time horizons. First, we present the regret as the sum of the relaxed, continuous round optimum tracking error an… ▽ More We consider online optimization with binary decision variables and convex loss functions. We design a new algorithm, binary online gradient descent (bOGD) and bound its expected dynamic regret. We provide a regret bound that holds for any time horizon and a specialized bound for finite time horizons. First, we present the regret as the sum of the relaxed, continuous round optimum tracking error and the rounding error of our update in which the former asymptomatically decreases with time under certain conditions. Then, we derive a finite-time bound that is sublinear in time and linear in the cumulative variation of the relaxed, continuous round optima. We apply bOGD to demand response with thermostatically controlled loads, in which binary constraints model discrete on/off settings. We also model uncertainty and varying load availability, which depend on temperature deadbands, lockout of cooling units and manual overrides. We test the performance of bOGD in several simulations based on demand response. The simulations corroborate that the use of randomization in bOGD does not significantly degrade performance while making the problem more tractable. △ Less

Submitted 19 February, 2021; v1 submitted 5 May, 2020; originally announced May 2020.

Journal ref: IEEE Transactions on Automatic Control 66 (12): 6164 - 6170. December 2021

arXiv:2002.00099 [pdf, other]

doi 10.1109/LCSYS.2020.2989110

Dynamic and Distributed Online Convex Optimization for Demand Response of Commercial Buildings

Authors: Antoine Lesage-Landry, Duncan S. Callaway

Abstract: We extend the regret analysis of the online distributed weighted dual averaging (DWDA) algorithm [1] to the dynamic setting and provide the tightest dynamic regret bound known to date with respect to the time horizon for a distributed online convex optimization (OCO) algorithm. Our bound is linear in the cumulative difference between consecutive optima and does not depend explicitly on the time ho… ▽ More We extend the regret analysis of the online distributed weighted dual averaging (DWDA) algorithm [1] to the dynamic setting and provide the tightest dynamic regret bound known to date with respect to the time horizon for a distributed online convex optimization (OCO) algorithm. Our bound is linear in the cumulative difference between consecutive optima and does not depend explicitly on the time horizon. We use dynamic-online DWDA (D-ODWDA) and formulate a performance-guaranteed distributed online demand response approach for heating, ventilation, and air-conditioning (HVAC) systems of commercial buildings. We show the performance of our approach for fast timescale demand response in numerical simulations and obtain demand response decisions that closely reproduce the centralized optimal ones. △ Less

Submitted 17 April, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

Journal ref: IEEE Control Systems Letters, 4 (3): 632-637. July 2020

arXiv:1905.06263 [pdf, ps, other]

doi 10.1016/j.automatica.2019.108771

Predictive Online Convex Optimization

Authors: Antoine Lesage-Landry, Iman Shames, Joshua A. Taylor

Abstract: We incorporate future information in the form of the estimated value of future gradients in online convex optimization. This is motivated by demand response in power systems, where forecasts about the current round, e.g., the weather or the loads' behavior, can be used to improve on predictions made with only past observations. Specifically, we introduce an additional predictive step that follows… ▽ More We incorporate future information in the form of the estimated value of future gradients in online convex optimization. This is motivated by demand response in power systems, where forecasts about the current round, e.g., the weather or the loads' behavior, can be used to improve on predictions made with only past observations. Specifically, we introduce an additional predictive step that follows the standard online convex optimization step when certain conditions on the estimated gradient and descent direction are met. We show that under these conditions and without any assumptions on the predictability of the environment, the predictive update strictly improves on the performance of the standard update. We give two types of predictive update for various family of loss functions. We provide a regret bound for each of our predictive online convex optimization algorithms. Finally, we apply our framework to an example based on demand response which demonstrates its superior performance to a standard online convex optimization algorithm. △ Less

Submitted 29 November, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

Journal ref: Automatica, 113: 108771, March 2020

Showing 1–8 of 8 results for author: Lesage-Landry, A