Search | arXiv e-print repository

MAHTM: A Multi-Agent Framework for Hierarchical Transactive Microgrids

Authors: Nicolas Cuadrado, Roberto Gutierrez, Yongli Zhu, Martin Takac

Abstract: Integrating variable renewable energy into the grid has posed challenges to system operators in achieving optimal trade-offs among energy availability, cost affordability, and pollution controllability. This paper proposes a multi-agent reinforcement learning framework for managing energy transactions in microgrids. The framework addresses the challenges above: it seeks to optimize the usage of av… ▽ More Integrating variable renewable energy into the grid has posed challenges to system operators in achieving optimal trade-offs among energy availability, cost affordability, and pollution controllability. This paper proposes a multi-agent reinforcement learning framework for managing energy transactions in microgrids. The framework addresses the challenges above: it seeks to optimize the usage of available resources by minimizing the carbon footprint while benefiting all stakeholders. The proposed architecture consists of three layers of agents, each pursuing different objectives. The first layer, comprised of prosumers and consumers, minimizes the total energy cost. The other two layers control the energy price to decrease the carbon impact while balancing the consumption and production of both renewable and conventional energy. This framework also takes into account fluctuations in energy demand and supply. △ Less

Submitted 14 September, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

Comments: ICLR 2023 Workshop: Tackling Climate Change with Machine Learning

ACM Class: I.2.8

arXiv:2211.02939 [pdf, other]

Optimal Power Flow Pursuit in the Alternating Current Model

Authors: Jie Liu, Antonio Bellon, Andrea Simonetto, Martin Takac, Jakub Marecek

Abstract: Transmission-constrained problems in power systems can be cast as polynomial optimization problems whose coefficients vary over time. We consider the complications therein and suggest several approaches. On the example of the alternating-current optimal power flows (ACOPFs), we illustrate one of the approaches in detail. For the time-varying ACOPF, we provide an upper bound for the difference betw… ▽ More Transmission-constrained problems in power systems can be cast as polynomial optimization problems whose coefficients vary over time. We consider the complications therein and suggest several approaches. On the example of the alternating-current optimal power flows (ACOPFs), we illustrate one of the approaches in detail. For the time-varying ACOPF, we provide an upper bound for the difference between the optimal cost for a relaxation using the most recent data and the current approximate optimal cost generated by our algorithm. This bound is a function of the properties of the instance and the rate of change of the coefficients over time. Moreover, we also bound the number of floating-point operations to perform between two subsequent updates to ensure a bounded error. △ Less

Submitted 22 September, 2023; v1 submitted 5 November, 2022; originally announced November 2022.

Comments: A journal version of Liu et al [arXiv:1710.07119, PSCC 2018] taking into account our recent work [arXiv:2104.05445 and arXiv:2210.08387]

arXiv:2206.02507 [pdf, other]

Learning to Control under Time-Varying Environment

Authors: Yuzhen Han, Ruben Solozabal, **g Dong, Xingyu Zhou, Martin Takac, Bin Gu

Abstract: This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from non… ▽ More This paper investigates the problem of regret minimization in linear time-varying (LTV) dynamical systems. Due to the simultaneous presence of uncertainty and non-stationarity, designing online control algorithms for unknown LTV systems remains a challenging task. At a cost of NP-hard offline planning, prior works have introduced online convex optimization algorithms, although they suffer from nonparametric rate of regret. In this paper, we propose the first computationally tractable online algorithm with regret guarantees that avoids offline planning over the state linear feedback policies. Our algorithm is based on the optimism in the face of uncertainty (OFU) principle in which we optimistically select the best model in a high confidence region. Our algorithm is then more explorative when compared to previous approaches. To overcome non-stationarity, we propose either a restarting strategy (R-OFU) or a sliding window (SW-OFU) strategy. With proper configuration, our algorithm is attains sublinear regret $O(T^{2/3})$. These algorithms utilize data from the current phase for tracking variations on the system dynamics. We corroborate our theoretical findings with numerical experiments, which highlight the effectiveness of our methods. To the best of our knowledge, our study establishes the first model-based online algorithm with regret guarantees under LTV dynamical systems. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2012.10480 [pdf, other]

Distributed Map Classification using Local Observations

Authors: Guangyi Liu, Arash Amini, Martin Takáč, Héctor Muñoz-Avila, Nader Motee

Abstract: We consider the problem of classifying a map using a team of communicating robots. It is assumed that all robots have localized visual sensing capabilities and can exchange their information with neighboring robots. Using a graph decomposition technique, we proposed an offline learning structure that makes every robot capable of communicating with and fusing information from its neighbors to plan… ▽ More We consider the problem of classifying a map using a team of communicating robots. It is assumed that all robots have localized visual sensing capabilities and can exchange their information with neighboring robots. Using a graph decomposition technique, we proposed an offline learning structure that makes every robot capable of communicating with and fusing information from its neighbors to plan its next move towards the most informative parts of the environment for map classification purposes. The main idea is to decompose a given undirected graph into a union of directed star graphs and train robots w.r.t a bounded number of star graphs. This will significantly reduce the computational cost of offline training and makes learning scalable (independent of the number of robots). Our approach is particularly useful for fast map classification in large environments using a large number of communicating robots. We validate the usefulness of our proposed methodology through extensive simulations. △ Less

Submitted 10 March, 2021; v1 submitted 18 December, 2020; originally announced December 2020.

arXiv:1909.09705 [pdf, other]

A Layered Architecture for Active Perception: Image Classification using Deep Reinforcement Learning

Authors: Hossein K. Mousavi, Guangyi Liu, Weihang Yuan, Martin Takáč, Héctor Muñoz-Avila, Nader Motee

Abstract: We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that… ▽ More We propose a planning and perception mechanism for a robot (agent), that can only observe the underlying environment partially, in order to solve an image classification problem. A three-layer architecture is suggested that consists of a meta-layer that decides the intermediate goals, an action-layer that selects local actions as the agent navigates towards a goal, and a classification-layer that evaluates the reward and makes a prediction. We design and implement these layers using deep reinforcement learning. A generalized policy gradient algorithm is utilized to learn the parameters of these layers to maximize the expected reward. Our proposed methodology is tested on the MNIST dataset of handwritten digits, which provides us with a level of explainability while interpreting the agent's intermediate goals and course of action. △ Less

Submitted 20 September, 2019; originally announced September 2019.

Comments: Submitted to ICRA-2020

arXiv:1905.04835 [pdf, other]

Multi-Agent Image Classification via Reinforcement Learning

Authors: Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

Abstract: We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allo… ▽ More We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allowed to exchange information with their neighboring agents to update their own beliefs. It is shown how reinforcement learning techniques can be utilized to achieve decentralized implementation of the classification problem by running a decentralized consensus protocol. Our experimental results on the MNIST handwritten digit dataset demonstrates the effectiveness of our proposed framework. △ Less

Submitted 6 August, 2019; v1 submitted 12 May, 2019; originally announced May 2019.

Comments: Preprint of the paper to be published in IROS'19 proceedings

Showing 1–6 of 6 results for author: Takac, M