-
UCB-driven Utility Function Search for Multi-objective Reinforcement Learning
Authors:
Yucheng Shi,
Alexandros Agapitos,
David Lynch,
Giorgio Cruciata,
Cengis Hasan,
Hao Wang,
Yayu Yao,
Aleksandar Milenovic
Abstract:
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours that trade-off between multiple, possibly conflicting, objectives. MORL based on decomposition is a family of solution methods that employ a number of utility functions to decompose the multi-objective problem into individual single-objective problems solved simultaneously in order to appr…
▽ More
In Multi-objective Reinforcement Learning (MORL) agents are tasked with optimising decision-making behaviours that trade-off between multiple, possibly conflicting, objectives. MORL based on decomposition is a family of solution methods that employ a number of utility functions to decompose the multi-objective problem into individual single-objective problems solved simultaneously in order to approximate a Pareto front of policies. We focus on the case of linear utility functions parameterised by weight vectors w. We introduce a method based on Upper Confidence Bound to efficiently search for the most promising weight vectors during different stages of the learning process, with the aim of maximising the hypervolume of the resulting Pareto front. The proposed method is shown to outperform various MORL baselines on Mujoco benchmark problems across different random seeds. The code is online at: https://github.com/SYCAMORE-1/ucb-MOPPO.
△ Less
Submitted 16 May, 2024; v1 submitted 1 May, 2024;
originally announced May 2024.
-
Continual Model-based Reinforcement Learning for Data Efficient Wireless Network Optimisation
Authors:
Cengis Hasan,
Alexandros Agapitos,
David Lynch,
Alberto Castagna,
Giorgio Cruciata,
Hao Wang,
Aleksandar Milenovic
Abstract:
We present a method that addresses the pain point of long lead-time required to deploy cell-level parameter optimisation policies to new wireless network sites. Given a sequence of action spaces represented by overlap** subsets of cell-level configuration parameters provided by domain experts, we formulate throughput optimisation as Continual Reinforcement Learning of control policies. Simulatio…
▽ More
We present a method that addresses the pain point of long lead-time required to deploy cell-level parameter optimisation policies to new wireless network sites. Given a sequence of action spaces represented by overlap** subsets of cell-level configuration parameters provided by domain experts, we formulate throughput optimisation as Continual Reinforcement Learning of control policies. Simulation results suggest that the proposed system is able to shorten the end-to-end deployment lead-time by two-fold compared to a reinitialise-and-retrain baseline without any drop in optimisation gain.
△ Less
Submitted 30 April, 2024;
originally announced April 2024.
-
Rate-Induced Tip** in Heterogeneous Reaction-Diffusion Systems: An Invariant Manifold Framework and Geographically Shifting Ecosystems
Authors:
Cris R. Hasan,
Ruaidhrà Mac Cárthaigh,
Sebastian Wieczorek
Abstract:
We propose a framework to study tip** points in reaction-diffusion equations (RDEs) in one spatial dimension, where the reaction term decays in space (asymptotically homogeneous) and varies linearly with time (nonautonomous) due to an external input. A compactification of the moving-frame coordinate together with Lin's method to construct heteroclinic orbits along intersections of stable and uns…
▽ More
We propose a framework to study tip** points in reaction-diffusion equations (RDEs) in one spatial dimension, where the reaction term decays in space (asymptotically homogeneous) and varies linearly with time (nonautonomous) due to an external input. A compactification of the moving-frame coordinate together with Lin's method to construct heteroclinic orbits along intersections of stable and unstable invariant manifolds allows us to (i) obtain multiple coexisting pulse and front solutions for the RDE by computing heteroclinic orbits connecting equilibria at negative and positive infinity in the compactified moving-frame ordinary differential equation, (ii) detect tip** points as dangerous bifurcations of such heteroclinic orbits, and (iii) obtain tip** diagrams by numerical continuation of such bifurcations. We apply our framework to an illustrative model of a habitat patch that features an Allee effect in population growth and is geographically shrinking or shifting due to human activity or climate change. Thus, we identify two classes of tip** points to extinction: bifurcation-induced tip** (B-tip**) when the shrinking habitat falls below some critical length and rate-induced tip** (R-tip**) when the shifting habitat exceeds some critical speed. We explore two-parameter R-tip** diagrams to understand how the critical speed depends on the size of the habitat patch and the dispersal rate of the population, uncover parameter regions where the shifting population survives, and relate these regions to the invasion speed in an infinite homogeneous habitat. Furthermore, we contrast the tip** instabilities with gradual transitions to extinction found for logistic population growth without the Allee effect.
△ Less
Submitted 3 December, 2023; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Incentive Mechanism Design for Federated Learning: Hedonic Game Approach
Authors:
Cengis Hasan
Abstract:
Incentive mechanism design is crucial for enabling federated learning. We deal with clustering problem of agents contributing to federated learning setting. Assuming agents behave selfishly, we model their interaction as a stable coalition partition problem using hedonic games where agents and clusters are the players and coalitions, respectively. We address the following question: is there a fami…
▽ More
Incentive mechanism design is crucial for enabling federated learning. We deal with clustering problem of agents contributing to federated learning setting. Assuming agents behave selfishly, we model their interaction as a stable coalition partition problem using hedonic games where agents and clusters are the players and coalitions, respectively. We address the following question: is there a family of hedonic games ensuring a Nash-stable coalition partition? We propose the Nash-stable set which determines the family of hedonic games possessing at least one Nash-stable partition, and analyze the conditions of non-emptiness of the Nash-stable set. Besides, we deal with the decentralized clustering. We formulate the problem as a non-cooperative game and prove the existence of a potential game.
△ Less
Submitted 7 May, 2021; v1 submitted 24 January, 2021;
originally announced January 2021.
-
Numerical continuation of spiral waves in heteroclinic networks of cyclic dominance
Authors:
Cris R. Hasan,
Hinke M. Osinga,
Claire M. Postlethwaite,
Alastair M. Rucklidge
Abstract:
Heteroclinic-induced spiral waves may arise in systems of partial differential equations that exhibit robust heteroclinic cycles between spatially uniform equilibria. Robust heteroclinic cycles arise naturally in systems with invariant subspaces and their robustness is considered with respect to perturbations that preserve these invariances. We make use of particular symmetries in the system to fo…
▽ More
Heteroclinic-induced spiral waves may arise in systems of partial differential equations that exhibit robust heteroclinic cycles between spatially uniform equilibria. Robust heteroclinic cycles arise naturally in systems with invariant subspaces and their robustness is considered with respect to perturbations that preserve these invariances. We make use of particular symmetries in the system to formulate a relatively low-dimensional spatial two-point boundary-value problem in Fourier space that can be solved efficiently in conjunction with numerical continuation. The standard numerical set-up is formulated on an annulus with small inner radius, and Neumann boundary conditions are used on both inner and outer radial boundaries. We derive and implement alternative boundary conditions that allow for continuing the inner radius to zero and so compute spiral waves on a full disk. As our primary example, we investigate the formation of heteroclinic-induced spiral waves in a reaction-diffusion model that describes the spatiotemporal evolution of three competing populations in a two-dimensional spatial domain--much like the Rock-Paper-Scissors game. We further illustrate the efficiency of our method with the computation of spiral waves in a larger network of cyclic dominance between five competing species, which describes the so-called Rock-Paper-Scissors-Lizard-Spock game.
△ Less
Submitted 15 November, 2021; v1 submitted 11 October, 2020;
originally announced October 2020.
-
Spatiotemporal stability of periodic travelling waves in a heteroclinic-cycle model
Authors:
Cris R. Hasan,
Hinke M. Osinga,
Claire M. Postlethwaite,
Alastair M. Rucklidge
Abstract:
We study a Rock-Paper-Scissors model for competing populations that exhibits travelling waves in one spatial dimension and spiral waves in two spatial dimensions. A characteristic feature of the model is the presence of a robust heteroclinic cycle that involves three saddle equilibria. The model also has travelling fronts that are heteroclinic connections between two equilibria in a moving frame o…
▽ More
We study a Rock-Paper-Scissors model for competing populations that exhibits travelling waves in one spatial dimension and spiral waves in two spatial dimensions. A characteristic feature of the model is the presence of a robust heteroclinic cycle that involves three saddle equilibria. The model also has travelling fronts that are heteroclinic connections between two equilibria in a moving frame of reference, but these fronts are unstable. However, we find that large-wavelength travelling waves can be stable in spite of being made up of three of these unstable travelling fronts. In this paper, we focus on determining the essential spectrum (and hence, stability) of large-wavelength travelling waves in a cyclic competition model with one spatial dimension. We compute the curve of transitions from stability to instability with the continuation scheme developed by Rademacher et al. (2007 Physica D 229 166-83). We build on this scheme and develop a method for computing what we call belts of instability, which are indicators of the growth rate of unstable travelling waves. Our results from the stability analysis are verified by direct simulation for travelling waves as well as associated spiral waves. We also show how the computed growth rates accurately quantify the instabilities of the travelling waves.
△ Less
Submitted 13 November, 2021; v1 submitted 23 November, 2019;
originally announced November 2019.
-
Deadline-aware Power Management in Data Centers
Authors:
Cengis Hasan,
Zygmunt J. Haas
Abstract:
We study the dynamic power optimization problem in data centers. We formulate and solve the following offline problem: in which slot which server has to be assigned to which job; and in which slot which server has to be switched ON or OFF so that the total power is optimal for some time horizon. We show that the offline problem is a new version of generalized assignment problem including new const…
▽ More
We study the dynamic power optimization problem in data centers. We formulate and solve the following offline problem: in which slot which server has to be assigned to which job; and in which slot which server has to be switched ON or OFF so that the total power is optimal for some time horizon. We show that the offline problem is a new version of generalized assignment problem including new constraints issuing from deadline characteristics of jobs and difference of activation energy of servers. We propose an online algorithm that solves the problem heuristically and compare it to randomized routing.
△ Less
Submitted 13 July, 2015;
originally announced July 2015.
-
Green Broadcast Transmission in Cellular Networks: A Game Theoretic Approach
Authors:
Cengis Hasan,
Jean-Marie Gorce,
Eitan Altman
Abstract:
This paper addresses the mobile assignment problem in a multi-cell broadcast transmission seeking minimal total power consumption by considering both transmission and operational powers. While the large scale nature of the problem entails to find distributed solutions, game theory appears to be a natural tool. We propose a novel distributed algorithm based on group formation games, called \textit{…
▽ More
This paper addresses the mobile assignment problem in a multi-cell broadcast transmission seeking minimal total power consumption by considering both transmission and operational powers. While the large scale nature of the problem entails to find distributed solutions, game theory appears to be a natural tool. We propose a novel distributed algorithm based on group formation games, called \textit{the hedonic decision algorithm}. This formalism is constructive: a new class of group formation games is introduced where the utility of players within a group is separable and symmetric being a generalized version of parity-affiliation games. The proposed hedonic decision algorithm is also suitable for any set-covering problem. To evaluate the performance of our algorithm, we propose other approaches to which our algorithm is compared. We first develop a centralized recursive algorithm called \textit{the hold minimum} being able to find the optimal assignments. However, because of the NP-hard complexity of the mobile assignment problem, we propose a centralized polynomial-time heuristic algorithm called \textit{the column control} producing near-optimal solutions when the operational power costs of base stations are taken into account. Starting from this efficient centralized approach, a \textit{distributed column control algorithm} is also proposed and compared to \textit{the hedonic decision algorithm}. We also implement the nearest base station algorithm which is very simple and intuitive and efficiently manage fast-moving users served by macro BSs. Extensive simulation results are provided and highlight the relative performance of these algorithms. The simulated scenarios are done according to Poisson point processes for both mobiles and base stations.
△ Less
Submitted 3 August, 2014;
originally announced August 2014.
-
On the Nash Stability in the Hedonic Coalition Formation Games
Authors:
Cengis Hasan,
Jean-Marie Gorce,
Eitan Altman
Abstract:
This paper studies the Nash stability in hedonic coalition formation games. We address the following issue: for a general problem formulation, is there any utility allocation method ensuring a Nash-stable partition? We propose the definition of the Nash-stable core. We study the conditions for having a non-empty Nash-stable core. More precisely, we show how relaxed efficiency in utility sharing me…
▽ More
This paper studies the Nash stability in hedonic coalition formation games. We address the following issue: for a general problem formulation, is there any utility allocation method ensuring a Nash-stable partition? We propose the definition of the Nash-stable core. We study the conditions for having a non-empty Nash-stable core. More precisely, we show how relaxed efficiency in utility sharing method allows to ensure a non-empty Nash-stable core.
△ Less
Submitted 31 October, 2017; v1 submitted 14 May, 2014;
originally announced May 2014.
-
The Coalitional Switch off Game of Service Providers
Authors:
Cengis Hasan,
Eitan Altman,
Jean-Marie Gorce,
Dimitrios Tsilimantos,
Manjesh K. Hanawal
Abstract:
This paper studies a significant problem in green networking called switching off base stations in case of cooperating service providers by means of stochastic geometric and coalitional game tools. The coalitional game herein considered is played by service providers who cooperate in switching off base stations. When they cooperate, any mobile is associated to the nearest BS of any service provide…
▽ More
This paper studies a significant problem in green networking called switching off base stations in case of cooperating service providers by means of stochastic geometric and coalitional game tools. The coalitional game herein considered is played by service providers who cooperate in switching off base stations. When they cooperate, any mobile is associated to the nearest BS of any service provider. Given a Poisson point process deployment model of nodes over an area and switching off base stations with some probability, it is proved that the distribution of signal to interference plus noise ratio remains unchanged while the transmission power is increased up to preserving the quality of service. The coalitional game behavior of a typical player is called to be \emph{hedonic} if the gain of any player depends solely on the members of the coalition to which the player belongs, thus, the coalitions form as a result of the preferences of the players over their possible coalitions' set. We also introduce a novel concept which is called the Nash-stable core containing those gain allocation methods that result in Nash-stable partitions. By this way, we always guarantee Nash stability. We study the non-emptiness of the Nash-stable core. Assuming the choice of a coalition is performed only by one player in a point of time, we prove that the Nash-stable core is non-empty when a player chooses its coalition in its turn, the player gains zero utility if the chosen coalition is visited before by itself.
△ Less
Submitted 30 July, 2013; v1 submitted 24 July, 2012;
originally announced July 2012.
-
On an Efficient Marie Curie Initial Training Network
Authors:
Ali Dinler,
Cengis Hasan,
Kamil Orucoglu,
Robert W. Barber
Abstract:
Collaboration in science is one of the key components of world-class research. The European Commission supports collaboration between institutions and funds young researchers appointed by these partner institutions. In these networks, the mobility of the researchers is enforced in order to enhance the collaboration. In this study, based on a real Marie Curie Initial Training Network, an algorithm…
▽ More
Collaboration in science is one of the key components of world-class research. The European Commission supports collaboration between institutions and funds young researchers appointed by these partner institutions. In these networks, the mobility of the researchers is enforced in order to enhance the collaboration. In this study, based on a real Marie Curie Initial Training Network, an algorithm to construct a collaboration network is investigated. The algorithm suggests that a strongly efficient expansion leads to a star-like network. The results might help the design of efficient collaboration networks for future Initial Training Network proposals.
△ Less
Submitted 11 July, 2011;
originally announced July 2011.