-
Scaling Power Management in Cloud Data Centers: A Multi-Level Continuous-Time MDP Approach
Authors:
Behzad Chitsaz,
Ahmad Khonsari,
Masoumeh Moradian,
Aresh Dadlani,
Mohammad Sadegh Talebi
Abstract:
Power management in multi-server data centers~especially at scale is a vital issue of increasing importance in cloud computing paradigm. Existing studies mostly consider thresholds on the number of idle servers to switch the servers on or off and suffer from scalability issues. As a natural approach in view~of~the Markovian assumption, we present a multi-level continuous-time Markov decision proce…
▽ More
Power management in multi-server data centers~especially at scale is a vital issue of increasing importance in cloud computing paradigm. Existing studies mostly consider thresholds on the number of idle servers to switch the servers on or off and suffer from scalability issues. As a natural approach in view~of~the Markovian assumption, we present a multi-level continuous-time Markov decision process (CTMDP) model based on state aggregation of multi-server data centers with setup times that interestingly overcomes the inherent intractability of traditional MDP approaches due to their colossal state-action space. The beauty of the presented model is that, while it keeps loyalty to the Markovian behavior, it approximates the calculation of the transition probabilities in a way that keeps the accuracy of the results at a desirable level. Moreover, near-optimal performance is attained at the expense of the increased state-space dimensionality by tuning the number of levels in the multi-level approach. The simulation results were promising and confirm that in many scenarios of interest, the proposed approach attains noticeable improvements, namely a near 50% reduction in the size of CTMDP while yielding better rewards as compared to existing fixed threshold-based policies and aggregation methods.
△ Less
Submitted 19 July, 2023; v1 submitted 3 August, 2021;
originally announced August 2021.
-
Combinatorial Bandits Revisited
Authors:
Richard Combes,
M. Sadegh Talebi,
Alexandre Proutiere,
Marc Lelarge
Abstract:
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ES…
▽ More
This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice. In the adversarial setting under bandit feedback, we propose \textsc{CombEXP}, an algorithm with the same regret scaling as state-of-the-art algorithms, but with lower computational complexity for some combinatorial problems.
△ Less
Submitted 5 November, 2015; v1 submitted 11 February, 2015;
originally announced February 2015.
-
Stochastic Online Shortest Path Routing: The Value of Feedback
Authors:
M. Sadegh Talebi,
Zhenhua Zou,
Richard Combes,
Alexandre Proutiere,
Mikael Johansson
Abstract:
This paper studies online shortest path routing over multi-hop networks. Link costs or delays are time-varying and modeled by independent and identically distributed random processes, whose parameters are initially unknown. The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays. Our aim is to find a routing policy…
▽ More
This paper studies online shortest path routing over multi-hop networks. Link costs or delays are time-varying and modeled by independent and identically distributed random processes, whose parameters are initially unknown. The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays. Our aim is to find a routing policy that minimizes the regret (the cumulative difference of expected delay) between the path chosen by the policy and the unknown optimal path. We formulate the problem as a combinatorial bandit optimization problem and consider several scenarios that differ in where routing decisions are made and in the information available when making the decisions. For each scenario, we derive a tight asymptotic lower bound on the regret that has to be satisfied by any online routing policy. These bounds help us to understand the performance improvements we can expect when (i) taking routing decisions at each hop rather than at the source only, and (ii) observing per-link delays rather than end-to-end path delays. In particular, we show that (i) is of no use while (ii) can have a spectacular impact. Three algorithms, with a trade-off between computational complexity and performance, are proposed. The regret upper bounds of these algorithms improve over those of the existing algorithms, and they significantly outperform state-of-the-art algorithms in numerical experiments.
△ Less
Submitted 18 January, 2017; v1 submitted 27 September, 2013;
originally announced September 2013.
-
Spectrum Bandit Optimization
Authors:
Marc Lelarge,
Alexandre Proutiere,
M. Sadegh Talebi
Abstract:
We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each c…
▽ More
We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each channel and on each link, an optimal allocation would be obtained by solving an Integer Linear Program (ILP). When radio conditions are unknown a priori, we look for a sequential channel allocation policy that converges to the optimal allocation while minimizing on the way the throughput loss or {\it regret} due to the need for exploring sub-optimal allocations. We formulate this problem as a generic linear bandit problem, and analyze it first in a stochastic setting where radio conditions are driven by a stationary stochastic process, and then in an adversarial setting where radio conditions can evolve arbitrarily. We provide new algorithms in both settings and derive upper bounds on their regrets.
△ Less
Submitted 17 February, 2015; v1 submitted 27 February, 2013;
originally announced February 2013.
-
NUM-Based Rate Allocation for Streaming Traffic via Sequential Convex Programming
Authors:
Ali Sehati,
Mohammad Sadegh Talebi,
Ahmad Khonsari
Abstract:
In recent years, there has been an increasing demand for ubiquitous streaming like applications in data networks. In this paper, we concentrate on NUM-based rate allocation for streaming applications with the so-called S-curve utility functions. Due to non-concavity of such utility functions, the underlying NUM problem would be non-convex for which dual methods might become quite useless. To tackl…
▽ More
In recent years, there has been an increasing demand for ubiquitous streaming like applications in data networks. In this paper, we concentrate on NUM-based rate allocation for streaming applications with the so-called S-curve utility functions. Due to non-concavity of such utility functions, the underlying NUM problem would be non-convex for which dual methods might become quite useless. To tackle the non-convex problem, using elementary techniques we make the utility of the network concave, however this results in reverse-convex constraints which make the problem non-convex. To deal with such a transformed NUM, we leverage Sequential Convex Programming (SCP) approach to approximate the non-convex problem by a series of convex ones. Based on this approach, we propose a distributed rate allocation algorithm and demonstrate that under mild conditions, it converges to a locally optimal solution of the original NUM. Numerical results validate the effectiveness, in terms of tractable convergence of the proposed rate allocation algorithm.
△ Less
Submitted 30 September, 2011;
originally announced September 2011.