Reinforcement Learning-based Application Autoscaling in the Cloud: A Survey
Authors:
Yisel Garí,
David A. Monge,
Elina Pacini,
Cristian Mateos,
Carlos García Garino
Abstract:
Reinforcement Learning (RL) has demonstrated a great potential for automatically solving decision-making problems in complex uncertain environments. RL proposes a computational approach that allows learning through interaction in an environment with stochastic behavior, where agents take actions to maximize some cumulative short-term and long-term rewards. Some of the most impressive results have…
▽ More
Reinforcement Learning (RL) has demonstrated a great potential for automatically solving decision-making problems in complex uncertain environments. RL proposes a computational approach that allows learning through interaction in an environment with stochastic behavior, where agents take actions to maximize some cumulative short-term and long-term rewards. Some of the most impressive results have been shown in Game Theory where agents exhibited superhuman performance in games like Go or Starcraft 2, which led to its gradual adoption in many other domains, including Cloud Computing. Therefore, RL appears as a promising approach for Autoscaling in Cloud since it is possible to learn transparent (with no human intervention), dynamic (no static plans), and adaptable (constantly updated) resource management policies to execute applications. These are three important distinctive aspects to consider in comparison with other widely used autoscaling policies that are defined in an ad-hoc way or statically computed as in solutions based on meta-heuristics. Autoscaling exploits the Cloud elasticity to optimize the execution of applications according to given optimization criteria, which demands to decide when and how to scale-up/down computational resources, and how to assign them to the upcoming processing workload. Such actions have to be taken considering that the Cloud is a dynamic and uncertain environment. Motivated by this, many works apply RL to the autoscaling problem in the Cloud. In this work, we survey exhaustively those proposals from major venues, and uniformly compare them based on a set of proposed taxonomies. We also discuss open problems and prospective research in the area.
△ Less
Submitted 17 November, 2020; v1 submitted 27 January, 2020;
originally announced January 2020.
CMI: An Online Multi-objective Genetic Autoscaler for Scientific and Engineering Workflows in Cloud Infrastructures with Unreliable Virtual Machines
Authors:
David A. Monge,
Elina Pacini,
Cristian Mateos,
Enrique Alba,
Carlos García Garino
Abstract:
Cloud Computing is becoming the leading paradigm for executing scientific and engineering workflows. The large-scale nature of the experiments they model and their variable workloads make clouds the ideal execution environment due to prompt and elastic access to huge amounts of computing resources. Autoscalers are middleware-level software components that allow scaling up and down the computing pl…
▽ More
Cloud Computing is becoming the leading paradigm for executing scientific and engineering workflows. The large-scale nature of the experiments they model and their variable workloads make clouds the ideal execution environment due to prompt and elastic access to huge amounts of computing resources. Autoscalers are middleware-level software components that allow scaling up and down the computing platform by acquiring or terminating virtual machines (VM) at the time that workflow's tasks are being scheduled. In this work we propose a novel online multi-objective autoscaler for workflows denominated Cloud Multi-objective Intelligence (CMI), that aims at the minimization of makespan, monetary cost and the potential impact of errors derived from unreliable VMs. In addition, this problem is subject to monetary budget constraints. CMI is responsible for periodically solving the autoscaling problems encountered along the execution of a workflow. Simulation experiments on four well-known workflows exhibit that CMI significantly outperforms a state-of-the-art autoscaler of similar characteristics called Spot Instances Aware Autoscaling (SIAA). These results convey a solid base for deepening in the study of other meta-heuristic methods for autoscaling workflow applications using cheap but unreliable infrastructures.
△ Less
Submitted 2 November, 2018;
originally announced November 2018.