-
SMSE: A Serverless Platform for Multimedia Cloud Systems
Authors:
Chavit Denninnart,
Mohsen Amini Salehi
Abstract:
Along with the rise of domain-specific computing (ASICs hardware) and domain-specific programming languages, we envision that the next step is the emergence of domain-specific cloud platforms. Develo** such platforms for popular applications in the serverless manner, not only can offer a higher efficiency to both users and providers, it can also expedite the application development cycles and en…
▽ More
Along with the rise of domain-specific computing (ASICs hardware) and domain-specific programming languages, we envision that the next step is the emergence of domain-specific cloud platforms. Develo** such platforms for popular applications in the serverless manner, not only can offer a higher efficiency to both users and providers, it can also expedite the application development cycles and enable users to become solution-oriented and focus on their specific business logic. Considering multimedia streaming as one of the most trendy applications in the IT industry, the goal of this study is to develop SMSE, the first domain-specific serverless platform for multimedia streaming. SMSE democratizes multimedia service development via enabling content providers (or even end-users) to rapidly develop their desired functionalities on their multimedia contents. Upon develo** SMSE, the next goal of this study is to deal with its efficiency challenges and develop a function container provisioning method that can efficiently utilize cloud resources and improve the users' QoS. In particular, we develop a dynamic method that provisions durable or ephemeral containers depending on the spatiotemporal and data-dependency characteristics of the functions. Evaluating the prototype implementation of SMSE under real-world settings demonstrates its capability to reduce both the containerization overhead, and the makespan time of serving multimedia processing functions (by up to 30%) in compare to the function provision methods that are being used in the general-purpose serverless cloud systems.
△ Less
Submitted 29 September, 2023; v1 submitted 6 January, 2022;
originally announced January 2022.
-
Efficiency in the Serverless Cloud Paradigm: A Survey on the Reusing and Approximation Aspects
Authors:
Chavit Denninnart,
Thanawat Chanikaphon,
Mohsen Amini Salehi
Abstract:
Serverless computing along with Function-as-a-Service (FaaS) is forming a new computing paradigm that is anticipated to found the next generation of cloud systems. The popularity of this paradigm is due to offering a highly transparent infrastructure that enables user applications to scale in the granularity of their functions. Since these often small and single-purpose functions are managed on sh…
▽ More
Serverless computing along with Function-as-a-Service (FaaS) is forming a new computing paradigm that is anticipated to found the next generation of cloud systems. The popularity of this paradigm is due to offering a highly transparent infrastructure that enables user applications to scale in the granularity of their functions. Since these often small and single-purpose functions are managed on shared computing resources behind the scene, a great potential for computational reuse and approximate computing emerges that if unleashed, can remarkably improve the efficiency of serverless cloud systems -- both from the user's QoS and system's (energy consumption and incurred cost) perspectives. Accordingly, the goal of this survey study is to, first, unfold the internal mechanics of serverless computing and, second, explore the scope for efficiency within this paradigm via studying function reuse and approximation approaches and discussing the pros and cons of each one. Next, we outline potential future research directions within this paradigm that can either unlock new use cases or make the paradigm more efficient.
△ Less
Submitted 25 June, 2023; v1 submitted 13 October, 2021;
originally announced October 2021.
-
Harnessing the Potential of Function-Reuse in Multimedia Cloud Systems
Authors:
Chavit Denninnart,
Mohsen Amini Salehi
Abstract:
Cloud-based computing systems can get oversubscribed due to the budget constraints of their users or limitations in certain resource types. The oversubscription can, in turn, degrade the users perceived Quality of Service (QoS). The approach we investigate to mitigate both the oversubscription and the incurred cost is based on smart reusing of the computation needed to process the service requests…
▽ More
Cloud-based computing systems can get oversubscribed due to the budget constraints of their users or limitations in certain resource types. The oversubscription can, in turn, degrade the users perceived Quality of Service (QoS). The approach we investigate to mitigate both the oversubscription and the incurred cost is based on smart reusing of the computation needed to process the service requests (i.e., tasks). We propose a reusing paradigm for the tasks that are waiting for execution. This paradigm can be particularly impactful in serverless platforms where multiple users can request similar services simultaneously. Our motivation is a multimedia streaming engine that processes the media segments in an on-demand manner. We propose a mechanism to identify various types of "mergeable" tasks and aggregate them to improve the QoS and mitigate the incurred cost. We develop novel approaches to determine when and how to perform task aggregation such that the QoS of other tasks is not affected. Evaluation results show that the proposed mechanism can improve the QoS by significantly reducing the percentage of tasks missing their deadlines %. In addition, it can and reduce the overall time (and subsequently the incurred cost) of utilizing cloud services by more than 9%.
△ Less
Submitted 9 April, 2021;
originally announced April 2021.
-
Descriptive and Predictive Analysis of Aggregating Functions in Serverless Clouds: the Case of Video Streaming
Authors:
Shangrui Wu,
Chavit Denninnart,
Xiangbo Li,
Yang Wang,
Mohsen Amini Salehi
Abstract:
Serverless clouds allocate multiple tasks (e.g., micro-services) from multiple users on a shared pool of computing resources. This enables serverless cloud providers to reduce their resource usage by transparently aggregate similar tasks of a certain context (e.g., video processing) that share the whole or part of their computation. To this end, it is crucial to know the amount of time-saving achi…
▽ More
Serverless clouds allocate multiple tasks (e.g., micro-services) from multiple users on a shared pool of computing resources. This enables serverless cloud providers to reduce their resource usage by transparently aggregate similar tasks of a certain context (e.g., video processing) that share the whole or part of their computation. To this end, it is crucial to know the amount of time-saving achieved by aggregating the tasks. Lack of such knowledge can lead to uninformed merging and scheduling decisions that, in turn, can cause deadline violation of either the merged tasks or other following tasks. Accordingly, in this paper, we study the problem of estimating execution-time saving resulted from merging tasks with the example in the context of video processing. To learn the execution-time saving in different forms of merging, we first establish a set of benchmarking videos and examine a wide variety of video processing tasks -- with and without merging in place. We observed that although merging can save up to 44% in the execution-time, the number of possible merging cases is intractable. Hence, in the second part, we leverage the benchmarking results and develop a method based on Gradient Boosting Decision Tree (GBDT) to estimate the time-saving for any given task merging case. Experimental results show that the method can estimate the time-saving with the error rate of 0.04, measured based on Root Mean Square Error (RMSE).
△ Less
Submitted 10 December, 2020;
originally announced December 2020.
-
Cost- and QoS-Efficient Serverless Cloud Computing
Authors:
Chavit Denninnart
Abstract:
Cloud-based serverless computing systems, either public or privately provisioned, aim to provide the illusion of infinite resources and abstract users from details of the allocation decisions. With the goal of providing a low cost and a high QoS, the serverless computing paradigm offers opportunities that can be harnessed to attain the goals. Specifically, our strategy in this dissertation is to a…
▽ More
Cloud-based serverless computing systems, either public or privately provisioned, aim to provide the illusion of infinite resources and abstract users from details of the allocation decisions. With the goal of providing a low cost and a high QoS, the serverless computing paradigm offers opportunities that can be harnessed to attain the goals. Specifically, our strategy in this dissertation is to avoid redundant computing, in cases where independent task requests are similar to each other and for tasks that are pointless to process. We explore two main approaches to (A) reuse part of computation needed to process the services and (B) proactively pruning tasks with a low chance of success to improve the overall QoS of the system. For the first approach, we propose a mechanism to identify various types of "mergeable" tasks, which can benefit from computational reuse if they are executed together as a group. To evaluate the task merging configurations extensively, we quantify the resource-saving magnitude and then leveraging the experimental data to create a resource-saving predictor. We investigate multiple tasks merging approaches that suit different workload scenarios to determine when it is appropriate to aggregate tasks and how to allocate them so that the QoS of other tasks is minimally affected. For the second approach, we developed the mechanisms to skip tasks whose chance of completing on time is not worth pursuing by drop or defer them. We determined the minimum chance of success thresholds for tasks to pass to get scheduled and executed. We dynamically adjust such thresholds based on multiple characteristics of the arriving workload and the system's conditions. We employed approximate computing to reduce the pruning mechanism's computational overheads and ensure that the mechanism can be used practically.
△ Less
Submitted 23 November, 2020;
originally announced November 2020.
-
The Art of CPU-Pinning: Evaluating and Improving the Performance of Virtualization and Containerization Platforms
Authors:
Davood Ghatreh Samani,
Chavit Denninnart,
Josef Bacik,
Mohsen Amini Salehi
Abstract:
Cloud providers offer a variety of execution platforms in form of bare-metal, VM, and containers. However, due to the pros and cons of each execution platform, choosing the appropriate platform for a specific cloud-based application has become a challenge for solution architects. The possibility to combine these platforms (e.g. deploying containers within VMs) offers new capacities that makes the…
▽ More
Cloud providers offer a variety of execution platforms in form of bare-metal, VM, and containers. However, due to the pros and cons of each execution platform, choosing the appropriate platform for a specific cloud-based application has become a challenge for solution architects. The possibility to combine these platforms (e.g. deploying containers within VMs) offers new capacities that makes the challenge even further complicated. However, there is a little study in the literature on the pros and cons of deploying different application types on various execution platforms. In particular, evaluation of diverse hardware configurations and different CPU provisioning methods, such as CPU pinning, have not been sufficiently studied in the literature. In this work, the performance overhead of container, VM, and bare-metal execution platforms are measured and analyzed for four categories of real-world applications, namely video processing, parallel processing (MPI), web processing, and No-SQL, respectively representing CPU intensive, parallel processing, and two IO intensive processes. Our analyses reveal a set of interesting and sometimes counterintuitive findings that can be used as best practices by the solution architects to efficiently deploy cloud-based applications. Here are some notable mentions: (A) Under specific circumstances, containers can impose a higher overhead than VMs; (B) Containers on top of VMs can mitigate the overhead of VMs for certain applications; (C) Containers with a large number of cores impose a lower overhead than those with a few cores.
△ Less
Submitted 3 June, 2020;
originally announced June 2020.
-
Autonomous Task Drop** Mechanism to Achieve Robustness in Heterogeneous Computing Systems
Authors:
Ali Mokhtari,
Chavit Denninnart,
Mohsen Amini Salehi
Abstract:
Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our…
▽ More
Robustness of a distributed computing system is defined as the ability to maintain its performance in the presence of uncertain parameters. Uncertainty is a key problem in heterogeneous (and even homogeneous) distributed computing systems that perturbs system robustness. Notably, the performance of these systems is perturbed by uncertainty in both task execution time and arrival. Accordingly, our goal is to make the system robust against these uncertainties. Considering task execution time as a random variable, we use probabilistic analysis to develop an autonomous proactive task drop** mechanism to attain our robustness goal. Specifically, we provide a mathematical model that identifies the optimality of a task drop** decision, so that the system robustness is maximized. Then, we leverage the mathematical model to develop a task drop** heuristic that achieves the system robustness within a feasible time complexity. Although the proposed model is generic and can be applied to any distributed system, we concentrate on heterogeneous computing (HC) systems that have a higher degree of exposure to uncertainty than homogeneous systems. Experimental results demonstrate that the autonomous proactive drop** mechanism can improve the system robustness by up to 20%.
△ Less
Submitted 22 May, 2020;
originally announced May 2020.
-
F-FDN: Federation of Fog Computing Systems for Low Latency Video Streaming
Authors:
Vaughan Veillon,
Chavit Denninnart,
Mohsen Amini Salehi
Abstract:
Video streaming is growing in popularity and has become the most bandwidth-consuming Internet service. As such, robust streaming in terms of low latency and uninterrupted streaming experience, particularly for viewers in distant areas, has become a challenge. The common practice to reduce latency is to pre-process multiple versions of each video and use Content Delivery Networks (CDN) to cache vid…
▽ More
Video streaming is growing in popularity and has become the most bandwidth-consuming Internet service. As such, robust streaming in terms of low latency and uninterrupted streaming experience, particularly for viewers in distant areas, has become a challenge. The common practice to reduce latency is to pre-process multiple versions of each video and use Content Delivery Networks (CDN) to cache videos that are popular in a geographical area. However, with the fast-growing video repository sizes, caching video contents in multiple versions on each CDN is becoming inefficient. Accordingly, in this paper, we propose the architecture for Fog Delivery Networks (FDN) and provide methods to federate them (called F-FDN) to reduce video streaming latency. In addition to caching, FDNs have the ability to process videos in an on-demand manner. F-FDN leverages cached contents on the neighboring FDNs to further reduce latency. In particular, F-FDN is equipped with methods that aim at reducing latency through probabilistically evaluating the cost benefit of fetching video segments either from neighboring FDNs or by processing them. Experimental results against alternative streaming methods show that both on-demand processing and leveraging cached video segments on neighboring FDNs can remarkably reduce streaming latency (on average 52%).
△ Less
Submitted 11 May, 2019;
originally announced May 2019.
-
Improving Robustness of Heterogeneous Serverless Computing Systems Via Probabilistic Task Pruning
Authors:
Chavit Denninnart,
James Gentry,
Mohsen Amini Salehi
Abstract:
Cloud-based serverless computing is an increasingly popular computing paradigm. In this paradigm, different services have diverse computing requirements that justify deploying an inconsistently Heterogeneous Computing (HC) system to efficiently process them. In an inconsistently HC system, each task needed for a given service, potentially exhibits different execution times on each type of machine.…
▽ More
Cloud-based serverless computing is an increasingly popular computing paradigm. In this paradigm, different services have diverse computing requirements that justify deploying an inconsistently Heterogeneous Computing (HC) system to efficiently process them. In an inconsistently HC system, each task needed for a given service, potentially exhibits different execution times on each type of machine. An ideal resource allocation system must be aware of such uncertainties in execution times and be robust against them, so that Quality of Service (QoS) requirements of users are met. This research aims to maximize the robustness of an HC system utilized to offer a serverless computing system, particularly when the system is oversubscribed. Our strategy to maximize robustness is to develop a task pruning mechanism that can be added to existing task-map** heuristics without altering them. Pruning tasks with a low probability of meeting their deadlines improves the likelihood of other tasks meeting their deadlines, thereby increasing system robustness and overall QoS. To evaluate the impact of the pruning mechanism, we examine it on various configurations of heterogeneous and homogeneous computing systems. Evaluation results indicate a considerable improvement (up to 35%) in the system robustness.
△ Less
Submitted 11 May, 2019;
originally announced May 2019.
-
Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems
Authors:
James Gentry,
Chavit Denninnart,
Mohsen Amini Salehi
Abstract:
In heterogeneous distributed computing (HC) systems, diversity can exist in both computational resources and arriving tasks. In an inconsistently heterogeneous computing system, task types have different execution times on heterogeneous machines. A method is required to map arriving tasks to machines based on machine availability and performance, maximizing the number of tasks meeting deadlines (d…
▽ More
In heterogeneous distributed computing (HC) systems, diversity can exist in both computational resources and arriving tasks. In an inconsistently heterogeneous computing system, task types have different execution times on heterogeneous machines. A method is required to map arriving tasks to machines based on machine availability and performance, maximizing the number of tasks meeting deadlines (defined as robustness). For tasks with hard deadlines (eg those in live video streaming), tasks that miss their deadlines are dropped. The problem investigated in this research is maximizing the robustness of an oversubscribed HC system. A way to maximize this robustness is to prune (ie defer or drop) tasks with low probability of meeting their deadlines to increase the probability of other tasks meeting their deadlines. In this paper, we first provide a mathematical model to estimate a task's probability of meeting its deadline in the presence of task drop**. We then investigate methods for engaging probabilistic drop** and we find thresholds for drop** and deferring. Next, we develop a pruning-aware map** heuristic and extend it to engender fairness across various task types. We show the cost benefit of using probabilistic pruning in an HC system. Simulation results, harnessing a selection of map** heuristics, show efficacy of the pruning mechanism in improving robustness (on average by 25%) and cost in an oversubscribed HC system by up to 40%.
△ Less
Submitted 26 January, 2019;
originally announced January 2019.
-
Leveraging Computational Reuse for Cost- and QoS-Efficient Task Scheduling in Clouds
Authors:
Chavit Denninnart,
Mohsen Amini Salehi,
Adel Nadjaran Toosi,
Xiangbo Li
Abstract:
Cloud-based computing systems could get oversubscribed due to budget constraints of cloud users which causes violation of Quality of Experience(QoE) metrics such as tasks' deadlines. We investigate an approach to achieve robustness against uncertain task arrival and oversubscription through smart reuse of computation while similar tasks are waiting for execution. Our motivation in this study is a…
▽ More
Cloud-based computing systems could get oversubscribed due to budget constraints of cloud users which causes violation of Quality of Experience(QoE) metrics such as tasks' deadlines. We investigate an approach to achieve robustness against uncertain task arrival and oversubscription through smart reuse of computation while similar tasks are waiting for execution. Our motivation in this study is a cloud-based video streaming engine that processes video streaming tasks in an on-demand manner. We propose a mechanism to identify various types of "mergeable" tasks and determine when it is appropriate to aggregate tasks without affecting QoS of other tasks. Experiment shows that our mechanism can improve robustness of the system and also saves the overall time of using cloud services by more than 14%.
△ Less
Submitted 18 September, 2018;
originally announced September 2018.