Search | arXiv e-print repository

Tail Optimality and Performance Analysis of the Nudge-M Scheduling Algorithm

Abstract: Recently it was shown that the response time of First-Come-First-Served (FCFS) scheduling can be stochastically and asymptotically improved upon by the {\it Nudge} scheduling algorithm in case of light-tailed job size distributions. Such improvements are feasible even when the jobs are partitioned into two types and the scheduler only has information about the type of incoming jobs (but not their… ▽ More Recently it was shown that the response time of First-Come-First-Served (FCFS) scheduling can be stochastically and asymptotically improved upon by the {\it Nudge} scheduling algorithm in case of light-tailed job size distributions. Such improvements are feasible even when the jobs are partitioned into two types and the scheduler only has information about the type of incoming jobs (but not their size). In this paper we introduce Nudge-$M$ scheduling, where basically any incoming type-1 job is allowed to pass any type-2 job that is still waiting in the queue given that it arrived as one of the last $M$ jobs. We prove that Nudge-$M$ has an asymptotically optimal response time within a large family of Nudge scheduling algorithms when job sizes are light-tailed. Simple explicit results for the asymptotic tail improvement ratio (ATIR) of Nudge-$M$ over FCFS are derived as well as explicit results for the optimal parameter $M$. An expression for the ATIR that only depends on the type-1 and type-2 mean job sizes and the fraction of type-1 jobs is presented in the heavy traffic setting. The paper further presents a numerical method to compute the response time distribution and mean response time of Nudge-$M$ scheduling provided that the job size distribution of both job types follows a phase-type distribution (by making use of the framework of Markov modulated fluid queues with jumps). △ Less

Submitted 12 April, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

arXiv:2401.07713 [pdf, other]

doi 10.1145/3639040

Approximations to Study the Impact of the Service Discipline in Systems with Redundancy

Authors: Nicolas Gast, Benny van Houdt

Abstract: As job redundancy has been recognized as an effective means to improve performance of large-scale computer systems, queueing systems with redundancy have been studied by various authors. Existing results include methods to compute the queue length distribution and response time but only when the service discipline is First-Come-First-Served (FCFS). For other service disciplines, such as Processor… ▽ More As job redundancy has been recognized as an effective means to improve performance of large-scale computer systems, queueing systems with redundancy have been studied by various authors. Existing results include methods to compute the queue length distribution and response time but only when the service discipline is First-Come-First-Served (FCFS). For other service disciplines, such as Processor Sharing (PS), or Last-Come-First-Served (LCFS), only the stability conditions are known. In this paper we develop the first methods to approximate the queue length distribution in a queueing system with redundancy under various service disciplines. We focus on a system with exponential job sizes, i.i.d. copies, and a large number of servers. We first derive a mean field approximation that is independent of the scheduling policy. In order to study the impact of service discipline, we then derive refinements of this approximation to specific scheduling policies. In the case of Processor Sharing, we provide a pair and a triplet approximation. The pair approximation can be regarded as a refinement of the classic mean field approximation and takes the service discipline into account, while the triplet approximation further refines the pair approximation. We also develop a pair approximation for three other service disciplines: First-Come-First-Served, Limited Processor Sharing and Last-Come-First-Served. We present numerical evidence that shows that all the approximations presented in the paper are highly accurate, but that none of them are asymptotically exact (as the number of servers goes to infinity). This makes these approximations suitable to study the impact of the service discipline on the queue length distribution. Our results show that FCFS yields the shortest queue length, and that the differences are more substantial at higher loads. △ Less

Submitted 15 January, 2024; originally announced January 2024.

Journal ref: Proceedings of the ACM on Measurement and Analysis of Computing Systems , 2024, 8 (1)

arXiv:2206.10428 [pdf, other]

On the stochastic and asymptotic improvement of First-Come First-Served and Nudge scheduling

Authors: Benny Van Houdt

Abstract: Recently it was shown that, contrary to expectations, the First-Come-First-Served (FCFS) scheduling algorithm can be stochastically improved upon by a scheduling algorithm called {\it Nudge} for light-tailed job size distributions. Nudge partitions jobs into 4 types based on their size, say small, medium, large and huge jobs. Nudge operates identical to FCFS, except that whenever a {\it small} job… ▽ More Recently it was shown that, contrary to expectations, the First-Come-First-Served (FCFS) scheduling algorithm can be stochastically improved upon by a scheduling algorithm called {\it Nudge} for light-tailed job size distributions. Nudge partitions jobs into 4 types based on their size, say small, medium, large and huge jobs. Nudge operates identical to FCFS, except that whenever a {\it small} job arrives that finds a {\it large} job waiting at the back of the queue, Nudge swaps the small job with the large one unless the large job was already involved in an earlier swap. In this paper, we show that FCFS can be stochastically improved upon under far weaker conditions. We consider a system with $2$ job types and limited swap** between type-$1$ and type-$2$ jobs, but where a type-$1$ job is not necessarily smaller than a type-$2$ job. More specifically, we introduce and study the Nudge-$K$ scheduling algorithm which allows type-$1$ jobs to be swapped with up to $K$ type-$2$ jobs waiting at the back of the queue, while type-$2$ jobs can be involved in at most one swap. We present an explicit expression for the response time distribution under Nudge-$K$ when both job types follow a phase-type distribution. Regarding the asymptotic tail improvement ratio (ATIR) , we derive a simple expression for the ATIR, as well as for the $K$ that maximizes the ATIR. We show that the ATIR is positive and the optimal $K$ tends to infinity in heavy traffic as long as the type-$2$ jobs are on average longer than the type-$1$ jobs. △ Less

Submitted 21 June, 2022; originally announced June 2022.

arXiv:2201.03905 [pdf, ps, other]

Performance of Load Balancers with Bounded Maximum Queue Length in case of Non-Exponential Job Sizes

Authors: Tim Hellemans, Grzegorz Kielanski, Benny Van Houdt

Abstract: In large-scale distributed systems, balancing the load in an efficient way is crucial in order to achieve low latency. Recently, some load balancing policies have been suggested which are able to achieve a bounded maximum queue length in the large-scale limit. However, these policies have thus far only been studied in case of exponential job sizes. As job sizes are more variable in real systems, w… ▽ More In large-scale distributed systems, balancing the load in an efficient way is crucial in order to achieve low latency. Recently, some load balancing policies have been suggested which are able to achieve a bounded maximum queue length in the large-scale limit. However, these policies have thus far only been studied in case of exponential job sizes. As job sizes are more variable in real systems, we investigate how the performance of these policies (and in particular the value of these bounds) is impacted by the job size distribution. We present a unified analysis which can be used to compute the bound on the queue length in case of phase-type distributed job sizes for four load balancing policies. We find that in most cases, the bound on the maximum queue length can be expressed in closed form. In addition, we obtain job size (in)dependent bounds on the expected response time. Our methodology relies on the use of the cavity process. That is, we conjecture that the cavity process captures the behaviour of the real system as the system size grows large. For each policy, we illustrate the accuracy of the cavity process by means of simulation. △ Less

Submitted 11 January, 2022; originally announced January 2022.

arXiv:2011.08250 [pdf, other]

Improved Load Balancing in Large Scale Systems using Attained Service Time Reporting

Authors: Tim Hellemans, Benny Van Houdt

Abstract: Our interest lies in load balancing jobs in large scale systems consisting of multiple dispatchers and FCFS servers. In the absence of any information on job sizes, dispatchers typically use queue length information reported by the servers to assign incoming jobs. When job sizes are highly variable, using only queue length information is clearly suboptimal and performance can be improved if some i… ▽ More Our interest lies in load balancing jobs in large scale systems consisting of multiple dispatchers and FCFS servers. In the absence of any information on job sizes, dispatchers typically use queue length information reported by the servers to assign incoming jobs. When job sizes are highly variable, using only queue length information is clearly suboptimal and performance can be improved if some indication can be provided to the dispatcher about the size of an ongoing job. In a FCFS server measuring the attained service time of the ongoing job is easy and servers can therefore report this attained service time together with the queue length when queried by a dispatcher. In this paper we propose and analyse a variety of load balancing policies that exploit both the queue length and attained service time to assign jobs, as well as policies for which only the attained service time of the job in service is used. We present a unified analysis for all these policies in a large scale system under the usual asymptotic independence assumptions. The accuracy of the proposed analysis is illustrated using simulation. We present extensive numerical experiments which clearly indicate that a significant improvement in waiting (and thus also in response) time may be achieved by using the attained service time information on top of the queue length of a server. Moreover, the policies which do not make use of the queue length still provide an improved waiting time for moderately loaded systems. △ Less

Submitted 15 April, 2021; v1 submitted 16 November, 2020; originally announced November 2020.

arXiv:2004.00876 [pdf, other]

Mean Waiting Time in Large-Scale and Critically Loaded Power of d Load Balancing Systems

Authors: Tim Hellemans, Benny Van Houdt

Abstract: Mean field models are a popular tool used to analyse load balancing policies. In some cases the waiting time distribution of the mean field limit has an explicit form. In other cases it can be computed as the solution of a set of differential equations. Here we study the limit of the mean waiting time $E[W_λ]$ as the arrival rate $λ$ approaches $1$ for a number of load balancing policies when job… ▽ More Mean field models are a popular tool used to analyse load balancing policies. In some cases the waiting time distribution of the mean field limit has an explicit form. In other cases it can be computed as the solution of a set of differential equations. Here we study the limit of the mean waiting time $E[W_λ]$ as the arrival rate $λ$ approaches $1$ for a number of load balancing policies when job sizes are exponential with mean $1$ (i.e. the system gets close to instability). As $E[W_λ]$ diverges to infinity, we scale with $-\log(1-λ)$ and present a method to compute the limit $\lim_{λ\rightarrow 1^-}-E[W_λ]/\log(1-λ)$. This limit has a surprisingly simple form for the load balancing algorithms considered. We present a general result that holds for any policy for which the associated differential equation satisfies a list of assumptions. For the LL(d) policy which assigns an incoming job to a server with the least work left among d randomly selected servers these assumptions are trivially verified. For this policy we prove the limit is given by $\frac{1}{d-1}$. We further show that the LL(d,K) policy, which assigns batches of $K$ jobs to the $K$ least loaded servers among d randomly selected servers, satisfies the assumptions and the limit is equal to $\frac{K}{d-K}$. For a policy which applies LL($d_i$) with probability $p_i$, we show that the limit is given by $\frac{1}{\sum_ip_id_i-1}$. We further indicate that our main result can also be used for load balancers with redundancy or memory. In addition, we propose an alternate scaling $-\log(p_λ)$ instead of $-\log(1-λ)$, for which the limit $\lim_{λ\rightarrow 0^+}-E[W_λ]/\log(p_λ)$ is well defined and non-zero (contrary to $\lim_{λ\rightarrow 0^+}-E[W_λ]/\log(1-λ)$), while $\lim_{λ\rightarrow 1^-}\log(1-λ) / \log(p_λ)=1$. △ Less

Submitted 28 January, 2021; v1 submitted 2 April, 2020; originally announced April 2020.

arXiv:2002.06906 [pdf, other]

Performance Analysis of Load Balancing Policies with Memory

Authors: Tim Hellemans, Benny Van Houdt

Abstract: Joining the shortest or least loaded queue among $d$ randomly selected queues are two fundamental load balancing policies. Under both policies the dispatcher does not maintain any information on the queue length or load of the servers. In this paper we analyze the performance of these policies when the dispatcher has some memory available to store the ids of some of the idle servers. We consider m… ▽ More Joining the shortest or least loaded queue among $d$ randomly selected queues are two fundamental load balancing policies. Under both policies the dispatcher does not maintain any information on the queue length or load of the servers. In this paper we analyze the performance of these policies when the dispatcher has some memory available to store the ids of some of the idle servers. We consider methods where the dispatcher discovers idle servers as well as methods where idle servers inform the dispatcher about their state. We focus on large-scale systems and our analysis uses the cavity method. The main insight provided is that the performance measures obtained via the cavity method for a load balancing policy {\it with} memory reduce to the performance measures for the same policy {\it without} memory provided that the arrival rate is properly scaled. Thus, we can study the performance of load balancers with memory in the same manner as load balancers without memory. In particular this entails closed form solutions for joining the shortest or least loaded queue among $d$ randomly selected queues with memory in case of exponential job sizes. Moreover, we obtain a simple closed form expression for the (scaled) expected waiting time as the system tends towards instability. We present simulation results that support our belief that the approximation obtained by the cavity method becomes exact as the number of servers tends to infinity. △ Less

Submitted 22 January, 2021; v1 submitted 17 February, 2020; originally announced February 2020.

Comments: 30 pages, 3 figures

arXiv:1811.05239 [pdf, ps, other]

Global attraction of ODE-based mean field models with hyperexponential job sizes

Authors: Benny Van Houdt

Abstract: Mean field modeling is a popular approach to assess the performance of large scale computer systems. The evolution of many mean field models is characterized by a set of ordinary differential equations that have a unique fixed point. In order to prove that this unique fixed point corresponds to the limit of the stationary measures of the finite systems, the unique fixed point must be a global attr… ▽ More Mean field modeling is a popular approach to assess the performance of large scale computer systems. The evolution of many mean field models is characterized by a set of ordinary differential equations that have a unique fixed point. In order to prove that this unique fixed point corresponds to the limit of the stationary measures of the finite systems, the unique fixed point must be a global attractor. While global attraction was established for various systems in case of exponential job sizes, it is often unclear whether these proof techniques can be generalized to non-exponential job sizes. In this paper we show how simple monotonicity arguments can be used to prove global attraction for a broad class of ordinary differential equations that capture the evolution of mean field models with hyperexponential job sizes. This class includes both existing as well as previously unstudied load balancing schemes and can be used for systems with either finite or infinite buffers. The main novelty of the approach exists in using a Coxian representation for the hyperexponential job sizes and a partial order that is stronger than the componentwise partial order used in the exponential case. △ Less

Submitted 17 April, 2019; v1 submitted 13 November, 2018; originally announced November 2018.

Comments: This paper was accepted at ACM Sigmetrics 2019

arXiv:1810.13186 [pdf, other]

Randomized Work Stealing versus Sharing in Large-scale Systems with Non-exponential Job Sizes

Authors: Benny Van Houdt

Abstract: Work sharing and work stealing are two scheduling paradigms to redistribute work when performing distributed computations. In work sharing, processors attempt to migrate pending jobs to other processors in the hope of reducing response times. In work stealing, on the other hand, underutilized processors attempt to steal jobs from other processors. Both paradigms generate a certain communication ov… ▽ More Work sharing and work stealing are two scheduling paradigms to redistribute work when performing distributed computations. In work sharing, processors attempt to migrate pending jobs to other processors in the hope of reducing response times. In work stealing, on the other hand, underutilized processors attempt to steal jobs from other processors. Both paradigms generate a certain communication overhead and the question addressed in this paper is which of the two reduces the response time the most given that they use the same amount of communication overhead. Prior work presented explicit bounds, for large scale systems, on when randomized work sharing outperforms randomized work stealing in case of Poisson arrivals and exponential job durations and indicated that work sharing is best when the load is below $φ-1 \approx 0.6180$, with $φ$ being the golden ratio. In this paper we revisit this problem and study the impact of the job size distribution using a mean field model. We present an efficient method to determine the boundary between the regions where sharing or stealing is best for a given job size distribution, as well as bounds that apply to any (phase-type) job size distribution. The main insight is that work stealing benefits significantly from having more variable job sizes and work sharing may become inferior to work stealing for loads as small as $1/2 + ε$ for any $ε> 0$. △ Less

Submitted 30 August, 2019; v1 submitted 31 October, 2018; originally announced October 2018.

Comments: This paper was accepted in IEEE/ACM Transactions on Networking

arXiv:1802.05420 [pdf, other]

On the Power-of-d-choices with Least Loaded Server Selection

Authors: Tim Hellemans, Benny Van Houdt

Abstract: Motivated by distributed schedulers that combine the power-of-d-choices with late binding and systems that use replication with cancellation-on-start, we study the performance of the LL(d) policy which assigns a job to a server that currently has the least workload among d randomly selected servers in large-scale homogeneous clusters. We consider general service time distributions and propose a pa… ▽ More Motivated by distributed schedulers that combine the power-of-d-choices with late binding and systems that use replication with cancellation-on-start, we study the performance of the LL(d) policy which assigns a job to a server that currently has the least workload among d randomly selected servers in large-scale homogeneous clusters. We consider general service time distributions and propose a partial integro-differential equation to describe the evolution of the system. This equation relies on the earlier proven ansatz for LL(d) which asserts that the workload distribution of any finite set of queues becomes independent of one another as the number of servers tends to infinity. Based on this equation we propose a fixed point iteration for the limiting workload distribution and study its convergence. For exponential job sizes we present a simple closed form expression for the limiting workload distribution that is valid for any work-conserving service discipline as well as for the limiting response time distribution in case of first-come-first-served scheduling. We further show that for phase-type distributed job sizes the limiting workload and response time distribution can be expressed via the unique solution of a simple set of ordinary differential equations. Numerical and analytical results that compare response time of the classic power-of-d-choices algorithm and the LL(d) policy are also presented and the accuracy of the limiting response time distribution for finite systems is illustrated using simulation. △ Less

Submitted 15 February, 2018; originally announced February 2018.

arXiv:1703.10500 [pdf, other]

Free Energy Approximations for CSMA networks

Authors: Benny Van Houdt

Abstract: In this paper we study how to estimate the back-off rates in an idealized CSMA network consisting of $n$ links to achieve a given throughput vector using free energy approximations. More specifically, we introduce the class of region-based free energy approximations with clique belief and present a closed form expression for the back-off rates based on the zero gradient points of the free energy a… ▽ More In this paper we study how to estimate the back-off rates in an idealized CSMA network consisting of $n$ links to achieve a given throughput vector using free energy approximations. More specifically, we introduce the class of region-based free energy approximations with clique belief and present a closed form expression for the back-off rates based on the zero gradient points of the free energy approximation (in terms of the conflict graph, target throughput vector and counting numbers). Next we introduce the size $k_{max}$ clique free energy approximation as a special case and derive an explicit expression for the counting numbers, as well as a recursion to compute the back-off rates. We subsequently show that the size $k_{max}$ clique approximation coincides with a Kikuchi free energy approximation and prove that it is exact on chordal conflict graphs when $k_{max} = n$. As a by-product these results provide us with an explicit expression of a fixed point of the inverse generalized belief propagation algorithm for CSMA networks. Using numerical experiments we compare the accuracy of the novel approximation method with existing methods. △ Less

Submitted 10 April, 2017; v1 submitted 30 March, 2017; originally announced March 2017.

arXiv:1602.08290 [pdf, other]

doi 10.1109/TNET.2016.2604462

Explicit back-off rates for achieving target throughputs in CSMA/CA networks

Authors: Benny Van Houdt

Abstract: CSMA/CA networks have often been analyzed using a stylized model that is fully characterized by a vector of back-off rates and a conflict graph. Further, for any achievable throughput vector $\vec θ$ the existence of a unique vector $\vec ν(\vec θ)$ of back-off rates that achieves this throughput vector was proven. Although this unique vector can in principle be computed iteratively, the required… ▽ More CSMA/CA networks have often been analyzed using a stylized model that is fully characterized by a vector of back-off rates and a conflict graph. Further, for any achievable throughput vector $\vec θ$ the existence of a unique vector $\vec ν(\vec θ)$ of back-off rates that achieves this throughput vector was proven. Although this unique vector can in principle be computed iteratively, the required time complexity grows exponentially in the network size, making this only feasible for small networks. In this paper, we present an explicit formula for the unique vector of back-off rates $\vec ν(\vec θ)$ needed to achieve any achievable throughput vector $\vec θ$ provided that the network has a chordal conflict graph. This class of networks contains a number of special cases of interest such as (inhomogeneous) line networks and networks with an acyclic conflict graph. Moreover, these back-off rates are such that the back-off rate of a node only depends on its own target throughput and the target throughput of its neighbors and can be determined in a distributed manner. We further indicate that back-off rates of this form cannot be obtained in general for networks with non-chordal conflict graphs. For general conflict graphs we nevertheless show how to adapt the back-off rates when a node is added to the network when its interfering nodes form a clique in the conflict graph. Finally, we introduce a distributed chordal approximation algorithm for general conflict graphs which is shown (using numerical examples) to be more accurate than the Bethe approximation. △ Less

Submitted 29 August, 2016; v1 submitted 26 February, 2016; originally announced February 2016.

Showing 1–12 of 12 results for author: van Houdt, B