Search | arXiv e-print repository

Server saturation in skewed networks

Authors: Diego Goldsztajn, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: We consider a model inspired by compatibility constraints that arise between tasks and servers in data centers, cloud computing systems and content delivery networks. The constraints are represented by a bipartite graph or network that interconnects dispatchers with compatible servers. Each dispatcher receives tasks over time and sends every task to a compatible server with the least number of tas… ▽ More We consider a model inspired by compatibility constraints that arise between tasks and servers in data centers, cloud computing systems and content delivery networks. The constraints are represented by a bipartite graph or network that interconnects dispatchers with compatible servers. Each dispatcher receives tasks over time and sends every task to a compatible server with the least number of tasks, or to a server with the least number of tasks among $d$ compatible servers selected uniformly at random. We focus on networks where the neighborhood of at least one server is skewed in a limiting regime. This means that a diverging number of dispatchers are in the neighborhood which are each compatible with a uniformly bounded number of servers; thus, the degree of the central server approaches infinity while the degrees of many neighboring dispatchers remain bounded. We prove that each server with a skewed neighborhood saturates, in the sense that the mean number of tasks queueing in front of it in steady state approaches infinity. Paradoxically, this pathological behavior can even arise in random networks where nearly all the servers have at most one task in the limit. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 48 pages, 5 figures, accepted at SIGMETRICS 2024

MSC Class: 60K25 (Primary) 68M20; 60J28 (Secondary)

arXiv:2403.13567 [pdf, other]

Certified Constraint Propagation and Dual Proof Analysis in a Numerically Exact MIP Solver

Authors: Sander Borst, Leon Eifler, Ambros Gleixner

Abstract: This paper presents the integration of constraint propagation and dual proof analysis in an exact, roundoff-error-free MIP solver. The authors employ safe rounding methods to ensure that all results remain provably correct, while sacrificing as little computational performance as possible in comparison to a pure floating-point implementation. The study also addresses the adaptation of certificatio… ▽ More This paper presents the integration of constraint propagation and dual proof analysis in an exact, roundoff-error-free MIP solver. The authors employ safe rounding methods to ensure that all results remain provably correct, while sacrificing as little computational performance as possible in comparison to a pure floating-point implementation. The study also addresses the adaptation of certification techniques for correctness verification. Computational studies demonstrate the effectiveness of these techniques, showcasing a 23% performance improvement on the MIPLIB 2017 benchmark test set. △ Less

Submitted 20 March, 2024; originally announced March 2024.

arXiv:2402.13227 [pdf, other]

Online Matching on $3$-Uniform Hypergraphs

Authors: Sander Borst, Danish Kashaev, Zhuan Khye Koh

Abstract: The online matching problem was introduced by Karp, Vazirani and Vazirani (STOC 1990) on bipartite graphs with vertex arrivals. It is well-known that the optimal competitive ratio is $1-1/e$ for both integral and fractional versions of the problem. Since then, there has been considerable effort to find optimal competitive ratios for other related settings. In this work, we go beyond the graph case… ▽ More The online matching problem was introduced by Karp, Vazirani and Vazirani (STOC 1990) on bipartite graphs with vertex arrivals. It is well-known that the optimal competitive ratio is $1-1/e$ for both integral and fractional versions of the problem. Since then, there has been considerable effort to find optimal competitive ratios for other related settings. In this work, we go beyond the graph case and study the online matching problem on $k$-uniform hypergraphs. For $k=3$, we provide an optimal primal-dual fractional algorithm, which achieves a competitive ratio of $(e-1)/(e+1)\approx 0.4621$. As our main technical contribution, we present a carefully constructed adversarial instance, which shows that this ratio is in fact optimal. It combines ideas from known hard instances for bipartite graphs under the edge-arrival and vertex-arrival models. For $k\geq 3$, we give a simple integral algorithm which performs better than greedy when the online nodes have bounded degree. As a corollary, it achieves the optimal competitive ratio of 1/2 on 3-uniform hypergraphs when every online node has degree at most 2. This is because the special case where every online node has degree 1 is equivalent to the edge-arrival model on graphs, for which an upper bound of 1/2 is known. △ Less

Submitted 20 February, 2024; originally announced February 2024.

arXiv:2402.00696 [pdf, other]

Multi-dimensional state space collapse in non-complete resource pooling scenarios

Authors: Ellen Cardinaels, Sem Borst, Johan S. H. van Leeuwaarden

Abstract: The present paper establishes an explicit multi-dimensional state space collapse (SSC) for parallel-processing systems with arbitrary compatibility constraints between servers and job types. This breaks major new ground beyond the SSC results and queue length asymptotics in the literature which are largely restricted to complete resource pooling (CRP) scenarios where the steady-state queue length… ▽ More The present paper establishes an explicit multi-dimensional state space collapse (SSC) for parallel-processing systems with arbitrary compatibility constraints between servers and job types. This breaks major new ground beyond the SSC results and queue length asymptotics in the literature which are largely restricted to complete resource pooling (CRP) scenarios where the steady-state queue length vector concentrates around a line in heavy traffic. The multi-dimensional SSC that we establish reveals heavy-traffic behavior which is also far more tractable than the pre-limit queue length distribution, yet exhibits a fundamentally more intricate structure than in the one-dimensional case, providing useful insight into the system dynamics. In particular, we prove that the limiting queue length vector lives in a $K$-dimensional cone of which the set of spanning vectors is random in general, capturing the delicate interplay between the various job types and servers. For a broad class of systems we provide a further simplification which shows that the collection of random cones constitutes a fixed $K$-dimensional cone, resulting in a $K$-dimensional SSC. The dimension $K$ represents the number of critically loaded subsystems, or equivalently, capacity bottlenecks in heavy-traffic, with $K=1$ corresponding to conventional CRP scenarios. Our approach leverages probability generating function (PGF) expressions for Markovian systems operating under redundancy policies. △ Less

Submitted 29 April, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

arXiv:2311.07435 [pdf, other]

Trajectories and Platoon-forming Algorithm for Intersections with Heterogeneous Autonomous Traffic

Authors: P. C. Joshi, M. A. A. Boon, S. C. Borst

Abstract: The anticipated launch of fully autonomous vehicles presents an opportunity to develop and implement novel traffic management systems. Intersections are one of the bottlenecks for urban traffic, and thus offer tremendous potential for performance improvements of traffic flow if managed efficiently. Platoon-forming algorithms, in which vehicles are grouped together with short inter-vehicular distan… ▽ More The anticipated launch of fully autonomous vehicles presents an opportunity to develop and implement novel traffic management systems. Intersections are one of the bottlenecks for urban traffic, and thus offer tremendous potential for performance improvements of traffic flow if managed efficiently. Platoon-forming algorithms, in which vehicles are grouped together with short inter-vehicular distances just before arriving at an intersection at high speed, seem particularly promising in this aspect. In this work, we present an intersection access control system based on platoon-forming for heterogeneous autonomous traffic. The heterogeneity of traffic arises from vehicles with different acceleration capabilities and safety constraints. We focus on obtaining computationally fast and interpretable closed-form expressions for safe and efficient vehicle trajectories that lead to platoon formation, and show that these trajectories are solutions to certain classes of optimisation problems. Additionally, we conduct a numerical study to obtain approximations for intersection capacity as a result of such platoon formation. △ Less

Submitted 24 January, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 42 pages, 16 figures. 3D Animations included as ancillary files

arXiv:2305.13054 [pdf, other]

Fluid limits for interacting queues in sparse dynamic graphs

Authors: Diego Goldsztajn, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: Consider a network of $n$ single-server queues where tasks arrive independently at each of the servers at rate $λ_n$. The servers are interconnected by a graph that is resampled at rate $μ_n$ in a way that is symmetric with respect to the servers, and each task is dispatched to the shortest queue in the graph neighborhood where it appears. The so-called occupancy process describes the empirical di… ▽ More Consider a network of $n$ single-server queues where tasks arrive independently at each of the servers at rate $λ_n$. The servers are interconnected by a graph that is resampled at rate $μ_n$ in a way that is symmetric with respect to the servers, and each task is dispatched to the shortest queue in the graph neighborhood where it appears. The so-called occupancy process describes the empirical distribution of the number of tasks across the servers. This stochastic process evolves on the underlying dynamic graph, and its dynamics depend on the the number of tasks at each individual server and the neighborhood structure of the graph. We prove that this dependency disappears in the limit as $n \to \infty$ when $λ_n / n \to λ$ and $μ_n \to \infty$, and establish that the limit of the occupancy process is given by a system of differential equations that depends solely on $λ$ and the limiting degree distribution of the graph. We further show that the stationary distribution of the occupancy process converges to an equilibrium point of the differential equations, and derive properties of this equilibrium that reflect the impact of the degree distribution. Our focus is on truly sparse graphs where the maximum degree is uniformly bounded across $n$, making neighboring servers strongly correlated. △ Less

Submitted 23 May, 2024; v1 submitted 22 May, 2023; originally announced May 2023.

Comments: 60 pages, 4 figures

MSC Class: 60F17; 60K25; 60K35 (Primary) 68M20 (Secondary)

arXiv:2304.09279 [pdf, ps, other]

Heavy Loads and Heavy Tails

Authors: Sem Borst

Abstract: The present paper is concerned with the stationary workload of queues with heavy-tailed (regularly varying) characteristics. We adopt a transform perspective to illuminate a close connection between the tail asymptotics and heavy-traffic limit in infinite-variance scenarios. This serves as a tribute to some of the pioneering results of J.W. Cohen in this domain. We specifically demonstrate that re… ▽ More The present paper is concerned with the stationary workload of queues with heavy-tailed (regularly varying) characteristics. We adopt a transform perspective to illuminate a close connection between the tail asymptotics and heavy-traffic limit in infinite-variance scenarios. This serves as a tribute to some of the pioneering results of J.W. Cohen in this domain. We specifically demonstrate that reduced-load equivalence properties established for the tail asymptotics of the workload naturally extend to the heavy-traffic limit. △ Less

Submitted 18 April, 2023; originally announced April 2023.

MSC Class: 60K25 (primary); 68M20; 90B22 (secondary)

arXiv:2302.03669 [pdf, other]

Deep Reinforcement Learning for Traffic Light Control in Intelligent Transportation Systems

Authors: Xiao-Yang Liu, Ming Zhu, Sem Borst, Anwar Walid

Abstract: Smart traffic lights in intelligent transportation systems (ITSs) are envisioned to greatly increase traffic efficiency and reduce congestion. Deep reinforcement learning (DRL) is a promising approach to adaptively control traffic lights based on the real-time traffic situation in a road network. However, conventional methods may suffer from poor scalability. In this paper, we investigate deep rei… ▽ More Smart traffic lights in intelligent transportation systems (ITSs) are envisioned to greatly increase traffic efficiency and reduce congestion. Deep reinforcement learning (DRL) is a promising approach to adaptively control traffic lights based on the real-time traffic situation in a road network. However, conventional methods may suffer from poor scalability. In this paper, we investigate deep reinforcement learning to control traffic lights, and both theoretical analysis and numerical experiments show that the intelligent behavior ``greenwave" (i.e., a vehicle will see a progressive cascade of green lights, and not have to brake at any intersection) emerges naturally a grid road network, which is proved to be the optimal policy in an avenue with multiple cross streets. As a first step, we use two DRL algorithms for the traffic light control problems in two scenarios. In a single road intersection, we verify that the deep Q-network (DQN) algorithm delivers a thresholding policy; and in a grid road network, we adopt the deep deterministic policy gradient (DDPG) algorithm. Secondly, numerical experiments show that the DQN algorithm delivers the optimal control, and the DDPG algorithm with passive observations has the capability to produce on its own a high-level intelligent behavior in a grid road network, namely, the ``greenwave" policy emerges. We also verify the ``greenwave" patterns in a $5 \times 10$ grid road network. Thirdly, the ``greenwave" patterns demonstrate that DRL algorithms produce favorable solutions since the ``greenwave" policy shown in experiment results is proved to be optimal in a specified traffic model (an avenue with multiple cross streets). The delivered policies both in a single road intersection and a grid road network demonstrate the scalability of DRL algorithms. △ Less

Submitted 5 March, 2023; v1 submitted 3 February, 2023; originally announced February 2023.

Comments: 17 pages

Journal ref: IEEE Transactions on Network Science and Engineering, 2023

arXiv:2210.05982 [pdf, other]

A nearly optimal randomized algorithm for explorable heap selection

Authors: Sander Borst, Daniel Dadush, Sophie Huiberts, Danish Kashaev

Abstract: Explorable heap selection is the problem of selecting the $n$th smallest value in a binary heap. The key values can only be accessed by traversing through the underlying infinite binary tree, and the complexity of the algorithm is measured by the total distance traveled in the tree (each edge has unit cost). This problem was originally proposed as a model to study search strategies for the branch-… ▽ More Explorable heap selection is the problem of selecting the $n$th smallest value in a binary heap. The key values can only be accessed by traversing through the underlying infinite binary tree, and the complexity of the algorithm is measured by the total distance traveled in the tree (each edge has unit cost). This problem was originally proposed as a model to study search strategies for the branch-and-bound algorithm with storage restrictions by Karp, Saks and Widgerson (FOCS '86), who gave deterministic and randomized $n\cdot \exp(O(\sqrt{\log{n}}))$ time algorithms using $O(\log(n)^{2.5})$ and $O(\sqrt{\log n})$ space respectively. We present a new randomized algorithm with running time $O(n\log(n)^3)$ using $O(\log n)$ space, substantially improving the previous best randomized running time at the expense of slightly increased space usage. We also show an $Ω(\log(n)n/\log(\log(n)))$ for any algorithm that solves the problem in the same amount of space, indicating that our algorithm is nearly optimal. △ Less

Submitted 12 October, 2022; originally announced October 2022.

arXiv:2206.07006 [pdf, other]

doi 10.1016/j.peva.2023.102355

Stability of a Stochastic Ring Network

Authors: Jaap Storm, Wouter Kager, Michel Mandjes, Sem Borst

Abstract: In this paper we establish a necessary and sufficient stability condition for a stochastic ring network. Such networks naturally appear in a variety of applications within communication, computer, and road traffic systems. They typically involve multiple customer types and some form of priority structure to decide which customer receives service. These two system features tend to complicate the is… ▽ More In this paper we establish a necessary and sufficient stability condition for a stochastic ring network. Such networks naturally appear in a variety of applications within communication, computer, and road traffic systems. They typically involve multiple customer types and some form of priority structure to decide which customer receives service. These two system features tend to complicate the issue of identifying a stability condition, but we demonstrate how the ring topology can be leveraged to solve the problem. △ Less

Submitted 9 August, 2023; v1 submitted 14 June, 2022; originally announced June 2022.

Comments: 25 pages, 2 figures; v2: revamped Section 3.1, rewritten Section 6.1, expanded Section 7, added two figures and two references, streamlined the proof of Theorem 5.6, additional minor edits throughout

Journal ref: Performance Evaluation 162 (2023) 102355

arXiv:2203.11863 [pdf, other]

doi 10.1137/1.9781611977554

Integrality Gaps for Random Integer Programs via Discrepancy

Authors: Sander Borst, Daniel Dadush, Dan Mikulincer

Abstract: We prove new bounds on the additive gap between the value of a random integer program $\max c^Tx,\ Ax\leq b,\ x\in\{0,1\}^n$ with $m$ constraints and that of its linear programming relaxation for a wide range of distributions on $(A,b,c)$ . We are motivated by the work of Dey, Dubey, and Molinaro (SODA '21), who gave a framework for relating the size of Branch-and-Bound (B&B) trees to additive int… ▽ More We prove new bounds on the additive gap between the value of a random integer program $\max c^Tx,\ Ax\leq b,\ x\in\{0,1\}^n$ with $m$ constraints and that of its linear programming relaxation for a wide range of distributions on $(A,b,c)$ . We are motivated by the work of Dey, Dubey, and Molinaro (SODA '21), who gave a framework for relating the size of Branch-and-Bound (B&B) trees to additive integrality gaps. Dyer and Frieze (MOR '89) and Borst et al. (Mathematical Programming '22), respectively, showed that for certain random packing and Gaussian IPs, where the entries of $A,c$ are independently distributed according to either the uniform distribution on $[0,1]$ or the Gaussian distribution $\mathcal{N}(0,1)$, the integrality gap is bounded by $O_m(\log^2 n / n)$ with probability at least $1-1/n-e^{-Ω_m(1)}$. In this paper, we generalize these results to the case where the entries of $A$ are uniformly distributed on an integer interval (e.g., entries in $\{-1,0,1\}$), and where the columns of $A$ are distributed according to an isotropic logconcave distribution. Second, we substantially improve the success probability to $1-1/poly(n)$, compared to constant probability in prior works (depending on $m$). Leveraging the connection to Branch-and-Bound, our gap results imply that for these IPs B&B trees have size $n^{poly(m)}$ with high probability (i.e., polynomial for fixed $m$), which significantly extends the class of IPs for which B&B is known to be polynomial. Our main technical contribution is a new linear discrepancy theorem for random matrices. Our theorem gives general conditions under which a target vector is equal to or very close to a $\{0,1\}$ combination of the columns of a random matrix $A$ . The proof uses a Fourier analytic approach, building on work of Hoberg and Rothvoss (SODA '19) and Franks and Saks (RSA '20). △ Less

Submitted 11 April, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

arXiv:2112.08958 [pdf, other]

doi 10.1287/stsy.2022.0103

Utility maximizing load balancing policies

Authors: Diego Goldsztajn, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: Consider a service system where incoming tasks are instantaneously dispatched to one out of many heterogeneous server pools. Associated with each server pool is a concave utility function which depends on the class of the server pool and its current occupancy. We derive an upper bound for the mean normalized aggregate utility in stationarity and introduce two load balancing policies that achieve t… ▽ More Consider a service system where incoming tasks are instantaneously dispatched to one out of many heterogeneous server pools. Associated with each server pool is a concave utility function which depends on the class of the server pool and its current occupancy. We derive an upper bound for the mean normalized aggregate utility in stationarity and introduce two load balancing policies that achieve this upper bound in a large-scale regime. Furthermore, the transient and stationary behavior of these asymptotically optimal load balancing policies is characterized on the scale of the number of server pools, in the same large-scale regime. △ Less

Submitted 10 February, 2024; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: 73 pages, 6 figures

MSC Class: 60K25 (Primary) 60F15; 60F17 (Secondary) ACM Class: G.3

Journal ref: Stochastic systems, 13(2):211-246, 2023

arXiv:2111.05777 [pdf, other]

Power-of-two sampling in redundancy systems: the impact of assignment constraints

Authors: Ellen Cardinaels, Sem Borst, Johan S. H. van Leeuwaarden

Abstract: A classical sampling strategy for load balancing policies is power-of-two, where any server pair is sampled with equal probability. This does not cover practical settings with assignment constraints which force non-uniform sampling. While intuition suggests that non-uniform sampling adversely impacts performance, this was only supported through simulations, and rigorous statements have remained el… ▽ More A classical sampling strategy for load balancing policies is power-of-two, where any server pair is sampled with equal probability. This does not cover practical settings with assignment constraints which force non-uniform sampling. While intuition suggests that non-uniform sampling adversely impacts performance, this was only supported through simulations, and rigorous statements have remained elusive. Building on product-form distributions for redundancy systems, we prove the stochastic dominance of uniform sampling for a four-server system as well as arbitrary-size systems in light traffic. △ Less

Submitted 15 July, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

arXiv:2107.11248 [pdf, ps, other]

A multidimensional solution to additive homological equations

Authors: Aleksei F. Ber, Matthijs J. Borst, Sander J. Borst, Fedor A. Sukochev

Abstract: In this paper we prove that for a finite-dimensional real normed space $V$, every bounded mean zero function $f\in L_\infty([0,1];V)$ can be written in the form $f = g\circ T - g$ for some $g\in L_\infty([0,1];V)$ and some ergodic invertible measure preserving transformation $T$ of $[0,1]$. Our method moreover allows us to choose $g$, for any given $\varepsilon>0$, to be such that… ▽ More In this paper we prove that for a finite-dimensional real normed space $V$, every bounded mean zero function $f\in L_\infty([0,1];V)$ can be written in the form $f = g\circ T - g$ for some $g\in L_\infty([0,1];V)$ and some ergodic invertible measure preserving transformation $T$ of $[0,1]$. Our method moreover allows us to choose $g$, for any given $\varepsilon>0$, to be such that $\|g\|_\infty\leq (S_V+\varepsilon)\|f\|_\infty$, where $S_V$ is the Steinitz constant corresponding to $V$. △ Less

Submitted 23 July, 2021; originally announced July 2021.

Comments: 51 pages

arXiv:2105.13738 [pdf, ps, other]

Fork-join and redundancy systems with heavy-tailed job sizes

Authors: Youri Raaijmakers, Sem Borst, Onno Boxma

Abstract: We investigate the tail asymptotics of the response time distribution for the cancel-on-start (c.o.s.) and cancel-on-completion (c.o.c.) variants of redundancy-$d$ scheduling and the fork-join model with heavy-tailed job sizes. We present bounds, which only differ in the pre-factor, for the tail probability of the response time in the case of the first-come first-served (FCFS) discipline. For the… ▽ More We investigate the tail asymptotics of the response time distribution for the cancel-on-start (c.o.s.) and cancel-on-completion (c.o.c.) variants of redundancy-$d$ scheduling and the fork-join model with heavy-tailed job sizes. We present bounds, which only differ in the pre-factor, for the tail probability of the response time in the case of the first-come first-served (FCFS) discipline. For the c.o.s. variant we restrict ourselves to redundancy-$d$ scheduling, which is a special case of the fork-join model. In particular, for regularly varying job sizes with tail index $-ν$ the tail index of the response time for the c.o.s. variant of redundancy-$d$ equals $-\min\{d_{\mathrm{cap}}(ν-1),ν\}$, where $d_{\mathrm{cap}} = \min\{d,N-k\}$, $N$ is the number of servers and $k$ is the integer part of the load. This result indicates that for $d_{\mathrm{cap}} < \fracν{ν-1}$ the waiting time component is dominant, whereas for $d_{\mathrm{cap}} > \fracν{ν-1}$ the job size component is dominant. Thus, having $d = \lceil \min\{\fracν{ν-1},N-k\} \rceil$ replicas is sufficient to achieve the optimal asymptotic tail behavior of the response time. For the c.o.c. variant of the fork-join($n_{\mathrm{F}},n_{\mathrm{J}}$) model the tail index of the response time, under some assumptions on the load, equals $1-ν$ and $1-(n_{\mathrm{F}}+1-n_{\mathrm{J}})ν$, for identical and i.i.d. replicas, respectively; here the waiting time component is always dominant. △ Less

Submitted 28 May, 2021; originally announced May 2021.

arXiv:2012.13306 [pdf, ps, other]

Majorizing Measures for the Optimizer

Authors: Sander Borst, Daniel Dadush, Neil Olver, Makrand Sinha

Abstract: The theory of majorizing measures, extensively developed by Fernique, Talagrand and many others, provides one of the most general frameworks for controlling the behavior of stochastic processes. In particular, it can be applied to derive quantitative bounds on the expected suprema and the degree of continuity of sample paths for many processes. One of the crowning achievements of the theory is T… ▽ More The theory of majorizing measures, extensively developed by Fernique, Talagrand and many others, provides one of the most general frameworks for controlling the behavior of stochastic processes. In particular, it can be applied to derive quantitative bounds on the expected suprema and the degree of continuity of sample paths for many processes. One of the crowning achievements of the theory is Talagrand's tight alternative characterization of the suprema of Gaussian processes in terms of majorizing measures. The proof of this theorem was difficult, and thus considerable effort was put into the task of develo** both shorter and easier to understand proofs. A major reason for this difficulty was considered to be theory of majorizing measures itself, which had the reputation of being opaque and mysterious. As a consequence, most recent treatments of the theory (including by Talagrand himself) have eschewed the use of majorizing measures in favor of a purely combinatorial approach (the generic chaining) where objects based on sequences of partitions provide roughly matching upper and lower bounds on the desired expected supremum. In this paper, we return to majorizing measures as a primary object of study, and give a viewpoint that we think is natural and clarifying from an optimization perspective. As our main contribution, we give an algorithmic proof of the majorizing measures theorem based on two parts: (1) We make the simple (but apparently new) observation that finding the best majorizing measure can be cast as a convex program. This also allows for efficiently computing the measure using off-the-shelf methods from convex optimization. (2) We obtain tree-based upper and lower bound certificates by rounding, in a series of steps, the primal and dual solutions to this convex program. [...] △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 37 pages. Extended Abstract to appear in ITCS 2021

MSC Class: 60G15; 68Q87 ACM Class: G.3

arXiv:2012.10142 [pdf, other]

Learning and balancing unknown loads in large-scale systems

Authors: Diego Goldsztajn, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: Consider a system of identical server pools where tasks with exponentially distributed service times arrive as a time-inhomogenenous Poisson process. An admission threshold is used in an inner control loop to assign incoming tasks to server pools while, in an outer control loop, a learning scheme adjusts this threshold over time to keep it aligned with the unknown offered load of the system. In a… ▽ More Consider a system of identical server pools where tasks with exponentially distributed service times arrive as a time-inhomogenenous Poisson process. An admission threshold is used in an inner control loop to assign incoming tasks to server pools while, in an outer control loop, a learning scheme adjusts this threshold over time to keep it aligned with the unknown offered load of the system. In a many-server regime, we prove that the learning scheme reaches an equilibrium along intervals of time where the normalized offered load per server pool is suitably bounded, and that this results in a balanced distribution of the load. Furthermore, we establish a similar result when tasks with Coxian distributed service times arrive at a constant rate and the threshold is adjusted using only the total number of tasks in the system. The novel proof technique developed in this paper, which differs from a traditional fluid limit analysis, allows to handle rapid variations of the first learning scheme, triggered by excursions of the occupancy process that have vanishing size. Moreover, our approach allows to characterize the asymptotic behavior of the system with Coxian distributed service times without relying on a fluid limit of a detailed state descriptor. △ Less

Submitted 5 April, 2024; v1 submitted 18 December, 2020; originally announced December 2020.

Comments: 56 pages, 3 figures

MSC Class: 60K25 (Primary) 60F15; 60F17 (Secondary) ACM Class: G.3

arXiv:2012.08357 [pdf, other]

Optimal Hyper-Scalable Load Balancing with a Strict Queue Limit

Authors: Mark van der Boor, Sem Borst, Johan van Leeuwaarden

Abstract: Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off between delay performance and implementation overhead (e.g. communication or memory usage). This trade-off has primarily been studied so far from the angle of th… ▽ More Load balancing plays a critical role in efficiently dispatching jobs in parallel-server systems such as cloud networks and data centers. A fundamental challenge in the design of load balancing algorithms is to achieve an optimal trade-off between delay performance and implementation overhead (e.g. communication or memory usage). This trade-off has primarily been studied so far from the angle of the amount of overhead required to achieve asymptotically optimal performance, particularly vanishing delay in large-scale systems. In contrast, in the present paper, we focus on an arbitrarily sparse communication budget, possibly well below the minimum requirement for vanishing delay, referred to as the hyper-scalable operating region. Furthermore, jobs may only be admitted when a specific limit on the queue position of the job can be guaranteed. The centerpiece of our analysis is a universal upper bound for the achievable throughput of any dispatcher-driven algorithm for a given communication budget and queue limit. We also propose a specific hyper-scalable scheme which can operate at any given message rate and enforce any given queue limit, while allowing the server states to be captured via a closed product-form network, in which servers act as customers traversing various nodes. The product-form distribution is leveraged to prove that the bound is tight and that the proposed hyper-scalable scheme is throughput-optimal in a many-server regime given the communication and queue limit constraints. Extensive simulation experiments are conducted to illustrate the results. △ Less

Submitted 14 December, 2020; originally announced December 2020.

arXiv:2012.08346 [pdf, ps, other]

On the Integrality Gap of Binary Integer Programs with Gaussian Data

Authors: Sander Borst, Daniel Dadush, Sophie Huiberts, Samarth Tiwari

Abstract: For a binary integer program (IP) ${\rm max} ~ c^\mathsf{T} x, Ax \leq b, x \in \{0,1\}^n$, where $A \in \mathbb{R}^{m \times n}$ and $c \in \mathbb{R}^n$ have independent Gaussian entries and the right-hand side $b \in \mathbb{R}^m$ satisfies that its negative coordinates have $\ell_2$ norm at most $n/10$, we prove that the gap between the value of the linear programming relaxation and the IP is… ▽ More For a binary integer program (IP) ${\rm max} ~ c^\mathsf{T} x, Ax \leq b, x \in \{0,1\}^n$, where $A \in \mathbb{R}^{m \times n}$ and $c \in \mathbb{R}^n$ have independent Gaussian entries and the right-hand side $b \in \mathbb{R}^m$ satisfies that its negative coordinates have $\ell_2$ norm at most $n/10$, we prove that the gap between the value of the linear programming relaxation and the IP is upper bounded by $\operatorname{poly}(m)(\log n)^2 / n$ with probability at least $1-2/n^7-2^{-\operatorname{poly}(m)}$. Our results give a Gaussian analogue of the classical integrality gap result of Dyer and Frieze (Math. of O.R., 1989) in the case of random packing IPs. In constrast to the packing case, our integrality gap depends only polynomially on $m$ instead of exponentially. Building upon recent breakthrough work of Dey, Dubey and Molinaro (SODA, 2021), we show that the integrality gap implies that branch-and-bound requires $n^{\operatorname{poly}(m)}$ time on random Gaussian IPs with good probability, which is polynomial when the number of constraints $m$ is fixed. We derive this result via a novel meta-theorem, which relates the size of branch-and-bound trees and the integrality gap for random logconcave IPs. △ Less

Submitted 2 June, 2021; v1 submitted 15 December, 2020; originally announced December 2020.

arXiv:2010.15525 [pdf, other]

doi 10.1287/ijoc.2021.1100

Self-Learning Threshold-Based Load Balancing

Authors: Diego Goldsztajn, Sem C. Borst, Johan S. H. van Leeuwaarden, Debankur Mukherjee, Philip A. Whiting

Abstract: We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality-of-service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the flui… ▽ More We consider a large-scale service system where incoming tasks have to be instantaneously dispatched to one out of many parallel server pools. The user-perceived performance degrades with the number of concurrent tasks and the dispatcher aims at maximizing the overall quality-of-service by balancing the load through a simple threshold policy. We demonstrate that such a policy is optimal on the fluid and diffusion scales, while only involving a small communication overhead, which is crucial for large-scale deployments. In order to set the threshold optimally, it is important, however, to learn the load of the system, which may be unknown. For that purpose, we design a control rule for tuning the threshold in an online manner. We derive conditions which guarantee that this adaptive threshold settles at the optimal value, along with estimates for the time until this happens. In addition, we provide numerical experiments which support the theoretical results and further indicate that our policy copes effectively with time-varying demand patterns. △ Less

Submitted 11 September, 2023; v1 submitted 29 October, 2020; originally announced October 2020.

Comments: 52 pages, 6 figures

MSC Class: 60F17; 60K25 (Primary) 68M20 (Secondary) ACM Class: C.4; G.3

Journal ref: INFORMS Journal on Computing, 34(1):39-54, 2022

arXiv:2008.03478 [pdf, ps, other]

Achievable Stability in Redundancy Systems

Authors: Youri Raaijmakers, Sem Borst

Abstract: We consider a system with $N$ parallel servers where incoming jobs are immediately replicated to, say, $d$ servers. Each of the $N$ servers has its own queue and follows a FCFS discipline. As soon as the first job replica is completed, the remaining replicas are abandoned. We investigate the achievable stability region for a quite general workload model with different job types and heterogeneous s… ▽ More We consider a system with $N$ parallel servers where incoming jobs are immediately replicated to, say, $d$ servers. Each of the $N$ servers has its own queue and follows a FCFS discipline. As soon as the first job replica is completed, the remaining replicas are abandoned. We investigate the achievable stability region for a quite general workload model with different job types and heterogeneous servers, reflecting job-server affinity relations which may arise from data locality issues and soft compatibility constraints. Under the assumption that job types are known beforehand we show for New-Better-than-Used (NBU) distributed speed variations that no replication $(d=1)$ gives a strictly larger stability region than replication $(d>1)$. Strikingly, this does not depend on the underlying distribution of the intrinsic job sizes, but observing the job types is essential for this statement to hold. In case of non-observable job types we show that for New-Worse-than-Used (NWU) distributed speed variations full replication ($d=N$) gives a larger stability region than no replication $(d=1)$. △ Less

Submitted 8 August, 2020; originally announced August 2020.

arXiv:2007.13615 [pdf, other]

doi 10.1007/s00453-022-00946-8

New FPT algorithms for finding the temporal hybridization number for sets of phylogenetic trees

Authors: Sander Borst, Leo van Iersel, Mark Jones, Steven Kelk

Abstract: We study the problem of finding a temporal hybridization network for a set of phylogenetic trees that minimizes the number of reticulations. First, we introduce an FPT algorithm for this problem on an arbitrary set of $m$ binary trees with $n$ leaves each with a running time of $O(5^k\cdot n\cdot m)$, where $k$ is the minimum temporal hybridization number. We also present the concept of temporal d… ▽ More We study the problem of finding a temporal hybridization network for a set of phylogenetic trees that minimizes the number of reticulations. First, we introduce an FPT algorithm for this problem on an arbitrary set of $m$ binary trees with $n$ leaves each with a running time of $O(5^k\cdot n\cdot m)$, where $k$ is the minimum temporal hybridization number. We also present the concept of temporal distance, which is a measure for how close a tree-child network is to being temporal. Then we introduce an algorithm for computing a tree-child network with temporal distance at most $d$ and at most $k$ reticulations in $O((8k)^d5^ k\cdot n\cdot m)$ time. Lastly, we introduce a $O(6^kk!\cdot k\cdot n^2)$ time algorithm for computing a minimum temporal hybridization network for a set of two nonbinary trees. We also provide an implementation of all algorithms and an experimental analysis on their performance. △ Less

Submitted 27 July, 2020; originally announced July 2020.

arXiv:2005.14566 [pdf, other]

Heavy-Traffic Universality of Redundancy Systems with Assignment Constraints

Authors: Ellen Cardinaels, Sem Borst, Johan S. H. van Leeuwaarden

Abstract: Service systems often face task-server assignment-constraints due to skill-based routing or geographical conditions. Redundancy scheduling responds to this limited flexibility by replicating tasks to specific servers in agreement with these assignment constraints. We gain insight from product-form stationary distributions and weak local stability conditions to establish a state space collapse in h… ▽ More Service systems often face task-server assignment-constraints due to skill-based routing or geographical conditions. Redundancy scheduling responds to this limited flexibility by replicating tasks to specific servers in agreement with these assignment constraints. We gain insight from product-form stationary distributions and weak local stability conditions to establish a state space collapse in heavy traffic. In this limiting regime, the parallel-server system with redundancy scheduling operates as a multi-class single-server system, achieving full resource pooling and exhibiting strong insensitivity to the underlying assignment constraints. In particular, the performance of a fully flexible (unconstrained) system can be matched even with rather strict assignment constraints. △ Less

Submitted 16 August, 2022; v1 submitted 29 May, 2020; originally announced May 2020.

Comments: 53 pages, 4 figures

arXiv:2005.13353 [pdf, other]

Threshold-based rerouting and replication for resolving job-server affinity relations

Authors: Youri Raaijmakers, Sem Borst, Onno Boxma

Abstract: We consider a system with several job types and two parallel server pools. Within the pools the servers are homogeneous, but across pools possibly not in the sense that the service speed of a job may depend on its type as well as the server pool. Immediately upon arrival, jobs are assigned to a server pool. This could be based on (partial) knowledge of their type, but such knowledge might not be a… ▽ More We consider a system with several job types and two parallel server pools. Within the pools the servers are homogeneous, but across pools possibly not in the sense that the service speed of a job may depend on its type as well as the server pool. Immediately upon arrival, jobs are assigned to a server pool. This could be based on (partial) knowledge of their type, but such knowledge might not be available. Information about the job type can however be obtained while the job is in service; as the service progresses, the likelihood that the service speed of this job type is low increases, creating an incentive to execute the job on different, possibly faster, server(s). Two policies are considered: reroute the job to the other server pool, or replicate it there. We determine the effective load per server under both the rerouting and replication policy for completely unknown as well as partly known job types. We also examine the impact of these policies on the stability bound, and find that the uncertainty in job types may significantly degrade the performance. For (highly) unbalanced service speeds full replication achieves the largest stability bound while for (nearly) balanced service speeds no replication maximizes the stability bound. Finally, we discuss how the use of threshold-based policies can help improve the expected latency for completely or partly unknown job types. △ Less

Submitted 27 May, 2020; originally announced May 2020.

arXiv:2001.02841 [pdf, other]

Wireless random-access networks with bipartite interference graphs

Authors: Sem C. Borst, Frank den Hollander, Francesca R. Nardi, Matteo Sfragara

Abstract: We consider random-access networks where nodes represent servers with a queue and can be either active or inactive. A node deactivates at unit rate, while it activates at a rate that depends on its queue length, provided none of its neighbors is active. We consider arbitrary bipartite graphs in the limit as the initial queue lengths become large and identify the transition time between the two sta… ▽ More We consider random-access networks where nodes represent servers with a queue and can be either active or inactive. A node deactivates at unit rate, while it activates at a rate that depends on its queue length, provided none of its neighbors is active. We consider arbitrary bipartite graphs in the limit as the initial queue lengths become large and identify the transition time between the two states where one half of the network is active and the other half is inactive. The transition path is decomposed into a succession of transitions on complete bipartite subgraphs. We formulate a randomized greedy algorithm that takes the graph as input and gives as output the set of transition paths the network is most likely to follow. Along each path we determine the mean transition time and its law on the scale of its mean. Depending on the activation rates, we identify three regimes of behavior. △ Less

Submitted 12 September, 2023; v1 submitted 9 January, 2020; originally announced January 2020.

Comments: 42 pages

arXiv:1912.13011 [pdf, other]

Crossover times in bipartite networks with activity constraints and time-varying switching rates

Authors: Sem Borst, Frank den Hollander, Francesca Nardi, Siamak Taati

Abstract: In this paper we study the performance of a bipartite network in which customers arrive at the nodes of the network, but not all nodes are able to serve their customers at all times. Each node can be either active or inactive, and two nodes connected by a bond cannot be active simultaneously. This situation arises in wireless random-access networks where, due to destructive interference, stations… ▽ More In this paper we study the performance of a bipartite network in which customers arrive at the nodes of the network, but not all nodes are able to serve their customers at all times. Each node can be either active or inactive, and two nodes connected by a bond cannot be active simultaneously. This situation arises in wireless random-access networks where, due to destructive interference, stations that are close to each other cannot use the same frequency band. We consider a model where the network is bipartite, the active nodes switch themselves off at rate 1, and the inactive nodes switch themselves on at a rate that depends on time and on which half of the bipartite network they are in. An inactive node cannot become active when one of the nodes it is connected to by a bond is active. The switching protocol allows the nodes to share activity among each other. In the limit as the activation rate becomes large, we compute the crossover time between the two states where one half of the network is active and the other half is inactive. This allows us to assess the overall activity of the network depending on the switching protocol. Our results make use of the metastability analysis for hard-core interacting particle models on finite bipartite graphs derived in an earlier paper. They are valid for a large class of bipartite networks, subject to certain assumptions. Proofs rely on a comparison with switching protocols that are not time-varying, through coupling techniques. △ Less

Submitted 12 February, 2022; v1 submitted 30 December, 2019; originally announced December 2019.

Comments: 32 pages, 2 figure

MSC Class: 60K25; 60K30; 60K35; 90B15; 90B18

arXiv:1912.00681 [pdf, ps, other]

Stability of Redundancy Systems with Processor Sharing

Authors: Youri Raaijmakers, Sem Borst, Onno Boxma

Abstract: We investigate the stability condition for redundancy-d systems where each of the servers follows a processor-sharing (PS) discipline. We allow for generally distributed job sizes, with possible dependence among the d replica sizes being governed by an arbitrary joint distribution. We establish that the stability condition is characterized by the expectation of the minimum of d replica sizes being… ▽ More We investigate the stability condition for redundancy-d systems where each of the servers follows a processor-sharing (PS) discipline. We allow for generally distributed job sizes, with possible dependence among the d replica sizes being governed by an arbitrary joint distribution. We establish that the stability condition is characterized by the expectation of the minimum of d replica sizes being less than the mean interarrival time per server. In the special case of identical replicas, the stability condition is insensitive to the job size distribution given its mean, and the stability condition is inversely proportional to the number of replicas. In the special case of i.i.d. replicas, the stability threshold decreases (increases) in the number of replicas for job size distributions that are NBU (NWU). We also discuss extensions to scenarios with heterogeneous servers. △ Less

Submitted 6 March, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

Comments: To appear in proceedings of ValueTools 2020

arXiv:1904.03980 [pdf, ps, other]

doi 10.1214/20-AAP1609

Induced idleness leads to deterministic heavy traffic limits for queue-based random-access algorithms

Authors: Eyal Castiel, Sem Borst, Laurent Miclo, Florian Simatos, Philip Whiting

Abstract: We examine a queue-based random-access algorithm where activation and deactivation rates are adapted as functions of queue lengths. We establish its heavy traffic behavior on a complete interference graph, which turns out to be highly nonstandard in two respects: (1) the scaling depends on some parameter of the algorithm and is not the $N/N^2$ scaling usually found in functional central limit theo… ▽ More We examine a queue-based random-access algorithm where activation and deactivation rates are adapted as functions of queue lengths. We establish its heavy traffic behavior on a complete interference graph, which turns out to be highly nonstandard in two respects: (1) the scaling depends on some parameter of the algorithm and is not the $N/N^2$ scaling usually found in functional central limit theorems; (2) the heavy traffic limit is deterministic. We discuss how this nonstandard behavior arises from the idleness induced by the distributed nature of the algorithm. In order to prove our main result, we developed a new method for obtaining a fully coupled stochastic averaging principle. △ Less

Submitted 16 February, 2020; v1 submitted 8 April, 2019; originally announced April 2019.

Comments: 40 pages

MSC Class: 60K25 (primary); 60K35 (secondary)

arXiv:1903.02337 [pdf, other]

doi 10.1145/3311075

Hyper-Scalable JSQ with Sparse Feedback

Authors: Mark van der Boor, Sem Borst, Johan van Leeuwaarden

Abstract: Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing mechanisms, have emerged as major concerns. Motivated by these issues, we introduce and analyze a novel class of load balancing schemes where the various se… ▽ More Load balancing algorithms play a vital role in enhancing performance in data centers and cloud networks. Due to the massive size of these systems, scalability challenges, and especially the communication overhead associated with load balancing mechanisms, have emerged as major concerns. Motivated by these issues, we introduce and analyze a novel class of load balancing schemes where the various servers provide occasional queue updates to guide the load assignment. We show that the proposed schemes strongly outperform JSQ($d$) strategies with comparable communication overhead per job, and can achieve a vanishing waiting time in the many-server limit with just one message per job, just like the popular JIQ scheme. The proposed schemes are particularly geared however towards the sparse feedback regime with less than one message per job, where they outperform corresponding sparsified JIQ versions. We investigate fluid limits for synchronous updates as well as asynchronous exponential update intervals. The fixed point of the fluid limit is identified in the latter case, and used to derive the queue length distribution. We also demonstrate that in the ultra-low feedback regime the mean stationary waiting time tends to a constant in the synchronous case, but grows without bound in the asynchronous case. △ Less

Submitted 6 March, 2019; originally announced March 2019.

arXiv:1812.10703 [pdf, other]

Job Allocation in Large-Scale Service Systems with Affinity Relations

Authors: Ellen Cardinaels, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: We consider load balancing in service systems with affinity relations between jobs and servers. Specifically, an arriving job can be allocated to a fast, primary server from a particular selection associated with this job or to a secondary server to be processed at a slower rate. Such job-server affinity relations can model network topologies based on geographical proximity, or data locality in cl… ▽ More We consider load balancing in service systems with affinity relations between jobs and servers. Specifically, an arriving job can be allocated to a fast, primary server from a particular selection associated with this job or to a secondary server to be processed at a slower rate. Such job-server affinity relations can model network topologies based on geographical proximity, or data locality in cloud scenarios. We introduce load balancing schemes that allocate jobs to primary servers if available, and otherwise to secondary servers. A novel coupling construction is developed to obtain stability conditions and performance bounds using a coupling technique. We also conduct a fluid limit analysis for symmetric model instances, which reveals a delicate interplay between the model parameters and load balancing performance. △ Less

Submitted 27 December, 2018; originally announced December 2018.

Comments: 29 pages, 9 figures

MSC Class: 60K25; 68M20; 90B15; 90B22; 90B35

arXiv:1812.00979 [pdf, other]

Deep Reinforcement Learning for Intelligent Transportation Systems

Authors: Xiao-Yang Liu, Zihan Ding, Sem Borst, Anwar Walid

Abstract: Intelligent Transportation Systems (ITSs) are envisioned to play a critical role in improving traffic flow and reducing congestion, which is a pervasive issue impacting urban areas around the globe. Rapidly advancing vehicular communication and edge cloud computation technologies provide key enablers for smart traffic management. However, operating viable real-time actuation mechanisms on a practi… ▽ More Intelligent Transportation Systems (ITSs) are envisioned to play a critical role in improving traffic flow and reducing congestion, which is a pervasive issue impacting urban areas around the globe. Rapidly advancing vehicular communication and edge cloud computation technologies provide key enablers for smart traffic management. However, operating viable real-time actuation mechanisms on a practically relevant scale involves formidable challenges, e.g., policy iteration and conventional Reinforcement Learning (RL) techniques suffer from poor scalability due to state space explosion. Motivated by these issues, we explore the potential for Deep Q-Networks (DQN) to optimize traffic light control policies. As an initial benchmark, we establish that the DQN algorithms yield the "thresholding" policy in a single-intersection. Next, we examine the scalability properties of DQN algorithms and their performance in a linear network topology with several intersections along a main artery. We demonstrate that DQN algorithms produce intelligent behavior, such as the emergence of "greenwave" patterns, reflecting their ability to learn favorable traffic light actuations. △ Less

Submitted 3 December, 2018; originally announced December 2018.

arXiv:1811.06309 [pdf, other]

doi 10.1007/s11134-019-09621-2

Redundancy scheduling with scaled Bernoulli service requirements

Authors: Youri Raaijmakers, Sem Borst, Onno Boxma

Abstract: Redundancy scheduling has emerged as a powerful strategy for improving response times in parallel-server systems. The key feature in redundancy scheduling is replication of a job upon arrival by dispatching replicas to different servers. Redundant copies are abandoned as soon as the first of these replicas finishes service. By creating multiple service opportunities, redundancy scheduling increase… ▽ More Redundancy scheduling has emerged as a powerful strategy for improving response times in parallel-server systems. The key feature in redundancy scheduling is replication of a job upon arrival by dispatching replicas to different servers. Redundant copies are abandoned as soon as the first of these replicas finishes service. By creating multiple service opportunities, redundancy scheduling increases the chance of a fast response from a server that is quick to provide service, and mitigates the risk of a long delay incurred when a single selected server turns out to be slow. The diversity enabled by redundant requests has been found to strongly improve the response time performance, especially in case of highly variable service requirements. Analytical results for redundancy scheduling are unfortunately scarce however, and even the stability condition has largely remained elusive so far, except for exponentially distributed service requirements. In order to gain further insight in the role of the service requirement distribution, we explore the behavior of redundancy scheduling for scaled Bernoulli service requirements. We establish a sufficient stability condition for generally distributed service requirements and we show that, for scaled Bernoulli service requirements, this condition is also asymptotically nearly necessary. This stability condition differs drastically from the exponential case, indicating that the stability condition depends on the service requirements in a sensitive and intricate manner. △ Less

Submitted 15 November, 2018; originally announced November 2018.

Report number: pages 67-82

Journal ref: Queueing Systems Volume 93, Issue 1-2, October 2019

arXiv:1807.05851 [pdf, other]

doi 10.1016/j.spa.2020.08.004

Transition time asymptotics of queue-based activation protocols in random-access networks

Authors: Sem Borst, Frank den Hollander, Francesca R. Nardi, Matteo Sfragara

Abstract: We consider networks where each node represents a server with a queue. An active node deactivates at unit rate. An inactive node activates at a rate that depends on its queue length, provided none of its neighbors is active. For complete bipartite networks, in the limit as the queues become large, we compute the average transition time between the two states where one half of the network is active… ▽ More We consider networks where each node represents a server with a queue. An active node deactivates at unit rate. An inactive node activates at a rate that depends on its queue length, provided none of its neighbors is active. For complete bipartite networks, in the limit as the queues become large, we compute the average transition time between the two states where one half of the network is active and the other half is inactive. We show that the law of the transition time divided by its mean exhibits a trichotomy, depending on the activation rate functions. △ Less

Submitted 9 February, 2021; v1 submitted 16 July, 2018; originally announced July 2018.

Comments: 32 pages

MSC Class: 60K25; 60K30; 90B15; 90B18

Journal ref: Stochastic Processes and their Applications, Volume 130, Issue 12, December 2020, Pages 7483-7517

arXiv:1806.05444 [pdf, other]

doi 10.1137/20M1323746

Scalable load balancing in networked systems: A survey of recent advances

Authors: Mark van der Boor, Sem C. Borst, Johan S. H. van Leeuwaarden, Debankur Mukherjee

Abstract: The basic load balancing scenario involves a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues. We discuss recent advances on scalable load balancing schemes which provide favorable delay performance when $N$ grows large, and yet only require minimal implementation overhead. Join-the-Shortest-Queue (JSQ) yields vanishing delays as $N$ grows… ▽ More The basic load balancing scenario involves a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues. We discuss recent advances on scalable load balancing schemes which provide favorable delay performance when $N$ grows large, and yet only require minimal implementation overhead. Join-the-Shortest-Queue (JSQ) yields vanishing delays as $N$ grows large, as in a centralized queueing arrangement, but involves a prohibitive communication burden. In contrast, power-of-$d$ or JSQ($d$) schemes that assign an incoming task to a server with the shortest queue among $d$ servers selected uniformly at random require little communication, but lead to constant delays. In order to examine this fundamental trade-off between delay performance and implementation overhead, we consider JSQ($d(N)$) schemes where the diversity parameter $d(N)$ depends on $N$ and investigate what growth rate of $d(N)$ is required to asymptotically match the optimal JSQ performance on fluid and diffusion scale. Stochastic coupling techniques and stochastic-process limits play an instrumental role in establishing the asymptotic optimality. We demonstrate how this methodology carries over to infinite-server settings, finite buffers, multiple dispatchers, servers arranged on graph topologies, and token-based load balancing including the popular Join-the-Idle-Queue (JIQ) scheme. In this way we provide a broad overview of the many recent advances in the field. This survey extends the short review presented at ICM 2018 (arXiv:1712.08555). △ Less

Submitted 4 November, 2021; v1 submitted 14 June, 2018; originally announced June 2018.

Comments: To appear in SIAM Review. arXiv admin note: substantial text overlap with arXiv:1712.08555

Journal ref: SIAM Rev. 64 3 (2022) 554-622

arXiv:1712.08555 [pdf, other]

Scalable Load Balancing in Networked Systems: Universality Properties and Stochastic Coupling Methods

Authors: Mark van der Boor, Sem C. Borst, Johan S. H. van Leeuwaarden, Debankur Mukherjee

Abstract: We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an introduction to the basic load balancing scenario, consisting of a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server… ▽ More We present an overview of scalable load balancing algorithms which provide favorable delay performance in large-scale systems, and yet only require minimal implementation overhead. Aimed at a broad audience, the paper starts with an introduction to the basic load balancing scenario, consisting of a single dispatcher where tasks arrive that must immediately be forwarded to one of $N$ single-server queues. A popular class of load balancing algorithms are so-called power-of-$d$ or JSQ($d$) policies, where an incoming task is assigned to a server with the shortest queue among $d$ servers selected uniformly at random. This class includes the Join-the-Shortest-Queue (JSQ) policy as a special case ($d = N$), which has strong stochastic optimality properties and yields a mean waiting time that vanishes as $N$ grows large for any fixed subcritical load. However, a nominal implementation of the JSQ policy involves a prohibitive communication burden in large-scale deployments. In contrast, a random assignment policy ($d = 1$) does not entail any communication overhead, but the mean waiting time remains constant as $N$ grows large for any fixed positive load. In order to examine the fundamental trade-off between performance and implementation overhead, we consider an asymptotic regime where $d(N)$ depends on $N$. We investigate what growth rate of $d(N)$ is required to match the performance of the JSQ policy on fluid and diffusion scale. The results demonstrate that the asymptotics for the JSQ($d(N)$) policy are insensitive to the exact growth rate of $d(N)$, as long as the latter is sufficiently fast, implying that the optimality of the JSQ policy can asymptotically be preserved while dramatically reducing the communication overhead. We additionally show how the communication overhead can be reduced yet further by the so-called Join-the-Idle-Queue scheme, leveraging memory at the dispatcher. △ Less

Submitted 22 December, 2017; originally announced December 2017.

Comments: Survey paper. Contribution to the Proceedings of the ICM 2018

arXiv:1711.04491 [pdf, other]

The impact of a network split on cascading failure processes

Authors: Fiona Sloothaak, Sem C. Borst, Bert Zwart

Abstract: Cascading failure models are typically used to capture the phenomenon where failures possibly trigger further failures in succession, causing knock-on effects. In many networks this ultimately leads to a disintegrated network where the failure propagation continues independently across the various components. In order to gain insight in the impact of network splitting on cascading failure processe… ▽ More Cascading failure models are typically used to capture the phenomenon where failures possibly trigger further failures in succession, causing knock-on effects. In many networks this ultimately leads to a disintegrated network where the failure propagation continues independently across the various components. In order to gain insight in the impact of network splitting on cascading failure processes, we extend a well-established cascading failure model for which the number of failures obeys a power-law distribution. We assume that a single line failure immediately splits the network in two components, and examine its effect on the power-law exponent. The results provide valuable qualitative insights that are crucial first steps towards understanding more complex network splitting scenarios. △ Less

Submitted 13 November, 2017; originally announced November 2017.

arXiv:1707.05866 [pdf, other]

doi 10.1145/3179417

Asymptotically Optimal Load Balancing Topologies

Authors: Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden

Abstract: We consider a system of $N$ servers inter-connected by some underlying graph topology $G_N$. Tasks arrive at the various servers as independent Poisson processes of rate $λ$. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in $G_N$. Tasks have unit-mean exponential service times and leave the system up… ▽ More We consider a system of $N$ servers inter-connected by some underlying graph topology $G_N$. Tasks arrive at the various servers as independent Poisson processes of rate $λ$. Each incoming task is irrevocably assigned to whichever server has the smallest number of tasks among the one where it appears and its neighbors in $G_N$. Tasks have unit-mean exponential service times and leave the system upon service completion. The above model has been extensively investigated in the case $G_N$ is a clique. Since the servers are exchangeable in that case, the queue length process is quite tractable, and it has been proved that for any $λ< 1$, the fraction of servers with two or more tasks vanishes in the limit as $N \to \infty$. For an arbitrary graph $G_N$, the lack of exchangeability severely complicates the analysis, and the queue length process tends to be worse than for a clique. Accordingly, a graph $G_N$ is said to be $N$-optimal or $\sqrt{N}$-optimal when the occupancy process on $G_N$ is equivalent to that on a clique on an $N$-scale or $\sqrt{N}$-scale, respectively. We prove that if $G_N$ is an Erdős-Rényi random graph with average degree $d(N)$, then it is with high probability $N$-optimal and $\sqrt{N}$-optimal if $d(N) \to \infty$ and $d(N) / (\sqrt{N} \log(N)) \to \infty$ as $N \to \infty$, respectively. This demonstrates that optimality can be maintained at $N$-scale and $\sqrt{N}$-scale while reducing the number of connections by nearly a factor $N$ and $\sqrt{N} / \log(N)$ compared to a clique, provided the topology is suitably random. It is further shown that if $G_N$ contains $Θ(N)$ bounded-degree nodes, then it cannot be $N$-optimal. In addition, we establish that an arbitrary graph $G_N$ is $N$-optimal when its minimum degree is $N - o(N)$, and may not be $N$-optimal even when its minimum degree is $c N + o(N)$ for any $0 < c < 1/2$. △ Less

Submitted 6 April, 2019; v1 submitted 18 July, 2017; originally announced July 2017.

Comments: A few relevant results from arXiv:1612.00723 are included for convenience

Journal ref: Proc. ACM Meas. Anal. Comput. Syst. 2 1 (2018)

arXiv:1706.01059 [pdf, other]

Load Balancing in Large-Scale Systems with Multiple Dispatchers

Authors: Mark van der Boor, Sem Borst, Johan van Leeuwaarden

Abstract: Load balancing algorithms play a crucial role in delivering robust application performance in data centers and cloud networks. Recently, strong interest has emerged in Join-the-Idle-Queue (JIQ) algorithms, which rely on tokens issued by idle servers in dispatching tasks and outperform power-of-$d$ policies. Specifically, JIQ strategies involve minimal information exchange, and yet achieve zero blo… ▽ More Load balancing algorithms play a crucial role in delivering robust application performance in data centers and cloud networks. Recently, strong interest has emerged in Join-the-Idle-Queue (JIQ) algorithms, which rely on tokens issued by idle servers in dispatching tasks and outperform power-of-$d$ policies. Specifically, JIQ strategies involve minimal information exchange, and yet achieve zero blocking and wait in the many-server limit. The latter property prevails in a multiple-dispatcher scenario when the loads are strictly equal among dispatchers. For various reasons it is not uncommon however for skewed load patterns to occur. We leverage product-form representations and fluid limits to establish that the blocking and wait then no longer vanish, even for arbitrarily low overall load. Remarkably, it is the least-loaded dispatcher that throttles tokens and leaves idle servers stranded, thus acting as bottleneck. Motivated by the above issues, we introduce two enhancements of the ordinary JIQ scheme where tokens are either distributed non-uniformly or occasionally exchanged among the various dispatchers. We prove that these extensions can achieve zero blocking and wait in the many-server limit, for any subcritical overall load and arbitrarily skewed load profiles. Extensive simulation experiments demonstrate that the asymptotic results are highly accurate, even for moderately sized systems. △ Less

Submitted 4 June, 2017; originally announced June 2017.

arXiv:1703.10575 [pdf, other]

Delay versus Stickiness Violation Trade-offs for Load Balancing in Large-Scale Data Centers

Authors: Qingkai Liang, Sem Borst

Abstract: Most load balancing techniques implemented in current data centers tend to rely on a map** from packets to server IP addresses through a hash value calculated from the flow five-tuple. The hash calculation allows extremely fast packet forwarding and provides flow `stickiness', meaning that all packets belonging to the same flow get dispatched to the same server. Unfortunately, such static hashin… ▽ More Most load balancing techniques implemented in current data centers tend to rely on a map** from packets to server IP addresses through a hash value calculated from the flow five-tuple. The hash calculation allows extremely fast packet forwarding and provides flow `stickiness', meaning that all packets belonging to the same flow get dispatched to the same server. Unfortunately, such static hashing may not yield an optimal degree of load balancing, e.g., due to variations in server processing speeds or traffic patterns. On the other hand, dynamic schemes, such as the Join-the-Shortest-Queue (JSQ) scheme, provide a natural way to mitigate load imbalances, but at the expense of stickiness violation. In the present paper we examine the fundamental trade-off between stickiness violation and packet-level latency performance in large-scale data centers. We establish that stringent flow stickiness carries a significant performance penalty in terms of packet-level delay. Moreover, relaxing the stickiness requirement by a minuscule amount is highly effective in clip** the tail of the latency distribution. We further propose a bin-based load balancing scheme that achieves a good balance among scalability, stickiness violation and packet-level delay performance. Extensive simulation experiments corroborate the analytical results and validate the effectiveness of the bin-based load balancing scheme. △ Less

Submitted 8 July, 2017; v1 submitted 30 March, 2017; originally announced March 2017.

arXiv:1703.08373 [pdf, other]

doi 10.1145/3084463

Optimal Service Elasticity in Large-Scale Distributed Systems

Authors: Debankur Mukherjee, Souvik Dhara, Sem Borst, Johan S. H. van Leeuwaarden

Abstract: A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and time-varying demand patterns. Auto-scaling provides a popular paradigm for automatically adjusting service capacity in response to demand while meeting performance… ▽ More A fundamental challenge in large-scale cloud networks and data centers is to achieve highly efficient server utilization and limit energy consumption, while providing excellent user-perceived performance in the presence of uncertain and time-varying demand patterns. Auto-scaling provides a popular paradigm for automatically adjusting service capacity in response to demand while meeting performance targets, and queue-driven auto-scaling techniques have been widely investigated in the literature. In typical data center architectures and cloud environments however, no centralized queue is maintained, and load balancing algorithms immediately distribute incoming tasks among parallel queues. In these distributed settings with vast numbers of servers, centralized queue-driven auto-scaling techniques involve a substantial communication overhead and major implementation burden, or may not even be viable at all. Motivated by the above issues, we propose a joint auto-scaling and load balancing scheme which does not require any global queue length information or explicit knowledge of system parameters, and yet provides provably near-optimal service elasticity. We establish the fluid-level dynamics for the proposed scheme in a regime where the total traffic volume and nominal service capacity grow large in proportion. The fluid-limit results show that the proposed scheme achieves asymptotic optimality in terms of user-perceived delay performance as well as energy consumption. Specifically, we prove that both the waiting time of tasks and the relative energy portion consumed by idle servers vanish in the limit. At the same time, the proposed scheme operates in a distributed fashion and involves only constant communication overhead per task, thus ensuring scalability in massive data center operations. △ Less

Submitted 24 March, 2017; originally announced March 2017.

Comments: Accepted in ACM SIGMETRICS, Urbana-Champaign, Illinois, USA, 2017

Journal ref: Proc. ACM Meas. Anal. Comput. Syst. 1 1 (2017)

arXiv:1612.00723 [pdf, ps, other]

doi 10.1287/stsy.2018.0016

Universality of Power-of-$d$ Load Balancing in Many-Server Systems

Authors: Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden, Philip A. Whiting

Abstract: We consider a system of $N$ parallel single-server queues with unit exponential service rates and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. When a task arrives, the dispatcher assigns it to a server with the shortest queue among $d(N)$ randomly selected servers ($1 \leq d(N) \leq N$). This load balancing strategy is referred to as a JSQ($d(N)$) scheme, marking tha… ▽ More We consider a system of $N$ parallel single-server queues with unit exponential service rates and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. When a task arrives, the dispatcher assigns it to a server with the shortest queue among $d(N)$ randomly selected servers ($1 \leq d(N) \leq N$). This load balancing strategy is referred to as a JSQ($d(N)$) scheme, marking that it subsumes the celebrated Join-the-Shortest Queue (JSQ) policy as a crucial special case for $d(N) = N$. We construct a stochastic coupling to bound the difference in the queue length processes between the JSQ policy and a scheme with an arbitrary value of $d(N)$. We use the coupling to derive the fluid limit in the regime where $λ(N) / N \to λ< 1$ as $N \to \infty$ with $d(N) \to\infty$, along with the associated fixed point. The fluid limit turns out not to depend on the exact growth rate of $d(N)$, and in particular coincides with that for the JSQ policy. We further leverage the coupling to establish that the diffusion limit in the critical regime where $(N - λ(N)) / \sqrt{N} \to β> 0$ as $N \to \infty$ with $d(N)/(\sqrt{N} \log (N))\to\infty$ corresponds to that for the JSQ policy. These results indicate that the optimality of the JSQ policy can be preserved at the fluid-level and diffusion-level while reducing the overhead by nearly a factor O($N$) and O($\sqrt{N}/\log(N)$), respectively. △ Less

Submitted 16 November, 2018; v1 submitted 2 December, 2016; originally announced December 2016.

Comments: 39 pages, 2 figures, companion paper of arXiv:1612.00722

Journal ref: Stoch.Syst. 8 4 (2018) 265-292

arXiv:1612.00722 [pdf, ps, other]

doi 10.1287/moor.2019.1042

Asymptotic Optimality of Power-of-$d$ Load Balancing in Large-Scale Systems

Authors: Debankur Mukherjee, Sem C. Borst, Johan S. H. van Leeuwaarden, Philip A. Whiting

Abstract: We consider a system of $N$ identical server pools and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. Arriving tasks cannot be queued, and must immediately be assigned to one of the server pools to start execution, or discarded. The execution times are assumed to be exponentially distributed with unit mean, and do not depend on the number of other tasks receiving servi… ▽ More We consider a system of $N$ identical server pools and a single dispatcher where tasks arrive as a Poisson process of rate $λ(N)$. Arriving tasks cannot be queued, and must immediately be assigned to one of the server pools to start execution, or discarded. The execution times are assumed to be exponentially distributed with unit mean, and do not depend on the number of other tasks receiving service. However, the experienced performance (e.g. in terms of received throughput) does degrade with an increasing number of concurrent tasks at the same server pool. The dispatcher therefore aims to evenly distribute the tasks across the various server pools. Specifically, when a task arrives, the dispatcher assigns it to the server pool with the minimum number of tasks among $d(N)$ randomly selected server pools. This assignment strategy is called the JSQ$(d(N))$ scheme, as it resembles the power-of-$d$ version of the Join-the-Shortest-Queue (JSQ) policy, and will also be referred to as such in the special case $d(N) = N$. We construct a stochastic coupling to bound the difference in the system occupancy processes between the JSQ policy and a scheme with an arbitrary value of $d(N)$. We use the coupling to derive the fluid limit in case $d(N) \to \infty$ and $λ(N)/N \to λ$ as $N \to \infty$, along with the associated fixed point. The fluid limit turns out to be insensitive to the exact growth rate of $d(N)$, and coincides with that for the JSQ policy. We further leverage the coupling to establish that the diffusion limit corresponds to that for the JSQ policy as well, as long as $d(N)/\sqrt{N} \log(N) \to \infty$, and characterize the common limiting diffusion process. These results indicate that the JSQ optimality can be preserved at the fluid-level and diffusion-level while reducing the overhead by nearly a factor O($N$) and O($\sqrt{N}/\log(N)$), respectively. △ Less

Submitted 2 December, 2016; originally announced December 2016.

Comments: 48 pages, 3 figures, companion paper of arXiv:1612.00723

Journal ref: Math. Oper. Res. 45 4 (2020) 1535-1571

arXiv:1611.09723 [pdf, other]

Mean-field limits for large-scale random-access networks

Authors: Fabio Cecchi, Sem C. Borst, Johan S. H. van Leeuwaarden, Philip A. Whiting

Abstract: We establish mean-field limits for large-scale random-access networks with buffer dynamics and arbitrary interference graphs. While saturated-buffer scenarios have been widely investigated and yield useful throughput estimates for persistent sessions, they fail to capture the fluctuations in buffer contents over time, and provide no insight in the delay performance of flows with intermittent packe… ▽ More We establish mean-field limits for large-scale random-access networks with buffer dynamics and arbitrary interference graphs. While saturated-buffer scenarios have been widely investigated and yield useful throughput estimates for persistent sessions, they fail to capture the fluctuations in buffer contents over time, and provide no insight in the delay performance of flows with intermittent packet arrivals. Motivated by that issue, we explore in the present paper random-access networks with buffer dynamics, where flows with empty buffers refrain from competition for the medium. The occurrence of empty buffers thus results in a complex dynamic interaction between activity states and buffer contents, which severely complicates the performance analysis. Hence we focus on a many-sources regime where the total number of nodes grows large, which not only offers mathematical tractability but is also highly relevant with the densification of wireless networks as the Internet of Things emerges. We exploit time scale separation properties to prove that the properly scaled buffer occupancy process converges to the solution of a deterministic initial-value problem, and establish the existence and uniqueness of the associated fixed point. This approach simplifies the performance analysis of networks with huge numbers of nodes to a low-dimensional fixed-point calculation. For the case of a complete interference graph, we demonstrate asymptotic stability, provide a simple closed-form expression for the fixed point, and prove interchange of the mean-field and steady-state limits. This yields asymptotically exact approximations for key performance metrics, in particular the stationary buffer content and packet delay distributions. The methodological framework that we develop easily extends to various model refinements as will be illustrated by several examples. △ Less

Submitted 24 April, 2019; v1 submitted 29 November, 2016; originally announced November 2016.

arXiv:1611.05070 [pdf, ps, other]

doi 10.1016/j.dam.2016.10.009

Scaling Laws for Maximum Coloring of Random Geometric Graphs

Authors: Sem Borst, Milan Bradonjić

Abstract: We examine maximum vertex coloring of random geometric graphs, in an arbitrary but fixed dimension, with a constant number of colors. Since this problem is neither scale-invariant nor smooth, the usual methodology to obtain limit laws cannot be applied. We therefore leverage different concepts based on subadditivity to establish convergence laws for the maximum number of vertices that can be color… ▽ More We examine maximum vertex coloring of random geometric graphs, in an arbitrary but fixed dimension, with a constant number of colors. Since this problem is neither scale-invariant nor smooth, the usual methodology to obtain limit laws cannot be applied. We therefore leverage different concepts based on subadditivity to establish convergence laws for the maximum number of vertices that can be colored. For the constants that appear in these results, we provide the exact value in dimension one, and upper and lower bounds in higher dimensions. △ Less

Submitted 15 November, 2016; originally announced November 2016.

MSC Class: 60C05; 60D05; 60G55; 05C15; 05C80; 68R05; 68R10

arXiv:1604.03677 [pdf, ps, other]

Robustness of Power-law Behavior in Cascading Failure Models

Authors: F. Sloothaak, S. C. Borst, A. P. Zwart

Abstract: Inspired by reliability issues in electric transmission networks, we use a probabilistic approach to study the occurrence of large failures in a stylized cascading failure model. In this model, lines have random capacities that initially meet the load demands imposed on the network. Every single line failure changes the load distribution in the surviving network, possibly causing further lines to… ▽ More Inspired by reliability issues in electric transmission networks, we use a probabilistic approach to study the occurrence of large failures in a stylized cascading failure model. In this model, lines have random capacities that initially meet the load demands imposed on the network. Every single line failure changes the load distribution in the surviving network, possibly causing further lines to become overloaded and trip as well. An initial single line failure can therefore potentially trigger massive cascading effects, and in this paper we measure the risk of such cascading events by the probability that the number of failed lines exceeds a certain large threshold. Under particular critical conditions, the exceedance probability follows a power-law distribution, implying a significant risk of severe failures. We examine the robustness of the power-law behavior by exploring under which assumptions this behavior prevails. △ Less

Submitted 13 April, 2016; originally announced April 2016.

arXiv:1511.01798 [pdf, ps, other]

Optimality gaps in asymptotic dimensioning of many-server systems

Authors: Jaron Sanders, S. C. Borst, A. J. E. M. Janssen, J. S. H. van Leeuwaarden

Abstract: The Quality-and-Efficiency-Driven (QED) regime provides a basis for solving asymptotic dimensioning problems that trade off revenue, costs and service quality. We derive bounds for the optimality gaps that capture the differences between the true optimum and the asymptotic optimum based on the QED approximations. Our bounds generalize earlier results for classical many-server systems. We also appl… ▽ More The Quality-and-Efficiency-Driven (QED) regime provides a basis for solving asymptotic dimensioning problems that trade off revenue, costs and service quality. We derive bounds for the optimality gaps that capture the differences between the true optimum and the asymptotic optimum based on the QED approximations. Our bounds generalize earlier results for classical many-server systems. We also apply our bounds to a many-server system with threshold control. △ Less

Submitted 5 November, 2015; originally announced November 2015.

Comments: 17 pages, 2 figures, 2 tables

arXiv:1510.02657 [pdf, ps, other]

doi 10.1017/jpr.2016.68

Universality of Load Balancing Schemes on Diffusion Scale

Authors: D. Mukherjee, S. C. Borst, J. S. H. van Leeuwaarden, P. A. Whiting

Abstract: We consider a system of $N$ parallel queues with identical exponential service rates and a single dispatcher where tasks arrive as a Poisson process. When a task arrives, the dispatcher always assigns it to an idle server, if there is any, and to a server with the shortest queue among $d$ randomly selected servers otherwise $(1 \leq d \leq N)$. This load balancing scheme subsumes the so-called Joi… ▽ More We consider a system of $N$ parallel queues with identical exponential service rates and a single dispatcher where tasks arrive as a Poisson process. When a task arrives, the dispatcher always assigns it to an idle server, if there is any, and to a server with the shortest queue among $d$ randomly selected servers otherwise $(1 \leq d \leq N)$. This load balancing scheme subsumes the so-called Join-the-Idle Queue (JIQ) policy $(d = 1)$ and the celebrated Join-the-Shortest Queue (JSQ) policy $(d = N)$ as two crucial special cases. We develop a stochastic coupling construction to obtain the diffusion limit of the queue process in the Halfin-Whitt heavy-traffic regime, and establish that it does not depend on the value of $d$, implying that assigning tasks to idle servers is sufficient for diffusion level optimality. △ Less

Submitted 3 March, 2016; v1 submitted 9 October, 2015; originally announced October 2015.

Journal ref: J.Appl.Probab. 53 4 (2016) 1111-1124

arXiv:1509.08665 [pdf]

doi 10.1007/s11134-015-9438-x

On the Scalability and Message Count of Trickle-based Broadcasting Schemes

Authors: Thomas M. M. Meyfroyt, Sem C. Borst, Onno J. Boxma, Dee Denteneer

Abstract: As the use of wireless sensor networks increases, the need for efficient and reliable broadcasting algorithms grows. Ideally, a broadcasting algorithm should have the ability to quickly disseminate data, while kee** the number of transmissions low. In this paper, we analyze the popular Trickle algorithm, which has been proposed as a suitable communication protocol for code maintenance and propag… ▽ More As the use of wireless sensor networks increases, the need for efficient and reliable broadcasting algorithms grows. Ideally, a broadcasting algorithm should have the ability to quickly disseminate data, while kee** the number of transmissions low. In this paper, we analyze the popular Trickle algorithm, which has been proposed as a suitable communication protocol for code maintenance and propagation in wireless sensor networks. We show that the broadcasting process of a network using Trickle can be modeled by a Markov chain and that this chain falls under a class of Markov chains, closely related to residual lifetime distributions. It is then shown that this class of Markov chains admits a stationary distribution of a special form. These results are used to analyze the Trickle algorithm and its message count. Our results prove conjectures made in the literature concerning the effect of a listen-only period. Besides providing a mathematical analysis of the algorithm, we propose a generalized version of Trickle, with an additional parameter defining the length of a listen-only period. △ Less

Submitted 29 September, 2015; originally announced September 2015.

Comments: arXiv admin note: substantial text overlap with arXiv:1407.6034

MSC Class: 60J05 60J20 90B18

Journal ref: Queueing Systems: Volume 81, Issue 2 (2015), Page 203-230

arXiv:1503.06757 [pdf, ps, other]

doi 10.1007/s10955-015-1391-x

Hitting times asymptotics for hard-core interactions on grids

Authors: Francesca R. Nardi, Alessandro Zocca, Sem C. Borst

Abstract: We consider the hard-core model with Metropolis transition probabilities on finite grid graphs and investigate the asymptotic behavior of the first hitting time between its two maximum-occupancy configurations in the low-temperature regime. In particular, we show how the order-of-magnitude of this first hitting time depends on the grid sizes and on the boundary conditions by means of a novel combi… ▽ More We consider the hard-core model with Metropolis transition probabilities on finite grid graphs and investigate the asymptotic behavior of the first hitting time between its two maximum-occupancy configurations in the low-temperature regime. In particular, we show how the order-of-magnitude of this first hitting time depends on the grid sizes and on the boundary conditions by means of a novel combinatorial method. Our analysis also proves the asymptotic exponentiality of the scaled hitting time and yields the mixing time of the process in the low-temperature limit as side-result. In order to derive these results, we extended the model-independent framework in [27] for first hitting times to allow for a more general initial state and target subset. △ Less

Submitted 23 March, 2015; originally announced March 2015.

arXiv:1411.2808 [pdf, ps, other]

Optimal Admission Control for Many-Server Systems with QED-Driven Revenues

Authors: Jaron Sanders, S. C. Borst, A. J. E. M. Janssen, J. S. H. van Leeuwaarden

Abstract: We consider Markovian many-server systems with admission control operating in a QED regime, where the relative utilization approaches unity while the number of servers grows large, providing natural Economies-of-Scale. In order to determine the optimal admission control policy, we adopt a revenue maximization framework, and suppose that the revenue rate attains a maximum when no customers are wait… ▽ More We consider Markovian many-server systems with admission control operating in a QED regime, where the relative utilization approaches unity while the number of servers grows large, providing natural Economies-of-Scale. In order to determine the optimal admission control policy, we adopt a revenue maximization framework, and suppose that the revenue rate attains a maximum when no customers are waiting and no servers are idling. When the revenue function scales properly with the system size, we show that a nondegenerate optimization problem arises in the limit. Detailed analysis demonstrates that the revenue is maximized by nontrivial policies that bar customers from entering when the queue length exceeds a certain threshold of the order of the typical square-root level variation in the system occupancy. We identify a fundamental equation characterizing the optimal threshold, which we extensively leverage to provide broadly applicable upper/lower bounds for the optimal threshold, establish its monotonicity, and examine its asymptotic behavior, all for general revenue structures. For linear and exponential revenue structures, we present explicit expressions for the optimal threshold. △ Less

Submitted 11 November, 2014; originally announced November 2014.

Comments: 36 pages, 7 figures

Showing 1–50 of 67 results for author: Borst, S