Search | arXiv e-print repository

Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report

Authors: Jerrod Wigmore, Brooke Shrader, Eytan Modiano

Abstract: Deep Reinforcement Learning (DRL) offers a powerful approach to training neural network control policies for stochastic queuing networks (SQN). However, traditional DRL methods rely on offline simulations or static datasets, limiting their real-world application in SQN control. This work proposes Online Deep Reinforcement Learning-based Controls (ODRLC) as an alternative, where an intelligent agen… ▽ More Deep Reinforcement Learning (DRL) offers a powerful approach to training neural network control policies for stochastic queuing networks (SQN). However, traditional DRL methods rely on offline simulations or static datasets, limiting their real-world application in SQN control. This work proposes Online Deep Reinforcement Learning-based Controls (ODRLC) as an alternative, where an intelligent agent interacts directly with a real environment and learns an optimal control policy from these online interactions. SQNs present a challenge for ODRLC due to the unbounded nature of the queues within the network resulting in an unbounded state-space. An unbounded state-space is particularly challenging for neural network policies as neural networks are notoriously poor at extrapolating to unseen states. To address this challenge, we propose an intervention-assisted framework that leverages strategic interventions from known stable policies to ensure the queue sizes remain bounded. This framework combines the learning power of neural networks with the guaranteed stability of classical control policies for SQNs. We introduce a method to design these intervention-assisted policies to ensure strong stability of the network. Furthermore, we extend foundational DRL theorems for intervention-assisted policies and develop two practical algorithms specifically for ODRLC of SQNs. Finally, we demonstrate through experiments that our proposed algorithms outperform both classical control approaches and prior ODRLC algorithms. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 25 pages, 6 Figures

ACM Class: F.2.2; I.2.6

arXiv:2008.01528 [pdf, other]

Throughput Maximization in Uncooperative Spectrum Sharing Networks

Authors: Thomas Stahlbuhk, Brooke Shrader, Eytan Modiano

Abstract: Throughput-optimal transmission scheduling in wireless networks has been a well considered problem in the literature, and the method for achieving optimality, MaxWeight scheduling, has been known for several decades. This algorithm achieves optimality by adaptively scheduling transmissions relative to each user's stochastic traffic demands. To implement the method, users must report their queue ba… ▽ More Throughput-optimal transmission scheduling in wireless networks has been a well considered problem in the literature, and the method for achieving optimality, MaxWeight scheduling, has been known for several decades. This algorithm achieves optimality by adaptively scheduling transmissions relative to each user's stochastic traffic demands. To implement the method, users must report their queue backlogs to the network controller and must rapidly respond to the resulting resource allocations. However, many currently-deployed wireless systems are not able to perform these tasks and instead expect to occupy a fixed assignment of resources. To accommodate these limitations, adaptive scheduling algorithms need to interactively estimate these uncooperative users' queue backlogs and make scheduling decisions to account for their predicted behavior. In this work, we address the problem of scheduling with uncooperative legacy systems by develo** algorithms to accomplish these tasks. We begin by formulating the problem of inferring the uncooperative systems' queue backlogs as a partially observable Markov decision process and proceed to show how our resulting learning algorithms can be successfully used in a queue-length-based scheduling policy. Our theoretical analysis characterizes the throughput-stability region of the network and is verified using simulation results. △ Less

Submitted 4 August, 2020; originally announced August 2020.

Comments: 15 pages, 7 figures

arXiv:2005.05206 [pdf, other]

Learning Algorithms for Minimizing Queue Length Regret

Authors: Thomas Stahlbuhk, Brooke Shrader, Eytan Modiano

Abstract: We consider a system consisting of a single transmitter/receiver pair and $N$ channels over which they may communicate. Packets randomly arrive to the transmitter's queue and wait to be successfully sent to the receiver. The transmitter may attempt a frame transmission on one channel at a time, where each frame includes a packet if one is in the queue. For each channel, an attempted transmission i… ▽ More We consider a system consisting of a single transmitter/receiver pair and $N$ channels over which they may communicate. Packets randomly arrive to the transmitter's queue and wait to be successfully sent to the receiver. The transmitter may attempt a frame transmission on one channel at a time, where each frame includes a packet if one is in the queue. For each channel, an attempted transmission is successful with an unknown probability. The transmitter's objective is to quickly identify the best channel to minimize the number of packets in the queue over $T$ time slots. To analyze system performance, we introduce queue length regret, which is the expected difference between the total queue length of a learning policy and a controller that knows the rates, a priori. One approach to designing a transmission policy would be to apply algorithms from the literature that solve the closely-related stochastic multi-armed bandit problem. These policies would focus on maximizing the number of successful frame transmissions over time. However, we show that these methods have $Ω(\log{T})$ queue length regret. On the other hand, we show that there exists a set of queue-length based policies that can obtain order optimal $O(1)$ queue length regret. We use our theoretical analysis to devise heuristic methods that are shown to perform well in simulation. △ Less

Submitted 14 May, 2020; v1 submitted 11 May, 2020; originally announced May 2020.

Comments: 28 Pages, 11 figures

arXiv:0711.0705 [pdf, ps, other]

doi 10.1109/TIT.2009.2023727

Feedback Capacity of the Compound Channel

Authors: Brooke Shrader, Haim Permuter

Abstract: In this work we find the capacity of a compound finite-state channel with time-invariant deterministic feedback. The model we consider involves the use of fixed length block codes. Our achievability result includes a proof of the existence of a universal decoder for the family of finite-state channels with feedback. As a consequence of our capacity result, we show that feedback does not increase… ▽ More In this work we find the capacity of a compound finite-state channel with time-invariant deterministic feedback. The model we consider involves the use of fixed length block codes. Our achievability result includes a proof of the existence of a universal decoder for the family of finite-state channels with feedback. As a consequence of our capacity result, we show that feedback does not increase the capacity of the compound Gilbert-Elliot channel. Additionally, we show that for a stationary and uniformly ergodic Markovian channel, if the compound channel capacity is zero without feedback then it is zero with feedback. Finally, we use our result on the finite-state channel to show that the feedback capacity of the memoryless compound channel is given by $\inf_θ \max_{Q_X} I(X;Y|θ)$. △ Less

Submitted 5 November, 2007; originally announced November 2007.

Comments: 34 pages, 2 figures, submitted to IEEE Transactions on Information Theory

arXiv:0705.3058 [pdf, ps, other]

On the Shannon capacity and queueing stability of random access multicast

Authors: Brooke Shrader, Anthony Ephremides

Abstract: We study and compare the Shannon capacity region and the stable throughput region for a random access system in which source nodes multicast their messages to multiple destination nodes. Under an erasure channel model which accounts for interference and allows for multipacket reception, we first characterize the Shannon capacity region. We then consider a queueing-theoretic formulation and chara… ▽ More We study and compare the Shannon capacity region and the stable throughput region for a random access system in which source nodes multicast their messages to multiple destination nodes. Under an erasure channel model which accounts for interference and allows for multipacket reception, we first characterize the Shannon capacity region. We then consider a queueing-theoretic formulation and characterize the stable throughput region for two different transmission policies: a retransmission policy and random linear coding. Our results indicate that for large blocklengths, the random linear coding policy provides a higher stable throughput than the retransmission policy. Furthermore, our results provide an example of a transmission policy for which the Shannon capacity region strictly outer bounds the stable throughput region, which contradicts an unproven conjecture that the Shannon capacity and stable throughput coincide for random access systems. △ Less

Submitted 19 September, 2007; v1 submitted 21 May, 2007; originally announced May 2007.

Comments: 27 pages, 3 figures. Revisions to sections I, III, VII and App. A, B

arXiv:0704.2778 [pdf, ps, other]

Random Access Broadcast: Stability and Throughput Analysis

Authors: Brooke Shrader, Anthony Ephremides

Abstract: A wireless network in which packets are broadcast to a group of receivers through use of a random access protocol is considered in this work. The relation to previous work on networks of interacting queues is discussed and subsequently, the stability and throughput regions of the system are analyzed and presented. A simple network of two source nodes and two destination nodes is considered first… ▽ More A wireless network in which packets are broadcast to a group of receivers through use of a random access protocol is considered in this work. The relation to previous work on networks of interacting queues is discussed and subsequently, the stability and throughput regions of the system are analyzed and presented. A simple network of two source nodes and two destination nodes is considered first. The broadcast service process is analyzed assuming a channel that allows for packet capture and multipacket reception. In this small network, the stability and throughput regions are observed to coincide. The same problem for a network with N sources and M destinations is considered next. The channel model is simplified in that multipacket reception is no longer permitted. Bounds on the stability region are developed using the concept of stability rank and the throughput region of the system is compared to the bounds. Our results show that as the number of destination nodes increases, the stability and throughput regions diminish. Additionally, a previous conjecture that the stability and throughput regions coincide for a network of arbitrarily many sources is supported for a broadcast scenario by the results presented in this work. △ Less

Submitted 20 April, 2007; originally announced April 2007.

Comments: 19 pages, 5 figures. Submitted as correspondence to IEEE Transactions on Information Theory, Sept 2006. Revised April 2007

Journal ref: IEEE Transactions on Information Theory, vol. 53, no. 8, pp. 2915-2921, August 2007.

arXiv:0704.0831 [pdf, ps, other]

On packet lengths and overhead for random linear coding over the erasure channel

Authors: Brooke Shrader, Anthony Ephremides

Abstract: We assess the practicality of random network coding by illuminating the issue of overhead and considering it in conjunction with increasingly long packets sent over the erasure channel. We show that the transmission of increasingly long packets, consisting of either of an increasing number of symbols per packet or an increasing symbol alphabet size, results in a data rate approaching zero over t… ▽ More We assess the practicality of random network coding by illuminating the issue of overhead and considering it in conjunction with increasingly long packets sent over the erasure channel. We show that the transmission of increasingly long packets, consisting of either of an increasing number of symbols per packet or an increasing symbol alphabet size, results in a data rate approaching zero over the erasure channel. This result is due to an erasure probability that increases with packet length. Numerical results for a particular modulation scheme demonstrate a data rate of approximately zero for a large, but finite-length packet. Our results suggest a reduction in the performance gains offered by random network coding. △ Less

Submitted 5 April, 2007; originally announced April 2007.

Comments: 5 pages, 5 figures, submitted to the 2007 International Wireless Communications and Mobile Computing Conference

Showing 1–7 of 7 results for author: Shrader, B