Search | arXiv e-print repository

Learning to Control Unknown Strongly Monotone Games

Authors: Siddharth Chandak, Ilai Bistritz, Nicholas Bambos

Abstract: Consider $N$ players each with a $d$-dimensional action set. Each of the players' utility functions includes their reward function and a linear term for each dimension, with coefficients that are controlled by the manager. We assume that the game is strongly monotone, so if each player runs gradient descent, the dynamics converge to a unique Nash equilibrium (NE). The NE is typically inefficient i… ▽ More Consider $N$ players each with a $d$-dimensional action set. Each of the players' utility functions includes their reward function and a linear term for each dimension, with coefficients that are controlled by the manager. We assume that the game is strongly monotone, so if each player runs gradient descent, the dynamics converge to a unique Nash equilibrium (NE). The NE is typically inefficient in terms of global performance. The resulting global performance of the system can be improved by imposing $K$-dimensional linear constraints on the NE. We therefore want the manager to pick the controlled coefficients that impose the desired constraint on the NE. However, this requires knowing the players' reward functions and their action sets. Obtaining this game structure information is infeasible in a large-scale network and violates the users' privacy. To overcome this, we propose a simple algorithm that learns to shift the NE of the game to meet the linear constraints by adjusting the controlled coefficients online. Our algorithm only requires the linear constraints violation as feedback and does not need to know the reward functions or the action sets. We prove that our algorithm, which is based on two time-scale stochastic approximation, guarantees convergence with probability 1 to the set of NE that meet target linear constraints. We then provide a mean square convergence rate of $O(t^{-1/4})$ for our algorithm. This is the first such bound for two time-scale stochastic approximation where the slower time-scale is a fixed point iteration with a non-expansive map**. We demonstrate how our scheme can be applied to optimizing a global quadratic cost at NE and load balancing in resource allocation games. We provide simulations of our algorithm for these scenarios. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: Submitted to IEEE Transactions on Automatic Control

arXiv:2406.18000 [pdf, other]

Tiered Service Architecture for Remote Patient Monitoring

Authors: Siddharth Chandak, Isha Thapa, Nicholas Bambos, David Scheinker

Abstract: We develop a remote patient monitoring (RPM) service architecture, which has two tiers of monitoring: ordinary and intensive. The patient's health state improves or worsens in each time period according to certain probabilities, which depend on the monitoring tier. The patient incurs a "loss of quality of life" cost or an "invasiveness" cost, which is higher under intensive monitoring than under o… ▽ More We develop a remote patient monitoring (RPM) service architecture, which has two tiers of monitoring: ordinary and intensive. The patient's health state improves or worsens in each time period according to certain probabilities, which depend on the monitoring tier. The patient incurs a "loss of quality of life" cost or an "invasiveness" cost, which is higher under intensive monitoring than under ordinary. On the other hand, their health improves faster under intensive monitoring than under ordinary. In each period, the service decides which monitoring tier to use, based on the health of the patient. We investigate the optimal policy for making that choice by formulating the problem using dynamic programming. We first provide analytic conditions for selecting ordinary vs intensive monitoring in the asymptotic regime where the number of health states is large. In the general case, we investigate the optimal policy numerically. We observe a threshold behavior, that is, when the patient's health drops below a certain threshold the service switches them to intensive monitoring, while ordinary monitoring is used during adequately good health states of the patient. The modeling and analysis provides a general framework for managing RPM services for various health conditions with medically/clinically defined system parameters. △ Less

Submitted 25 June, 2024; originally announced June 2024.

Comments: Submitted to IEEE Healthcom 2024. 7 pages

arXiv:1606.04136 [pdf, other]

Myopic Policies for Non-Preemptive Scheduling of Jobs with Decaying Value

Authors: Neal Master, Carri W. Chan, Nicholas Bambos

Abstract: In many scheduling applications, minimizing delays is of high importance. One adverse effect of such delays is that the reward for completion of a job may decay over time. Indeed in healthcare settings, delays in access to care can result in worse outcomes, such as an increase in mortality risk. Motivated by managing hospital operations in disaster scenarios, as well as other applications in peris… ▽ More In many scheduling applications, minimizing delays is of high importance. One adverse effect of such delays is that the reward for completion of a job may decay over time. Indeed in healthcare settings, delays in access to care can result in worse outcomes, such as an increase in mortality risk. Motivated by managing hospital operations in disaster scenarios, as well as other applications in perishable inventory control and information services, we consider non-preemptive scheduling of jobs whose internal value decays over time. Because solving for the optimal scheduling policy is computationally intractable, we focus our attention on the performance of three intuitive heuristics: (1) a policy which maximizes the expected immediate reward, (2) a policy which maximizes the expected immediate reward rate, and (3) a policy which prioritizes jobs with imminent deadlines. We provide performance guarantees for all three policies and show that many of these performance bounds are tight. In addition, we provide numerical experiments and simulations to compare how the policies perform in a variety of scenarios. Our theoretical and numerical results allow us to establish rules-of-thumb for applying these heuristics in a variety of situations, including patient scheduling scenarios. △ Less

Submitted 21 October, 2016; v1 submitted 13 June, 2016; originally announced June 2016.

Comments: Accepted for publication in Probability in the Engineering and Informational Sciences

arXiv:1105.0417 [pdf, ps, other]

Cone Schedules for Processing Systems in Fluctuating Environments

Authors: Kevin Ross, Nicholas Bambos, George Michailidis

Abstract: We consider a generalized processing system having several queues, where the available service rate combinations are fluctuating over time due to reliability and availability variations. The objective is to allocate the available resources, and corresponding service rates, in response to both workload and service capacity considerations, in order to maintain the long term stability of the system.… ▽ More We consider a generalized processing system having several queues, where the available service rate combinations are fluctuating over time due to reliability and availability variations. The objective is to allocate the available resources, and corresponding service rates, in response to both workload and service capacity considerations, in order to maintain the long term stability of the system. The service configurations are completely arbitrary, including negative service rates which represent forwarding and service-induced cross traffic. We employ a trace-based trajectory asymptotic technique, which requires minimal assumptions about the arrival dynamics of the system. We prove that cone schedules, which leverage the geometry of the queueing dynamics, maximize the system throughput for a broad class of processing systems, even under adversarial arrival processes. We study the impact of fluctuating service availability, where resources are available only some of the time, and the schedule must dynamically respond to the changing available service rates, establishing both the capacity of such systems and the class of schedules which will stabilize the system at full capacity. The rich geometry of the system dynamics leads to important insights for stability, performance and scalability, and substantially generalizes previous findings. The processing system studied here models a broad variety of computer, communication and service networks, including varying channel conditions and cross-traffic in wireless networking, and call centers with fluctuating capacity. The findings have implications for bandwidth and processor allocation in communication networks and workforce scheduling in congested call centers. △ Less

Submitted 2 May, 2011; originally announced May 2011.

Comments: 25 pages, 5 figures

arXiv:1101.0011 [pdf, ps, other]

Packet Scheduling in Switches with Target Outflow Profiles

Authors: Aditya Dua, Nicholas Bambos

Abstract: The problem of packet scheduling for traffic streams with target outflow profiles traversing input queued switches is formulated in this paper. Target outflow profiles specify the desirable inter-departure times of packets leaving the switch from each traffic stream. The goal of the switch scheduler is to dynamically select service configurations of the switch, so that actual outflow streams ("pul… ▽ More The problem of packet scheduling for traffic streams with target outflow profiles traversing input queued switches is formulated in this paper. Target outflow profiles specify the desirable inter-departure times of packets leaving the switch from each traffic stream. The goal of the switch scheduler is to dynamically select service configurations of the switch, so that actual outflow streams ("pulled" through the switch) adhere to their desired target profiles as accurately as possible. Dynamic service controls (schedules) are developed to minimize deviation of actual outflow streams from their targets and suppress stream "distortion". Using appropriately selected subsets of service configurations of the switch, efficient schedules are designed, which deliver high performance at relatively low complexity. Some of these schedules are provably shown to achieve 100% pull-throughput. Moreover, simulations demonstrate that for even substantial contention of streams through the switch, due to stringent/intense target outflow profiles, the proposed schedules achieve closely their target profiles and suppress stream distortion. The switch model investigated here deviates from the classical switching paradigm. In the latter, the goal of packet scheduling is primarily to "push" as much traffic load through the switch as possible, while controlling delay to traverse the switch and kee** congestion/backlogs from exploding. In the model presented here, however, the goal of packet scheduling is to "pull" traffic streams through the switch, maintaining desirable (target) outflow profiles. △ Less

Submitted 29 December, 2010; originally announced January 2011.

Comments: 33 pages, 10 figures

Showing 1–5 of 5 results for author: Bambos, N