Skip to main content

Showing 1–4 of 4 results for author: Prasad, H L

.
  1. arXiv:1507.07984  [pdf, ps, other

    cs.LG math.OC

    A constrained optimization perspective on actor critic algorithms and application to network routing

    Authors: Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra

    Abstract: We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routin… ▽ More

    Submitted 28 July, 2015; originally announced July 2015.

  2. arXiv:1507.00093  [pdf, ps, other

    cs.LG cs.GT

    A Study of Gradient Descent Schemes for General-Sum Stochastic Games

    Authors: H. L. Prasad, Shalabh Bhatnagar

    Abstract: Zero-sum stochastic games are easy to solve as they can be cast as simple Markov decision processes. This is however not the case with general-sum stochastic games. A fairly general optimization problem formulation is available for general-sum stochastic games by Filar and Vrieze [2004]. However, the optimization problem there has a non-linear objective and non-linear constraints with special stru… ▽ More

    Submitted 30 June, 2015; originally announced July 2015.

  3. arXiv:1401.2086  [pdf, ps, other

    cs.GT cs.LG stat.ML

    Actor-Critic Algorithms for Learning Nash Equilibria in N-player General-Sum Games

    Authors: H. L Prasad, L. A. Prashanth, Shalabh Bhatnagar

    Abstract: We consider the problem of finding stationary Nash equilibria (NE) in a finite discounted general-sum stochastic game. We first generalize a non-linear optimization problem from Filar and Vrieze [2004] to a $N$-player setting and break down this problem into simpler sub-problems that ensure there is no Bellman error for a given state and an agent. We then provide a characterization of solution poi… ▽ More

    Submitted 2 July, 2015; v1 submitted 8 January, 2014; originally announced January 2014.

  4. arXiv:1312.7430  [pdf, other

    eess.SY

    Simultaneous Perturbation Methods for Adaptive Labor Staffing in Service Systems

    Authors: L. A. Prashanth, H. L. Prasad, Nirmit Desai, Shalabh Bhatnagar, Gargi Dasgupta

    Abstract: Service systems are labor intensive due to the large variation in the tasks required to address service requests from multiple customers. Aligning the staffing levels to the forecasted workloads adaptively in such systems is nontrivial because of a large number of parameters and operational variations leading to a huge search space. A challenging problem here is to optimize the staffing while main… ▽ More

    Submitted 28 December, 2013; originally announced December 2013.