Skip to main content

Showing 1–1 of 1 results for author: Maniyar, M P

.
  1. arXiv:2304.10951  [pdf, ps, other

    cs.LG math.OC stat.ML

    A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning

    Authors: Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar

    Abstract: We consider the problem of control in the setting of reinforcement learning (RL), where model information is not available. Policy gradient algorithms are a popular solution approach for this problem and are usually shown to converge to a stationary point of the value function. In this paper, we propose two policy Newton algorithms that incorporate cubic regularization. Both algorithms employ the… ▽ More

    Submitted 21 April, 2023; originally announced April 2023.