Skip to main content

Showing 1–5 of 5 results for author: Kostas, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.09838  [pdf, other

    cs.LG cs.AI

    Coagent Networks: Generalized and Scaled

    Authors: James E. Kostas, Scott M. Jordan, Yash Chandak, Georgios Theocharous, Dhawal Gupta, Martha White, Bruno Castro da Silva, Philip S. Thomas

    Abstract: Coagent networks for reinforcement learning (RL) [Thomas and Barto, 2011] provide a powerful and flexible framework for deriving principled learning rules for arbitrary stochastic neural networks. The coagent framework offers an alternative to backpropagation-based deep learning (BDL) that overcomes some of backpropagation's main limitations. For example, coagent networks can compute different par… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  2. arXiv:2112.05812  [pdf, other

    cs.LG

    Edge-Compatible Reinforcement Learning for Recommendations

    Authors: James E. Kostas, Philip S. Thomas, Georgios Theocharous

    Abstract: Most reinforcement learning (RL) recommendation systems designed for edge computing must either synchronize during recommendation selection or depend on an unprincipled patchwork collection of algorithms. In this work, we build on asynchronous coagent policy gradient algorithms \citep{kostas2020asynchronous} to propose a principled solution to this problem. The class of algorithms that we propose… ▽ More

    Submitted 10 August, 2022; v1 submitted 10 December, 2021; originally announced December 2021.

  3. arXiv:1906.03063  [pdf, ps, other

    cs.LG stat.ML

    Classical Policy Gradient: Preserving Bellman's Principle of Optimality

    Authors: Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James Kostas

    Abstract: We propose a new objective function for finite-horizon episodic Markov decision processes that better captures Bellman's principle of optimality, and provide an expression for the gradient of the objective.

    Submitted 6 June, 2019; originally announced June 2019.

    Comments: 1 page, 0 figures

  4. arXiv:1902.05650  [pdf, other

    cs.LG stat.ML

    Asynchronous Coagent Networks

    Authors: James E. Kostas, Chris Nota, Philip S. Thomas

    Abstract: Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hiera… ▽ More

    Submitted 10 August, 2020; v1 submitted 14 February, 2019; originally announced February 2019.

    Comments: Updated version

  5. arXiv:1902.00183  [pdf, other

    cs.LG stat.ML

    Learning Action Representations for Reinforcement Learning

    Authors: Yash Chandak, Georgios Theocharous, James Kostas, Scott Jordan, Philip S. Thomas

    Abstract: Most model-free reinforcement learning methods leverage state representations (embeddings) for generalization, but either ignore structure in the space of actions or assume the structure is provided a priori. We show how a policy can be decomposed into a component that acts in a low-dimensional space of action representations and a component that transforms these representations into actual action… ▽ More

    Submitted 14 May, 2019; v1 submitted 31 January, 2019; originally announced February 2019.

    Comments: In Proceedings of the 36th International Conference on Machine Learning (ICML 2019)