Skip to main content

Showing 1–6 of 6 results for author: Chen, R Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:1802.04821  [pdf, other

    cs.LG cs.AI

    Evolved Policy Gradients

    Authors: Rein Houthooft, Richard Y. Chen, Phillip Isola, Bradly C. Stadie, Filip Wolski, Jonathan Ho, Pieter Abbeel

    Abstract: We propose a metalearning approach for learning gradient-based reinforcement learning (RL) algorithms. The idea is to evolve a differentiable loss function, such that an agent, which optimizes its policy to minimize this loss, will achieve high rewards. The loss is parametrized via temporal convolutions over the agent's experience. Because this loss is highly flexible in its ability to take into a… ▽ More

    Submitted 29 April, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

  2. arXiv:1709.07223  [pdf, other

    cs.CV cs.AI cs.LG physics.optics

    Convolutional neural networks that teach microscopes how to image

    Authors: Roarke Horstmeyer, Richard Y. Chen, Barbara Kappes, Benjamin Judkewitz

    Abstract: Deep learning algorithms offer a powerful means to automatically analyze the content of medical images. However, many biological samples of interest are primarily transparent to visible light and contain features that are difficult to resolve with a standard optical microscope. Here, we use a convolutional neural network (CNN) not only to classify images, but also to optimize the physical layout o… ▽ More

    Submitted 21 September, 2017; originally announced September 2017.

    Comments: 13 pages, 6 figures

  3. arXiv:1709.04326  [pdf, other

    cs.AI cs.GT

    Learning with Opponent-Learning Awareness

    Authors: Jakob N. Foerster, Richard Y. Chen, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch

    Abstract: Multi-agent settings are quickly gathering importance in machine learning. This includes a plethora of recent work on deep multi-agent reinforcement learning, but also can be extended to hierarchical RL, generative adversarial networks and decentralised optimisation. In all these settings the presence of multiple learning agents renders the training problem non-stationary and often leads to unstab… ▽ More

    Submitted 19 September, 2018; v1 submitted 13 September, 2017; originally announced September 2017.

  4. arXiv:1706.01905  [pdf, other

    cs.LG cs.AI cs.NE cs.RO stat.ML

    Parameter Space Noise for Exploration

    Authors: Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, Marcin Andrychowicz

    Abstract: Deep reinforcement learning (RL) methods generally engage in exploratory behavior through noise injection in the action space. An alternative is to add noise directly to the agent's parameters, which can lead to more consistent exploration and a richer set of behaviors. Methods such as evolutionary strategies use parameter perturbations, but discard all temporal structure in the process and requir… ▽ More

    Submitted 31 January, 2018; v1 submitted 6 June, 2017; originally announced June 2017.

    Comments: Updated to camera-ready ICLR submission

  5. arXiv:1706.01502  [pdf, ps, other

    cs.LG stat.ML

    UCB Exploration via Q-Ensembles

    Authors: Richard Y. Chen, Szymon Sidor, Pieter Abbeel, John Schulman

    Abstract: We show how an ensemble of $Q^*$-functions can be leveraged for more effective exploration in deep reinforcement learning. We build on well established algorithms from the bandit setting, and adapt them to the $Q$-learning setting. We propose an exploration strategy based on upper-confidence bounds (UCB). Our experiments show significant gains on the Atari benchmark.

    Submitted 7 November, 2017; v1 submitted 5 June, 2017; originally announced June 2017.

  6. arXiv:1308.2952  [pdf, ps, other

    cs.IT math.PR

    Subadditivity of Matrix phi-Entropy and Concentration of Random Matrices

    Authors: Joel A. Tropp, Richard Y. Chen

    Abstract: Matrix concentration inequalities provide a direct way to bound the typical spectral norm of a random matrix. The methods for establishing these results often parallel classical arguments, such as the Laplace transform method. This work develops a matrix extension of the entropy method, and it applies these ideas to obtain some matrix concentration inequalities.

    Submitted 13 August, 2013; originally announced August 2013.

    Comments: 23 pages

    MSC Class: 60B20; 60E15; 60F10

    Journal ref: Electron. J. Probab., Vol. 19, Article 27, pp. 1-30, Mar. 2014