Skip to main content

Showing 1–2 of 2 results for author: Lesner, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:1304.5610  [pdf, other

    math.OC cs.AI

    Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies

    Authors: Boris Lesner, Bruno Scherrer

    Abstract: We consider approximate dynamic programming for the infinite-horizon stationary $γ$-discounted optimal control problem formalized by Markov Decision Processes. While in the exact case it is known that there always exists an optimal policy that is stationary, we show that when using value function approximation, looking for a non-stationary policy may lead to a better performance guarantee. We defi… ▽ More

    Submitted 20 April, 2013; originally announced April 2013.

  2. arXiv:1211.6898  [pdf, ps, other

    cs.LG cs.AI

    On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes

    Authors: Bruno Scherrer, Boris Lesner

    Abstract: We consider infinite-horizon stationary $γ$-discounted Markov Decision Processes, for which it is known that there exists a stationary optimal policy. Using Value and Policy Iteration with some error $ε$ at each iteration, it is well-known that one can compute stationary policies that are $\frac{2γ}{(1-γ)^2}ε$-optimal. After arguing that this guarantee is tight, we develop variations of Value and… ▽ More

    Submitted 29 November, 2012; originally announced November 2012.

    Journal ref: NIPS 2012 (2012)