Skip to main content

Showing 1–10 of 10 results for author: Meyn, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.17834  [pdf, other

    math.ST stat.ML

    Revisiting Step-Size Assumptions in Stochastic Approximation

    Authors: Caio Kalil Lauand, Sean Meyn

    Abstract: Many machine learning and optimization algorithms are built upon the framework of stochastic approximation (SA), for which the selection of step-size (or learning rate) is essential for success. For the sake of clarity, this paper focuses on the special case $α_n = α_0 n^{-ρ}$ at iteration $n$, with $ρ\in [0,1]$ and $α_0>0$ design parameters. It is most common in practice to take $ρ=0$ (constant s… ▽ More

    Submitted 3 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

    Comments: 30 pages, 5 figures

    MSC Class: 62L20; 68T05

  2. arXiv:2309.02944  [pdf, other

    math.ST stat.ML

    The Curse of Memory in Stochastic Approximation: Extended Version

    Authors: Caio Kalil Lauand, Sean Meyn

    Abstract: Theory and application of stochastic approximation (SA) has grown within the control systems community since the earliest days of adaptive control. This paper takes a new look at the topic, motivated by recent results establishing remarkable performance of SA with (sufficiently small) constant step-size $α>0$. If averaging is implemented to obtain the final parameter estimate, then the estimates a… ▽ More

    Submitted 17 September, 2023; v1 submitted 6 September, 2023; originally announced September 2023.

    Comments: 21 pages, 4 figures

    MSC Class: 62L20; 68T05

  3. arXiv:2002.10301  [pdf, other

    cs.LG eess.SY stat.ML

    Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning

    Authors: Adithya M. Devraj, Sean P. Meyn

    Abstract: Sample complexity bounds are a common performance metric in the Reinforcement Learning literature. In the discounted cost, infinite horizon setting, all of the known bounds have a factor that is a polynomial in $1/(1-γ)$, where $γ< 1$ is the discount factor. For a large discount factor, these bounds seem to imply that a very large number of samples is required to achieve an $\varepsilon$-optimal p… ▽ More

    Submitted 7 July, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: 33 pages, 4 figures

  4. arXiv:2002.02584  [pdf, other

    math.PR cs.LG eess.SY math.OC math.ST stat.ML

    Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation

    Authors: Shuhang Chen, Adithya M. Devraj, Ana Bušić, Sean Meyn

    Abstract: This paper concerns error bounds for recursive equations subject to Markovian disturbances. Motivating examples abound within the fields of Markov chain Monte Carlo (MCMC) and Reinforcement Learning (RL), and many of these algorithms can be interpreted as special cases of stochastic approximation (SA). It is argued that it is not possible in general to obtain a Hoeffding bound on the error sequenc… ▽ More

    Submitted 6 February, 2020; originally announced February 2020.

  5. arXiv:1910.05405  [pdf, other

    cs.LG eess.SY stat.ML

    Zap Q-Learning With Nonlinear Function Approximation

    Authors: Shuhang Chen, Adithya M. Devraj, Fan Lu, Ana Bušić, Sean P. Meyn

    Abstract: Zap Q-learning is a recent class of reinforcement learning algorithms, motivated primarily as a means to accelerate convergence. Stability theory has been absent outside of two restrictive classes: the tabular setting, and optimal stop**. This paper introduces a new framework for analysis of a more general class of recursive algorithms known as stochastic approximation. Based on this general the… ▽ More

    Submitted 15 July, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

  6. arXiv:1812.11137  [pdf, other

    cs.LG eess.SY math.OC stat.ML

    Differential Temporal Difference Learning

    Authors: Adithya M. Devraj, Ioannis Kontoyiannis, Sean P. Meyn

    Abstract: Value functions derived from Markov decision processes arise as a central component of algorithms as well as performance metrics in many statistics and engineering applications of machine learning techniques. Computation of the solution to the associated Bellman equations is challenging in most practical cases of interest. A popular class of approximation techniques, known as Temporal Difference (… ▽ More

    Submitted 27 February, 2020; v1 submitted 28 December, 2018; originally announced December 2018.

    Comments: Preliminary versions of some of the results in this article were submitted as arXiv:1604.01828

    MSC Class: 93E20; 93E35; 60J20

  7. arXiv:1808.01665  [pdf, other

    stat.ME

    Diffusion approximations and control variates for MCMC

    Authors: Nicolas Brosse, Alain Durmus, Sean Meyn, Eric Moulines, Anand Radhakrishnan

    Abstract: A new methodology is presented for the construction of control variates to reduce the variance of additive functionals of Markov Chain Monte Carlo (MCMC) samplers. Our control variates are definedthrough the minimization of the asymptotic variance of the Langevin diffusion over a family of functions, which can be seen as a quadratic risk minimization procedure. The use of these control variates is… ▽ More

    Submitted 8 July, 2019; v1 submitted 5 August, 2018; originally announced August 2018.

  8. arXiv:1609.00051  [pdf, other

    eess.SY stat.AP

    Estimation and Control of Quality of Service in Demand Dispatch

    Authors: Yue Chen, Ana Bušić, Sean Meyn

    Abstract: It is now well known that flexibility of energy consumption can be harnessed for the purposes of grid-level ancillary services. In particular, through distributed control of a collection of loads, a balancing authority regulation signal can be tracked accurately, while ensuring that the quality of service (QoS) for each load is acceptable {\it on average}. In this paper it is argued that a histogr… ▽ More

    Submitted 31 August, 2016; originally announced September 2016.

    Comments: Submitted for publication, August 2016. arXiv admin note: text overlap with arXiv:1409.6941

    MSC Class: 60J20; 68M20

  9. arXiv:1604.04013  [pdf, other

    cs.PF cs.IT stat.AP

    Ergodic Theory for Controlled Markov Chains with Stationary Inputs

    Authors: Yue Chen, Ana Bušić, Sean Meyn

    Abstract: Consider a stochastic process $\{X(t)\}$ on a finite state space $ {\sf X}=\{1,\dots, d\}$. It is conditionally Markov, given a real-valued `input process' $\{ζ(t)\}$. This is assumed to be small, which is modeled through the scaling, \[ ζ_t = \varepsilon ζ^1_t, \qquad 0\le \varepsilon \le 1\,, \] where $\{ζ^1(t)\}$ is a bounded stationary process. The following conclusions are obtained, subject t… ▽ More

    Submitted 18 June, 2016; v1 submitted 13 April, 2016; originally announced April 2016.

    MSC Class: 60J20; 60G10; 68M20; 94A15

  10. arXiv:math/0612040  [pdf, ps, other

    math.PR math.ST stat.CO

    Computable exponential bounds for screened estimation and simulation

    Authors: Ioannis Kontoyiannis, Sean P. Meyn

    Abstract: Suppose the expectation $E(F(X))$ is to be estimated by the empirical averages of the values of $F$ on independent and identically distributed samples $\{X_i\}$. A sampling rule called the "screened" estimator is introduced, and its performance is studied. When the mean $E(U(X))$ of a different function $U$ is known, the estimates are "screened," in that we only consider those which correspond t… ▽ More

    Submitted 22 August, 2008; v1 submitted 1 December, 2006; originally announced December 2006.

    Comments: Published in at http://dx.doi.org/10.1214/00-AAP492 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AAP-AAP492 MSC Class: 60C05; 60F10 (Primary) 60G05; 60E15 (Secondary)

    Journal ref: Annals of Applied Probability 2008, Vol. 18, No. 4, 1491-1518