Skip to main content

Showing 1–3 of 3 results for author: Jedra, Y

Searching in archive math. Search in all archives.
.
  1. arXiv:2208.08480  [pdf, other

    cs.LG math.ST stat.ML

    Nearly Optimal Latent State Decoding in Block MDPs

    Authors: Yassir Jedra, Junghyun Lee, Alexandre Proutière, Se-Young Yun

    Abstract: We investigate the problems of model estimation and reward-free learning in episodic Block MDPs. In these MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states. We are first interested in estimating the latent state decoding function (the map** from the observations to latent states) based on data generated under a fixed behavior poli… ▽ More

    Submitted 24 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Y. Jedra and J. Lee contributed equally; 100 pages, 3 figures; Accepted to the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

  2. arXiv:2109.14429  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    Minimal Expected Regret in Linear Quadratic Control

    Authors: Yassir Jedra, Alexandre Proutiere

    Abstract: We consider the problem of online learning in Linear Quadratic Control systems whose state transition and state-action transition matrices $A$ and $B$ may be initially unknown. We devise an online learning algorithm and provide guarantees on its expected regret. This regret at time $T$ is upper bounded (i) by $\widetilde{O}((d_u+d_x)\sqrt{d_xT})$ when $A$ and $B$ are unknown, (ii) by… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  3. arXiv:2003.07937  [pdf, ps, other

    math.ST cs.LG eess.SY stat.ML

    Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator

    Authors: Yassir Jedra, Alexandre Proutiere

    Abstract: We present a new finite-time analysis of the estimation error of the Ordinary Least Squares (OLS) estimator for stable linear time-invariant systems. We characterize the number of observed samples (the length of the observed trajectory) sufficient for the OLS estimator to be $(\varepsilon,δ)$-PAC, i.e., to yield an estimation error less than $\varepsilon$ with probability at least $1-δ$. We show t… ▽ More

    Submitted 26 March, 2020; v1 submitted 17 March, 2020; originally announced March 2020.