Skip to main content

Showing 1–9 of 9 results for author: Mhammedi, Z

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.20540  [pdf, ps, other

    cs.LG math.OC stat.ML

    Fully Unconstrained Online Learning

    Authors: Ashok Cutkosky, Zakaria Mhammedi

    Abstract: We provide an online learning algorithm that obtains regret $G\|w_\star\|\sqrt{T\log(\|w_\star\|G\sqrt{T})} + \|w_\star\|^2 + G^2$ on $G$-Lipschitz convex losses for any comparison point $w_\star$ without knowing either $G$ or $\|w_\star\|$. Importantly, this matches the optimal bound $G\|w_\star\|\sqrt{T}$ available with such knowledge (up to logarithmic factors), unless either $\|w_\star\|$ or… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  2. arXiv:2404.15417  [pdf, other

    cs.LG cs.AI stat.ML

    The Power of Resets in Online Reinforcement Learning

    Authors: Zakaria Mhammedi, Dylan J. Foster, Alexander Rakhlin

    Abstract: Simulators are a pervasive tool in reinforcement learning, but most existing algorithms cannot efficiently exploit simulator access -- particularly in high-dimensional domains that require general function approximation. We explore the power of simulators through online reinforcement learning with {local simulator access} (or, local planning), an RL protocol where the agent is allowed to reset to… ▽ More

    Submitted 26 April, 2024; v1 submitted 23 April, 2024; originally announced April 2024.

    Comments: Fixed a small typo

  3. arXiv:2011.14126  [pdf, other

    cs.LG math.ST stat.ML

    Risk-Monotonicity in Statistical Learning

    Authors: Zakaria Mhammedi

    Abstract: Acquisition of data is a difficult task in many applications of machine learning, and it is only natural that one hopes and expects the population risk to decrease (better performance) monotonically with increasing data points. It turns out, somewhat surprisingly, that this is not the case even for the most standard algorithms that minimize the empirical risk. Non-monotonic behavior of the risk an… ▽ More

    Submitted 15 January, 2022; v1 submitted 28 November, 2020; originally announced November 2020.

    Comments: To appear in NeurIPS 2021 as Oral presentation

  4. arXiv:2010.03799  [pdf, ps, other

    cs.LG math.OC math.ST stat.ML

    Learning the Linear Quadratic Regulator from Nonlinear Observations

    Authors: Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

    Abstract: We introduce a new problem setting for continuous control called the LQR with Rich Observations, or RichLQR. In our setting, the environment is summarized by a low-dimensional continuous latent state with linear dynamics and quadratic costs, but the agent operates on high-dimensional, nonlinear observations such as images from a camera. To enable sample-efficient learning, we assume that the learn… ▽ More

    Submitted 8 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020

  5. arXiv:2006.14763  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayesian Bound for the Conditional Value at Risk

    Authors: Zakaria Mhammedi, Benjamin Guedj, Robert C. Williamson

    Abstract: Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation. Widely used in mathematical finance, it is garnering increasing interest in machine learning, e.g., as an alternate approach to regularization, and as a means for ensuring fairness. This paper presents a generalization bound for learning algorithms that minimize the C… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Journal ref: NeurIPS 2020

  6. arXiv:2002.12242  [pdf, other

    cs.LG stat.ML

    Lipschitz and Comparator-Norm Adaptivity in Online Learning

    Authors: Zakaria Mhammedi, Wouter M. Koolen

    Abstract: We study Online Convex Optimization in the unbounded setting where neither predictions nor gradient are constrained. The goal is to simultaneously adapt to both the sequence of gradients and the comparator. We first develop parameter-free and scale-free algorithms for a simplified setting with hints. We present two versions: the first adapts to the squared norms of both comparator and gradients se… ▽ More

    Submitted 8 August, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: 30 Pages, 1 Figure

  7. arXiv:1905.13367  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayes Un-Expected Bernstein Inequality

    Authors: Zakaria Mhammedi, Peter D. Grunwald, Benjamin Guedj

    Abstract: We present a new PAC-Bayesian generalization bound. Standard bounds contain a $\sqrt{L_n \cdot \KL/n}$ complexity term which dominates unless $L_n$, the empirical error of the learning algorithm's randomized predictions, vanishes. We manage to replace $L_n$ by a term which vanishes in many more situations, essentially whenever the employed learning algorithm is sufficiently stable on the dataset a… ▽ More

    Submitted 3 November, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: 24 pages, 6 figures. To Appear in NeurIPS2019

    Journal ref: NeurIPS 2019

  8. arXiv:1902.10797  [pdf, ps, other

    cs.LG stat.ML

    Lipschitz Adaptivity with Multiple Learning Rates in Online Learning

    Authors: Zakaria Mhammedi, Wouter M. Koolen, Tim van Erven

    Abstract: We aim to design adaptive online learning algorithms that take advantage of any special structure that might be present in the learning task at hand, with as little manual tuning by the user as possible. A fundamental obstacle that comes up in the design of such adaptive algorithms is to calibrate a so-called step-size or learning rate hyperparameter depending on variance, gradient norms, etc. A r… ▽ More

    Submitted 30 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: 22 pages. To appear in COLT 2019

  9. arXiv:1703.01460  [pdf, other

    cs.LG cs.HC stat.ML

    Adversarial Generation of Real-time Feedback with Neural Networks for Simulation-based Training

    Authors: Xingjun Ma, Sudanthi Wijewickrema, Shuo Zhou, Yun Zhou, Zakaria Mhammedi, Stephen O'Leary, James Bailey

    Abstract: Simulation-based training (SBT) is gaining popularity as a low-cost and convenient training technique in a vast range of applications. However, for a SBT platform to be fully utilized as an effective training tool, it is essential that feedback on performance is provided automatically in real-time during training. It is the aim of this paper to develop an efficient and effective feedback generatio… ▽ More

    Submitted 23 May, 2017; v1 submitted 4 March, 2017; originally announced March 2017.

    Comments: Appeared in the Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, 2017