Skip to main content

Showing 1–2 of 2 results for author: Howe, N H R

.
  1. arXiv:2209.13085  [pdf, other

    cs.LG stat.ML

    Defining and Characterizing Reward Hacking

    Authors: Joar Skalse, Nikolaus H. R. Howe, Dmitrii Krasheninnikov, David Krueger

    Abstract: We provide the first formal definition of reward hacking, a phenomenon where optimizing an imperfect proxy reward function, $\mathcal{\tilde{R}}$, leads to poor performance according to the true reward function, $\mathcal{R}$. We say that a proxy is unhackable if increasing the expected proxy return can never decrease the expected true return. Intuitively, it might be possible to create an unhacka… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

  2. arXiv:2202.10600  [pdf, other

    cs.LG cs.AI eess.SY stat.ML

    Myriad: a real-world testbed to bridge trajectory optimization and deep learning

    Authors: Nikolaus H. R. Howe, Simon Dufort-Labbé, Nitarshan Rajkumar, Pierre-Luc Bacon

    Abstract: We present Myriad, a testbed written in JAX for learning and planning in real-world continuous environments. The primary contributions of Myriad are threefold. First, Myriad provides machine learning practitioners access to trajectory optimization techniques for application within a typical automatic differentiation workflow. Second, Myriad presents many real-world optimal control problems, rangin… ▽ More

    Submitted 26 January, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

    Comments: Updated to match version accepted at NeurIPS 2022