Skip to main content

Showing 1–1 of 1 results for author: Davar, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15612  [pdf, other

    cs.LG q-fin.RM

    Catastrophic-risk-aware reinforcement learning with extreme-value-theory-based policy gradients

    Authors: Parisa Davar, Frédéric Godin, Jose Garrido

    Abstract: This paper tackles the problem of mitigating catastrophic risk (which is risk with very low frequency but very high severity) in the context of a sequential decision making process. This problem is particularly challenging due to the scarcity of observations in the far tail of the distribution of cumulative costs (negative rewards). A policy gradient algorithm is developed, that we call POTPG. It… ▽ More

    Submitted 28 June, 2024; v1 submitted 21 June, 2024; originally announced June 2024.

    Comments: The Python code to replicate the various numerical experiments of this paper is available at https://github.com/parisadavar/EVT-policy-gradient-RL