Skip to main content

Showing 1–3 of 3 results for author: Urpí, N A

.
  1. arXiv:2405.18917  [pdf, other

    cs.LG cs.AI cs.RO

    Causal Action Influence Aware Counterfactual Data Augmentation

    Authors: Núria Armengol Urpí, Marco Bagatella, Marin Vlastelica, Georg Martius

    Abstract: Offline data are both valuable and practical resources for teaching robots complex behaviors. Ideally, learning agents should not be constrained by the scarcity of available demonstrations, but rather generalize beyond the training distribution. However, the complexity of real-world scenarios typically requires huge amounts of data to prevent neural network policies from picking up on spurious cor… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

    Comments: Accepted in 41st International Conference on Machine Learning (ICML 2024)

  2. arXiv:2303.09628  [pdf, other

    cs.LG cs.RO

    Efficient Learning of High Level Plans from Play

    Authors: Núria Armengol Urpí, Marco Bagatella, Otmar Hilliges, Georg Martius, Stelian Coros

    Abstract: Real-world robotic manipulation tasks remain an elusive challenge, since they involve both fine-grained environment interaction, as well as the ability to plan for long-horizon goals. Although deep reinforcement learning (RL) methods have shown encouraging results when planning end-to-end in high-dimensional environments, they remain fundamentally limited by poor sample efficiency due to inefficie… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: Accepted to the International Conference on Robotics and Automation 2023

  3. arXiv:2102.05371  [pdf, other

    cs.LG

    Risk-Averse Offline Reinforcement Learning

    Authors: Núria Armengol Urpí, Sebastian Curi, Andreas Krause

    Abstract: Training Reinforcement Learning (RL) agents in high-stakes applications might be too prohibitive due to the risk associated to exploration. Thus, the agent can only use data previously collected by safe policies. While previous work considers optimizing the average performance using offline data, we focus on optimizing a risk-averse criteria, namely the CVaR. In particular, we present the Offline… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.