Skip to main content

Showing 1–13 of 13 results for author: Ramponi, G

.
  1. arXiv:2406.18450  [pdf, other

    cs.LG cs.AI

    Preference Elicitation for Offline Reinforcement Learning

    Authors: Alizée Pace, Bernhard Schölkopf, Gunnar Rätsch, Giorgia Ramponi

    Abstract: Applying reinforcement learning (RL) to real-world problems is often made challenging by the inability to interact with the environment and the difficulty of designing reward functions. Offline RL addresses the first challenge by considering access to an offline dataset of environment interactions labeled by the reward function. In contrast, Preference-based RL does not assume access to the reward… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  2. arXiv:2406.01575  [pdf, other

    math.OC cs.AI cs.LG stat.ML

    Stochastic Bilevel Optimization with Lower-Level Contextual Markov Decision Processes

    Authors: Vinzenz Thoma, Barna Pasztor, Andreas Krause, Giorgia Ramponi, Yifan Hu

    Abstract: In various applications, the optimal policy in a strategic decision-making problem depends both on the environmental configuration and exogenous events. For these settings, we introduce Bilevel Optimization with Contextual Markov Decision Processes (BO-CMDP), a stochastic bilevel decision-making model, where the lower level consists of solving a contextual Markov Decision Process (CMDP). BO-CMDP c… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 54 pages, 18 Figures

  3. arXiv:2402.15776  [pdf, other

    cs.LG stat.ML

    Truly No-Regret Learning in Constrained MDPs

    Authors: Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, Niao He

    Abstract: Constrained Markov decision processes (CMDPs) are a common way to model safety constraints in reinforcement learning. State-of-the-art methods for efficiently solving CMDPs are based on primal-dual algorithms. For these algorithms, all currently known regret bounds allow for error cancellations -- one can compensate for a constraint violation in one round with a strict constraint satisfaction in a… ▽ More

    Submitted 18 March, 2024; v1 submitted 24 February, 2024; originally announced February 2024.

  4. arXiv:2310.07518  [pdf, other

    cs.LG

    Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning

    Authors: Mirco Mutti, Riccardo De Santi, Marcello Restelli, Alexander Marx, Giorgia Ramponi

    Abstract: Posterior sampling allows exploitation of prior knowledge on the environment's transition dynamics to improve the sample efficiency of reinforcement learning. The prior is typically specified as a class of parametric distributions, the design of which can be cumbersome in practice, often resulting in the choice of uninformative priors. In this work, we propose a novel posterior sampling approach i… ▽ More

    Submitted 8 April, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

  5. arXiv:2306.14799  [pdf, other

    cs.LG cs.GT

    On Imitation in Mean-field Games

    Authors: Giorgia Ramponi, Pavel Kolev, Olivier Pietquin, Niao He, Mathieu Laurière, Matthieu Geist

    Abstract: We explore the problem of imitation learning (IL) in the context of mean-field games (MFGs), where the goal is to imitate the behavior of a population of agents following a Nash equilibrium policy according to some unknown payoff function. IL in MFGs presents new challenges compared to single-agent IL, particularly when both the reward function and the transition kernel depend on the population di… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  6. arXiv:2306.07749  [pdf, other

    cs.LG cs.GT cs.MA

    Provably Learning Nash Policies in Constrained Markov Potential Games

    Authors: Pragnya Alatur, Giorgia Ramponi, Niao He, Andreas Krause

    Abstract: Multi-agent reinforcement learning (MARL) addresses sequential decision-making problems with multiple agents, where each agent optimizes its own objective. In many real-world instances, the agents may not only want to optimize their objectives, but also ensure safe behavior. For example, in traffic routing, each car (agent) aims to reach its destination quickly (objective) while avoiding collision… ▽ More

    Submitted 13 June, 2023; originally announced June 2023.

    Comments: 30 pages

  7. arXiv:2306.07001  [pdf, ps, other

    cs.LG stat.ML

    Cancellation-Free Regret Bounds for Lagrangian Approaches in Constrained Markov Decision Processes

    Authors: Adrian Müller, Pragnya Alatur, Giorgia Ramponi, Niao He

    Abstract: Constrained Markov Decision Processes (CMDPs) are one of the common ways to model safe reinforcement learning problems, where constraint functions model the safety objectives. Lagrangian-based dual or primal-dual algorithms provide efficient methods for learning in CMDPs. For these algorithms, the currently known regret bounds in the finite-horizon setting allow for a "cancellation of errors"; one… ▽ More

    Submitted 30 August, 2023; v1 submitted 12 June, 2023; originally announced June 2023.

  8. arXiv:2210.11137  [pdf, ps, other

    cs.LG cs.AI eess.SY

    Trust Region Policy Optimization with Optimal Transport Discrepancies: Duality and Algorithm for Continuous Actions

    Authors: Antonio Terpin, Nicolas Lanzetti, Batuhan Yardim, Florian Dörfler, Giorgia Ramponi

    Abstract: Policy Optimization (PO) algorithms have been proven particularly suited to handle the high-dimensionality of real-world continuous control tasks. In this context, Trust Region Policy Optimization methods represent a popular approach to stabilize the policy updates. These usually rely on the Kullback-Leibler (KL) divergence to limit the change in the policy. The Wasserstein distance represents a n… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted for presentation at, and publication in the proceedings of, the 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

  9. arXiv:2210.04817  [pdf, ps, other

    cs.LG cs.CR

    Do you pay for Privacy in Online learning?

    Authors: Amartya Sanyal, Giorgia Ramponi

    Abstract: Online learning, in the mistake bound model, is one of the most fundamental concepts in learning theory. Differential privacy, instead, is the most widely used statistical concept of privacy in the machine learning community. It is thus clear that defining learning problems that are online differentially privately learnable is of great interest. In this paper, we pose the question on if the two pr… ▽ More

    Submitted 10 October, 2022; originally announced October 2022.

    Comments: This is an updated version with i) clearer problem statements especially in proposed Theorem 1 and ii) clearer discussion of existing work especially Golowich and Livni (2021). Conference on Learning Theory. PMLR, 2022

  10. arXiv:2207.08645  [pdf, other

    cs.LG cs.AI stat.ML

    Active Exploration for Inverse Reinforcement Learning

    Authors: David Lindner, Andreas Krause, Giorgia Ramponi

    Abstract: Inverse Reinforcement Learning (IRL) is a powerful paradigm for inferring a reward function from expert demonstrations. Many IRL algorithms require a known transition model and sometimes even a known expert policy, or they at least require access to a generative model. However, these assumptions are too strong for many real-world applications, where the environment can be accessed only through seq… ▽ More

    Submitted 22 August, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: Presented at Conference on Neural Information Processing Systems (NeurIPS), 2022

  11. arXiv:2007.07812  [pdf, other

    cs.LG stat.ML

    Inverse Reinforcement Learning from a Gradient-based Learner

    Authors: Giorgia Ramponi, Gianluca Drappo, Marcello Restelli

    Abstract: Inverse Reinforcement Learning addresses the problem of inferring an expert's reward function from demonstrations. However, in many applications, we not only have access to the expert's near-optimal behavior, but we also observe part of her learning process. In this paper, we propose a new algorithm for this setting, in which the goal is to recover the reward function being optimized by an agent,… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Journal ref: Advances in Neural Information Processing Systems 33 (2020) 2458--2468

  12. arXiv:2007.07804  [pdf, other

    cs.LG stat.ML

    Newton Optimization on Helmholtz Decomposition for Continuous Games

    Authors: Giorgia Ramponi, Marcello Restelli

    Abstract: Many learning problems involve multiple agents optimizing different interactive functions. In these problems, the standard policy gradient algorithms fail due to the non-stationarity of the setting and the different interests of each agent. In fact, algorithms must take into account the complex dynamics of these systems to guarantee rapid convergence towards a (local) Nash equilibrium. In this pap… ▽ More

    Submitted 2 September, 2021; v1 submitted 15 July, 2020; originally announced July 2020.

    Comments: In 35th AAAI Conference on Artificial Intelligence (AAAI 2021)

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 35 (2021) 11325-11333

  13. arXiv:1811.08295  [pdf, other

    cs.LG stat.ML

    T-CGAN: Conditional Generative Adversarial Network for Data Augmentation in Noisy Time Series with Irregular Sampling

    Authors: Giorgia Ramponi, Pavlos Protopapas, Marco Brambilla, Ryan Janssen

    Abstract: In this paper we propose a data augmentation method for time series with irregular sampling, Time-Conditional Generative Adversarial Network (T-CGAN). Our approach is based on Conditional Generative Adversarial Networks (CGAN), where the generative step is implemented by a deconvolutional NN and the discriminative step by a convolutional NN. Both the generator and the discriminator are conditioned… ▽ More

    Submitted 1 February, 2019; v1 submitted 20 November, 2018; originally announced November 2018.