Skip to main content

Showing 1–16 of 16 results for author: Bazerque, J

.
  1. arXiv:2406.01782  [pdf, other

    eess.SY cs.AI cs.LG cs.MA

    Multi-agent assignment via state augmented reinforcement learning

    Authors: Leopoldo Agorio, Sean Van Alen, Miguel Calvo-Fullana, Santiago Paternain, Juan Andres Bazerque

    Abstract: We address the conflicting requirements of a multi-agent assignment problem through constrained reinforcement learning, emphasizing the inadequacy of standard regularization techniques for this purpose. Instead, we recur to a state augmentation approach in which the oscillation of dual variables is exploited by agents to alternate between tasks. In addition, we coordinate the actions of the multip… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 12 pages, 3 figures, 6th Annual Conference on Learning for Dynamics and Control

    MSC Class: 93E35

    Journal ref: Proceedings of Machine Learning Research vol 242 1 12, 2024. 6th Annual Conference on Learning for Dynamics and Control

  2. arXiv:2401.12849  [pdf, other

    cs.LG eess.SY

    Learning safety critics via a non-contractive binary bellman operator

    Authors: Agustin Castellano, Hancheng Min, Juan Andrés Bazerque, Enrique Mallada

    Abstract: The inability to naturally enforce safety in Reinforcement Learning (RL), with limited failures, is a core challenge impeding its use in real-world applications. One notion of safety of vast practical relevance is the ability to avoid (unsafe) regions of the state space. Though such a safety goal can be captured by an action-value-like function, a.k.a. safety critics, the associated operator lacks… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

  3. arXiv:2306.08737  [pdf, other

    cs.RO cs.IT math.OC

    A Networked Multi-Agent System for Mobile Wireless Infrastructure on Demand

    Authors: Miguel Calvo-Fullana, Mikhail Gerasimenko, Daniel Mox, Leopoldo Agorio, Mariana del Castillo, Vijay Kumar, Alejandro Ribeiro, Juan Andres Bazerque

    Abstract: Despite the prevalence of wireless connectivity in urban areas around the globe, there remain numerous and diverse situations where connectivity is insufficient or unavailable. To address this, we introduce mobile wireless infrastructure on demand, a system of UAVs that can be rapidly deployed to establish an ad-hoc wireless network. This network has the capability of reconfiguring itself dynamica… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

  4. arXiv:2210.15573  [pdf, other

    cs.LG

    Multi-task Bias-Variance Trade-off Through Functional Constraints

    Authors: Juan Cervino, Juan Andres Bazerque, Miguel Calvo-Fullana, Alejandro Ribeiro

    Abstract: Multi-task learning aims to acquire a set of functions, either regressors or classifiers, that perform well for diverse tasks. At its core, the idea behind multi-task learning is to exploit the intrinsic similarity across data sources to aid in the learning process for each individual domain. In this paper we draw intuition from the two extreme learning scenarios -- a single function for all tasks… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  5. arXiv:2112.05198  [pdf, other

    cs.LG eess.SY

    Reinforcement Learning with Almost Sure Constraints

    Authors: Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada

    Abstract: In this work we address the problem of finding feasible policies for Constrained Markov Decision Processes under probability one constraints. We argue that stationary policies are not sufficient for solving this problem, and that a rich class of policies can be found by endowing the controller with a scalar quantity, so called budget, that tracks how close the agent is to violating the constraint.… ▽ More

    Submitted 13 February, 2023; v1 submitted 9 December, 2021; originally announced December 2021.

    Comments: Accepted to L4DC 2022

  6. arXiv:2105.08748  [pdf, other

    eess.SY cs.LG

    Learning to Act Safely with Limited Exposure and Almost Sure Certainty

    Authors: Agustin Castellano, Hancheng Min, Juan Bazerque, Enrique Mallada

    Abstract: This paper puts forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials. This is indeed possible, provided that one is willing to navigate trade-offs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We… ▽ More

    Submitted 13 February, 2023; v1 submitted 18 May, 2021; originally announced May 2021.

    Comments: 16 pages, 7 figures. Submitted to TAC

  7. arXiv:2012.13036  [pdf, ps, other

    cs.LG eess.SY

    Assured RL: Reinforcement Learning with Almost Sure Constraints

    Authors: Agustin Castellano, Juan Bazerque, Enrique Mallada

    Abstract: We consider the problem of finding optimal policies for a Markov Decision Process with almost sure constraints on state transitions and action triplets. We define value and action-value functions that satisfy a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process. We prove that, given a policy π, certifying whether certain state-a… ▽ More

    Submitted 23 December, 2020; originally announced December 2020.

  8. arXiv:2010.12993  [pdf, other

    cs.LG eess.SP stat.ML

    Multi-task Supervised Learning via Cross-learning

    Authors: Juan Cervino, Juan Andres Bazerque, Miguel Calvo-Fullana, Alejandro Ribeiro

    Abstract: In this paper we consider a problem known as multi-task learning, consisting of fitting a set of classifier or regression functions intended for solving different tasks. In our novel formulation, we couple the parameters of these functions, so that they learn in their task specific domains while staying close to each other. This facilitates cross-fertilization in which data collected across differ… ▽ More

    Submitted 26 May, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

  9. arXiv:2010.08443  [pdf, other

    cs.LG eess.SY

    Policy Gradient for Continuing Tasks in Non-stationary Markov Decision Processes

    Authors: Santiago Paternain, Juan Andres Bazerque, Alejandro Ribeiro

    Abstract: Reinforcement learning considers the problem of finding policies that maximize an expected cumulative reward in a Markov decision process with unknown transition probabilities. In this paper we consider the problem of finding optimal policies assuming that they belong to a reproducing kernel Hilbert space (RKHS). To that end we compute unbiased stochastic gradients of the value function which we u… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

  10. arXiv:2010.02122  [pdf, other

    eess.SY math.OC

    Quadratic approximate dynamic programming for scheduling water resources: a case study

    Authors: Agustin Castellano, Camila Martínez, Pablo Monzón, Juan Andrés Bazerque, Andrés Ferragut, Fernando Paganini

    Abstract: We address the problem of scheduling water resources in a power system via approximate dynamic programming.To this goal, we model a finite horizon economic dispatch problemwith convex stage cost and affine dynamics, and consider aquadratic approximation of the value functions. Evaluating theachieved policy entails solving a quadratic program at each timestep, while value function fitting can be ca… ▽ More

    Submitted 5 October, 2020; originally announced October 2020.

  11. arXiv:2010.00417  [pdf, other

    cs.LG stat.ML

    Learning to be safe, in finite time

    Authors: Agustin Castellano, Juan Bazerque, Enrique Mallada

    Abstract: This paper aims to put forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials, provided that one is willing to relax its optimality requirements mildly. We focus on the canonical multi-armed bandit problem and seek to study the exploration-preservation trade… ▽ More

    Submitted 31 March, 2021; v1 submitted 1 October, 2020; originally announced October 2020.

  12. Multi-task Reinforcement Learning in Reproducing Kernel Hilbert Spaces via Cross-learning

    Authors: Juan Cervino, Juan Andres Bazerque, Miguel Calvo-Fullana, Alejandro Ribeiro

    Abstract: Reinforcement learning (RL) is a framework to optimize a control policy using rewards that are revealed by the system as a response to a control action. In its standard form, RL involves a single agent that uses its policy to accomplish a specific task. These methods require large amounts of reward samples to achieve good performance, and may not generalize well when the task is modified, even if… ▽ More

    Submitted 26 August, 2020; originally announced August 2020.

  13. arXiv:1807.11274  [pdf, other

    eess.SY

    Stochastic Policy Gradient Ascent in Reproducing Kernel Hilbert Spaces

    Authors: Santiago Paternain, Juan Andrés Bazerque, Austin Small, Alejandro Ribeiro

    Abstract: Reinforcement learning consists of finding policies that maximize an expected cumulative long-term reward in a Markov decision process with unknown transition probabilities and instantaneous rewards. In this paper, we consider the problem of finding such optimal policies while assuming they are continuous functions belonging to a reproducing kernel Hilbert space (RKHS). To learn the optimal policy… ▽ More

    Submitted 30 July, 2018; originally announced July 2018.

  14. arXiv:1302.5449  [pdf, ps, other

    cs.LG cs.CV cs.IT stat.ML

    Nonparametric Basis Pursuit via Sparse Kernel-based Learning

    Authors: Juan Andres Bazerque, Georgios B. Giannakis

    Abstract: Signal processing tasks as fundamental as sampling, reconstruction, minimum mean-square error interpolation and prediction can be viewed under the prism of reproducing kernel Hilbert spaces. Endowing this vantage point with contemporary advances in sparsity-aware modeling and processing, promotes the nonparametric basis pursuit advocated in this paper as the overarching framework for the confluenc… ▽ More

    Submitted 21 February, 2013; originally announced February 2013.

    Comments: IEEE SIGNAL PROCESSING MAGAZINE, 2013 (TO APPEAR)

  15. arXiv:1301.7619  [pdf, ps, other

    cs.IT cs.LG stat.ML

    Rank regularization and Bayesian inference for tensor completion and extrapolation

    Authors: Juan Andres Bazerque, Gonzalo Mateos, Georgios B. Giannakis

    Abstract: A novel regularizer of the PARAFAC decomposition factors capturing the tensor's rank is proposed in this paper, as the key enabler for completion of three-way data arrays with missing entries. Set in a Bayesian framework, the tensor completion method incorporates prior information to enhance its smoothing and prediction capabilities. This probabilistic approach can naturally accommodate general mo… ▽ More

    Submitted 31 January, 2013; originally announced January 2013.

    Comments: 12 pages, submitted to IEEE Transactions on Signal Processing

  16. Group-Lasso on Splines for Spectrum Cartography

    Authors: Juan A. Bazerque, Gonzalo Mateos, Georgios B. Giannakis

    Abstract: The unceasing demand for continuous situational awareness calls for innovative and large-scale signal processing algorithms, complemented by collaborative and adaptive sensing platforms to accomplish the objectives of layered sensing and control. Towards this goal, the present paper develops a spline-based approach to field estimation, which relies on a basis expansion model of the field of intere… ▽ More

    Submitted 1 October, 2010; originally announced October 2010.

    Comments: Submitted to IEEE Transactions on Signal Processing