Skip to main content

Showing 1–7 of 7 results for author: Cella, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2206.03150  [pdf, other

    stat.ML cs.LG

    Group Meritocratic Fairness in Linear Contextual Bandits

    Authors: Riccardo Grazzi, Arya Akhavan, John Isak Texas Falk, Leonardo Cella, Massimiliano Pontil

    Abstract: We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates' rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social… ▽ More

    Submitted 20 December, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022. Code for the experiments at https://github.com/CSML-IIT-UCL/GMFbandits

  2. arXiv:2205.15100  [pdf, other

    cs.LG stat.ML

    Meta Representation Learning with Contextual Linear Bandits

    Authors: Leonardo Cella, Karim Lounici, Massimiliano Pontil

    Abstract: Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  3. arXiv:2202.10066  [pdf, other

    stat.ML cs.LG

    Multi-task Representation Learning with Stochastic Linear Bandits

    Authors: Leonardo Cella, Karim Lounici, Grégoire Pacreau, Massimiliano Pontil

    Abstract: We study the problem of transfer-learning in the setting of stochastic linear bandit tasks. We consider that a low dimensional linear representation is shared across the tasks, and study the benefit of learning this representation in the multi-task learning setting. Following recent results to design stochastic bandit policies, we propose an efficient greedy policy based on trace norm regularizati… ▽ More

    Submitted 15 August, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

  4. arXiv:2012.03522  [pdf, ps, other

    stat.ML cs.LG

    Online Model Selection: a Rested Bandit Formulation

    Authors: Leonardo Cella, Claudio Gentile, Massimiliano Pontil

    Abstract: Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learn… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  5. arXiv:2005.08531  [pdf, other

    stat.ML cs.LG

    Meta-learning with Stochastic Linear Bandits

    Authors: Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

    Abstract: We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution. Inspired by recent work on learning-to-learn linear regression, we consider a class of bandit algorithms that implement a regularized version of the well-known OFUL… ▽ More

    Submitted 18 May, 2020; originally announced May 2020.

  6. arXiv:1910.02757  [pdf, other

    stat.ML cs.LG

    Stochastic Bandits with Delay-Dependent Payoffs

    Authors: Leonardo Cella, Nicolò Cesa-Bianchi

    Abstract: Motivated by recommendation problems in music streaming platforms, we propose a nonstationary stochastic bandit model in which the expected reward of an arm depends on the number of rounds that have passed since the arm was last pulled. After proving that finding an optimal policy is NP-hard even when all model parameters are known, we introduce a class of ranking policies provably approximating,… ▽ More

    Submitted 19 February, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

  7. arXiv:1809.11033  [pdf, other

    cs.LG stat.ML

    Efficient Linear Bandits through Matrix Sketching

    Authors: Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

    Abstract: We prove that two popular linear contextual bandit algorithms, OFUL and Thompson Sampling, can be made efficient using Frequent Directions, a deterministic online sketching technique. More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $Ω(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of c… ▽ More

    Submitted 21 March, 2022; v1 submitted 28 September, 2018; originally announced September 2018.