Search | arXiv e-print repository

arXiv:2404.19224 [pdf, other]

Variational approximations of possibilistic inferential models

Abstract: Inferential models (IMs) offer reliable, data-driven, possibilistic statistical inference. But despite IMs' theoretical/foundational advantages, efficient computation in applications is a major challenge. This paper presents a simple and apparently powerful Monte Carlo-driven strategy for approximating the IM's possibility contour, or at least its $α$-level set for a specified $α$. Our proposal ut… ▽ More Inferential models (IMs) offer reliable, data-driven, possibilistic statistical inference. But despite IMs' theoretical/foundational advantages, efficient computation in applications is a major challenge. This paper presents a simple and apparently powerful Monte Carlo-driven strategy for approximating the IM's possibility contour, or at least its $α$-level set for a specified $α$. Our proposal utilizes a parametric family that, in a certain sense, approximately covers the credal set associated with the IM's possibility measure, which is reminiscent of variational approximations now widely used in Bayesian statistics. △ Less

Submitted 29 April, 2024; originally announced April 2024.

Comments: Comments welcome at https://researchers.one/articles/24.04.00005

arXiv:2304.05740 [pdf, other]

doi 10.1016/j.ijar.2023.109060

Possibility-theoretic statistical inference offers performance and probativeness assurances

Authors: Leonardo Cella, Ryan Martin

Abstract: Statisticians are largely focused on develo** methods that perform well in a frequentist sense -- even the Bayesians. But the widely-publicized replication crisis suggests that these performance guarantees alone are not enough to instill confidence in scientific discoveries. In addition to reliably detecting hypotheses that are (in)compatible with data, investigators require methods that can pro… ▽ More Statisticians are largely focused on develo** methods that perform well in a frequentist sense -- even the Bayesians. But the widely-publicized replication crisis suggests that these performance guarantees alone are not enough to instill confidence in scientific discoveries. In addition to reliably detecting hypotheses that are (in)compatible with data, investigators require methods that can probe for hypotheses that are actually supported by the data. In this paper, we demonstrate that valid inferential models (IMs) achieve both performance and probativeness properties and we offer a powerful new result that ensures the IM's probing is reliable. We also compare and contrast the IM's dual performance and probativeness abilities with that of Deborah Mayo's severe testing framework. △ Less

Submitted 26 July, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

Comments: Comments welcome at https://researchers.one/articles/23.04.00004

Journal ref: International Journal of Approximate Reasoning, volume 163, pages 109060, 2023

arXiv:2206.03150 [pdf, other]

Group Meritocratic Fairness in Linear Contextual Bandits

Authors: Riccardo Grazzi, Arya Akhavan, John Isak Texas Falk, Leonardo Cella, Massimiliano Pontil

Abstract: We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates' rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social… ▽ More We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates' rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social injustice. We propose a notion of fairness that states that the agent's policy is fair when it selects a candidate with highest relative rank, which measures how good the reward is when compared to candidates from the same group. This is a very strong notion of fairness, since the relative rank is not directly observed by the agent and depends on the underlying reward model and on the distribution of rewards. Thus we study the problem of learning a policy which approximates a fair policy under the condition that the contexts are independent between groups and the distribution of rewards of each group is absolutely continuous. In particular, we design a greedy policy which at each round constructs a ridge regression estimate from the observed context-reward pairs, and then computes an estimate of the relative rank of each candidate using the empirical cumulative distribution function. We prove that, despite its simplicity and the lack of an initial exploration phase, the greedy policy achieves, up to log factors and with high probability, a fair pseudo-regret of order $\sqrt{dT}$ after $T$ rounds, where $d$ is the dimension of the context vectors. The policy also satisfies demographic parity at each round when averaged over all possible information available before the selection. Finally, we use simulated settings and experiments on the US census data to show that our policy achieves sub-linear fair pseudo-regret also in practice. △ Less

Submitted 20 December, 2022; v1 submitted 7 June, 2022; originally announced June 2022.

Comments: NeurIPS 2022. Code for the experiments at https://github.com/CSML-IIT-UCL/GMFbandits

arXiv:2205.15100 [pdf, other]

Meta Representation Learning with Contextual Linear Bandits

Authors: Leonardo Cella, Karim Lounici, Massimiliano Pontil

Abstract: Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn… ▽ More Meta-learning seeks to build algorithms that rapidly learn how to solve new learning problems based on previous experience. In this paper we investigate meta-learning in the setting of stochastic linear bandit tasks. We assume that the tasks share a low dimensional representation, which has been partially acquired from previous learning tasks. We aim to leverage this information in order to learn a new downstream bandit task, which shares the same representation. Our principal contribution is to show that if the learned representation estimates well the unknown one, then the downstream task can be efficiently learned by a greedy policy that we propose in this work. We derive an upper bound on the regret of this policy, which is, up to logarithmic factors, of order $r\sqrt{N}(1\vee \sqrt{d/T})$, where $N$ is the horizon of the downstream task, $T$ is the number of training tasks, $d$ the ambient dimension and $r \ll d$ the dimension of the representation. We highlight that our strategy does not need to know $r$. We note that if $T> d$ our bound achieves the same rate of optimal minimax bandit algorithms using the true underlying representation. Our analysis is inspired and builds in part upon previous work on meta-learning in the i.i.d. full information setting \citep{tripuraneni2021provable,boursier2022trace}. As a separate contribution we show how to relax certain assumptions in those works, thereby improving their representation learning and risk analysis. △ Less

Submitted 30 May, 2022; originally announced May 2022.

arXiv:2202.10066 [pdf, other]

Multi-task Representation Learning with Stochastic Linear Bandits

Authors: Leonardo Cella, Karim Lounici, Grégoire Pacreau, Massimiliano Pontil

Abstract: We study the problem of transfer-learning in the setting of stochastic linear bandit tasks. We consider that a low dimensional linear representation is shared across the tasks, and study the benefit of learning this representation in the multi-task learning setting. Following recent results to design stochastic bandit policies, we propose an efficient greedy policy based on trace norm regularizati… ▽ More We study the problem of transfer-learning in the setting of stochastic linear bandit tasks. We consider that a low dimensional linear representation is shared across the tasks, and study the benefit of learning this representation in the multi-task learning setting. Following recent results to design stochastic bandit policies, we propose an efficient greedy policy based on trace norm regularization. It implicitly learns a low dimensional representation by encouraging the matrix formed by the task regression vectors to be of low rank. Unlike previous work in the literature, our policy does not need to know the rank of the underlying matrix. We derive an upper bound on the multi-task regret of our policy, which is, up to logarithmic factors, of order $\sqrt{NdT(T+d)r}$, where $T$ is the number of tasks, $r$ the rank, $d$ the number of variables and $N$ the number of rounds per task. We show the benefit of our strategy compared to the baseline $Td\sqrt{N}$ obtained by solving each task independently. We also provide a lower bound to the multi-task regret. Finally, we corroborate our theoretical findings with preliminary experiments on synthetic data. △ Less

Submitted 15 August, 2023; v1 submitted 21 February, 2022; originally announced February 2022.

arXiv:2112.10234 [pdf, other]

doi 10.1016/j.ijar.2022.08.001

Valid inferential models for prediction in supervised learning problems

Authors: Leonardo Cella, Ryan Martin

Abstract: Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide probabilistic uncertainty quantification in the sense of assigning beliefs to relevant assertions about the future observable. Alternatively, we recommend the use of a… ▽ More Prediction, where observed data is used to quantify uncertainty about a future observation, is a fundamental problem in statistics. Prediction sets with coverage probability guarantees are a common solution, but these do not provide probabilistic uncertainty quantification in the sense of assigning beliefs to relevant assertions about the future observable. Alternatively, we recommend the use of a {\em probabilistic predictor}, a data-dependent (imprecise) probability distribution for the to-be-predicted observation given the observed data. It is essential that the probabilistic predictor be reliable or valid, and here we offer a notion of validity and explore its behavioral and statistical implications. In particular, we show that valid probabilistic predictors must be imprecise, that they avoid sure loss, and that they lead to prediction procedures with desirable frequentist error rate control properties. We provide a general construction of a provably valid probabilistic predictor, which has close connections to the powerful conformal prediction machinery, and we illustrate this construction in regression and classification applications. △ Less

Submitted 9 June, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

Comments: 29 pages, 4 figures, 2 tables. Comments welcome at https://researchers.one/articles/21.12.00002

Journal ref: International Journal of Approximate Reasoning, volume 150, pages 1--18, 2022

arXiv:2112.10232 [pdf, other]

doi 10.1016/j.ijar.2022.09.011

Direct and approximately valid probabilistic inference on a class of statistical functionals

Authors: Leonardo Cella, Ryan Martin

Abstract: Existing frameworks for probabilistic inference assume the quantity of interest is the parameter of a posited statistical model. In machine learning applications, however, often there is no statistical model/parameter; the quantity of interest is a statistical functional, a feature of the underlying distribution. Model-based methods can only handle such problems indirectly, via marginalization fro… ▽ More Existing frameworks for probabilistic inference assume the quantity of interest is the parameter of a posited statistical model. In machine learning applications, however, often there is no statistical model/parameter; the quantity of interest is a statistical functional, a feature of the underlying distribution. Model-based methods can only handle such problems indirectly, via marginalization from a model parameter to the real quantity of interest. Here we develop a generalized inferential model (IM) framework for direct probabilistic uncertainty quantification on the quantity of interest. In particular, we construct a data-dependent, bootstrap-based possibility measure for uncertainty quantification and inference. We then prove that this new approach provides approximately valid inference in the sense that the plausibility values assigned to hypotheses about the unknowns are asymptotically well-calibrated in a frequentist sense. Among other things, this implies that confidence regions for the underlying functional derived from our proposed IM are approximately valid. The method is shown to perform well in key examples, including quantile regression, and in a personalized medicine application. △ Less

Submitted 9 June, 2022; v1 submitted 19 December, 2021; originally announced December 2021.

Comments: 32 pages, 5 figures, 1 table. Comments welcome at https://researchers.one/articles/21.12.00004

Journal ref: International Journal of Approximate Reasoning, volume 151, pages 205--224, 2022

arXiv:2012.03522 [pdf, ps, other]

Online Model Selection: a Rested Bandit Formulation

Authors: Leonardo Cella, Claudio Gentile, Massimiliano Pontil

Abstract: Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learn… ▽ More Motivated by a natural problem in online model selection with bandit information, we introduce and analyze a best arm identification problem in the rested bandit setting, wherein arm expected losses decrease with the number of times the arm has been played. The shape of the expected loss functions is similar across arms, and is assumed to be available up to unknown parameters that have to be learned on the fly. We define a novel notion of regret for this problem, where we compare to the policy that always plays the arm having the smallest expected loss at the end of the game. We analyze an arm elimination algorithm whose regret vanishes as the time horizon increases. The actual rate of convergence depends in a detailed way on the postulated functional form of the expected losses. Unlike known model selection efforts in the recent bandit literature, our algorithm exploits the specific structure of the problem to learn the unknown parameters of the expected loss function so as to identify the best arm as quickly as possible. We complement our analysis with a lower bound, indicating strengths and limitations of the proposed solution. △ Less

Submitted 7 December, 2020; originally announced December 2020.

arXiv:2005.08531 [pdf, other]

Meta-learning with Stochastic Linear Bandits

Authors: Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

Abstract: We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution. Inspired by recent work on learning-to-learn linear regression, we consider a class of bandit algorithms that implement a regularized version of the well-known OFUL… ▽ More We investigate meta-learning procedures in the setting of stochastic linear bandits tasks. The goal is to select a learning algorithm which works well on average over a class of bandits tasks, that are sampled from a task-distribution. Inspired by recent work on learning-to-learn linear regression, we consider a class of bandit algorithms that implement a regularized version of the well-known OFUL algorithm, where the regularization is a square euclidean distance to a bias vector. We first study the benefit of the biased OFUL algorithm in terms of regret minimization. We then propose two strategies to estimate the bias within the learning-to-learn setting. We show both theoretically and experimentally, that when the number of tasks grows and the variance of the task-distribution is small, our strategies have a significant advantage over learning the tasks in isolation. △ Less

Submitted 18 May, 2020; originally announced May 2020.

arXiv:2001.09225 [pdf, other]

doi 10.1016/j.ijar.2021.07.013

Validity, consonant plausibility measures, and conformal prediction

Authors: Leonardo Cella, Ryan Martin

Abstract: Prediction of future observations is an important and challenging problem. The two mainstream approaches for quantifying prediction uncertainty use prediction regions and predictive distributions, respectively, with the latter believed to be more informative because it can perform other prediction-related tasks. The standard notion of validity, what we refer to here as Type-1 validity, focuses on… ▽ More Prediction of future observations is an important and challenging problem. The two mainstream approaches for quantifying prediction uncertainty use prediction regions and predictive distributions, respectively, with the latter believed to be more informative because it can perform other prediction-related tasks. The standard notion of validity, what we refer to here as Type-1 validity, focuses on coverage probability of prediction regions, while a notion of validity relevant to the other prediction-related tasks performed by predictive distributions is lacking. Here we present a new notion, called Type-2 validity, relevant to these other prediction tasks. We establish connections between Type-2 validity and coherence properties, and show that imprecise probability considerations are required in order to achieve it. We go on to show that both types of prediction validity can be achieved by interpreting the conformal prediction output as the contour function of a consonant plausibility measure. We also offer an alternative characterization of conformal prediction, based on a new nonparametric inferential model construction, wherein the appearance of consonance is natural, and prove its validity. △ Less

Submitted 9 June, 2022; v1 submitted 24 January, 2020; originally announced January 2020.

Comments: 34 pages, 2 figures, 2 tables. Comments welcome at https://www.researchers.one/article/2020-01-12

Journal ref: International Journal of Approximate Reasoning, volume 141, pages 110--130, 2022

arXiv:1910.02757 [pdf, other]

Stochastic Bandits with Delay-Dependent Payoffs

Authors: Leonardo Cella, Nicolò Cesa-Bianchi

Abstract: Motivated by recommendation problems in music streaming platforms, we propose a nonstationary stochastic bandit model in which the expected reward of an arm depends on the number of rounds that have passed since the arm was last pulled. After proving that finding an optimal policy is NP-hard even when all model parameters are known, we introduce a class of ranking policies provably approximating,… ▽ More Motivated by recommendation problems in music streaming platforms, we propose a nonstationary stochastic bandit model in which the expected reward of an arm depends on the number of rounds that have passed since the arm was last pulled. After proving that finding an optimal policy is NP-hard even when all model parameters are known, we introduce a class of ranking policies provably approximating, to within a constant factor, the expected reward of the optimal policy. We show an algorithm whose regret with respect to the best ranking policy is bounded by $\widetilde{\mathcal{O}}\big(\!\sqrt{kT}\big)$, where $k$ is the number of arms and $T$ is time. Our algorithm uses only $\mathcal{O}\big(k\ln\ln T\big)$ switches, which helps when switching between policies is costly. As constructing the class of learning policies requires ordering the arms according to their expectations, we also bound the number of pulls required to do so. Finally, we run experiments to compare our algorithm against UCB on different problem instances. △ Less

Submitted 19 February, 2020; v1 submitted 7 October, 2019; originally announced October 2019.

arXiv:1809.11033 [pdf, other]

Efficient Linear Bandits through Matrix Sketching

Authors: Ilja Kuzborskij, Leonardo Cella, Nicolò Cesa-Bianchi

Abstract: We prove that two popular linear contextual bandit algorithms, OFUL and Thompson Sampling, can be made efficient using Frequent Directions, a deterministic online sketching technique. More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $Ω(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of c… ▽ More We prove that two popular linear contextual bandit algorithms, OFUL and Thompson Sampling, can be made efficient using Frequent Directions, a deterministic online sketching technique. More precisely, we show that a sketch of size $m$ allows a $\mathcal{O}(md)$ update time for both algorithms, as opposed to $Ω(d^2)$ required by their non-sketched versions in general (where $d$ is the dimension of context vectors). This computational speedup is accompanied by regret bounds of order $(1+\varepsilon_m)^{3/2}d\sqrt{T}$ for OFUL and of order $\big((1+\varepsilon_m)d\big)^{3/2}\sqrt{T}$ for Thompson Sampling, where $\varepsilon_m$ is bounded by the sum of the tail eigenvalues not covered by the sketch. In particular, when the selected contexts span a subspace of dimension at most $m$, our algorithms have a regret bound matching that of their slower, non-sketched counterparts. Experiments on real-world datasets corroborate our theoretical results. △ Less

Submitted 21 March, 2022; v1 submitted 28 September, 2018; originally announced September 2018.

Showing 1–12 of 12 results for author: Cella, L