Skip to main content

Showing 1–20 of 20 results for author: Degenne, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.03033  [pdf, other

    cs.LG stat.ML

    Optimal Multi-Fidelity Best-Arm Identification

    Authors: Riccardo Poiani, Rémy Degenne, Emilie Kaufmann, Alberto Maria Metelli, Marcello Restelli

    Abstract: In bandit best-arm identification, an algorithm is tasked with finding the arm with highest mean reward with a specified accuracy as fast as possible. We study multi-fidelity best-arm identification, in which the algorithm can choose to sample an arm at a lower fidelity (less accurate mean estimate) for a lower cost. Several methods have been proposed for tackling this problem, but their optimalit… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  2. arXiv:2405.17108  [pdf, ps, other

    cs.LG

    Finding good policies in average-reward Markov Decision Processes without prior knowledge

    Authors: Adrienne Tuynman, Rémy Degenne, Emilie Kaufmann

    Abstract: We revisit the identification of an $\varepsilon$-optimal policy in average-reward Markov Decision Processes (MDP). In such MDPs, two measures of complexity have appeared in the literature: the diameter, $D$, and the optimal bias span, $H$, which satisfy $H\leq D$. Prior work have studied the complexity of $\varepsilon$-optimal policy identification only when a generative model is available. In th… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2403.01892  [pdf, other

    math.ST cs.IT

    Information Lower Bounds for Robust Mean Estimation

    Authors: Rémy Degenne, Timothée Mathieu

    Abstract: We prove lower bounds on the error of any estimator for the mean of a real probability distribution under the knowledge that the distribution belongs to a given set. We apply these lower bounds both to parametric and nonparametric estimation. In the nonparametric case, we apply our results to the question of sub-Gaussian estimation for distributions with finite variance to obtain new lower bounds… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  4. arXiv:2305.16041  [pdf, other

    stat.ML cs.LG

    An $\varepsilon$-Best-Arm Identification Algorithm for Fixed-Confidence and Beyond

    Authors: Marc Jourdan, Rémy Degenne, Emilie Kaufmann

    Abstract: We propose EB-TC$\varepsilon$, a novel sampling rule for $\varepsilon$-best arm identification in stochastic bandits. It is the first instance of Top Two algorithm analyzed for approximate best arm identification. EB-TC$\varepsilon$ is an *anytime* sampling rule that can therefore be employed without modification for fixed confidence or fixed budget identification (without prior knowledge of the b… ▽ More

    Submitted 6 November, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: 68 pages, 14 figures, 4 tables. To be published in the Thirty-seventh Conference on Neural Information Processing Systems

  5. arXiv:2303.09468  [pdf, ps, other

    stat.ML cs.LG

    On the Existence of a Complexity in Fixed Budget Bandit Identification

    Authors: Rémy Degenne

    Abstract: In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of distributions. A good algorithm will have a small probability of error. While that probability decreases exponentially with the final time, the best attainable rate is not known precisely for most identification tasks. We sh… ▽ More

    Submitted 30 June, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: 24 pages, 36th Annual Conference on Learning Theory, Proceedings of Machine Learning Research vol 195

  6. arXiv:2212.05578  [pdf, other

    cs.LO math.PR

    A Formalization of Doob's Martingale Convergence Theorems in mathlib

    Authors: Kexing Ying, Rémy Degenne

    Abstract: We present the formalization of Doob's martingale convergence theorems in the mathlib library for the Lean theorem prover. These theorems give conditions under which (sub)martingales converge, almost everywhere or in $L^1$. In order to formalize those results, we build a definition of the conditional expectation in Banach spaces and develop the theory of stochastic processes, stop** times and ma… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

  7. arXiv:2210.05431  [pdf, other

    stat.ML cs.LG

    Non-Asymptotic Analysis of a UCB-based Top Two Algorithm

    Authors: Marc Jourdan, Rémy Degenne

    Abstract: A Top Two sampling rule for bandit identification is a method which selects the next arm to sample from among two candidate arms, a leader and a challenger. Due to their simplicity and good empirical performance, they have received increased attention in recent years. However, for fixed-confidence best arm identification, theoretical guarantees for Top Two methods have only been obtained in the as… ▽ More

    Submitted 6 November, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Comments: 32 pages, 5 figures, 3 tables. To be published in the Thirty-seventh Conference on Neural Information Processing Systems

  8. arXiv:2210.00974  [pdf, other

    stat.ML cs.LG

    Dealing with Unknown Variances in Best-Arm Identification

    Authors: Marc Jourdan, Rémy Degenne, Emilie Kaufmann

    Abstract: The problem of identifying the best arm among a collection of items having Gaussian rewards distribution is well understood when the variances are known. Despite its practical relevance for many applications, few works studied it for unknown variances. In this paper we introduce and analyze two approaches to deal with unknown variances, either by plugging in the empirical variance or by adapting t… ▽ More

    Submitted 23 January, 2023; v1 submitted 3 October, 2022; originally announced October 2022.

    Comments: 73 pages, 5 figures, 3 tables. To be published in the 34th International Conference on Algorithmic Learning Theory, Singapore, 2023

  9. arXiv:2206.05979  [pdf, other

    stat.ML cs.LG

    Top Two Algorithms Revisited

    Authors: Marc Jourdan, Rémy Degenne, Dorian Baudry, Rianne de Heide, Emilie Kaufmann

    Abstract: Top Two algorithms arose as an adaptation of Thompson sampling to best arm identification in multi-armed bandit models (Russo, 2016), for parametric families of arms. They select the next arm to sample from by randomizing among two candidate arms, a leader and a challenger. Despite their good empirical performance, theoretical guarantees for fixed-confidence best arm identification have only been… ▽ More

    Submitted 4 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: 75 pages, 8 figures, 3 tables

  10. arXiv:2206.04456  [pdf, other

    stat.ML cs.LG

    Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

    Authors: Marc Jourdan, Rémy Degenne

    Abstract: In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been dedicated to identifying one arm that is $\varepsilon$-close to the best one (and not exactly the best one). In this problem with several correct answers, an identifi… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: 47 pages, 10 figures, 8 tables. To be published in the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  11. arXiv:2205.10936  [pdf, other

    cs.LG stat.ML

    On Elimination Strategies for Bandit Fixed-Confidence Identification

    Authors: Andrea Tirinzoni, Rémy Degenne

    Abstract: Elimination algorithms for bandit identification, which prune the plausible correct answers sequentially until only one remains, are computationally convenient since they reduce the problem size over time. However, existing elimination strategies are often not fully adaptive (they update their sampling rule infrequently) and are not easy to extend to combinatorial settings, where the set of answer… ▽ More

    Submitted 24 October, 2022; v1 submitted 22 May, 2022; originally announced May 2022.

  12. arXiv:2111.01479  [pdf, other

    cs.AI cs.LG math.ST

    Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

    Authors: Clémence Réda, Andrea Tirinzoni, Rémy Degenne

    Abstract: We study the problem of the identification of m arms with largest means under a fixed error rate $δ$ (fixed-confidence Top-m identification), for misspecified linear bandit models. This problem is motivated by practical applications, especially in medicine and recommendation systems, where linear models are popular due to their simplicity and the existence of efficient algorithms, but in which dat… ▽ More

    Submitted 2 November, 2021; originally announced November 2021.

    Comments: Virtual conference

  13. arXiv:2110.09133  [pdf, other

    cs.LG

    Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

    Authors: Reda Ouhamma, Rémy Degenne, Pierre Gaillard, Vianney Perchet

    Abstract: In the fixed budget thresholding bandit problem, an algorithm sequentially allocates a budgeted number of samples to different distributions. It then predicts whether the mean of each distribution is larger or lower than a given threshold. We introduce a large family of algorithms (containing most existing relevant ones), inspired by the Frank-Wolfe algorithm, and provide a thorough yet generic an… ▽ More

    Submitted 18 October, 2021; originally announced October 2021.

    Comments: 10+15 pages. To be published in the proceedings of NeurIPS 2021

  14. arXiv:2007.00969  [pdf, other

    stat.ML cs.LG

    Structure Adaptive Algorithms for Stochastic Bandits

    Authors: Rémy Degenne, Han Shao, Wouter M. Koolen

    Abstract: We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent l… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 10+18 pages. To be published in the proceedings of ICML 2020

  15. arXiv:2007.00953  [pdf, other

    stat.ML cs.LG

    Gamification of Pure Exploration for Linear Bandits

    Authors: Rémy Degenne, Pierre Ménard, Xuedong Shang, Michal Valko

    Abstract: We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 11+25 pages. To be published in the proceedings of ICML 2020

  16. arXiv:1906.10431  [pdf, other

    stat.ML cs.LG

    Non-Asymptotic Pure Exploration by Solving Games

    Authors: Rémy Degenne, Wouter M. Koolen, Pierre Ménard

    Abstract: Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment. Good algorithms make few mistakes and take few samples. Lower bounds (for multi-armed bandit models with arms in an exponential family) reveal that the sample complexity is determined by the solution to an optimisation problem. The existing state of… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

  17. arXiv:1902.03475  [pdf, other

    cs.LG stat.ML

    Pure Exploration with Multiple Correct Answers

    Authors: Rémy Degenne, Wouter M. Koolen

    Abstract: We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensures that the Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. W… ▽ More

    Submitted 9 February, 2019; originally announced February 2019.

  18. arXiv:1810.04088  [pdf, other

    cs.LG stat.ML

    Bridging the gap between regret minimization and best arm identification, with application to A/B tests

    Authors: Rémy Degenne, Thomas Nedelec, Clément Calauzènes, Vianney Perchet

    Abstract: State of the art online learning procedures focus either on selecting the best alternative ("best arm identification") or on minimizing the cost (the "regret"). We merge these two objectives by providing the theoretical analysis of cost minimizing algorithms that are also delta-PAC (with a proven guaranteed bound on the decision time), hence fulfilling at the same time regret minimization and best… ▽ More

    Submitted 26 February, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

    Journal ref: AISTATS 2019 proceedings

  19. arXiv:1807.03558  [pdf, other

    cs.LG stat.ML

    Bandits with Side Observations: Bounded vs. Logarithmic Regret

    Authors: Rémy Degenne, Evrard Garcelon, Vianney Perchet

    Abstract: We consider the classical stochastic multi-armed bandit but where, from time to time and roughly with frequency $ε$, an extra observation is gathered by the agent for free. We prove that, no matter how small $ε$ is the agent can ensure a regret uniformly bounded in time. More precisely, we construct an algorithm with a regret smaller than $\sum_i \frac{\log(1/ε)}{Δ_i}$, up to multiplicative cons… ▽ More

    Submitted 10 July, 2018; originally announced July 2018.

    Comments: Conference on Uncertainty in Artificial Intelligence (UAI) 2018, 21 pages

  20. arXiv:1612.01859  [pdf, other

    cs.LG

    Combinatorial semi-bandit with known covariance

    Authors: Rémy Degenne, Vianney Perchet

    Abstract: The combinatorial stochastic semi-bandit problem is an extension of the classical multi-armed bandit problem in which an algorithm pulls more than one arm at each stage and the rewards of all pulled arms are revealed. One difference with the single arm variant is that the dependency structure of the arms is crucial. Previous works on this setting either used a worst-case approach or imposed indepe… ▽ More

    Submitted 6 December, 2016; originally announced December 2016.

    Comments: in NIPS 2016 (Conference on Neural Information Processing Systems), Dec 2016, Barcelona, Spain