Search | arXiv e-print repository

A simple and improved algorithm for noisy, convex, zeroth-order optimisation

Abstract: In this paper, we study the problem of noisy, convex, zeroth order optimisation of a function $f$ over a bounded convex set $\bar{\mathcal X}\subset \mathbb{R}^d$. Given a budget $n$ of noisy queries to the function $f$ that can be allocated sequentially and adaptively, our aim is to construct an algorithm that returns a point $\hat x\in \bar{\mathcal X}$ such that $f(\hat x)$ is as small as possi… ▽ More In this paper, we study the problem of noisy, convex, zeroth order optimisation of a function $f$ over a bounded convex set $\bar{\mathcal X}\subset \mathbb{R}^d$. Given a budget $n$ of noisy queries to the function $f$ that can be allocated sequentially and adaptively, our aim is to construct an algorithm that returns a point $\hat x\in \bar{\mathcal X}$ such that $f(\hat x)$ is as small as possible. We provide a conceptually simple method inspired by the textbook center of gravity method, but adapted to the noisy and zeroth order setting. We prove that this method is such that the $f(\hat x) - \min_{x\in \bar{\mathcal X}} f(x)$ is of smaller order than $d^2/\sqrt{n}$ up to poly-logarithmic terms. We slightly improve upon existing literature, where to the best of our knowledge the best known rate is in [Lattimore, 2024] is of order $d^{2.5}/\sqrt{n}$, albeit for a more challenging problem. Our main contribution is however conceptual, as we believe that our algorithm and its analysis bring novel ideas and are significantly simpler than existing approaches. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2406.11485 [pdf, other]

Active clustering with bandit feedback

Authors: Victor Thuot, Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen

Abstract: We investigate the Active Clustering Problem (ACP). A learner interacts with an $N$-armed stochastic bandit with $d$-dimensional subGaussian feedback. There exists a hidden partition of the arms into $K$ groups, such that arms within the same group, share the same mean vector. The learner's task is to uncover this hidden partition with the smallest budget - i.e., the least number of observation -… ▽ More We investigate the Active Clustering Problem (ACP). A learner interacts with an $N$-armed stochastic bandit with $d$-dimensional subGaussian feedback. There exists a hidden partition of the arms into $K$ groups, such that arms within the same group, share the same mean vector. The learner's task is to uncover this hidden partition with the smallest budget - i.e., the least number of observation - and with a probability of error smaller than a prescribed constant $δ$. In this paper, (i) we derive a non-asymptotic lower bound for the budget, and (ii) we introduce the computationally efficient ACB algorithm, whose budget matches the lower bound in most regimes. We improve on the performance of a uniform sampling strategy. Importantly, contrary to the batch setting, we establish that there is no computation-information gap in the active setting. △ Less

Submitted 17 June, 2024; originally announced June 2024.

Comments: 50 pages

arXiv:2406.04003 [pdf, other]

High contrast at short separation with VLTI/GRAVITY: Bringing Gaia companions to light

Authors: N. Pourré, T. O. Winterhalder, J. -B. Le Bouquin, S. Lacour, A. Bidot, M. Nowak, A. -L. Maire, D. Mouillet, C. Babusiaux, J. Woillez, R. Abuter, A. Amorim, R. Asensio-Torres, W. O. Balmer, M. Benisty, J. -P. Berger, H. Beust, S. Blunt, A. Boccaletti, M. Bonnefoy, H. Bonnet, M. S. Bordoni, G. Bourdarot, W. Brandner, F. Cantalloube , et al. (151 additional authors not shown)

Abstract: Since 2019, GRAVITY has provided direct observations of giant planets and brown dwarfs at separations of down to 95 mas from the host star. Some of these observations have provided the first direct confirmation of companions previously detected by indirect techniques (astrometry and radial velocities). We want to improve the observing strategy and data reduction in order to lower the inner working… ▽ More Since 2019, GRAVITY has provided direct observations of giant planets and brown dwarfs at separations of down to 95 mas from the host star. Some of these observations have provided the first direct confirmation of companions previously detected by indirect techniques (astrometry and radial velocities). We want to improve the observing strategy and data reduction in order to lower the inner working angle of GRAVITY in dual-field on-axis mode. We also want to determine the current limitations of the instrument when observing faint companions with separations in the 30-150 mas range. To improve the inner working angle, we propose a fiber off-pointing strategy during the observations to maximize the ratio of companion-light-to-star-light coupling in the science fiber. We also tested a lower-order model for speckles to decouple the companion light from the star light. We then evaluated the detection limits of GRAVITY using planet injection and retrieval in representative archival data. We compare our results to theoretical expectations. We validate our observing and data-reduction strategy with on-sky observations; first in the context of brown dwarf follow-up on the auxiliary telescopes with HD 984 B, and second with the first confirmation of a substellar candidate around the star Gaia DR3 2728129004119806464. With synthetic companion injection, we demonstrate that the instrument can detect companions down to a contrast of $8\times 10^{-4}$ ($Δ\mathrm{K}= 7.7$ mag) at a separation of 35 mas, and a contrast of $3\times 10^{-5}$ ($Δ\mathrm{K}= 11$ mag) at 100 mas from a bright primary (K<6.5), for 30 min exposure time. With its inner working angle and astrometric precision, GRAVITY has a unique reach in direct observation parameter space. This study demonstrates the promising synergies between GRAVITY and Gaia for the confirmation and characterization of substellar companions. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 16 pages, 14 figures. Submitted to A&A

arXiv:2310.01133 [pdf, ps, other]

Optimal rates for ranking a permuted isotonic matrix in polynomial time

Authors: Emmanuel Pilliat, Alexandra Carpentier, Nicolas Verzelen

Abstract: We consider a ranking problem where we have noisy observations from a matrix with isotonic columns whose rows have been permuted by some permutation $π$ *. This encompasses many models, including crowd-labeling and ranking in tournaments by pair-wise comparisons. In this work, we provide an optimal and polynomial-time procedure for recovering $π$ * , settling an open problem in [7]. As a byproduct… ▽ More We consider a ranking problem where we have noisy observations from a matrix with isotonic columns whose rows have been permuted by some permutation $π$ *. This encompasses many models, including crowd-labeling and ranking in tournaments by pair-wise comparisons. In this work, we provide an optimal and polynomial-time procedure for recovering $π$ * , settling an open problem in [7]. As a byproduct, our procedure is used to improve the state-of-the art for ranking problems in the stochastically transitive model (SST). Our approach is based on iterative pairwise comparisons by suitable data-driven weighted means of the columns. These weights are built using a combination of spectral methods with new dimension-reduction techniques. In order to deal with the important case of missing data, we establish a new concentration inequality for sparse and centered rectangular Wishart-type matrices. △ Less

Submitted 2 October, 2023; originally announced October 2023.

arXiv:2306.16403 [pdf, other]

Moment inequalities for sums of weakly dependent random fields

Authors: Gilles Blanchard, Alexandra Carpentier, Oleksandr Zadorozhnyi

Abstract: We derive both Azuma-Hoeffding and Burkholder-type inequalities for partial sums over a rectangular grid of dimension $d$ of a random field satisfying a weak dependency assumption of projective type: the difference between the expectation of an element of the random field and its conditional expectation given the rest of the field at a distance more than $δ$ is bounded, in $L^p$ distance, by a kno… ▽ More We derive both Azuma-Hoeffding and Burkholder-type inequalities for partial sums over a rectangular grid of dimension $d$ of a random field satisfying a weak dependency assumption of projective type: the difference between the expectation of an element of the random field and its conditional expectation given the rest of the field at a distance more than $δ$ is bounded, in $L^p$ distance, by a known decreasing function of $δ$. The analysis is based on the combination of a multi-scale approximation of random sums by martingale difference sequences, and of a careful decomposition of the domain. The obtained results extend previously known bounds under comparable hypotheses, and do not use the assumption of commuting filtrations. △ Less

Submitted 28 June, 2023; originally announced June 2023.

Comments: 20 pages, 3 figures

arXiv:2306.02971 [pdf, other]

Online Learning with Feedback Graphs: The True Shape of Regret

Authors: Tomáš Kocák, Alexandra Carpentier

Abstract: Sequential learning with feedback graphs is a natural extension of the multi-armed bandit problem where the problem is equipped with an underlying graph structure that provides additional information - playing an action reveals the losses of all the neighbors of the action. This problem was introduced by \citet{mannor2011} and received considerable attention in recent years. It is generally stated… ▽ More Sequential learning with feedback graphs is a natural extension of the multi-armed bandit problem where the problem is equipped with an underlying graph structure that provides additional information - playing an action reveals the losses of all the neighbors of the action. This problem was introduced by \citet{mannor2011} and received considerable attention in recent years. It is generally stated in the literature that the minimax regret rate for this problem is of order $\sqrt{αT}$, where $α$ is the independence number of the graph, and $T$ is the time horizon. However, this is proven only when the number of rounds $T$ is larger than $α^3$, which poses a significant restriction for the usability of this result in large graphs. In this paper, we define a new quantity $R^*$, called the \emph{problem complexity}, and prove that the minimax regret is proportional to $R^*$ for any graph and time horizon $T$. Introducing an intricate exploration strategy, we define the \mainAlgorithm algorithm that achieves the minimax optimal regret bound and becomes the first provably optimal algorithm for this setting, even if $T$ is smaller than $α^3$. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2306.02628 [pdf, other]

Active Ranking of Experts Based on their Performances in Many Tasks

Authors: El Mehdi Saad, Nicolas Verzelen, Alexandra Carpentier

Abstract: We consider the problem of ranking n experts based on their performances on d tasks. We make a monotonicity assumption stating that for each pair of experts, one outperforms the other on all tasks. We consider the sequential setting where in each round, the learner has access to noisy evaluations of actively chosen pair of expert-task, given the information available up to the actual round. Given… ▽ More We consider the problem of ranking n experts based on their performances on d tasks. We make a monotonicity assumption stating that for each pair of experts, one outperforms the other on all tasks. We consider the sequential setting where in each round, the learner has access to noisy evaluations of actively chosen pair of expert-task, given the information available up to the actual round. Given a confidence parameter $δ$ $\in$ (0, 1), we provide strategies allowing to recover the correct ranking of experts and develop a bound on the total number of queries made by our algorithm that hold with probability at least 1 -- $δ$. We show that our strategy is adaptive to the complexity of the problem (our bounds are instance dependent), and develop matching lower bounds up to a poly-logarithmic factor. Finally, we adapt our strategy to the relaxed problem of best expert identification and provide numerical simulation consistent with our theoretical results. △ Less

Submitted 5 June, 2023; originally announced June 2023.

arXiv:2211.04092 [pdf, other]

Optimal Permutation Estimation in Crowd-Sourcing problems

Authors: Emmanuel Pilliat, Alexandra Carpentier, Nicolas Verzelen

Abstract: Motivated by crowd-sourcing applications, we consider a model where we have partial observations from a bivariate isotonic n x d matrix with an unknown permutation $π$ * acting on its rows. Focusing on the twin problems of recovering the permutation $π$ * and estimating the unknown matrix, we introduce a polynomial-time procedure achieving the minimax risk for these two problems, this for all poss… ▽ More Motivated by crowd-sourcing applications, we consider a model where we have partial observations from a bivariate isotonic n x d matrix with an unknown permutation $π$ * acting on its rows. Focusing on the twin problems of recovering the permutation $π$ * and estimating the unknown matrix, we introduce a polynomial-time procedure achieving the minimax risk for these two problems, this for all possible values of n, d, and all possible sampling efforts. Along the way, we establish that, in some regimes, recovering the unknown permutation $π$ * is considerably simpler than estimating the matrix. △ Less

Submitted 30 March, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

arXiv:2203.09784 [pdf, other]

The price of unfairness in linear bandits with biased feedback

Authors: Solenne Gaucher, Alexandra Carpentier, Christophe Giraud

Abstract: In this paper, we study the problem of fair sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. The perceived reward is a linear combination of the covariates of the chosen action, but the player only observes a biased evaluation of this reward, depending on the sensitive attribute. To chara… ▽ More In this paper, we study the problem of fair sequential decision making with biased linear bandit feedback. At each round, a player selects an action described by a covariate and by a sensitive attribute. The perceived reward is a linear combination of the covariates of the chosen action, but the player only observes a biased evaluation of this reward, depending on the sensitive attribute. To characterize the difficulty of this problem, we design a phased elimination algorithm that corrects the unfair evaluations, and establish upper bounds on its regret. We show that the worst-case regret is smaller than $\mathcal{O}(κ_*^{1/3}\log(T)^{1/3}T^{2/3})$, where $κ_*$ is an explicit geometrical constant characterizing the difficulty of bias estimation. We prove lower bounds on the worst-case regret for some sets of actions showing that this rate is tight up to a possible sub-logarithmic factor. We also derive gap-dependent upper bounds on the regret, and matching lower bounds for some problem instance.Interestingly, these results reveal a transition between a regime where the problem is as difficult as its unbiased counterpart, and a regime where it can be much harder. △ Less

Submitted 3 June, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

arXiv:2109.04346 [pdf, ps, other]

Goodness-of-Fit Testing for Hölder-Continuous Densities: Sharp Local Minimax Rates

Authors: Julien Chhor, Alexandra Carpentier

Abstract: We consider the goodness-of fit testing problem for Hölder smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $ρ$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all Hölder-smooth densities $p$ such that $\|p-p_0\|_t \geq ρ$ where $t \in [1,2]$. The d… ▽ More We consider the goodness-of fit testing problem for Hölder smooth densities over $\mathbb{R}^d$: given $n$ iid observations with unknown density $p$ and given a known density $p_0$, we investigate how large $ρ$ should be to distinguish, with high probability, the case $p=p_0$ from the composite alternative of all Hölder-smooth densities $p$ such that $\|p-p_0\|_t \geq ρ$ where $t \in [1,2]$. The densities are assumed to be defined over $\mathbb{R}^d$ and to have Hölder smoothness parameter $α>0$. In the present work, we solve the case $α\leq 1$ and handle the case $α>1$ using an additional technical restriction on the densities. We identify matching upper and lower bounds on the local minimax rates of testing, given explicitly in terms of $p_0$. We propose novel test statistics which we believe could be of independent interest. We also establish the first definition of an explicit cutoff $u_B$ allowing us to split $\mathbb{R}^d$ into a bulk part (defined as the subset of $\mathbb{R}^d$ where $p_0$ takes only values greater than or equal to $u_B$) and a tail part (defined as the complementary of the bulk), each part involving fundamentally different contributions to the local minimax rates of testing. △ Less

Submitted 17 March, 2023; v1 submitted 9 September, 2021; originally announced September 2021.

Comments: 79 pages

MSC Class: 62G10 (Primary); 62B10; 62C20 (Secondary)

arXiv:2106.10166 [pdf, other]

Problem Dependent View on Structured Thresholding Bandit Problems

Authors: James Cheshire, Pierre Ménard, Alexandra Carpentier

Abstract: We investigate the problem dependent regime in the stochastic Thresholding Bandit problem (TBP) under several shape constraints. In the TBP, the objective of the learner is to output, at the end of a sequential game, the set of arms whose means are above a given threshold. The vanilla, unstructured, case is already well studied in the literature. Taking $K$ as the number of arms, we consider the c… ▽ More We investigate the problem dependent regime in the stochastic Thresholding Bandit problem (TBP) under several shape constraints. In the TBP, the objective of the learner is to output, at the end of a sequential game, the set of arms whose means are above a given threshold. The vanilla, unstructured, case is already well studied in the literature. Taking $K$ as the number of arms, we consider the case where (i) the sequence of arm's means $(μ_k)_{k=1}^K$ is monotonically increasing (MTBP) and (ii) the case where $(μ_k)_{k=1}^K$ is concave (CTBP). We consider both cases in the problem dependent regime and study the probability of error - i.e. the probability to mis-classify at least one arm. In the fixed budget setting, we provide upper and lower bounds for the probability of error in both the concave and monotone settings, as well as associated algorithms. In both settings the bounds match in the problem dependent regime up to universal constants in the exponential. △ Less

Submitted 18 June, 2021; originally announced June 2021.

Comments: 25 pages. arXiv admin note: text overlap with arXiv:2006.10006

arXiv:2103.12452 [pdf, other]

Bandits with many optimal arms

Authors: Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier

Abstract: We consider a stochastic bandit problem with a possibly infinite number of arms. We write $p^*$ for the proportion of optimal arms and $Δ$ for the minimal mean-gap between optimal and sub-optimal arms. We characterize the optimal learning rates both in the cumulative regret setting, and in the best-arm identification setting in terms of the problem parameters $T$ (the budget), $p^*$ and $Δ$. For t… ▽ More We consider a stochastic bandit problem with a possibly infinite number of arms. We write $p^*$ for the proportion of optimal arms and $Δ$ for the minimal mean-gap between optimal and sub-optimal arms. We characterize the optimal learning rates both in the cumulative regret setting, and in the best-arm identification setting in terms of the problem parameters $T$ (the budget), $p^*$ and $Δ$. For the objective of minimizing the cumulative regret, we provide a lower bound of order $Ω(\log(T)/(p^*Δ))$ and a UCB-style algorithm with matching upper bound up to a factor of $\log(1/Δ)$. Our algorithm needs $p^*$ to calibrate its parameters, and we prove that this knowledge is necessary, since adapting to $p^*$ in this setting is impossible. For best-arm identification we also provide a lower bound of order $Ω(\exp(-cTΔ^2 p^*))$ on the probability of outputting a sub-optimal arm where $c>0$ is an absolute constant. We also provide an elimination algorithm with an upper bound matching the lower bound up to a factor of order $\log(T)$ in the exponential, and that does not need $p^*$ or $Δ$ as parameter. Our results apply directly to the three related problems of competing against the $j$-th best arm, identifying an $ε$ good arm, and finding an arm with mean larger than a quantile of a known order. △ Less

Submitted 5 November, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

Comments: Substantial rewrite and added experiments. Accepted for NeurIPS 2021

arXiv:2102.00725 [pdf, ps, other]

Generalized non-stationary bandits

Authors: Anne Gael Manegueu, Alexandra Carpentier, Yi Yu

Abstract: In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of infl… ▽ More In this paper, we study a non-stationary stochastic bandit problem, which generalizes the switching bandit problem. On top of the switching bandit problem (\textbf{Case a}), we are interested in three concrete examples: (\textbf{b}) the means of the arms are local polynomials, (\textbf{c}) the means of the arms are locally smooth, and (\textbf{d}) the gaps of the arms have a bounded number of inflexion points and where the highest arm mean cannot vary too much in a short range. These three settings are very different, but have in common the following: (i) the number of similarly-sized level sets of the logarithm of the gaps can be controlled, and (ii) the highest mean has a limited number of abrupt changes, and otherwise has limited variations. We propose a single algorithm in this general setting, that in particular solves in an efficient and unified way the four problems (a)-(d) mentioned. △ Less

Submitted 2 February, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

arXiv:2101.06935 [pdf]

Spatio-temporal characterization of causal electrophysiological activity stimulated by single pulse focused ultrasound: An ex vivo study on hippocampal brain slices

Authors: Ivan Suarez-Castellanos, Elena Dossi, Jeremy Vion-Bailly, Lea Salette, Jean-Yves Chapelon, Alexandre Carpentier, Gilles Huberfeld, William Apoutou N'D**

Abstract: Objective: The brain operates via generation, transmission and integration of neuronal signals and most neurological disorders are related to perturbation of these processes. Neurostimulation by Focused Ultrasound (FUS) is a promising technology with potential to rival other clinically-used techniques for the investigation of brain function and treatment of numerous neurological diseases. The purp… ▽ More Objective: The brain operates via generation, transmission and integration of neuronal signals and most neurological disorders are related to perturbation of these processes. Neurostimulation by Focused Ultrasound (FUS) is a promising technology with potential to rival other clinically-used techniques for the investigation of brain function and treatment of numerous neurological diseases. The purpose of this study was to characterize spatial and temporal aspects of causal electrophysiological signals directly stimulated by short, single pulses of focused ultrasound (FUS) on ex vivo mouse hippocampal brain slices. Approach: MicroElectrode Arrays (MEA) are used to study the spatio-temporal dynamics of extracellular neuronal activities both at the single neuron and neural networks scales. Hence, MEAs provide an excellent platform for characterization of electrical activity generated, modulated and transmitted in response to FUS exposure. In this study, a novel mixed FUS/MEA platform was designed for the spatio-temporal description of the causal responses generated by single 1.78 MHz FUS pulses in ex vivo mouse hippocampal brain slices. Main results: Our results show that FUS pulses can generate local field potentials (LFPs), sustained by synchronized neuronal post-synaptic potentials, and reproducing network activities. LFPs induced by FUS stimulation were found to be repeatable to consecutive FUS pulses though exhibiting a wide range of amplitudes (50-600 $μ$V), durations (20-200 ms), and response delays (10-60 ms). Moreover, LFPs were spread across the hippocampal slice following single FUS pulses thus demonstrating that FUS may be capable of stimulating different neural structures within the hippocampus. Significance: Current knowledge on neurostimulation by ultrasound describes neuronal activity generated by trains of repetitive ultrasound pulses. This novel study details the causal neural responses produced by single-pulse FUS neurostimulation while illustrating the distribution and propagation properties of this neural activity along major neural pathways of the hippocampus. △ Less

Submitted 18 January, 2021; originally announced January 2021.

arXiv:2012.13766 [pdf, ps, other]

Sharp Local Minimax Rates for Goodness-of-Fit Testing in multivariate Binomial and Poisson families and in multinomials

Authors: J. Chhor, A. Carpentier

Abstract: We consider the identity testing problem - or goodness-of-fit testing problem - in multivariate binomial families, multivariate Poisson families and multinomial distributions. Given a known distribution $p$ and $n$ iid samples drawn from an unknown distribution $q$, we investigate how large $ρ>0$ should be to distinguish, with high probability, the case $p=q$ from the case $d(p,q) \geq ρ$, where… ▽ More We consider the identity testing problem - or goodness-of-fit testing problem - in multivariate binomial families, multivariate Poisson families and multinomial distributions. Given a known distribution $p$ and $n$ iid samples drawn from an unknown distribution $q$, we investigate how large $ρ>0$ should be to distinguish, with high probability, the case $p=q$ from the case $d(p,q) \geq ρ$, where $d$ denotes a specific distance over probability distributions. We answer this question in the case of a family of different distances: $d(p,q) = \|p-q\|_t$ for $t \in [1,2]$ where $\|\cdot\|_t$ is the entrywise $\ell_t$ norm. Besides being locally minimax-optimal - i.e. characterizing the detection threshold in dependence of the known matrix $p$ - our tests have simple expressions and are easily implementable. △ Less

Submitted 23 April, 2022; v1 submitted 26 December, 2020; originally announced December 2020.

arXiv:2011.07818 [pdf, other]

Optimal multiple change-point detection for high-dimensional data

Authors: Emmanuel Pilliat, Alexandra Carpentier, Nicolas Verzelen

Abstract: This manuscript makes two contributions to the field of change-point detection. In a generalchange-point setting, we provide a generic algorithm for aggregating local homogeneity testsinto an estimator of change-points in a time series. Interestingly, we establish that the errorrates of the collection of tests directly translate into detection properties of the change-pointestimator. This generic… ▽ More This manuscript makes two contributions to the field of change-point detection. In a generalchange-point setting, we provide a generic algorithm for aggregating local homogeneity testsinto an estimator of change-points in a time series. Interestingly, we establish that the errorrates of the collection of tests directly translate into detection properties of the change-pointestimator. This generic scheme is then applied to various problems including covariance change-point detection, nonparametric change-point detection and sparse multivariate mean change-point detection. For the latter, we derive minimax optimal rates that are adaptive to theunknown sparsity and to the distance between change-points when the noise is Gaussian. Forsub-Gaussian noise, we introduce a variant that is optimal in almost all sparsity regimes. △ Less

Submitted 8 December, 2022; v1 submitted 16 November, 2020; originally announced November 2020.

arXiv:2010.13679 [pdf, ps, other]

Estimation of the $l_2$-norm and testing in sparse linear regression with unknown variance

Authors: Alexandra Carpentier, Olivier Collier, Laetitia Comminges, Alexandre B. Tsybakov, Yuhao Wang

Abstract: We consider the related problems of estimating the $l_2$-norm and the squared $l_2$-norm in sparse linear regression with unknown variance, as well as the problem of testing the hypothesis that the regression parameter is null under sparse alternatives with $l_2$ separation. We establish the minimax optimal rates of estimation (respectively, testing) in these three problems. We consider the related problems of estimating the $l_2$-norm and the squared $l_2$-norm in sparse linear regression with unknown variance, as well as the problem of testing the hypothesis that the regression parameter is null under sparse alternatives with $l_2$ separation. We establish the minimax optimal rates of estimation (respectively, testing) in these three problems. △ Less

Submitted 26 October, 2020; originally announced October 2020.

arXiv:2010.10182 [pdf, ps, other]

The Elliptical Potential Lemma Revisited

Authors: Alexandra Carpentier, Claire Vernade, Yasin Abbasi-Yadkori

Abstract: This note proposes a new proof and new perspectives on the so-called Elliptical Potential Lemma. This result is important in online learning, especially for linear stochastic bandits. The original proof of the result, however short and elegant, does not give much flexibility on the type of potentials considered and we believe that this new interpretation can be of interest for future research in t… ▽ More This note proposes a new proof and new perspectives on the so-called Elliptical Potential Lemma. This result is important in online learning, especially for linear stochastic bandits. The original proof of the result, however short and elegant, does not give much flexibility on the type of potentials considered and we believe that this new interpretation can be of interest for future research in this field. △ Less

Submitted 20 October, 2020; originally announced October 2020.

Comments: 8 pages

arXiv:2006.10459 [pdf, other]

Stochastic bandits with arm-dependent delays

Authors: Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko

Abstract: Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and… ▽ More Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PatientBandits. We provide both problems-dependent and problems-independent bounds on the regret as well as performance lower bounds. △ Less

Submitted 18 June, 2020; originally announced June 2020.

Comments: 19 Pages, 4 figures

MSC Class: 62L10

arXiv:2006.10006 [pdf, ps, other]

The Influence of Shape Constraints on the Thresholding Bandit Problem

Authors: James Cheshire, Pierre Menard, Alexandra Carpentier

Abstract: We investigate the stochastic Thresholding Bandit problem (TBP) under several shape constraints. On top of (i) the vanilla, unstructured TBP, we consider the case where (ii) the sequence of arm's means $(μ_k)_k$ is monotonically increasing MTBP, (iii) the case where $(μ_k)_k$ is unimodal UTBP and (iv) the case where $(μ_k)_k$ is concave CTBP. In the TBP problem the aim is to output, at the end of… ▽ More We investigate the stochastic Thresholding Bandit problem (TBP) under several shape constraints. On top of (i) the vanilla, unstructured TBP, we consider the case where (ii) the sequence of arm's means $(μ_k)_k$ is monotonically increasing MTBP, (iii) the case where $(μ_k)_k$ is unimodal UTBP and (iv) the case where $(μ_k)_k$ is concave CTBP. In the TBP problem the aim is to output, at the end of the sequential game, the set of arms whose means are above a given threshold. The regret is the highest gap between a misclassified arm and the threshold. In the fixed budget setting, we provide problem independent minimax rates for the expected regret in all settings, as well as associated algorithms. We prove that the minimax rates for the regret are (i) $\sqrt{\log(K)K/T}$ for TBP, (ii) $\sqrt{\log(K)/T}$ for MTBP, (iii) $\sqrt{K/T}$ for UTBP and (iv) $\sqrt{\log\log K/T}$ for CTBP, where $K$ is the number of arms and $T$ is the budget. These rates demonstrate that the dependence on $K$ of the minimax regret varies significantly depending on the shape constraint. This highlights the fact that the shape constraints modify fundamentally the nature of the TBP. △ Less

Submitted 23 February, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

arXiv:1906.10454 [pdf, ps, other]

Restless dependent bandits with fading memory

Authors: Oleksandr Zadorozhnyi, Gilles Blanchard, Alexandra Carpentier

Abstract: We study the stochastic multi-armed bandit problem in the case when the arm samples are dependent over time and generated from so-called weak $\cC$-mixing processes. We establish a $\cC-$Mix Improved UCB agorithm and provide both problem-dependent and independent regret analysis in two different scenarios. In the first, so-called fast-mixing scenario, we show that pseudo-regret enjoys the same upp… ▽ More We study the stochastic multi-armed bandit problem in the case when the arm samples are dependent over time and generated from so-called weak $\cC$-mixing processes. We establish a $\cC-$Mix Improved UCB agorithm and provide both problem-dependent and independent regret analysis in two different scenarios. In the first, so-called fast-mixing scenario, we show that pseudo-regret enjoys the same upper bound (up to a factor) as for i.i.d. observations; whereas in the second, slow mixing scenario, we discover a surprising effect, that the regret upper bound is similar to the independent case, with an incremental {\em additive} term which does not depend on the number of arms. The analysis of slow mixing scenario is supported with a minmax lower bound, which (up to a $\log(T)$ factor) matches the obtained upper bound. △ Less

Submitted 25 June, 2019; originally announced June 2019.

Comments: 30 pages

arXiv:1902.01219 [pdf, ps, other]

Local minimax rates for closeness testing of discrete distributions

Authors: Joseph Lam-Weil, Alexandra Carpentier, Bharath K. Sriperumbudur

Abstract: We consider the closeness testing problem for discrete distributions. The goal is to distinguish whether two samples are drawn from the same unspecified distribution, or whether their respective distributions are separated in $L_1$-norm. In this paper, we focus on adapting the rate to the shape of the underlying distributions, i.e. we consider \textit{a local minimax setting}. We provide, to the b… ▽ More We consider the closeness testing problem for discrete distributions. The goal is to distinguish whether two samples are drawn from the same unspecified distribution, or whether their respective distributions are separated in $L_1$-norm. In this paper, we focus on adapting the rate to the shape of the underlying distributions, i.e. we consider \textit{a local minimax setting}. We provide, to the best of our knowledge, the first local minimax rate for the separation distance up to logarithmic factors, together with a test that achieves it. In view of the rate, closeness testing turns out to be substantially harder than the related one-sample testing problem over a wide range of cases. △ Less

Submitted 19 January, 2021; v1 submitted 1 February, 2019; originally announced February 2019.

MSC Class: 62F03; 62G10; 62F35 ACM Class: G.3; I.2.6

arXiv:1901.08802 [pdf, ps, other]

Optimal Sparsity Testing in Linear regression Model

Authors: Alexandra Carpentier, Nicolas Verzelen

Abstract: We consider the problem of sparsity testing in the high-dimensional linear regression model. The problem is to test whether the number of non-zero components (aka the sparsity) of the regression parameter $θ^*$ is less than or equal to $k_0$. We pinpoint the minimax separation distances for this problem, which amounts to quantifying how far a $k_1$-sparse vector $θ^*$ has to be from the set of… ▽ More We consider the problem of sparsity testing in the high-dimensional linear regression model. The problem is to test whether the number of non-zero components (aka the sparsity) of the regression parameter $θ^*$ is less than or equal to $k_0$. We pinpoint the minimax separation distances for this problem, which amounts to quantifying how far a $k_1$-sparse vector $θ^*$ has to be from the set of $k_0$-sparse vectors so that a test is able to reject the null hypothesis with high probability. Two scenarios are considered. In the independent scenario, the covariates are i.i.d. normally distributed and the noise level is known. In the general scenario, both the covariance matrix of the covariates and the noise level are unknown. Although the minimax separation distances differ in these two scenarios, both of them actually depend on $k_0$ and $k_1$ illustrating that for this composite-composite testing problem both the size of the null and of the alternative hypotheses play a key role. △ Less

Submitted 23 April, 2020; v1 submitted 25 January, 2019; originally announced January 2019.

Comments: 50 pages

arXiv:1811.11043 [pdf, other]

Rotting bandits are not harder than stochastic ones

Authors: Julien Seznec, Andrea Locatelli, Alexandra Carpentier, Alessandro Lazaric, Michal Valko

Abstract: In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward of an arm may change whenever is selected, i.e., rested bandit setting. In this paper, we consider the non-parametric rotting bandit setting, where rewards can only decrease. We introduce the filtering… ▽ More In stochastic multi-armed bandits, the reward distribution of each arm is assumed to be stationary. This assumption is often violated in practice (e.g., in recommendation systems), where the reward of an arm may change whenever is selected, i.e., rested bandit setting. In this paper, we consider the non-parametric rotting bandit setting, where rewards can only decrease. We introduce the filtering on expanding window average (FEWA) algorithm that constructs moving averages of increasing windows to identify arms that are more likely to return high rewards when pulled once more. We prove that for an unknown horizon $T$, and without any knowledge on the decreasing behavior of the $K$ arms, FEWA achieves problem-dependent regret bound of $\widetilde{\mathcal{O}}(\log{(KT)}),$ and a problem-independent one of $\widetilde{\mathcal{O}}(\sqrt{KT})$. Our result substantially improves over the algorithm of Levine et al. (2017), which suffers regret $\widetilde{\mathcal{O}}(K^{1/3}T^{2/3})$. FEWA also matches known bounds for the stochastic bandit setting, thus showing that the rotting bandits are not harder. Finally, we report simulations confirming the theoretical improvements of FEWA. △ Less

Submitted 9 May, 2020; v1 submitted 27 November, 2018; originally announced November 2018.

Journal ref: International Conference on Artificial Intelligence and Statistics (AISTATS 2019)

arXiv:1810.09390 [pdf, other]

A minimax near-optimal algorithm for adaptive rejection sampling

Authors: Juliette Achdou, Joseph C. Lam, Alexandra Carpentier, Gilles Blanchard

Abstract: Rejection Sampling is a fundamental Monte-Carlo method. It is used to sample from distributions admitting a probability density function which can be evaluated exactly at any given point, albeit at a high computational cost. However, without proper tuning, this technique implies a high rejection rate. Several methods have been explored to cope with this problem, based on the principle of adaptivel… ▽ More Rejection Sampling is a fundamental Monte-Carlo method. It is used to sample from distributions admitting a probability density function which can be evaluated exactly at any given point, albeit at a high computational cost. However, without proper tuning, this technique implies a high rejection rate. Several methods have been explored to cope with this problem, based on the principle of adaptively estimating the density by a simpler function, using the information of the previous samples. Most of them either rely on strong assumptions on the form of the density, or do not offer any theoretical performance guarantee. We give the first theoretical lower bound for the problem of adaptive rejection sampling and introduce a new algorithm which guarantees a near-optimal rejection rate in a minimax sense. △ Less

Submitted 22 October, 2018; originally announced October 2018.

Comments: 32 pages, 4 figures. Submitted to ALT 2019

MSC Class: 62D05; 62L12; 62G05 (Primary) 62L05; 62G07 (Secondary) ACM Class: G.3; I.2.6

arXiv:1810.02998 [pdf, other]

Total variation distance for discretely observed Lévy processes: a Gaussian approximation of the small jumps

Authors: Alexandra Carpentier, Céline Duval, Ester Mariucci

Abstract: It is common practice to treat small jumps of Lévy processes as Wiener noise and thus to approximate its marginals by a Gaussian distribution. However, results that allow to quantify the goodness of this approximation according to a given metric are rare. In this paper, we clarify what happens when the chosen metric is the total variation distance. Such a choice is motivated by its statistical int… ▽ More It is common practice to treat small jumps of Lévy processes as Wiener noise and thus to approximate its marginals by a Gaussian distribution. However, results that allow to quantify the goodness of this approximation according to a given metric are rare. In this paper, we clarify what happens when the chosen metric is the total variation distance. Such a choice is motivated by its statistical interpretation. If the total variation distance between two statistical models converges to zero, then no tests can be constructed to distinguish the two models which are therefore equivalent, statistically speaking. We elaborate a fine analysis of a Gaussian approximation for the small jumps of Lévy processes with infinite Lévy measure in total variation distance. Non asymptotic bounds for the total variation distance between $n$ discrete observations of small jumps of a Lévy process and the corresponding Gaussian distribution is presented and extensively discussed. As a byproduct, new upper bounds for the total variation distance between discrete observations of Lévy processes are provided. The theory is illustrated by concrete examples. △ Less

Submitted 2 April, 2019; v1 submitted 6 October, 2018; originally announced October 2018.

Comments: Important and necessary changes have been made in this new version, this version supersedes version 1

MSC Class: 60G51; 62M99 (Primary); 60E99 (Secondary)

arXiv:1809.08330 [pdf, other]

Estimating minimum effect with outlier selection

Authors: Alexandra Carpentier, Sylvain Delattre, Etienne Roquain, Nicolas Verzelen

Abstract: We introduce one-sided versions of Huber's contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation of the mean of uncorrupted samples (minimum effect) and selection of corrupted samples (outliers). Regarding the minimum effect estimation, we derive the minimax risks and introduce adaptive estimators to the… ▽ More We introduce one-sided versions of Huber's contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation of the mean of uncorrupted samples (minimum effect) and selection of corrupted samples (outliers). Regarding the minimum effect estimation, we derive the minimax risks and introduce adaptive estimators to the unknown number of contaminations. Interestingly, the optimal convergence rate highly differs from that in classical Huber's contamination model. Also, our analysis uncovers the effect of particular structural assumptions on the distribution of the contaminated samples. As for the problem of selecting the outliers, we formulate the problem in a multiple testing framework for which the location/scaling of the null hypotheses are unknown. We rigorously prove how estimating the null hypothesis is possible while maintaining a theoretical guarantee on the amount of the falsely selected outliers, both through false discovery rate (FDR) or post hoc bounds. As a by-product, we address a long-standing open issue on FDR control under equi-correlation, which reinforces the interest of removing dependency when making multiple testing. △ Less

Submitted 21 September, 2018; originally announced September 2018.

Comments: 70 pages; 7 figures

arXiv:1807.02089 [pdf, other]

Linear Bandits with Stochastic Delayed Feedback

Authors: Claire Vernade, Alexandra Carpentier, Tor Lattimore, Giovanni Zappella, Beyza Ermis, Michael Brueckner

Abstract: Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by practitioners ho** to apply existing algorithms is that usually the feedback is randomly delayed and delays are only partially observable. For example, while a purchase… ▽ More Stochastic linear bandits are a natural and well-studied model for structured exploration/exploitation problems and are widely used in applications such as online marketing and recommendation. One of the main challenges faced by practitioners ho** to apply existing algorithms is that usually the feedback is randomly delayed and delays are only partially observable. For example, while a purchase is usually observable some time after the display, the decision of not buying is never explicitly sent to the system. In other words, the learner only observes delayed positive events. We formalize this problem as a novel stochastic delayed linear bandit and propose ${\tt OTFLinUCB}$ and ${\tt OTFLinTS}$, two computationally efficient algorithms able to integrate new information as it becomes available and to deal with the permanently censored feedback. We prove optimal $\tilde O(\smash{d\sqrt{T}})$ bounds on the regret of the first algorithm and study the dependency on delay-dependent parameters. Our model, assumptions and results are validated by experiments on simulated and real data. △ Less

Submitted 2 March, 2020; v1 submitted 5 July, 2018; originally announced July 2018.

arXiv:1804.06494 [pdf, ps, other]

Minimax rate of testing in sparse linear regression

Authors: Alexandra Carpentier, Olivier Collier, Laëtitia Comminges, Alexandre B. Tsybakov, Yuhao Wang

Abstract: We consider the problem of testing the hypothesis that the parameter of linear regression model is 0 against an s-sparse alternative separated from 0 in the l2-distance. We show that, in Gaussian linear regression model with p < n, where p is the dimension of the parameter and n is the sample size, the non-asymptotic minimax rate of testing has the form sqrt((s/n) log(1 + sqrt(p)/s )). We also sho… ▽ More We consider the problem of testing the hypothesis that the parameter of linear regression model is 0 against an s-sparse alternative separated from 0 in the l2-distance. We show that, in Gaussian linear regression model with p < n, where p is the dimension of the parameter and n is the sample size, the non-asymptotic minimax rate of testing has the form sqrt((s/n) log(1 + sqrt(p)/s )). We also show that this is the minimax rate of estimation of the l2-norm of the regression parameter. △ Less

Submitted 9 October, 2018; v1 submitted 17 April, 2018; originally announced April 2018.

arXiv:1711.09294 [pdf, other]

An Adaptive Strategy for Active Learning with Smooth Decision Boundary

Authors: Andrea Locatelli, Alexandra Carpentier, Samory Kpotufe

Abstract: We present the first adaptive strategy for active learning in the setting of classification with smooth decision boundary. The problem of adaptivity (to unknown distributional parameters) has remained opened since the seminal work of Castro and Nowak (2007), which first established (active learning) rates for this setting. While some recent advances on this problem establish adaptive rates in the… ▽ More We present the first adaptive strategy for active learning in the setting of classification with smooth decision boundary. The problem of adaptivity (to unknown distributional parameters) has remained opened since the seminal work of Castro and Nowak (2007), which first established (active learning) rates for this setting. While some recent advances on this problem establish adaptive rates in the case of univariate data, adaptivity in the more practical setting of multivariate data has so far remained elusive. Combining insights from various recent works, we show that, for the multivariate case, a careful reduction to univariate-adaptive strategies yield near-optimal rates without prior knowledge of distributional parameters. △ Less

Submitted 25 November, 2017; originally announced November 2017.

arXiv:1707.00833 [pdf, ps, other]

doi 10.1214/19-AOS1884

Two-sample Hypothesis Testing for Inhomogeneous Random Graphs

Authors: Debarghya Ghoshdastidar, Maurilio Gutzeit, Alexandra Carpentier, Ulrike von Luxburg

Abstract: The study of networks leads to a wide range of high dimensional inference problems. In many practical applications, one needs to draw inference from one or few large sparse networks. The present paper studies hypothesis testing of graphs in this high-dimensional regime, where the goal is to test between two populations of inhomogeneous random graphs defined on the same set of $n$ vertices. The siz… ▽ More The study of networks leads to a wide range of high dimensional inference problems. In many practical applications, one needs to draw inference from one or few large sparse networks. The present paper studies hypothesis testing of graphs in this high-dimensional regime, where the goal is to test between two populations of inhomogeneous random graphs defined on the same set of $n$ vertices. The size of each population $m$ is much smaller than $n$, and can even be a constant as small as 1. The critical question in this context is whether the problem is solvable for small $m$. We answer this question from a minimax testing perspective. Let $P,Q$ be the population adjacencies of two sparse inhomogeneous random graph models, and $d$ be a suitably defined distance function. Given a population of $m$ graphs from each model, we derive minimax separation rates for the problem of testing $P=Q$ against $d(P,Q)>ρ$. We observe that if $m$ is small, then the minimax separation is too large for some popular choices of $d$, including total variation distance between corresponding distributions. This implies that some models that are widely separated in $d$ cannot be distinguished for small $m$, and hence, the testing problem is generally not solvable in these cases. We also show that if $m>1$, then the minimax separation is relatively small if $d$ is the Frobenius norm or operator norm distance between $P$ and $Q$. For $m=1$, only the latter distance provides small minimax separation. Thus, for these distances, the problem is solvable for small $m$. We also present near-optimal two-sample tests in both cases, where tests are adaptive with respect to sparsity level of the graphs. △ Less

Submitted 17 July, 2019; v1 submitted 4 July, 2017; originally announced July 2017.

Comments: To appear in the Annals of Statistics. This 54-page version includes the supplementary material (appendix to the main paper)

MSC Class: 62H15; 62C20; 05C80; 60B20

Journal ref: Ann. Statist. Volume 48, Number 4 (2020), 2208-2229

arXiv:1705.06168 [pdf, ps, other]

Two-Sample Tests for Large Random Graphs Using Network Statistics

Authors: Debarghya Ghoshdastidar, Maurilio Gutzeit, Alexandra Carpentier, Ulrike von Luxburg

Abstract: We consider a two-sample hypothesis testing problem, where the distributions are defined on the space of undirected graphs, and one has access to only one observation from each model. A motivating example for this problem is comparing the friendship networks on Facebook and LinkedIn. The practical approach to such problems is to compare the networks based on certain network statistics. In this pap… ▽ More We consider a two-sample hypothesis testing problem, where the distributions are defined on the space of undirected graphs, and one has access to only one observation from each model. A motivating example for this problem is comparing the friendship networks on Facebook and LinkedIn. The practical approach to such problems is to compare the networks based on certain network statistics. In this paper, we present a general principle for two-sample hypothesis testing in such scenarios without making any assumption about the network generation process. The main contribution of the paper is a general formulation of the problem based on concentration of network statistics, and consequently, a consistent two-sample test that arises as the natural solution for this problem. We also show that the proposed test is minimax optimal for certain network statistics. △ Less

Submitted 26 May, 2017; v1 submitted 17 May, 2017; originally announced May 2017.

Comments: To be presented in COLT 2017 (author sequence, funding details and minor typos updated in version 2)

arXiv:1704.02760 [pdf, ps, other]

Constructing confidence sets for the matrix completion problem

Authors: Alexandra Carpentier, Olga Klopp, Matthias Löffler

Abstract: In the present note we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a realizable method for constructing confidence sets that adapt to the unknown rank of the true matrix. In the present note we consider the problem of constructing honest and adaptive confidence sets for the matrix completion problem. For the Bernoulli model with known variance of the noise we provide a realizable method for constructing confidence sets that adapt to the unknown rank of the true matrix. △ Less

Submitted 10 April, 2017; originally announced April 2017.

arXiv:1703.05841 [pdf, other]

Adaptivity to Noise Parameters in Nonparametric Active Learning

Authors: Andrea Locatelli, Alexandra Carpentier, Samory Kpotufe

Abstract: This work addresses various open questions in the theory of active learning for nonparametric classification. Our contributions are both statistical and algorithmic: -We establish new minimax-rates for active learning under common \textit{noise conditions}. These rates display interesting transitions -- due to the interaction between noise \textit{smoothness and margin} -- not present in the passi… ▽ More This work addresses various open questions in the theory of active learning for nonparametric classification. Our contributions are both statistical and algorithmic: -We establish new minimax-rates for active learning under common \textit{noise conditions}. These rates display interesting transitions -- due to the interaction between noise \textit{smoothness and margin} -- not present in the passive setting. Some such transitions were previously conjectured, but remained unconfirmed. -We present a generic algorithmic strategy for adaptivity to unknown noise smoothness and margin; our strategy achieves optimal rates in many general situations; furthermore, unlike in previous work, we avoid the need for \textit{adaptive confidence sets}, resulting in strictly milder distributional requirements. △ Less

Submitted 16 March, 2017; originally announced March 2017.

arXiv:1703.00167 [pdf, ps, other]

Adaptive estimation of the sparsity in the Gaussian vector model

Authors: Alexandra Carpentier, Nicolas Verzelen

Abstract: Consider the Gaussian vector model with mean value θ. We study the twin problems of estimating the number |θ|_0 of non-zero components of θ and testing whether |θ|_0 is smaller than some value. For testing, we establish the minimax separation distances for this model and introduce a minimax adaptive test. Extensions to the case of unknown variance are also discussed. Rewriting the estimation of |θ… ▽ More Consider the Gaussian vector model with mean value θ. We study the twin problems of estimating the number |θ|_0 of non-zero components of θ and testing whether |θ|_0 is smaller than some value. For testing, we establish the minimax separation distances for this model and introduce a minimax adaptive test. Extensions to the case of unknown variance are also discussed. Rewriting the estimation of |θ|_0 as a multiple testing problem of all hypotheses {|θ|_0 <= q}, we both derive a new way of assessing the optimality of a sparsity estimator and we exhibit such an optimal procedure. This general approach provides a roadmap for estimating the complexity of the signal in various statistical models. △ Less

Submitted 1 March, 2017; originally announced March 2017.

Comments: 76 pages

MSC Class: 62C20; 62G10; 62B10

arXiv:1702.03760 [pdf, other]

Minimax Euclidean Separation Rates for Testing Convex Hypotheses in $\mathbb{R}^d$

Authors: Gilles Blanchard, Alexandra Carpentier, Maurilio Gutzeit

Abstract: We consider composite-composite testing problems for the expectation in the Gaussian sequence model where the null hypothesis corresponds to a convex subset $\mathcal{C}$ of $\mathbb{R}^d$. We adopt a minimax point of view and our primary objective is to describe the smallest Euclidean distance between the null and alternative hypotheses such that there is a test with small total error probability… ▽ More We consider composite-composite testing problems for the expectation in the Gaussian sequence model where the null hypothesis corresponds to a convex subset $\mathcal{C}$ of $\mathbb{R}^d$. We adopt a minimax point of view and our primary objective is to describe the smallest Euclidean distance between the null and alternative hypotheses such that there is a test with small total error probability. In particular, we focus on the dependence of this distance on the dimension $d$ and the sample size/variance parameter $n$ giving rise to the minimax separation rate. In this paper we discuss lower and upper bounds on this rate for different smooth and non- smooth choices for $\mathcal{C}$. △ Less

Submitted 23 August, 2018; v1 submitted 13 February, 2017; originally announced February 2017.

MSC Class: 62G10

arXiv:1608.04861 [pdf, ps, other]

Adaptive confidence sets for matrix completion

Authors: Alexandra Carpentier, Olga Klopp, Matthias Löffler, Richard Nickl

Abstract: In the present paper we study the problem of existence of honest and adaptive confidence sets for matrix completion. We consider two statistical models: the trace regression model and the Bernoulli model. In the trace regression model, we show that honest confidence sets that adapt to the unknown rank of the matrix exist even when the error variance is unknown. Contrary to this, we prove that in t… ▽ More In the present paper we study the problem of existence of honest and adaptive confidence sets for matrix completion. We consider two statistical models: the trace regression model and the Bernoulli model. In the trace regression model, we show that honest confidence sets that adapt to the unknown rank of the matrix exist even when the error variance is unknown. Contrary to this, we prove that in the Bernoulli model, honest and adaptive confidence sets exist only when the error variance is known a priori. In the course of our proofs we obtain bounds for the minimax rates of certain composite hypothesis testing problems arising in low rank inference. △ Less

Submitted 6 February, 2017; v1 submitted 17 August, 2016; originally announced August 2016.

arXiv:1605.09004 [pdf, ps, other]

Tight (Lower) Bounds for the Fixed Budget Best Arm Identification Bandit Problem

Authors: Alexandra Carpentier, Andrea Locatelli

Abstract: We consider the problem of \textit{best arm identification} with a \textit{fixed budget $T$}, in the $K$-armed stochastic bandit setting, with arms distribution defined on $[0,1]$. We prove that any bandit strategy, for at least one bandit problem characterized by a complexity $H$, will misidentify the best arm with probability lower bounded by $$\exp\Big(-\frac{T}{\log(K)H}\Big),$$ where $H$ is t… ▽ More We consider the problem of \textit{best arm identification} with a \textit{fixed budget $T$}, in the $K$-armed stochastic bandit setting, with arms distribution defined on $[0,1]$. We prove that any bandit strategy, for at least one bandit problem characterized by a complexity $H$, will misidentify the best arm with probability lower bounded by $$\exp\Big(-\frac{T}{\log(K)H}\Big),$$ where $H$ is the sum for all sub-optimal arms of the inverse of the squared gaps. Our result disproves formally the general belief - coming from results in the fixed confidence setting - that there must exist an algorithm for this problem whose probability of error is upper bounded by $\exp(-T/H)$. This also proves that some existing strategies based on the Successive Rejection of the arms are optimal - closing therefore the current gap between upper and lower bounds for the fixed budget best arm identification problem. △ Less

Submitted 29 May, 2016; originally announced May 2016.

Comments: COLT 2016

arXiv:1605.08671 [pdf, other]

An optimal algorithm for the Thresholding Bandit Problem

Authors: Andrea Locatelli, Maurilio Gutzeit, Alexandra Carpentier

Abstract: We study a specific \textit{combinatorial pure exploration stochastic bandit problem} where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and \textit{for a fixed time horizon}. We propose a parameter-free algorithm based on an original heuristic, and prove that it is optimal for this problem by deriving matching upper and lower bounds… ▽ More We study a specific \textit{combinatorial pure exploration stochastic bandit problem} where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and \textit{for a fixed time horizon}. We propose a parameter-free algorithm based on an original heuristic, and prove that it is optimal for this problem by deriving matching upper and lower bounds. To the best of our knowledge, this is the first non-trivial pure exploration setting with \textit{fixed budget} for which optimal strategies are constructed. △ Less

Submitted 27 May, 2016; originally announced May 2016.

Comments: ICML 2016

arXiv:1601.00504 [pdf, other]

Learning relationships between data obtained independently

Authors: Alexandra Carpentier, Teresa Schlueter

Abstract: The aim of this paper is to provide a new method for learning the relationships between data that have been obtained independently. Unlike existing methods like matching, the proposed technique does not require any contextual information, provided that the dependency between the variables of interest is monotone. It can therefore be easily combined with matching in order to exploit the advantages… ▽ More The aim of this paper is to provide a new method for learning the relationships between data that have been obtained independently. Unlike existing methods like matching, the proposed technique does not require any contextual information, provided that the dependency between the variables of interest is monotone. It can therefore be easily combined with matching in order to exploit the advantages of both methods. This technique can be described as a mix between quantile matching, and deconvolution. We provide for it a theoretical and an empirical validation. △ Less

Submitted 4 January, 2016; originally announced January 2016.

arXiv:1512.03206 [pdf, ps, other]

Determination of the species generated in atmospheric-pressure laser-induced plasmas by mass spectrometry techniques

Authors: F. Valle, C. Salgado, J. I. Apiñaniz, A. V. Carpentier, M. Sánchez Albaneda, L. Roso, C. Raposo, C. Padilla, A. Peralta Conde

Abstract: We present temporal information obtained by mass spectrometry techniques about the evolution of plasmas generated by laser filamentation in air. The experimental setup used in this work allowed us to study not only the dynamics of the filament core but also of the energy reservoir that surrounds it. Furthermore, valuable insights about the chemistry of such systems like the photofragmentation and/… ▽ More We present temporal information obtained by mass spectrometry techniques about the evolution of plasmas generated by laser filamentation in air. The experimental setup used in this work allowed us to study not only the dynamics of the filament core but also of the energy reservoir that surrounds it. Furthermore, valuable insights about the chemistry of such systems like the photofragmentation and/or formation of molecules were obtained. The interpretation of the experimental results are supported by PIC (particle in cell) simulations. △ Less

Submitted 10 December, 2015; originally announced December 2015.

Comments: 11 pages, 5 figures

arXiv:1510.04852 [pdf, other]

A novel technique to achieve atomic macro-coherence as a tool to determine the nature of neutrinos

Authors: R. Boyero, A. V. Carpentier, J. J. Gomez-Cadenas, A. Peralta Conde

Abstract: The photon spectrum in macrocoherent atomic de-excitation via radiative emission of neutrino pairs (RENP) has been proposed as a sensitive probe of the neutrino mass spectrum, capable of competing with conventional neutrino experiments. In this paper we revisit this intriguing possibility, presenting an alternative method for inducing large coherence in a target based on adiabatic techniques. More… ▽ More The photon spectrum in macrocoherent atomic de-excitation via radiative emission of neutrino pairs (RENP) has been proposed as a sensitive probe of the neutrino mass spectrum, capable of competing with conventional neutrino experiments. In this paper we revisit this intriguing possibility, presenting an alternative method for inducing large coherence in a target based on adiabatic techniques. More concretely, we propose the use of a modified version of Coherent Population Return (CPR), namely double CPR, that turns out to be extremely robust with respect to the experimental parameters, and capable of inducing a coherence close to 100% in the target. △ Less

Submitted 16 October, 2015; originally announced October 2015.

Comments: 16 pages, 12 figures. arXiv admin note: text overlap with arXiv:1510.00421

arXiv:1507.04523 [pdf, ps, other]

Upper-Confidence-Bound Algorithms for Active Learning in Multi-Armed Bandits

Authors: Alexandra Carpentier, Alessandro Lazaric, Mohammad Ghavamzadeh, Rémi Munos, Peter Auer, András Antos

Abstract: In this paper, we study the problem of estimating uniformly well the mean values of several distributions given a finite budget of samples. If the variance of the distributions were known, one could design an optimal sampling strategy by collecting a number of independent samples per distribution that is proportional to their variance. However, in the more realistic case where the distributions ar… ▽ More In this paper, we study the problem of estimating uniformly well the mean values of several distributions given a finite budget of samples. If the variance of the distributions were known, one could design an optimal sampling strategy by collecting a number of independent samples per distribution that is proportional to their variance. However, in the more realistic case where the distributions are not known in advance, one needs to design adaptive sampling strategies in order to select which distribution to sample from according to the previously observed samples. We describe two strategies based on pulling the distributions a number of times that is proportional to a high-probability upper-confidence-bound on their variance (built from previous observed samples) and report a finite-sample performance analysis on the excess estimation error compared to the optimal allocation. We show that the performance of these allocation strategies depends not only on the variances but also on the full shape of the distributions. △ Less

Submitted 16 July, 2015; originally announced July 2015.

Comments: 30 pages, 2 Postscript figures, uses elsarticle.cls, earlier, shorter version published in Proceedings of the 22nd International Conference, Algorithmic Learning Theory

ACM Class: G.3

arXiv:1507.03829 [pdf, ps, other]

On signal detection and confidence sets for low rank inference problems

Authors: Alexandra Carpentier, Richard Nickl

Abstract: We consider the signal detection problem in the Gaussian design trace regression model with low rank alternative hypotheses. We derive the precise (Ingster-type) detection boundary for the Frobenius and the nuclear norm. We then apply these results to show that honest confidence sets for the unknown matrix parameter that adapt to all low rank sub-models in nuclear norm do not exist. This shows tha… ▽ More We consider the signal detection problem in the Gaussian design trace regression model with low rank alternative hypotheses. We derive the precise (Ingster-type) detection boundary for the Frobenius and the nuclear norm. We then apply these results to show that honest confidence sets for the unknown matrix parameter that adapt to all low rank sub-models in nuclear norm do not exist. This shows that recently obtained positive results in (Carpentier, Eisert, Gross and Nickl, 2015) for confidence sets in low rank recovery problems are essentially optimal. △ Less

Submitted 9 November, 2015; v1 submitted 14 July, 2015; originally announced July 2015.

Comments: This paper will appear in the Electronic Journal of Statistics

arXiv:1505.04627 [pdf, other]

Simple regret for infinitely many armed bandits

Authors: Alexandra Carpentier, Michal Valko

Abstract: We consider a stochastic bandit problem with infinitely many arms. In this setting, the learner has no chance of trying all the arms even once and has to dedicate its limited number of samples only to a certain number of arms. All previous algorithms for this setting were designed for minimizing the cumulative regret of the learner. In this paper, we propose an algorithm aiming at minimizing the s… ▽ More We consider a stochastic bandit problem with infinitely many arms. In this setting, the learner has no chance of trying all the arms even once and has to dedicate its limited number of samples only to a certain number of arms. All previous algorithms for this setting were designed for minimizing the cumulative regret of the learner. In this paper, we propose an algorithm aiming at minimizing the simple regret. As in the cumulative regret setting of infinitely many armed bandits, the rate of the simple regret will depend on a parameter $β$ characterizing the distribution of the near-optimal arms. We prove that depending on $β$, our algorithm is minimax optimal either up to a multiplicative constant or up to a $\log(n)$ factor. We also provide extensions to several important cases: when $β$ is unknown, in a natural setting where the near-optimal arms have a small variance, and in the case of unknown time horizon. △ Less

Submitted 18 May, 2015; originally announced May 2015.

Comments: in 32th International Conference on Machine Learning (ICML 2015)

arXiv:1504.05724 [pdf, ps, other]

doi 10.1103/PhysRevA.91.053414

In-trap fluorescence detection of atoms in a microscopic dipole trap

Authors: A. J. Hilliard, Y. H. Fung, P. Sompet, A. V. Carpentier, M. F. Andersen

Abstract: We investigate fluorescence detection using a standing wave of blue-detuned light of one or more atoms held in a deep, microscopic dipole trap. The blue-detuned standing wave realizes a Sisyphus laser cooling mechanism so that an atom can scatter many photons while remaining trapped. When imaging more than one atom, the blue detuning limits loss due to inelastic light-assisted collisions. Using th… ▽ More We investigate fluorescence detection using a standing wave of blue-detuned light of one or more atoms held in a deep, microscopic dipole trap. The blue-detuned standing wave realizes a Sisyphus laser cooling mechanism so that an atom can scatter many photons while remaining trapped. When imaging more than one atom, the blue detuning limits loss due to inelastic light-assisted collisions. Using this standing wave probe beam, we demonstrate that we can count from one to the order of 100 atoms in the microtrap with sub-poissonian precision. △ Less

Submitted 19 May, 2015; v1 submitted 22 April, 2015; originally announced April 2015.

Comments: 13 pages, 10 figures

Journal ref: Phys. Rev. A, 91, 053414 (2015)

arXiv:1504.03234 [pdf, other]

doi 10.1007/978-3-030-26391-1_18

Uncertainty Quantification for Matrix Compressed Sensing and Quantum Tomography Problems

Authors: Alexandra Carpentier, Jens Eisert, David Gross, Richard Nickl

Abstract: We construct minimax optimal non-asymptotic confidence sets for low rank matrix recovery algorithms such as the Matrix Lasso or Dantzig selector. These are employed to devise adaptive sequential sampling procedures that guarantee recovery of the true matrix in Frobenius norm after a data-driven stop** time $\hat n$ for the number of measurements that have to be taken. With high probability, this… ▽ More We construct minimax optimal non-asymptotic confidence sets for low rank matrix recovery algorithms such as the Matrix Lasso or Dantzig selector. These are employed to devise adaptive sequential sampling procedures that guarantee recovery of the true matrix in Frobenius norm after a data-driven stop** time $\hat n$ for the number of measurements that have to be taken. With high probability, this stop** time is minimax optimal. We detail applications to quantum tomography problems where measurements arise from Pauli observables. We also give a theoretical construction of a confidence set for the density matrix of a quantum state that has optimal diameter in nuclear norm. The non-asymptotic properties of our confidence sets are further investigated in a simulation study. △ Less

Submitted 21 December, 2015; v1 submitted 13 April, 2015; originally announced April 2015.

Journal ref: pp 385-430 (2019) In: Gozlan N., Latała R., Lounici K., Madiman M. (eds) High Dimensional Probability VIII. Progress in Probability, vol 74. Birkh\

arXiv:1502.04654 [pdf, other]

An iterative hard thresholding estimator for low rank matrix recovery with explicit limiting distribution

Authors: Alexandra Carpentier, Arlene K. H. Kim

Abstract: We consider the problem of low rank matrix recovery in a stochastically noisy high dimensional setting. We propose a new estimator for the low rank matrix, based on the iterative hard thresholding method, and that is computationally efficient and simple. We prove that our estimator is efficient both in terms of the Frobenius risk, and in terms of the entry-wise risk uniformly over any change of or… ▽ More We consider the problem of low rank matrix recovery in a stochastically noisy high dimensional setting. We propose a new estimator for the low rank matrix, based on the iterative hard thresholding method, and that is computationally efficient and simple. We prove that our estimator is efficient both in terms of the Frobenius risk, and in terms of the entry-wise risk uniformly over any change of orthonormal basis. This result allows us, in the case where the design is Gaussian, to provide the limiting distribution of the estimator, which is of great interest for constructing tests and confidence sets for low dimensional subsets of entries of the low rank matrix. △ Less

Submitted 1 March, 2016; v1 submitted 16 February, 2015; originally announced February 2015.

arXiv:1501.04467 [pdf, other]

Implementable confidence sets in high dimensional regression

Authors: Alexandra Carpentier

Abstract: We consider the setting of linear regression in high dimension. We focus on the problem of constructing adaptive and honest confidence sets for the sparse parameter θ, i.e. we want to construct a confidence set for theta that contains theta with high probability, and that is as small as possible. The l_2 diameter of a such confidence set should depend on the sparsity S of θ- the larger S, the wide… ▽ More We consider the setting of linear regression in high dimension. We focus on the problem of constructing adaptive and honest confidence sets for the sparse parameter θ, i.e. we want to construct a confidence set for theta that contains theta with high probability, and that is as small as possible. The l_2 diameter of a such confidence set should depend on the sparsity S of θ- the larger S, the wider the confidence set. However, in practice, S is unknown. This paper focuses on constructing a confidence set for θwhich contains θwith high probability, whose diameter is adaptive to the unknown sparsity S, and which is implementable in practice. △ Less

Submitted 19 January, 2015; originally announced January 2015.

arXiv:1312.2968 [pdf, ps, other]

Adaptive confidence intervals for the tail coefficient in a wide second order class of Pareto models

Authors: Alexandra Carpentier, Arlene K. H. Kim

Abstract: We study the problem of constructing honest and adaptive confidence intervals for the tail coefficient in the second order Pareto model, when the second order coefficient is unknown. This problem is translated into a testing problem on the second order parameter. By constructing an appropriate model and an associated test statistic, we provide a uniform and adaptive confidence interval for the fir… ▽ More We study the problem of constructing honest and adaptive confidence intervals for the tail coefficient in the second order Pareto model, when the second order coefficient is unknown. This problem is translated into a testing problem on the second order parameter. By constructing an appropriate model and an associated test statistic, we provide a uniform and adaptive confidence interval for the first order parameter. We also provide an almost matching lower bound, which proves that the result is minimax optimal up to a logarithmic factor. △ Less

Submitted 17 September, 2014; v1 submitted 10 December, 2013; originally announced December 2013.

Showing 1–50 of 67 results for author: Carpentier, A