Skip to main content

Showing 1–31 of 31 results for author: Koolen, W

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.03483  [pdf, ps, other

    math.ST cs.LG cs.NE

    Hebbian learning inspired estimation of the linear regression parameters from queries

    Authors: Johannes Schmidt-Hieber, Wouter M Koolen

    Abstract: Local learning rules in biological neural networks (BNNs) are commonly referred to as Hebbian learning. [26] links a biologically motivated Hebbian learning rule to a specific zeroth-order optimization method. In this work, we study a variation of this Hebbian learning rule to recover the regression vector in the linear regression model. Zeroth-order optimization methods are known to converge with… ▽ More

    Submitted 26 September, 2023; originally announced November 2023.

    Comments: 34 pages

    MSC Class: Primary: 62L20; secondary: 62J05

  2. arXiv:2304.12768  [pdf, ps, other

    cs.GT math.OC stat.ML

    Towards Characterizing the First-order Query Complexity of Learning (Approximate) Nash Equilibria in Zero-sum Matrix Games

    Authors: Hédi Hadiji, Sarah Sachs, Tim van Erven, Wouter M. Koolen

    Abstract: In the first-order query model for zero-sum $K\times K$ matrix games, players observe the expected pay-offs for all their possible actions under the randomized action played by their opponent. This classical model has received renewed interest after the discovery by Rakhlin and Sridharan that $ε$-approximate Nash equilibria can be computed efficiently from $O(\frac{\ln K}ε)$ instead of… ▽ More

    Submitted 2 November, 2023; v1 submitted 25 April, 2023; originally announced April 2023.

  3. arXiv:2203.04485  [pdf, ps, other

    math.PR cs.GT math.ST

    A composite generalization of Ville's martingale theorem

    Authors: Johannes Ruf, Martin Larsson, Wouter M. Koolen, Aaditya Ramdas

    Abstract: We provide a composite version of Ville's theorem that an event has zero measure if and only if there exists a nonnegative martingale which explodes to infinity when that event occurs. This is a classic result connecting measure-theoretic probability to the sequence-by-sequence game-theoretic probability, recently developed by Shafer and Vovk. Our extension of Ville's result involves appropriate c… ▽ More

    Submitted 3 May, 2023; v1 submitted 8 March, 2022; originally announced March 2022.

    Comments: 21 pages

  4. arXiv:2110.15573  [pdf, other

    stat.ML cs.LG

    A/B/n Testing with Control in the Presence of Subpopulations

    Authors: Yoan Russac, Christina Katsimerou, Dennis Bohle, Olivier Cappé, Aurélien Garivier, Wouter Koolen

    Abstract: Motivated by A/B/n testing applications, we consider a finite set of distributions (called \emph{arms}), one of which is treated as a \emph{control}. We assume that the population is stratified into homogeneous subpopulations. At every time step, a subpopulation is sampled and an arm is chosen: the resulting observation is an independent draw from the arm conditioned on the subpopulation. The qual… ▽ More

    Submitted 29 October, 2021; originally announced October 2021.

    Journal ref: NeurIPS 2021, Dec 2021, Virtual, France

  5. arXiv:2107.01881  [pdf, ps, other

    cs.LG

    Robust Online Convex Optimization in the Presence of Outliers

    Authors: Tim van Erven, Sarah Sachs, Wouter M. Koolen, Wojciech Kotłowski

    Abstract: We consider online convex optimization when a number k of data points are outliers that may be corrupted. We model this by introducing the notion of robust regret, which measures the regret only on rounds that are not outliers. The aim for the learner is to achieve small robust regret, without knowing where the outliers are. If the outliers are chosen adversarially, we show that a simple filtering… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

    Journal ref: Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:4174-4194, 2021

  6. arXiv:2102.06622  [pdf, other

    cs.LG stat.ML

    MetaGrad: Adaptation using Multiple Learning Rates in Online Learning

    Authors: Tim van Erven, Wouter M. Koolen, Dirk van der Hoeven

    Abstract: We provide a new adaptive method for online convex optimization, MetaGrad, that is robust to general convex losses but achieves faster rates for a broad class of special functions, including exp-concave and strongly convex functions, but also various types of stochastic and non-stochastic functions without any curvature. We prove this by drawing a connection to the Bernstein condition, which is kn… ▽ More

    Submitted 30 August, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Journal ref: Journal of Machine Learning Research 22(161):1-61, 2021

  7. arXiv:2102.03734  [pdf, other

    cs.LG stat.ML

    Regret Minimization in Heavy-Tailed Bandits

    Authors: Shubhada Agrawal, Sandeep Juneja, Wouter M. Koolen

    Abstract: We revisit the classic regret-minimization problem in the stochastic multi-armed bandit setting when the arm-distributions are allowed to be heavy-tailed. Regret minimization has been well studied in simpler settings of either bounded support reward distributions or distributions that belong to a single parameter exponential family. We work under the much weaker assumption that the moments of orde… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 35 pages, 2 figures

  8. arXiv:2102.00630  [pdf, other

    math.ST cs.IT math.PR stat.ME

    Testing exchangeability: fork-convexity, supermartingales, and e-processes

    Authors: Aaditya Ramdas, Johannes Ruf, Martin Larsson, Wouter Koolen

    Abstract: Suppose we observe an infinite series of coin flips $X_1,X_2,\ldots$, and wish to sequentially test the null that these binary random variables are exchangeable. Nonnegative supermartingales (NSMs) are a workhorse of sequential inference, but we prove that they are powerless for this problem. First, utilizing a geometric concept called fork-convexity (a sequential analog of convexity), we show tha… ▽ More

    Submitted 23 July, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

    Comments: 34 pages, 7 figures, accepted at the International Journal of Approximate Reasoning

  9. arXiv:2008.07606  [pdf, other

    cs.LG stat.ML

    Optimal Best-Arm Identification Methods for Tail-Risk Measures

    Authors: Shubhada Agrawal, Wouter M. Koolen, Sandeep Juneja

    Abstract: Conditional value-at-risk (CVaR) and value-at-risk (VaR) are popular tail-risk measures in finance and insurance industries as well as in highly reliable, safety-critical uncertain environments where often the underlying probability distributions are heavy-tailed. We use the multi-armed bandit best-arm identification framework and consider the problem of identifying the arm from amongst finitely m… ▽ More

    Submitted 21 June, 2021; v1 submitted 17 August, 2020; originally announced August 2020.

    Comments: 55 pages, 4 figures

  10. arXiv:2007.00969  [pdf, other

    stat.ML cs.LG

    Structure Adaptive Algorithms for Stochastic Bandits

    Authors: Rémy Degenne, Han Shao, Wouter M. Koolen

    Abstract: We study reward maximisation in a wide class of structured stochastic multi-armed bandit problems, where the mean rewards of arms satisfy some given structural constraints, e.g. linear, unimodal, sparse, etc. Our aim is to develop methods that are flexible (in that they easily adapt to different structures), powerful (in that they perform well empirically and/or provably match instance-dependent l… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 10+18 pages. To be published in the proceedings of ICML 2020

  11. arXiv:2002.12242  [pdf, other

    cs.LG stat.ML

    Lipschitz and Comparator-Norm Adaptivity in Online Learning

    Authors: Zakaria Mhammedi, Wouter M. Koolen

    Abstract: We study Online Convex Optimization in the unbounded setting where neither predictions nor gradient are constrained. The goal is to simultaneously adapt to both the sequence of gradients and the comparator. We first develop parameter-free and scale-free algorithms for a simplified setting with hints. We present two versions: the first adapts to the squared norms of both comparator and gradients se… ▽ More

    Submitted 8 August, 2020; v1 submitted 27 February, 2020; originally announced February 2020.

    Comments: 30 Pages, 1 Figure

  12. arXiv:1906.10431  [pdf, other

    stat.ML cs.LG

    Non-Asymptotic Pure Exploration by Solving Games

    Authors: Rémy Degenne, Wouter M. Koolen, Pierre Ménard

    Abstract: Pure exploration (aka active testing) is the fundamental task of sequentially gathering information to answer a query about a stochastic environment. Good algorithms make few mistakes and take few samples. Lower bounds (for multi-armed bandit models with arms in an exponential family) reveal that the sample complexity is determined by the solution to an optimisation problem. The existing state of… ▽ More

    Submitted 25 June, 2019; originally announced June 2019.

  13. arXiv:1906.07801  [pdf, other

    math.ST cs.IT cs.LG stat.ME

    Safe Testing

    Authors: Peter Grünwald, Rianne de Heide, Wouter Koolen

    Abstract: We develop the theory of hypothesis testing based on the e-value, a notion of evidence that, unlike the p-value, allows for effortlessly combining results from several studies in the common scenario where the decision to perform a new study may depend on previous outcomes. Tests based on e-values are safe, i.e. they preserve Type-I error guarantees, under such optional continuation. We define grow… ▽ More

    Submitted 10 March, 2023; v1 submitted 18 June, 2019; originally announced June 2019.

    Comments: Accepted as discussion paper to the Journal of the Royal Statistical Society series B

  14. arXiv:1902.10797  [pdf, ps, other

    cs.LG stat.ML

    Lipschitz Adaptivity with Multiple Learning Rates in Online Learning

    Authors: Zakaria Mhammedi, Wouter M. Koolen, Tim van Erven

    Abstract: We aim to design adaptive online learning algorithms that take advantage of any special structure that might be present in the learning task at hand, with as little manual tuning by the user as possible. A fundamental obstacle that comes up in the design of such adaptive algorithms is to calibrate a so-called step-size or learning rate hyperparameter depending on variance, gradient norms, etc. A r… ▽ More

    Submitted 30 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

    Comments: 22 pages. To appear in COLT 2019

  15. arXiv:1902.03475  [pdf, other

    cs.LG stat.ML

    Pure Exploration with Multiple Correct Answers

    Authors: Rémy Degenne, Wouter M. Koolen

    Abstract: We determine the sample complexity of pure exploration bandit problems with multiple good answers. We derive a lower bound using a new game equilibrium argument. We show how continuity and convexity properties of single-answer problems ensures that the Track-and-Stop algorithm has asymptotically optimal sample complexity. However, that convexity is lost when going to the multiple-answer setting. W… ▽ More

    Submitted 9 February, 2019; originally announced February 2019.

  16. arXiv:1811.11419  [pdf, other

    stat.ML cs.LG

    Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals

    Authors: Emilie Kaufmann, Wouter Koolen

    Abstract: This paper presents new deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model. The deviations are measured using the Kullback-Leibler divergence in a given one-dimensional exponential family, and may take into account several arms at a time. They are obtained by constructing for each arm a mixture martingale based on a hierarchical prior, and… ▽ More

    Submitted 8 December, 2021; v1 submitted 28 November, 2018; originally announced November 2018.

    Journal ref: Journal of Machine Learning Research, Microtome Publishing, 2021

  17. arXiv:1806.00973  [pdf, other

    stat.ML cs.LG

    Sequential Test for the Lowest Mean: From Thompson to Murphy Sampling

    Authors: Emilie Kaufmann, Wouter Koolen, Aurelien Garivier

    Abstract: Learning the minimum/maximum mean among a finite set of distributions is a fundamental sub-task in planning, game tree search and reinforcement learning. We formalize this learning task as the problem of sequentially testing how the minimum mean among a finite set of distributions compares to a given threshold. We develop refined non-asymptotic lower bounds, which show that optimality mandates ver… ▽ More

    Submitted 4 June, 2018; originally announced June 2018.

  18. arXiv:1706.02986  [pdf, ps, other

    stat.ML cs.LG

    Monte-Carlo Tree Search by Best Arm Identification

    Authors: Emilie Kaufmann, Wouter Koolen

    Abstract: Recent advances in bandit tools and techniques for sequential learning are steadily enabling new applications and are promising the resolution of a range of challenging related problems. We study the game tree search problem, where the goal is to quickly identify the optimal move in a given game tree by sequentially sampling its stochastic payoffs. We develop new algorithms for trees of arbitrary… ▽ More

    Submitted 6 November, 2017; v1 submitted 9 June, 2017; originally announced June 2017.

    Comments: Advances in Neural Information Processing Systems (NIPS), Dec 2017, Long Beach, United States

  19. arXiv:1605.06439  [pdf, ps, other

    cs.LG

    Combining Adversarial Guarantees and Stochastic Fast Rates in Online Learning

    Authors: Wouter M. Koolen, Peter Grünwald, Tim van Erven

    Abstract: We consider online learning algorithms that guarantee worst-case regret rates in adversarial environments (so they can be deployed safely and will perform robustly), yet adapt optimally to favorable stochastic environments (so they will perform well in a variety of settings of practical importance). We quantify the friendliness of stochastic environments by means of the well-known Bernstein (a.k.a… ▽ More

    Submitted 20 May, 2016; originally announced May 2016.

    Journal ref: Advances in Neural Information Processing Systems 29 (NeurIPS), 4457-4465, 2016

  20. arXiv:1604.08740  [pdf, ps, other

    cs.LG

    MetaGrad: Multiple Learning Rates in Online Learning

    Authors: Tim van Erven, Wouter M. Koolen

    Abstract: In online convex optimization it is well known that certain subclasses of objective functions are much easier than arbitrary convex functions. We are interested in designing adaptive methods that can automatically get fast rates in as many such subclasses as possible, without any manual tuning. Previous adaptive methods are able to interpolate between strongly convex and general convex functions.… ▽ More

    Submitted 1 November, 2016; v1 submitted 29 April, 2016; originally announced April 2016.

    Journal ref: Advances in Neural Information Processing Systems 29 (NeurIPS), 3666-3674, 2016

  21. arXiv:1603.04190  [pdf, ps, other

    cs.LG stat.ML

    Online Isotonic Regression

    Authors: Wojciech Kotłowski, Wouter M. Koolen, Alan Malek

    Abstract: We consider the online version of the isotonic regression problem. Given a set of linearly ordered points (e.g., on the real line), the learner must predict labels sequentially at adversarially chosen positions and is evaluated by her total squared loss compared against the best isotonic (non-decreasing) function in hindsight. We survey several standard online learning algorithms and show that non… ▽ More

    Submitted 7 October, 2016; v1 submitted 14 March, 2016; originally announced March 2016.

    Comments: 25 pages

  22. arXiv:1602.04676  [pdf, ps, other

    math.ST cs.GT stat.ML

    Maximin Action Identification: A New Bandit Framework for Games

    Authors: Aurélien Garivier, Emilie Kaufmann, Wouter Koolen

    Abstract: We study an original problem of pure exploration in a strategic bandit model motivated by Monte Carlo Tree Search. It consists in identifying the best action in a game, when the player may sample random outcomes of sequentially chosen pairs of actions. We propose two strategies for the fixed-confidence setting: Maximin-LUCB, based on lower-and upper-confidence bounds; and Maximin-Racing, which ope… ▽ More

    Submitted 15 February, 2016; originally announced February 2016.

  23. Robust Probability Updating

    Authors: Thijs van Ommen, Wouter M. Koolen, Thijs E. Feenstra, Peter D. Grünwald

    Abstract: This paper discusses an alternative to conditioning that may be used when the probability distribution is not fully specified. It does not require any assumptions (such as CAR: coarsening at random) on the unknown distribution. The well-known Monty Hall problem is the simplest scenario where neither naive conditioning nor the CAR assumption suffice to determine an updated probability distribution.… ▽ More

    Submitted 2 May, 2016; v1 submitted 10 December, 2015; originally announced December 2015.

    Comments: 47 pages, 4 figures. This second version is the accepted manuscript: it incorporates reviewer comments and has a new title

    Journal ref: International Journal of Approximate Reasoning 74 (2016) 30-57

  24. arXiv:1502.08009  [pdf, ps, other

    cs.LG stat.ML

    Second-order Quantile Methods for Experts and Combinatorial Games

    Authors: Wouter M. Koolen, Tim van Erven

    Abstract: We aim to design strategies for sequential decision making that adjust to the difficulty of the learning problem. We study this question both in the setting of prediction with expert advice, and for more general combinatorial decision tasks. We are not satisfied with just guaranteeing minimax regret rates, but we want our algorithms to perform significantly better on easy data. Two popular ways to… ▽ More

    Submitted 27 February, 2015; originally announced February 2015.

  25. Universal Codes from Switching Strategies

    Authors: Wouter M. Koolen, Steven de Rooij

    Abstract: We discuss algorithms for combining sequential prediction strategies, a task which can be viewed as a natural generalisation of the concept of universal coding. We describe a graphical language based on Hidden Markov Models for defining prediction strategies, and we provide both existing and new models as examples. The models include efficient, parameterless models for switching between the input… ▽ More

    Submitted 25 November, 2013; originally announced November 2013.

    Journal ref: IEEE Transactions on Information Theory, 59(11):7168-7185, November 2013

  26. arXiv:1301.0534  [pdf, ps, other

    cs.LG stat.ML

    Follow the Leader If You Can, Hedge If You Must

    Authors: Steven de Rooij, Tim van Erven, Peter D. Grünwald, Wouter M. Koolen

    Abstract: Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has terrible performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combi… ▽ More

    Submitted 17 January, 2013; v1 submitted 3 January, 2013; originally announced January 2013.

    Comments: under submission

    Journal ref: Journal of Machine Learning Research 15(37):1281-1316, 2014

  27. arXiv:1008.4654  [pdf, other

    cs.LG

    Freezing and Slee**: Tracking Experts that Learn by Evolving Past Posteriors

    Authors: Wouter M. Koolen, Tim van Erven

    Abstract: A problem posed by Freund is how to efficiently track a small pool of experts out of a much larger set. This problem was solved when Bousquet and Warmuth introduced their mixing past posteriors (MPP) algorithm in 2001. In Freund's problem the experts would normally be considered black boxes. However, in this paper we re-examine Freund's problem in case the experts have internal structure that en… ▽ More

    Submitted 27 August, 2010; originally announced August 2010.

  28. arXiv:1008.4532  [pdf, other

    cs.LG

    Switching between Hidden Markov Models using Fixed Share

    Authors: Wouter M. Koolen, Tim van Erven

    Abstract: In prediction with expert advice the goal is to design online prediction algorithms that achieve small regret (additional loss on the whole data) compared to a reference scheme. In the simplest such scheme one compares to the loss of the best expert in hindsight. A more ambitious goal is to split the data into segments and compare to the best expert on each segment. This is appropriate if the natu… ▽ More

    Submitted 26 August, 2010; originally announced August 2010.

  29. arXiv:0809.2965  [pdf, ps, other

    cs.CC cs.IT

    On Time-Bounded Incompressibility of Compressible Strings and Sequences

    Authors: E. G. Daylight, W. M. Koolen, P. M. B. Vitanyi

    Abstract: For every total recursive time bound $t$, a constant fraction of all compressible (low Kolmogorov complexity) strings is $t$-bounded incompressible (high time-bounded Kolmogorov complexity); there are uncountably many infinite sequences of which every initial segment of length $n$ is compressible to $\log n$ yet $t$-bounded incompressible below ${1/4}n - \log n$; and there are countable infinite… ▽ More

    Submitted 11 August, 2009; v1 submitted 17 September, 2008; originally announced September 2008.

    Comments: 9 pages, LaTeX, no figures, submitted to Information Processing Letters. Changed and added a Barzdins-like lemma for infinite sequences with different quantification oreder, a fixed constant, and uncountably many sequences

  30. arXiv:0802.2027  [pdf, other

    cs.CC cs.SC

    Kolmogorov Complexity Theory over the Reals

    Authors: Martin Ziegler, Wouter M. Koolen

    Abstract: Kolmogorov Complexity constitutes an integral part of computability theory, information theory, and computational complexity theory -- in the discrete setting of bits and Turing machines. Over real numbers, on the other hand, the BSS-machine (aka real-RAM) has been established as a major model of computation. This real realm has turned out to exhibit natural counterparts to many notions and resu… ▽ More

    Submitted 28 March, 2008; v1 submitted 14 February, 2008; originally announced February 2008.

    ACM Class: F.4.1; F.1.1; E.4; I.1.2; I.1.3

  31. arXiv:0802.2015  [pdf, ps, other

    cs.LG cs.DS cs.IT

    Combining Expert Advice Efficiently

    Authors: Wouter Koolen, Steven de Rooij

    Abstract: We show how models for prediction with expert advice can be defined concisely and clearly using hidden Markov models (HMMs); standard HMM algorithms can then be used to efficiently calculate, among other things, how the expert predictions should be weighted according to the model. We cast many existing models as HMMs and recover the best known running times in each case. We also describe two new… ▽ More

    Submitted 15 February, 2008; v1 submitted 14 February, 2008; originally announced February 2008.

    Comments: 50 pages

    ACM Class: G.3