Skip to main content

Showing 1–25 of 25 results for author: Grünwald, P D

.
  1. arXiv:2304.14217  [pdf, other

    math.ST

    Exponential Stochastic Inequality

    Authors: Peter D. Grünwald, Muriel F. Pérez-Ortiz, Zakaria Mhammedi

    Abstract: We develop the concept of exponential stochastic inequality (ESI), a novel notation that simultaneously captures high-probability and in-expectation statements. It is especially well suited to succinctly state, prove, and reason about excess-risk and generalization bounds in statistical learning, specifically, but not restricted to, the PAC-Bayesian type. We show that the ESI satisfies transitivit… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

  2. arXiv:2302.11401  [pdf, other

    stat.ME

    Safe Sequential Testing and Effect Estimation in Stratified Count Data

    Authors: Rosanne J. Turner, Peter D. Grünwald

    Abstract: Sequential decision making significantly speeds up research and is more cost-effective compared to fixed-n methods. We present a method for sequential decision making for stratified count data that retains Type-I error guarantee or false discovery rate under optional stop**, using e-variables. We invert the method to construct stratified anytime-valid confidence sequences, where cross-talk betwe… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

    Comments: Preprint, to be published in the Proceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS) 2023, Valencia, Spain. PMLR: Volume 206

  3. The no-free-lunch theorems of supervised learning

    Authors: Tom F. Sterkenburg, Peter D. Grünwald

    Abstract: The no-free-lunch theorems promote a skeptical conclusion that all possible machine learning algorithms equally lack justification. But how could this leave room for a learning theory, that shows that some algorithms are better than others? Drawing parallels to the philosophy of induction, we point out that the no-free-lunch results presuppose a conception of learning algorithms as purely data-dri… ▽ More

    Submitted 9 February, 2022; originally announced February 2022.

    Journal ref: Synthese 199:9979-10015 (2021)

  4. Accumulation Bias in Meta-Analysis: The Need to Consider Time in Error Control

    Authors: Judith ter Schure, Peter D. Grünwald

    Abstract: Studies accumulate over time and meta-analyses are mainly retrospective. These two characteristics introduce dependencies between the analysis time, at which a series of studies is up for meta-analysis, and results within the series. Dependencies introduce bias --- Accumulation Bias --- and invalidate the sampling distribution assumed for p-value tests, thus inflating type-I errors. But dependenci… ▽ More

    Submitted 31 May, 2019; originally announced May 2019.

    Comments: Soon to be published at F1000 Research

  5. arXiv:1905.13367  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayes Un-Expected Bernstein Inequality

    Authors: Zakaria Mhammedi, Peter D. Grunwald, Benjamin Guedj

    Abstract: We present a new PAC-Bayesian generalization bound. Standard bounds contain a $\sqrt{L_n \cdot \KL/n}$ complexity term which dominates unless $L_n$, the empirical error of the learning algorithm's randomized predictions, vanishes. We manage to replace $L_n$ by a term which vanishes in many more situations, essentially whenever the employed learning algorithm is sufficiently stable on the dataset a… ▽ More

    Submitted 3 November, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: 24 pages, 6 figures. To Appear in NeurIPS2019

    Journal ref: NeurIPS 2019

  6. arXiv:1710.07732  [pdf, other

    cs.LG stat.ML

    A Tight Excess Risk Bound via a Unified PAC-Bayesian-Rademacher-Shtarkov-MDL Complexity

    Authors: Peter D. Grünwald, Nishant A. Mehta

    Abstract: We present a novel notion of complexity that interpolates between and generalizes some classic existing complexity notions in learning theory: for estimators like empirical risk minimization (ERM) with arbitrary bounded losses, it is upper bounded in terms of data-independent Rademacher complexity; for generalized Bayesian estimators, it is upper bounded by the data-dependent information complexit… ▽ More

    Submitted 20 October, 2017; originally announced October 2017.

    Comments: 38 pages

  7. Why optional stop** can be a problem for Bayesians

    Authors: Rianne de Heide, Peter D. Grünwald

    Abstract: Recently, optional stop** has been a subject of debate in the Bayesian psychology community. Rouder (2014) argues that optional stop** is no problem for Bayesians, and even recommends the use of optional stop** in practice, as do Wagenmakers et al. (2012). This article addresses the question whether optional stop** is problematic for Bayesian methods, and specifies under which circumstance… ▽ More

    Submitted 25 March, 2021; v1 submitted 28 August, 2017; originally announced August 2017.

    Comments: Replacement of Figures 7a-7d in the appendix. There was a mistake in the sampling plan. Thanks to Jorge Tendeiro for pointing this out. Replaced the main text with the final (published) version. Psychonomic Bulletin & Review 2020 Advance Publication

  8. arXiv:1605.00252  [pdf, other

    cs.LG stat.ML

    Fast Rates for General Unbounded Loss Functions: from ERM to Generalized Bayes

    Authors: Peter D. Grünwald, Nishant A. Mehta

    Abstract: We present new excess risk bounds for general unbounded loss functions including log loss and squared loss, where the distribution of the losses may be heavy-tailed. The bounds hold for general estimators, but they are optimized when applied to $η$-generalized Bayesian, MDL, and empirical risk minimization estimators. In the case of log loss, the bounds imply convergence rates for generalized Baye… ▽ More

    Submitted 5 November, 2019; v1 submitted 1 May, 2016; originally announced May 2016.

    Comments: accepted to JMLR pending minor final modifications

  9. Robust Probability Updating

    Authors: Thijs van Ommen, Wouter M. Koolen, Thijs E. Feenstra, Peter D. Grünwald

    Abstract: This paper discusses an alternative to conditioning that may be used when the probability distribution is not fully specified. It does not require any assumptions (such as CAR: coarsening at random) on the unknown distribution. The well-known Monty Hall problem is the simplest scenario where neither naive conditioning nor the CAR assumption suffice to determine an updated probability distribution.… ▽ More

    Submitted 2 May, 2016; v1 submitted 10 December, 2015; originally announced December 2015.

    Comments: 47 pages, 4 figures. This second version is the accepted manuscript: it incorporates reviewer comments and has a new title

    Journal ref: International Journal of Approximate Reasoning 74 (2016) 30-57

  10. arXiv:1507.02592  [pdf, other

    cs.LG stat.ML

    Fast rates in statistical and online learning

    Authors: Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

    Abstract: The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most… ▽ More

    Submitted 1 September, 2015; v1 submitted 9 July, 2015; originally announced July 2015.

    Comments: 69 pages, 3 figures

    Journal ref: Journal of Machine Learning Research 6(54):1793-1861, 2015

  11. arXiv:1407.7190  [pdf

    cs.AI

    A Game-Theoretic Analysis of Updating Sets of Probabilities

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: We consider how an agent should update her uncertainty when it is represented by a set P of probability distributions and the agent observes that a random variable X takes on value x, given that the agent makes decisions using the minimax criterion, perhaps the best-studied and most commonly-used criterion in the literature. We adopt a game-theoretic framework, where the agent plays against a book… ▽ More

    Submitted 27 July, 2014; originally announced July 2014.

    Comments: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

    Report number: UAI-P-2008-PG-240-247

  12. arXiv:1407.7188  [pdf

    cs.AI

    When Ignorance is Bliss

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: It is commonly-accepted wisdom that more information is better, and that information should never be ignored. Here we argue, using both a Bayesian and a non-Bayesian analysis, that in some situations you are better off ignoring information if your uncertainty is represented by a set of probability measures. These include situations in which the information is relevant for the prediction task at ha… ▽ More

    Submitted 27 July, 2014; originally announced July 2014.

    Comments: Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

    Report number: UAI-P-2004-PG-226-234

  13. arXiv:1407.7183  [pdf

    cs.AI

    Updating Probabilities

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a ``naive space', which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR (coarsening at random) in the statistical literature characterizes when ``naive' conditioning in a naive space works. We show… ▽ More

    Submitted 27 July, 2014; originally announced July 2014.

    Comments: Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002)

    Report number: UAI-P-2002-PG-187-196

  14. arXiv:1401.3906  [pdf

    cs.AI cs.GT

    Making Decisions Using Sets of Probabilities: Updating, Time Consistency, and Calibration

    Authors: Peter D Grunwald, Joseph Y Halpern

    Abstract: We consider how an agent should update her beliefs when her beliefs are represented by a set P of probability distributions, given that the agent makes decisions using the minimax criterion, perhaps the best-studied and most commonly-used criterion in the literature. We adopt a game-theoretic framework, where the agent plays against a bookie, who chooses some distribution from P. We consider two r… ▽ More

    Submitted 16 January, 2014; originally announced January 2014.

    Journal ref: Journal Of Artificial Intelligence Research, Volume 42, pages 393-426, 2011

  15. arXiv:1301.7378  [pdf

    cs.LG stat.ML

    Minimum Encoding Approaches for Predictive Modeling

    Authors: Peter D Grunwald, Petri Kontkanen, Petri Myllymaki, Tomi Silander, Henry Tirri

    Abstract: We analyze differences between two information-theoretically motivated approaches to statistical inference and model selection: the Minimum Description Length (MDL) principle, and the Minimum Message Length (MML) principle. Based on this analysis, we present two revised versions of MML: a pointwise estimator which gives the MML-optimal single parameter model, and a volumewise estimator which give… ▽ More

    Submitted 30 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Fourteenth Conference on Uncertainty in Artificial Intelligence (UAI1998)

    Report number: UAI-P-1998-PG-183-192

  16. arXiv:1301.3860  [pdf

    cs.AI

    Maximum Entropy and the Glasses You Are Looking Through

    Authors: Peter D. Grunwald

    Abstract: We give an interpretation of the Maximum Entropy (MaxEnt) Principle in game-theoretic terms. Based on this interpretation, we make a formal distinction between different ways of {em applying/} Maximum Entropy distributions. MaxEnt has frequently been criticized on the grounds that it leads to highly representation dependent results. Our distinction allows us to avoid this problem in many cases.

    Submitted 16 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence (UAI2000)

    Report number: UAI-P-2000-PG-238-246

  17. arXiv:1301.0534  [pdf, ps, other

    cs.LG stat.ML

    Follow the Leader If You Can, Hedge If You Must

    Authors: Steven de Rooij, Tim van Erven, Peter D. Grünwald, Wouter M. Koolen

    Abstract: Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has terrible performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combi… ▽ More

    Submitted 17 January, 2013; v1 submitted 3 January, 2013; originally announced January 2013.

    Comments: under submission

    Journal ref: Journal of Machine Learning Research 15(37):1281-1316, 2014

  18. arXiv:1107.6004  [pdf

    cs.IT physics.data-an

    Explicit Bounds for Entropy Concentration under Linear Constraints

    Authors: Kostas N. Oikonomou, Peter D. Grunwald

    Abstract: Consider the set of all sequences of $n$ outcomes, each taking one of $m$ values, that satisfy a number of linear constraints. If $m$ is fixed while $n$ increases, most sequences that satisfy the constraints result in frequency vectors whose entropy approaches that of the maximum entropy vector satisfying the constraints. This well-known "entropy concentration" phenomenon underlies the maximum ent… ▽ More

    Submitted 30 September, 2015; v1 submitted 29 July, 2011; originally announced July 2011.

    Comments: 1) An error affecting sec. 3 has been corrected: the parameters delta and theta cannot be chosen independently. Sec. 3 has been revised up to Theorem 3.15 in sec. 3.6. 2) Some minor updates in sec. 4. 3) Some proofs used in both sec. 3 and sec. 4 have been unified (This version to appear in IEEE Transactions on Information Theory, December 2015)

  19. An Algorithmic and a geometric characterization of coarsening at random

    Authors: Richard D. Gill, Peter D. Grünwald

    Abstract: We show that the class of conditional distributions satisfying the coarsening at random (CAR) property for discrete data has a simple and robust algorithmic description based on randomized uniform multicovers: combinatorial objects generalizing the notion of partition of a set. However, the complexity of a given CAR mechanism can be large: the maximal "height" of the needed multicovers can be ex… ▽ More

    Submitted 5 November, 2008; originally announced November 2008.

    Comments: Published in at http://dx.doi.org/10.1214/07-AOS532 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: Note accidental duplicate submission arXiv:0811.0683 MSC Class: 62A01 (Primary); 62N01 (Secondary)

    Journal ref: Annals of Statistics 2008, Vol. 36, No. 5, 2409-2422

  20. arXiv:0809.2754  [pdf, ps, other

    cs.IT cs.LG math.ST

    Algorithmic information theory

    Authors: Peter D. Grunwald, Paul M. B. Vitanyi

    Abstract: We introduce algorithmic information theory, also known as the theory of Kolmogorov complexity. We explain the main concepts of this quantitative approach to defining `information'. We discuss the extent to which Kolmogorov's and Shannon's information theory have a common purpose, and where they are fundamentally different. We indicate how recent developments within the theory allow one to forma… ▽ More

    Submitted 17 September, 2008; v1 submitted 16 September, 2008; originally announced September 2008.

    Comments: 37 pages, 2 figures, pdf, in: Philosophy of Information, P. Adriaans and J. van Benthem, Eds., A volume in Handbook of the philosophy of science, D. Gabbay, P. Thagard, and J. Woods, Eds., Elsevier, 2008. In version 1 of September 16 the refs are missing. Corrected in version 2 of September 17

  21. arXiv:0711.3235  [pdf, ps, other

    cs.AI math.ST

    A Game-Theoretic Analysis of Updating Sets of Probabilities

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: We consider how an agent should update her uncertainty when it is represented by a set $¶$ of probability distributions and the agent observes that a random variable $X$ takes on value $x$, given that the agent makes decisions using the minimax criterion, perhaps the best-studied and most commonly-used criterion in the literature. We adopt a game-theoretic framework, where the agent plays agains… ▽ More

    Submitted 20 November, 2007; originally announced November 2007.

    ACM Class: I.2.4

  22. arXiv:math/0510276  [pdf, ps, other

    math.ST cs.AI stat.ME

    An algorithmic and a geometric characterization of Coarsening At Random

    Authors: Richard D. Gill, Peter D. Grunwald

    Abstract: We show that the class of conditional distributions satisfying the coarsening at Random (CAR) property for discrete data has a simple and robust algorithmic description based on randomized uniform multicovers: combinatorial objects generalizing the notion of partition of a set. However, the complexity of a given CAR mechanism can be large: the maximal "height" of the needed multicovers can be ex… ▽ More

    Submitted 13 September, 2007; v1 submitted 13 October, 2005; originally announced October 2005.

    Comments: 16 pages; accepted in this form for publication by Annals of Statistics

    Report number: See also 0811.0683 (duplicate submission) MSC Class: 62A01 (Primary); 62N01; 60A99; 68T37 (Secondary)

    Journal ref: The Annals of Statistics 2008, Vol. 36, No. 5, 2409-2422

  23. arXiv:cs/0510080  [pdf, ps, other

    cs.AI cs.LG

    When Ignorance is Bliss

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: It is commonly-accepted wisdom that more information is better, and that information should never be ignored. Here we argue, using both a Bayesian and a non-Bayesian analysis, that in some situations you are better off ignoring information if your uncertainty is represented by a set of probability measures. These include situations in which the information is relevant for the prediction task at… ▽ More

    Submitted 25 October, 2005; originally announced October 2005.

    Comments: In Proceedings of the Twentieth Conference on Uncertainty in AI, 2004, pp. 226-234

    ACM Class: I.2.4

  24. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory

    Authors: Peter D. Grunwald, A. Philip Dawid

    Abstract: We describe and develop a close relationship between two problems that have customarily been regarded as distinct: that of maximizing entropy, and that of minimizing worst-case expected loss. Using a formulation grounded in the equilibrium theory of zero-sum games between Decision Maker and Nature, these two problems are shown to be dual to each other, the solution to each providing that to th… ▽ More

    Submitted 5 October, 2004; originally announced October 2004.

    Comments: Published by the Institute of Mathematical Statistics (http://www.imstat.org) in the Annals of Statistics (http://www.imstat.org/aos/) at http://dx.doi.org/10.1214/009053604000000553

    Report number: IMS-AOS-AOS231 MSC Class: 62C20 (Primary) 94A17 (Secondary)

    Journal ref: Annals of Statistics 2004, Vol. 32, No. 4, 1367-1433

  25. arXiv:cs/0306124  [pdf, ps, other

    cs.AI

    Updating Probabilities

    Authors: Peter D. Grunwald, Joseph Y. Halpern

    Abstract: As examples such as the Monty Hall puzzle show, applying conditioning to update a probability distribution on a ``naive space'', which does not take into account the protocol used, can often lead to counterintuitive results. Here we examine why. A criterion known as CAR (``coarsening at random'') in the statistical literature characterizes when ``naive'' conditioning in a naive space works. We s… ▽ More

    Submitted 23 June, 2003; originally announced June 2003.

    Comments: This is an expanded version of a paper that appeared in Proceedings of the Eighteenth Conference on Uncertainty in AI, 2002, pp. 187--196. to appear, Journal of AI Research

    ACM Class: I.2.4