Skip to main content

Showing 1–41 of 41 results for author: Williamson, R C

.
  1. arXiv:2406.02292  [pdf, other

    cs.LG

    An Axiomatic Approach to Loss Aggregation and an Adapted Aggregating Algorithm

    Authors: Armando J. Cabrera Pacheco, Rabanus Derr, Robert C. Williamson

    Abstract: Supervised learning has gone beyond the expected risk minimization framework. Central to most of these developments is the introduction of more general aggregation functions for losses incurred by the learner. In this paper, we turn towards online learning under expert advice. Via easily justified assumptions we characterize a set of reasonable loss aggregation functions as quasi-sums. Based upon… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: 31 pages

  2. arXiv:2404.09741  [pdf, ps, other

    math.ST

    Data Models With Two Manifestations of Imprecision

    Authors: Christian Fröhlich, Robert C. Williamson

    Abstract: Motivated by recently emerging problems in machine learning and statistics, we propose data models which relax the familiar i.i.d. assumption. In essence, we seek to understand what it means for data to come from a set of probability measures. We show that our frequentist data models, parameterized by such sets, manifest two aspects of imprecision. We characterize the intricate interplay of these… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2403.01660  [pdf, other

    cs.LG math.MG

    Geometry and Stability of Supervised Learning Problems

    Authors: Facundo Mémoli, Brantley Vose, Robert C. Williamson

    Abstract: We introduce a notion of distance between supervised learning problems, which we call the Risk distance. This optimal-transport-inspired distance facilitates stability results; one can quantify how seriously issues like sampling bias, noise, limited data, and approximations might change a given problem by bounding how much these modifications can move the problem under the Risk distance. With the… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: 87 pages

  4. arXiv:2401.14483  [pdf, other

    cs.LG stat.ML

    Four Facets of Forecast Felicity: Calibration, Predictiveness, Randomness and Regret

    Authors: Rabanus Derr, Robert C. Williamson

    Abstract: Machine learning is about forecasting. Forecasts, however, obtain their usefulness only through their evaluation. Machine learning has traditionally focused on types of losses and their corresponding regret. Currently, the machine learning community regained interest in calibration. In this work, we show the conceptual equivalence of calibration and regret in evaluating forecasts. We frame the eva… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

  5. arXiv:2307.08643  [pdf, other

    cs.LG stat.ML

    Corruptions of Supervised Learning Problems: Typology and Mitigations

    Authors: Laura Iacovissi, Nan Lu, Robert C. Williamson

    Abstract: Corruption is notoriously widespread in data collection. Despite extensive research, the existing literature on corruption predominantly focuses on specific settings and learning scenarios, lacking a unified view. There is still a limited understanding of how to effectively model and mitigate corruption in machine learning problems. In this work, we develop a general theory of corruption from an i… ▽ More

    Submitted 2 May, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: 56 pages

  6. arXiv:2306.14624  [pdf, ps, other

    cs.LG cs.CY

    Insights From Insurance for Fair Machine Learning

    Authors: Christian Fröhlich, Robert C. Williamson

    Abstract: We argue that insurance can act as an analogon for the social situatedness of machine learning systems, hence allowing machine learning scholars to take insights from the rich and interdisciplinary insurance literature. Tracing the interaction of uncertainty, fairness and responsibility in insurance provides a fresh perspective on fairness in machine learning. We link insurance fairness conception… ▽ More

    Submitted 23 January, 2024; v1 submitted 26 June, 2023; originally announced June 2023.

  7. arXiv:2302.11905  [pdf, other

    cs.LG

    The Geometry of Mixability

    Authors: Armando J. Cabrera Pacheco, Robert C. Williamson

    Abstract: Mixable loss functions are of fundamental importance in the context of prediction with expert advice in the online setting since they characterize fast learning rates. By re-interpreting properness from the point of view of differential geometry, we provide a simple geometric characterization of mixability for the binary and multi-class cases: a proper loss function $\ell$ is $η$-mixable if and on… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 53 pages, 6 figures

  8. On the Richness of Calibration

    Authors: Benedikt Höltgen, Robert C Williamson

    Abstract: Probabilistic predictions can be evaluated through comparisons with observed label frequencies, that is, through the lens of calibration. Recent scholarship on algorithmic fairness has started to look at a growing variety of calibration-based objectives under the name of multi-calibration but has still remained fairly restricted. In this paper, we explore and analyse forms of evaluation through ca… ▽ More

    Submitted 14 May, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Journal ref: 2023 ACM Conference on Fairness, Accountability, and Transparency (FAccT '23), June 12-15, 2023, Chicago, IL, USA

  9. arXiv:2302.03522  [pdf, other

    math.ST

    Systems of Precision: Coherent Probabilities on Pre-Dynkin-Systems and Coherent Previsions on Linear Subspaces

    Authors: Rabanus Derr, Robert C. Williamson

    Abstract: In literature on imprecise probability little attention is paid to the fact that imprecise probabilities are precise on a set of events. We call these sets systems of precision. We show that, under mild assumptions, the system of precision of a lower and upper probability form a so-called (pre-)Dynkin-system. Interestingly, there are several settings, ranging from machine learning on partial data… ▽ More

    Submitted 5 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  10. arXiv:2302.03520  [pdf, other

    math.ST

    Strictly Frequentist Imprecise Probability

    Authors: Christian Fröhlich, Rabanus Derr, Robert C. Williamson

    Abstract: Strict frequentism defines probability as the limiting relative frequency in an infinite sequence. What if the limit does not exist? We present a broader theory, which is applicable also to random phenomena that exhibit diverging relative frequencies. In doing so, we develop a close connection with the theory of imprecise probability: the cluster points of relative frequencies yield a coherent upp… ▽ More

    Submitted 6 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

  11. arXiv:2209.00238  [pdf, other

    cs.LG cs.IT stat.ML

    The Geometry and Calculus of Losses

    Authors: Robert C. Williamson, Zac Cranko

    Abstract: Statistical decision problems lie at the heart of statistical machine learning. The simplest problems are binary and multiclass classification and class probability estimation. Central to their definition is the choice of loss function, which is the means by which the quality of a solution is evaluated. In this paper we systematically develop the theory of loss functions for such problems from a n… ▽ More

    Submitted 17 August, 2023; v1 submitted 1 September, 2022; originally announced September 2022.

    Comments: 65 pages, 17 figures

  12. arXiv:2208.03066  [pdf, other

    cs.LG stat.ML

    Tailoring to the Tails: Risk Measures for Fine-Grained Tail Sensitivity

    Authors: Christian Fröhlich, Robert C. Williamson

    Abstract: Expected risk minimization (ERM) is at the core of many machine learning systems. This means that the risk inherent in a loss distribution is summarized using a single number - its average. In this paper, we propose a general approach to construct risk measures which exhibit a desired tail sensitivity and may replace the expectation operator in ERM. Our method relies on the specification of a refe… ▽ More

    Submitted 23 January, 2023; v1 submitted 5 August, 2022; originally announced August 2022.

    Comments: Made multiple minor edits

  13. arXiv:2207.13596  [pdf, other

    cs.LG cs.CY

    Fairness and Randomness in Machine Learning: Statistical Independence and Relativization

    Authors: Rabanus Derr, Robert C. Williamson

    Abstract: Fair Machine Learning endeavors to prevent unfairness arising in the context of machine learning applications embedded in society. Despite the variety of definitions of fairness and proposed "fair algorithms", there remain unresolved conceptual problems regarding fairness. In this paper, we dissect the role of statistical independence in fairness and randomness notions regularly used in machine le… ▽ More

    Submitted 16 November, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

    Comments: This paper has been presented at the Philosophy of Science meets Machine Learning Conference in Tübingen in October 2022

  14. arXiv:2207.11987  [pdf, other

    cs.LG cs.IT stat.ML

    Information Processing Equalities and the Information-Risk Bridge

    Authors: Robert C. Williamson, Zac Cranko

    Abstract: We introduce two new classes of measures of information for statistical experiments which generalise and subsume $φ$-divergences, integral probability metrics, $\mathfrak{N}$-distances (MMD), and $(f,Γ)$ divergences between two or more distributions. This enables us to derive a simple geometrical relationship between measures of information and the Bayes risk of a statistical decision problem, thu… ▽ More

    Submitted 8 September, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: 48 pages; corrected some typos and added a few additional explanations

    ACM Class: G.3; I.5; E.4

  15. arXiv:2206.03183  [pdf, other

    cs.LG math.ST

    Risk Measures and Upper Probabilities: Coherence and Stratification

    Authors: Christian Fröhlich, Robert C. Williamson

    Abstract: Machine learning typically presupposes classical probability theory which implies that aggregation is built upon expectation. There are now multiple reasons to motivate looking at richer alternatives to classical probability theory as a mathematical foundation for machine learning. We systematically examine a powerful and rich class of alternative aggregation functionals, known variously as spectr… ▽ More

    Submitted 29 January, 2024; v1 submitted 7 June, 2022; originally announced June 2022.

  16. arXiv:2205.09628  [pdf, ps, other

    cs.LG

    What killed the Convex Booster ?

    Authors: Yishay Mansour, Richard Nock, Robert C. Williamson

    Abstract: A landmark negative result of Long and Servedio established a worst-case spectacular failure of a supervised learning trio (loss, algorithm, model) otherwise praised for its high precision machinery. Hundreds of papers followed up on the two suspected culprits: the loss (for being convex) and/or the algorithm (for fitting a classical boosting blueprint). Here, we call to the half-century+ founding… ▽ More

    Submitted 25 May, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

    ACM Class: I.2.6

  17. arXiv:2006.14763  [pdf, ps, other

    cs.LG stat.ML

    PAC-Bayesian Bound for the Conditional Value at Risk

    Authors: Zakaria Mhammedi, Benjamin Guedj, Robert C. Williamson

    Abstract: Conditional Value at Risk (CVaR) is a family of "coherent risk measures" which generalize the traditional mathematical expectation. Widely used in mathematical finance, it is garnering increasing interest in machine learning, e.g., as an alternate approach to regularization, and as a means for ensuring fairness. This paper presents a generalization bound for learning algorithms that minimize the C… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

    Journal ref: NeurIPS 2020

  18. arXiv:1902.06881   

    cs.LG stat.ML

    Proper-Composite Loss Functions in Arbitrary Dimensions

    Authors: Zac Cranko, Robert C. Williamson, Richard Nock

    Abstract: The study of a machine learning problem is in many ways is difficult to separate from the study of the loss function being used. One avenue of inquiry has been to look at these loss functions in terms of their properties as scoring rules via the proper-composite representation, in which predictions are mapped to probability distributions which are then scored via a scoring rule. However, recent re… ▽ More

    Submitted 1 September, 2022; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Oh there are just some simple mistakes in this

  19. arXiv:1902.00985  [pdf, ps, other

    stat.ML cs.LG

    Adversarial Networks and Autoencoders: The Primal-Dual Relationship and Generalization Bounds

    Authors: Hisham Husain, Richard Nock, Robert C. Williamson

    Abstract: Since the introduction of Generative Adversarial Networks (GANs) and Variational Autoencoders (VAE), the literature on generative modelling has witnessed an overwhelming resurgence. The impressive, yet elusive empirical performance of GANs has lead to the rise of many GAN-VAE hybrids, with the hopes of GAN level performance and additional benefits of VAE, such as an encoder for feature reduction,… ▽ More

    Submitted 26 April, 2019; v1 submitted 3 February, 2019; originally announced February 2019.

  20. arXiv:1901.08665  [pdf, other

    cs.LG stat.ML

    Fairness risk measures

    Authors: Robert C. Williamson, Aditya Krishna Menon

    Abstract: Ensuring that classifiers are non-discriminatory or fair with respect to a sensitive feature (e.g., race or gender) is a topical problem. Progress in this task requires fixing a definition of fairness, and there have been several proposals in this regard over the past few years. Several of these, however, assume either binary sensitive features (thus precluding categorical or real-valued sensitive… ▽ More

    Submitted 24 January, 2019; originally announced January 2019.

  21. arXiv:1805.07737  [pdf, other

    cs.LG stat.ML

    Exp-Concavity of Proper Composite Losses

    Authors: Parameswaran Kamalaruban, Robert C. Williamson, Xinhua Zhang

    Abstract: The goal of online prediction with expert advice is to find a decision strategy which will perform almost as well as the best expert in a given pool of experts, on any sequence of outcomes. This problem has been widely studied and $O(\sqrt{T})$ and $O(\log{T})$ regret bounds can be achieved for convex losses (\cite{zinkevich2003online}) and strictly convex losses with bounded first and second deri… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

    Journal ref: PMLR 40:1035-1065, 2015

  22. arXiv:1805.07723  [pdf, other

    cs.LG stat.ML

    Minimax Lower Bounds for Cost Sensitive Classification

    Authors: Parameswaran Kamalaruban, Robert C. Williamson

    Abstract: The cost-sensitive classification problem plays a crucial role in mission-critical machine learning applications, and differs with traditional classification by taking the misclassification costs into consideration. Although being studied extensively in the literature, the fundamental limits of this problem are still not well understood. We investigate the hardness of this problem by extending the… ▽ More

    Submitted 20 May, 2018; originally announced May 2018.

  23. arXiv:1802.06965  [pdf, other

    cs.LG

    Constant Regret, Generalized Mixability, and Mirror Descent

    Authors: Zakaria Mhammedi, Robert C. Williamson

    Abstract: We consider the setting of prediction with expert advice; a learner makes predictions by aggregating those of a group of experts. Under this setting, and for the right choice of loss function and "mixing" algorithm, it is possible for the learner to achieve a constant regret regardless of the number of prediction rounds. For example, a constant regret can be achieved for \emph{mixable} losses usin… ▽ More

    Submitted 31 October, 2018; v1 submitted 19 February, 2018; originally announced February 2018.

    Comments: 48 pages, accepted to NIPS 2018

  24. arXiv:1711.07050  [pdf, other

    stat.ML cs.LG

    A Classifying Variational Autoencoder with Application to Polyphonic Music Generation

    Authors: Jay A. Hennig, Akash Umakantha, Ryan C. Williamson

    Abstract: The variational autoencoder (VAE) is a popular probabilistic generative model. However, one shortcoming of VAEs is that the latent variables cannot be discrete, which makes it difficult to generate data from different modes of a distribution. Here, we propose an extension of the VAE framework that incorporates a classifier to infer the discrete class of the modeled data. To model sequential data,… ▽ More

    Submitted 19 November, 2017; originally announced November 2017.

  25. arXiv:1710.04394  [pdf, other

    cs.LG

    Provably Fair Representations

    Authors: Daniel McNamara, Cheng Soon Ong, Robert C. Williamson

    Abstract: Machine learning systems are increasingly used to make decisions about people's lives, such as whether to give someone a loan or whether to interview someone for a job. This has led to considerable interest in making such machine learning systems fair. One approach is to transform the input data used by the algorithm. This can be achieved by passing each input data point through a representation f… ▽ More

    Submitted 12 October, 2017; originally announced October 2017.

  26. arXiv:1707.04385  [pdf, other

    cs.LG stat.ML

    f-GANs in an Information Geometric Nutshell

    Authors: Richard Nock, Zac Cranko, Aditya Krishna Menon, Lizhen Qu, Robert C. Williamson

    Abstract: Nowozin \textit{et al} showed last year how to extend the GAN \textit{principle} to all $f$-divergences. The approach is elegant but falls short of a full description of the supervised game, and says little about the key player, the generator: for example, what does the generator actually converge to if solving the GAN game means convergence in some space of parameters? How does that provide hints… ▽ More

    Submitted 14 July, 2017; originally announced July 2017.

    ACM Class: I.2.6; I.5.1

  27. arXiv:1705.09055  [pdf, other

    cs.LG

    The cost of fairness in classification

    Authors: Aditya Krishna Menon, Robert C. Williamson

    Abstract: We study the problem of learning classifiers with a fairness constraint, with three main contributions towards the goal of quantifying the problem's inherent tradeoffs. First, we relate two existing fairness measures to cost-sensitive risks. Second, we show that for cost-sensitive classification and fairness measures, the optimal classifier is an instance-dependent thresholding of the class-probab… ▽ More

    Submitted 25 May, 2017; originally announced May 2017.

  28. arXiv:1611.03125  [pdf, other

    cs.LG

    A Modular Theory of Feature Learning

    Authors: Daniel McNamara, Cheng Soon Ong, Robert C. Williamson

    Abstract: Learning representations of data, and in particular learning features for a subsequent prediction task, has been a fruitful area of research delivering impressive empirical results in recent years. However, relatively little is understood about what makes a representation `good'. We propose the idea of a risk gap induced by representation learning for a given prediction context, which measures the… ▽ More

    Submitted 9 November, 2016; originally announced November 2016.

  29. arXiv:1507.02592  [pdf, other

    cs.LG stat.ML

    Fast rates in statistical and online learning

    Authors: Tim van Erven, Peter D. Grünwald, Nishant A. Mehta, Mark D. Reid, Robert C. Williamson

    Abstract: The speed with which a learning algorithm converges as it is presented with more data is a central problem in machine learning --- a fast rate of convergence means less data is needed for the same level of performance. The pursuit of fast rates in online and statistical learning has led to the discovery of many conditions in learning theory under which fast learning is possible. We show that most… ▽ More

    Submitted 1 September, 2015; v1 submitted 9 July, 2015; originally announced July 2015.

    Comments: 69 pages, 3 figures

    Journal ref: Journal of Machine Learning Research 6(54):1793-1861, 2015

  30. arXiv:1506.01520  [pdf, other

    stat.ML cs.LG

    An Average Classification Algorithm

    Authors: Brendan van Rooyen, Aditya Krishna Menon, Robert C. Williamson

    Abstract: Many classification algorithms produce a classifier that is a weighted average of kernel evaluations. When working with a high or infinite dimensional kernel, it is imperative for speed of evaluation and storage issues that as few training samples as possible are used in the kernel expansion. Popular existing approaches focus on altering standard learning algorithms, such as the Support Vector Mac… ▽ More

    Submitted 15 December, 2015; v1 submitted 4 June, 2015; originally announced June 2015.

  31. arXiv:1505.07634  [pdf, other

    cs.LG

    Learning with Symmetric Label Noise: The Importance of Being Unhinged

    Authors: Brendan van Rooyen, Aditya Krishna Menon, Robert C. Williamson

    Abstract: Convex potential minimisation is the de facto approach to binary classification. However, Long and Servedio [2010] proved that under symmetric label noise (SLN), minimisation of any convex potential over a linear function class can result in classification performance equivalent to random guessing. This ostensibly shows that convex losses are not SLN-robust. In this paper, we propose a convex, cla… ▽ More

    Submitted 28 May, 2015; originally announced May 2015.

  32. arXiv:1504.00091  [pdf, other

    stat.ML cs.LG

    Learning in the Presence of Corruption

    Authors: Brendan van Rooyen, Robert C. Williamson

    Abstract: In supervised learning one wishes to identify a pattern present in a joint distribution $P$, of instances, label pairs, by providing a function $f$ from instances to labels that has low risk $\mathbb{E}_{P}\ell(y,f(x))$. To do so, the learner is given access to $n$ iid samples drawn from $P$. In many real world problems clean samples are not available. Rather, the learner is given access to sample… ▽ More

    Submitted 4 July, 2015; v1 submitted 31 March, 2015; originally announced April 2015.

  33. arXiv:1504.00083  [pdf, other

    stat.ML cs.LG

    A Theory of Feature Learning

    Authors: Brendan van Rooyen, Robert C. Williamson

    Abstract: Feature Learning aims to extract relevant information contained in data sets in an automated fashion. It is driving force behind the current deep learning trend, a set of methods that have had widespread empirical success. What is lacking is a theoretical understanding of different feature learning schemes. This work provides a theoretical framework for feature learning and then characterizes when… ▽ More

    Submitted 31 March, 2015; originally announced April 2015.

  34. arXiv:1406.6130  [pdf, other

    cs.LG

    Generalized Mixability via Entropic Duality

    Authors: Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson, Nishant Mehta

    Abstract: Mixability is a property of a loss which characterizes when fast convergence is possible in the game of prediction with expert advice. We show that a key property of mixability generalizes, and the exp and log operations present in the usual theory are not as special as one might have thought. In doing this we introduce a more general notion of $Φ$-mixability where $Φ$ is a general entropy (\ie, a… ▽ More

    Submitted 23 June, 2014; originally announced June 2014.

    Comments: 20 pages, 1 figure. Supersedes the work in arXiv:1403.2433 [cs.LG]

  35. arXiv:1406.3781  [pdf, other

    cs.LG stat.ML

    From Stochastic Mixability to Fast Rates

    Authors: Nishant A. Mehta, Robert C. Williamson

    Abstract: Empirical risk minimization (ERM) is a fundamental learning rule for statistical learning problems where the data is generated according to some unknown distribution $\mathsf{P}$ and returns a hypothesis $f$ chosen from a fixed class $\mathcal{F}$ with small loss $\ell$. In the parametric setting, depending upon $(\ell, \mathcal{F},\mathsf{P})$ ERM can have slow $(1/\sqrt{n})$ or fast $(1/n)$ rate… ▽ More

    Submitted 22 November, 2014; v1 submitted 14 June, 2014; originally announced June 2014.

    Comments: 21 pages, accepted to NIPS 2014

  36. arXiv:1403.2433  [pdf, ps, other

    cs.LG stat.ML

    Generalised Mixability, Constant Regret, and Bayesian Updating

    Authors: Mark D. Reid, Rafael M. Frongillo, Robert C. Williamson

    Abstract: Mixability of a loss is known to characterise when constant regret bounds are achievable in games of prediction with expert advice through the use of Vovk's aggregating algorithm. We provide a new interpretation of mixability via convex analysis that highlights the role of the Kullback-Leibler divergence in its definition. This naturally generalises to what we call $Φ$-mixability where the Bregman… ▽ More

    Submitted 10 March, 2014; originally announced March 2014.

    Comments: 12 pages

  37. arXiv:1402.4884  [pdf, other

    stat.ML

    Le Cam meets LeCun: Deficiency and Generic Feature Learning

    Authors: Brendan van Rooyen, Robert C. Williamson

    Abstract: "Deep Learning" methods attempt to learn generic features in an unsupervised fashion from a large unlabelled data set. These generic features should perform as well as the best hand crafted features for any learning problem that makes use of this data. We provide a definition of generic features, characterize when it is possible to learn them and provide methods closely related to the autoencoder… ▽ More

    Submitted 21 February, 2014; v1 submitted 19 February, 2014; originally announced February 2014.

    Comments: 25 pages, 2 figures

  38. arXiv:1212.5764  [pdf, other

    cs.GT cs.MA

    Strategy-Proof Prediction Markets

    Authors: Ayman Ghoneim, Robert C. Williamson

    Abstract: Prediction markets aggregate agents' beliefs regarding a future event, where each agent is paid based on the accuracy of its reported belief when compared to the realized outcome. Agents may strategically manipulate the market (e.g., delay reporting, make false reports) aiming for higher expected payments, and hence the accuracy of the market's aggregated information will be in question. In this s… ▽ More

    Submitted 23 December, 2012; originally announced December 2012.

    Comments: 9 pages

    ACM Class: J.4; I.2.11

  39. arXiv:0912.3301  [pdf, other

    stat.ML

    Composite Binary Losses

    Authors: Mark D. Reid, Robert C. Williamson

    Abstract: We study losses for binary classification and class probability estimation and extend the understanding of them from margin losses to general composite losses which are the composition of a proper loss with a link function. We characterise when margin losses can be proper composite losses, explicitly show how to determine a symmetric loss in full from half of one of its partial losses, introduce… ▽ More

    Submitted 16 December, 2009; originally announced December 2009.

    Comments: 38 pages, 4 figures. Submitted to JMLR

  40. arXiv:0906.1244  [pdf, other

    cs.IT

    Generalised Pinsker Inequalities

    Authors: Mark D. Reid, Robert C. Williamson

    Abstract: We generalise the classical Pinsker inequality which relates variational divergence to Kullback-Liebler divergence in two ways: we consider arbitrary f-divergences in place of KL divergence, and we assume knowledge of a sequence of values of generalised variational divergences. We then develop a best possible inequality for this doubly generalised situation. Specialising our result to the classi… ▽ More

    Submitted 5 June, 2009; originally announced June 2009.

    Comments: 21 pages, 3 figures, accepted to COLT 2009

  41. arXiv:0901.0356  [pdf, other

    stat.ML math.ST

    Information, Divergence and Risk for Binary Experiments

    Authors: Mark D. Reid, Robert C. Williamson

    Abstract: We unify f-divergences, Bregman divergences, surrogate loss bounds (regret bounds), proper scoring rules, matching losses, cost curves, ROC-curves and information. We do this by systematically studying integral and variational representations of these objects and in so doing identify their primitives which all are related to cost-sensitive binary classification. As well as clarifying relationshi… ▽ More

    Submitted 5 January, 2009; originally announced January 2009.

    Comments: 89 pages, 9 figures