Search | arXiv e-print repository

High-Dimensional Prediction for Sequential Decision Making

Authors: Georgy Noarov, Ramya Ramalingam, Aaron Roth, Stephan Xie

Abstract: We study the problem of making predictions of an adversarially chosen high-dimensional state that are unbiased subject to an arbitrary collection of conditioning events, with the goal of tailoring these events to downstream decision makers. We give efficient algorithms for solving this problem, as well as a number of applications that stem from choosing an appropriate set of conditioning events.… ▽ More We study the problem of making predictions of an adversarially chosen high-dimensional state that are unbiased subject to an arbitrary collection of conditioning events, with the goal of tailoring these events to downstream decision makers. We give efficient algorithms for solving this problem, as well as a number of applications that stem from choosing an appropriate set of conditioning events. For example, we can efficiently make predictions targeted at polynomially many decision makers, giving each of them optimal swap regret if they best-respond to our predictions. We generalize this to online combinatorial optimization, where the decision makers have a very large action space, to give the first algorithms offering polynomially many decision makers no regret on polynomially many subsequences that may depend on their actions and the context. We apply these results to get efficient no-subsequence-regret algorithms in extensive-form games (EFGs), yielding a new family of regret guarantees for EFGs that generalizes some existing EFG regret notions, e.g. regret to informed causal deviations, and is generally incomparable to other known such notions. Next, we develop a novel transparent alternative to conformal prediction for building valid online adversarial multiclass prediction sets. We produce class scores that downstream algorithms can use for producing valid-coverage prediction sets, as if these scores were the true conditional class probabilities. We show this implies strong conditional validity guarantees including set-size-conditional and multigroup-fair coverage for polynomially many downstream prediction sets. Moreover, our class scores can be guaranteed to have improved $L_2$ loss, cross-entropy loss, and generally any Bregman loss, compared to any collection of benchmark models, yielding a high-dimensional real-valued version of omniprediction. △ Less

Submitted 27 October, 2023; v1 submitted 26 October, 2023; originally announced October 2023.

Comments: Added references, Arxiv abstract edited

arXiv:2302.08507 [pdf, ps, other]

The Scope of Multicalibration: Characterizing Multicalibration via Property Elicitation

Authors: Georgy Noarov, Aaron Roth

Abstract: We make a connection between multicalibration and property elicitation and show that (under mild technical conditions) it is possible to produce a multicalibrated predictor for a continuous scalar distributional property $Γ$ if and only if $Γ$ is elicitable. On the negative side, we show that for non-elicitable continuous properties there exist simple data distributions on which even the true di… ▽ More We make a connection between multicalibration and property elicitation and show that (under mild technical conditions) it is possible to produce a multicalibrated predictor for a continuous scalar distributional property $Γ$ if and only if $Γ$ is elicitable. On the negative side, we show that for non-elicitable continuous properties there exist simple data distributions on which even the true distributional predictor is not calibrated. On the positive side, for elicitable $Γ$, we give simple canonical algorithms for the batch and the online adversarial setting, that learn a $Γ$-multicalibrated predictor. This generalizes past work on multicalibrated means and quantiles, and in fact strengthens existing online quantile multicalibration results. To further counter-weigh our negative result, we show that if a property $Γ^1$ is not elicitable by itself, but is elicitable conditionally on another elicitable property $Γ^0$, then there is a canonical algorithm that jointly multicalibrates $Γ^1$ and $Γ^0$; this generalizes past work on mean-moment multicalibration. Finally, as applications of our theory, we provide novel algorithmic and impossibility results for fair (multicalibrated) risk assessment. △ Less

Submitted 16 February, 2023; originally announced February 2023.

arXiv:2209.15145 [pdf, other]

Batch Multivalid Conformal Prediction

Authors: Christopher Jung, Georgy Noarov, Ramya Ramalingam, Aaron Roth

Abstract: We develop fast distribution-free conformal prediction algorithms for obtaining multivalid coverage on exchangeable data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees in two ways: (1) They hold even conditional on group membership -- that is, the target coverage level $1-α$ holds conditionally on membership in each of an arbitrary (potentially… ▽ More We develop fast distribution-free conformal prediction algorithms for obtaining multivalid coverage on exchangeable data in the batch setting. Multivalid coverage guarantees are stronger than marginal coverage guarantees in two ways: (1) They hold even conditional on group membership -- that is, the target coverage level $1-α$ holds conditionally on membership in each of an arbitrary (potentially intersecting) group in a finite collection $\mathcal{G}$ of regions in the feature space. (2) They hold even conditional on the value of the threshold used to produce the prediction set on a given example. In fact multivalid coverage guarantees hold even when conditioning on group membership and threshold value simultaneously. We give two algorithms: both take as input an arbitrary non-conformity score and an arbitrary collection of possibly intersecting groups $\mathcal{G}$, and then can equip arbitrary black-box predictors with prediction sets. Our first algorithm (BatchGCP) is a direct extension of quantile regression, needs to solve only a single convex minimization problem, and produces an estimator which has group-conditional guarantees for each group in $\mathcal{G}$. Our second algorithm (BatchMVP) is iterative, and gives the full guarantees of multivalid conformal prediction: prediction sets that are valid conditionally both on group membership and non-conformity threshold. We evaluate the performance of both of our algorithms in an extensive set of experiments. Code to replicate all of our experiments can be found at https://github.com/ProgBelarus/BatchMultivalidConformal △ Less

Submitted 29 September, 2022; originally announced September 2022.

Comments: Code to replicate all of our experiments can be found at https://github.com/ProgBelarus/BatchMultivalidConformal

arXiv:2206.01067 [pdf, other]

Practical Adversarial Multivalid Conformal Prediction

Authors: Osbert Bastani, Varun Gupta, Christopher Jung, Georgy Noarov, Ramya Ramalingam, Aaron Roth

Abstract: We give a simple, generic conformal prediction method for sequential prediction that achieves target empirical coverage guarantees against adversarially chosen data. It is computationally lightweight -- comparable to split conformal prediction -- but does not require having a held-out validation set, and so all data can be used for training models from which to derive a conformal score. It gives s… ▽ More We give a simple, generic conformal prediction method for sequential prediction that achieves target empirical coverage guarantees against adversarially chosen data. It is computationally lightweight -- comparable to split conformal prediction -- but does not require having a held-out validation set, and so all data can be used for training models from which to derive a conformal score. It gives stronger than marginal coverage guarantees in two ways. First, it gives threshold calibrated prediction sets that have correct empirical coverage even conditional on the threshold used to form the prediction set from the conformal score. Second, the user can specify an arbitrary collection of subsets of the feature space -- possibly intersecting -- and the coverage guarantees also hold conditional on membership in each of these subsets. We call our algorithm MVP, short for MultiValid Prediction. We give both theory and an extensive set of empirical evaluations. △ Less

Submitted 2 June, 2022; originally announced June 2022.

Comments: Code for our experiments can be found at: https://github.com/ProgBelarus/MultiValidPrediction

arXiv:2108.03837 [pdf, ps, other]

Online Minimax Multiobjective Optimization: Multicalibeating and Other Applications

Authors: Daniel Lee, Georgy Noarov, Mallesh Pai, Aaron Roth

Abstract: We introduce a simple but general online learning framework in which a learner plays against an adversary in a vector-valued game that changes every round. Even though the learner's objective is not convex-concave (and so the minimax theorem does not apply), we give a simple algorithm that can compete with the setting in which the adversary must announce their action first, with optimally diminish… ▽ More We introduce a simple but general online learning framework in which a learner plays against an adversary in a vector-valued game that changes every round. Even though the learner's objective is not convex-concave (and so the minimax theorem does not apply), we give a simple algorithm that can compete with the setting in which the adversary must announce their action first, with optimally diminishing regret. We demonstrate the power of our framework by using it to (re)derive optimal bounds and efficient algorithms across a variety of domains, ranging from multicalibration to a large set of no regret algorithms, to a variant of Blackwell's approachability theorem for polytopes with fast convergence rates. As a new application, we show how to ``(multi)calibeat'' an arbitrary collection of forecasters -- achieving an exponentially improved dependence on the number of models we are competing against, compared to prior work. △ Less

Submitted 13 October, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: Appears in NeurIPS 2022

arXiv:2101.01739 [pdf, ps, other]

doi 10.4230/LIPIcs.ITCS.2022.82

Online Multivalid Learning: Means, Moments, and Prediction Intervals

Authors: Varun Gupta, Christopher Jung, Georgy Noarov, Mallesh M. Pai, Aaron Roth

Abstract: We present a general, efficient technique for providing contextual predictions that are "multivalid" in various senses, against an online sequence of adversarially chosen examples $(x,y)$. This means that the resulting estimates correctly predict various statistics of the labels $y$ not just marginally -- as averaged over the sequence of examples -- but also conditionally on $x \in G$ for any $G$… ▽ More We present a general, efficient technique for providing contextual predictions that are "multivalid" in various senses, against an online sequence of adversarially chosen examples $(x,y)$. This means that the resulting estimates correctly predict various statistics of the labels $y$ not just marginally -- as averaged over the sequence of examples -- but also conditionally on $x \in G$ for any $G$ belonging to an arbitrary intersecting collection of groups $\mathcal{G}$. We provide three instantiations of this framework. The first is mean prediction, which corresponds to an online algorithm satisfying the notion of multicalibration from Hebert-Johnson et al. The second is variance and higher moment prediction, which corresponds to an online algorithm satisfying the notion of mean-conditioned moment multicalibration from Jung et al. Finally, we define a new notion of prediction interval multivalidity, and give an algorithm for finding prediction intervals which satisfy it. Because our algorithms handle adversarially chosen examples, they can equally well be used to predict statistics of the residuals of arbitrary point prediction methods, giving rise to very general techniques for quantifying the uncertainty of predictions of black box algorithms, even in an online adversarial setting. When instantiated for prediction intervals, this solves a similar problem as conformal prediction, but in an adversarial environment and with multivalidity guarantees stronger than simple marginal coverage guarantees. △ Less

Submitted 5 January, 2021; originally announced January 2021.

arXiv:2002.10669 [pdf, other]

doi 10.1145/3465456.3467639

Binary Scoring Rules that Incentivize Precision

Authors: Eric Neyman, Georgy Noarov, S. Matthew Weinberg

Abstract: All proper scoring rules incentivize an expert to predict \emph{accurately} (report their true estimate), but not all proper scoring rules equally incentivize \emph{precision}. Rather than treating the expert's belief as exogenously given, we consider a model where a rational expert can endogenously refine their belief by repeatedly paying a fixed cost, and is incentivized to do so by a proper sco… ▽ More All proper scoring rules incentivize an expert to predict \emph{accurately} (report their true estimate), but not all proper scoring rules equally incentivize \emph{precision}. Rather than treating the expert's belief as exogenously given, we consider a model where a rational expert can endogenously refine their belief by repeatedly paying a fixed cost, and is incentivized to do so by a proper scoring rule. Specifically, our expert aims to predict the probability that a biased coin flipped tomorrow will land heads, and can flip the coin any number of times today at a cost of $c$ per flip. Our first main result defines an \emph{incentivization index} for proper scoring rules, and proves that this index measures the expected error of the expert's estimate (where the number of flips today is chosen adaptively to maximize the predictor's expected payoff). Our second main result finds the unique scoring rule which optimizes the incentivization index over all proper scoring rules. We also consider extensions to minimizing the $\ell^{th}$ moment of error, and again provide an incentivization index and optimal proper scoring rule. In some cases, the resulting scoring rule is differentiable, but not infinitely differentiable. In these cases, we further prove that the optimum can be uniformly approximated by polynomial scoring rules. Finally, we compare common scoring rules via our measure, and include simulations confirming the relevance of our measure even in domains outside where it provably applies. △ Less

Submitted 29 May, 2021; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: 42 pages, accepted for publication in EC 2021

arXiv:1810.12806 [pdf, ps, other]

Computing Approximate Equilibria in Weighted Congestion Games via Best-Responses

Authors: Yiannis Giannakopoulos, Georgy Noarov, Andreas S. Schulz

Abstract: We present a deterministic polynomial-time algorithm for computing $d^{d+o(d)}$-approximate (pure) Nash equilibria in (proportional sharing) weighted congestion games with polynomial cost functions of degree at most $d$. This is an exponential improvement of the approximation factor with respect to the previously best deterministic algorithm. An appealing additional feature of the algorithm is tha… ▽ More We present a deterministic polynomial-time algorithm for computing $d^{d+o(d)}$-approximate (pure) Nash equilibria in (proportional sharing) weighted congestion games with polynomial cost functions of degree at most $d$. This is an exponential improvement of the approximation factor with respect to the previously best deterministic algorithm. An appealing additional feature of the algorithm is that it only uses best-improvement steps in the actual game, as opposed to the previously best algorithms, that first had to transform the game itself. Our algorithm is an adaptation of the seminal algorithm by Caragiannis et al. [FOCS'11, TEAC 2015], but we utilize an approximate potential function directly on the original game instead of an exact one on a modified game. A critical component of our analysis, which is of independent interest, is the derivation of a novel bound of $[d/\mathcal{W}(d/ρ)]^{d+1}$ for the Price of Anarchy (PoA) of $ρ$-approximate equilibria in weighted congestion games, where $\mathcal{W}$ is the Lambert-W function. More specifically, we show that this PoA is exactly equal to $Φ_{d,ρ}^{d+1}$, where $Φ_{d,ρ}$ is the unique positive solution of the equation $ρ(x+1)^d=x^{d+1}$. Our upper bound is derived via a smoothness-like argument, and thus holds even for mixed Nash and correlated equilibria, while our lower bound is simple enough to apply even to singleton congestion games. △ Less

Submitted 25 November, 2020; v1 submitted 30 October, 2018; originally announced October 2018.

Comments: To be published in Mathematics of Operations Research

Showing 1–8 of 8 results for author: Noarov, G