Skip to main content

Showing 1–22 of 22 results for author: Zrnic, T

.
  1. arXiv:2405.18379  [pdf, other

    stat.ML cs.LG stat.ME

    A Note on the Prediction-Powered Bootstrap

    Authors: Tijana Zrnic

    Abstract: We introduce PPBoot: a bootstrap-based method for prediction-powered inference. PPBoot is applicable to arbitrary estimation problems and is very simple to implement, essentially only requiring one application of the bootstrap. Through a series of examples, we demonstrate that PPBoot often performs nearly identically to (and sometimes better than) the earlier PPI(++) method based on asymptotic nor… ▽ More

    Submitted 7 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2403.03208  [pdf, other

    stat.ML cs.LG stat.ME

    Active Statistical Inference

    Authors: Tijana Zrnic, Emmanuel J. Candès

    Abstract: Inspired by the concept of active learning, we propose active inference$\unicode{x2013}$a methodology for statistical inference with machine-learning-assisted data collection. Assuming a budget on the number of labels that can be collected, the methodology uses a machine learning model to identify which data points would be most beneficial to label, thus effectively utilizing the budget. It operat… ▽ More

    Submitted 29 May, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

  3. arXiv:2311.01453  [pdf, other

    stat.ML cs.LG stat.ME

    PPI++: Efficient Prediction-Powered Inference

    Authors: Anastasios N. Angelopoulos, John C. Duchi, Tijana Zrnic

    Abstract: We present PPI++: a computationally lightweight methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions. The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets -- for parameters of any dimensionality -- that always improve on classical intervals using onl… ▽ More

    Submitted 25 March, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Code available at https://github.com/aangelopoulos/ppi_py

  4. arXiv:2309.16598  [pdf, other

    stat.ML cs.LG stat.ME

    Cross-Prediction-Powered Inference

    Authors: Tijana Zrnic, Emmanuel J. Candès

    Abstract: While reliable data-driven decision-making hinges on high-quality labeled data, the acquisition of quality labels often involves laborious human annotations or slow and expensive scientific measurements. Machine learning is becoming an appealing alternative as sophisticated predictive techniques are being used to quickly and cheaply produce large amounts of predicted labels; e.g., predicted protei… ▽ More

    Submitted 28 February, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

  5. arXiv:2305.18728  [pdf, other

    cs.LG stat.ML

    Plug-in Performative Optimization

    Authors: Licong Lin, Tijana Zrnic

    Abstract: When predictions are performative, the choice of which predictor to deploy influences the distribution of future observations. The overarching goal in learning under performativity is to find a predictor that has low \emph{performative risk}, that is, good performance on its induced distribution. One family of solutions for optimizing the performative risk, including bandits and other derivative-f… ▽ More

    Submitted 28 May, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

  6. arXiv:2302.04262  [pdf, other

    cs.LG cs.GT stat.ML

    Algorithmic Collective Action in Machine Learning

    Authors: Moritz Hardt, Eric Mazumdar, Celestine Mendler-Dünner, Tijana Zrnic

    Abstract: We initiate a principled study of algorithmic collective action on digital platforms that deploy machine learning algorithms. We propose a simple theoretical model of a collective interacting with a firm's learning algorithm. The collective pools the data of participating individuals and executes an algorithmic strategy by instructing participants how to modify their own data to achieve a collecti… ▽ More

    Submitted 21 June, 2023; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: accepted at ICML 2023, camera-ready updates

  7. arXiv:2301.09633  [pdf, other

    stat.ML cs.AI cs.LG q-bio.QM stat.ME

    Prediction-Powered Inference

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Clara Fannjiang, Michael I. Jordan, Tijana Zrnic

    Abstract: Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system. The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients, without making any assumptions on the ma… ▽ More

    Submitted 9 November, 2023; v1 submitted 23 January, 2023; originally announced January 2023.

    Comments: Code is available at https://github.com/aangelopoulos/ppi_py

  8. arXiv:2212.09009  [pdf, other

    stat.ME math.ST

    Locally Simultaneous Inference

    Authors: Tijana Zrnic, William Fithian

    Abstract: Selective inference is the problem of giving valid answers to statistical questions chosen in a data-driven manner. A standard solution to selective inference is simultaneous inference, which delivers valid answers to the set of all questions that could possibly have been asked. However, simultaneous inference can be unnecessarily conservative if this set includes many questions that were unlikely… ▽ More

    Submitted 2 May, 2024; v1 submitted 17 December, 2022; originally announced December 2022.

  9. arXiv:2208.05949  [pdf, other

    stat.ME cs.LG stat.ML

    Valid Inference after Causal Discovery

    Authors: Paula Gradu, Tijana Zrnic, Yixin Wang, Michael I. Jordan

    Abstract: Causal discovery and causal effect estimation are two fundamental tasks in causal inference. While many methods have been developed for each task individually, statistical challenges arise when applying these methods jointly: estimating causal effects after running causal discovery algorithms on the same data leads to "double dip**," invalidating the coverage guarantees of classical confidence i… ▽ More

    Submitted 20 March, 2023; v1 submitted 11 August, 2022; originally announced August 2022.

  10. arXiv:2208.01185  [pdf, ps, other

    cs.LG cs.GT math.OC

    A Note on Zeroth-Order Optimization on the Simplex

    Authors: Tijana Zrnic, Eric Mazumdar

    Abstract: We construct a zeroth-order gradient estimator for a smooth function defined on the probability simplex. The proposed estimator queries the simplex only. We prove that projected gradient descent and the exponential weights algorithm, when run with this estimator instead of exact gradients, converge at a $\mathcal O(T^{-1/4})$ rate.

    Submitted 1 August, 2022; originally announced August 2022.

  11. arXiv:2202.00628  [pdf, other

    cs.LG cs.GT stat.ML

    Regret Minimization with Performative Feedback

    Authors: Meena Jagadeesan, Tijana Zrnic, Celestine Mendler-Dünner

    Abstract: In performative prediction, the deployment of a predictive model triggers a shift in the data distribution. As these shifts are typically unknown ahead of time, the learner needs to deploy a model to get feedback about the distribution it induces. We study the problem of finding near-optimal models under performativity while maintaining low regret. On the surface, this problem might seem equivalen… ▽ More

    Submitted 18 July, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Appeared at ICML 2022

  12. arXiv:2106.12529  [pdf, other

    cs.LG cs.GT

    Who Leads and Who Follows in Strategic Classification?

    Authors: Tijana Zrnic, Eric Mazumdar, S. Shankar Sastry, Michael I. Jordan

    Abstract: As predictive models are deployed into the real world, they must increasingly contend with strategic behavior. A growing body of work on strategic classification treats this problem as a Stackelberg game: the decision-maker "leads" in the game by deploying a model, and the strategic agents "follow" by playing their best response to the deployed model. Importantly, in this framing, the burden of le… ▽ More

    Submitted 29 January, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

  13. arXiv:2102.08570  [pdf, other

    cs.LG stat.ML

    Outside the Echo Chamber: Optimizing the Performative Risk

    Authors: John Miller, Juan C. Perdomo, Tijana Zrnic

    Abstract: In performative prediction, predictions guide decision-making and hence can influence the distribution of future data. To date, work on performative prediction has focused on finding performatively stable models, which are the fixed points of repeated retraining. However, stable solutions can be far from optimal when evaluated in terms of the performative risk, the loss experienced by the decision… ▽ More

    Submitted 15 June, 2021; v1 submitted 16 February, 2021; originally announced February 2021.

  14. arXiv:2102.06202  [pdf, other

    cs.LG cs.AI cs.CR stat.ME stat.ML

    Private Prediction Sets

    Authors: Anastasios N. Angelopoulos, Stephen Bates, Tijana Zrnic, Michael I. Jordan

    Abstract: In real-world settings involving consequential decision-making, the deployment of machine learning systems generally requires both reliable uncertainty quantification and protection of individuals' privacy. We present a framework that treats these two desiderata jointly. Our framework is based on conformal prediction, a methodology that augments predictive models to return prediction sets that pro… ▽ More

    Submitted 3 March, 2024; v1 submitted 11 February, 2021; originally announced February 2021.

    Comments: Code available at https://github.com/aangelopoulos/private_prediction_sets

    Journal ref: Harvard Data Science Review, 4(2). 2022

  15. arXiv:2011.09462  [pdf, other

    math.ST stat.ME

    Post-Selection Inference via Algorithmic Stability

    Authors: Tijana Zrnic, Michael I. Jordan

    Abstract: When the target of statistical inference is chosen in a data-driven manner, the guarantees provided by classical theories vanish. We propose a solution to the problem of inference after selection by building on the framework of algorithmic stability, in particular its branch with origins in the field of differential privacy. Stability is achieved via randomization of selection and it serves as a q… ▽ More

    Submitted 14 March, 2022; v1 submitted 18 November, 2020; originally announced November 2020.

  16. arXiv:2008.11193  [pdf, other

    cs.CR cs.LG stat.ML

    Individual Privacy Accounting via a Renyi Filter

    Authors: Vitaly Feldman, Tijana Zrnic

    Abstract: We consider a sequential setting in which a single dataset of individuals is used to perform adaptively-chosen analyses, while ensuring that the differential privacy loss of each participant does not exceed a pre-specified privacy budget. The standard approach to this problem relies on bounding a worst-case estimate of the privacy loss over all individuals and all possible values of their data, fo… ▽ More

    Submitted 8 January, 2022; v1 submitted 25 August, 2020; originally announced August 2020.

  17. arXiv:2006.06887  [pdf, other

    cs.LG cs.GT stat.ML

    Stochastic Optimization for Performative Prediction

    Authors: Celestine Mendler-Dünner, Juan C. Perdomo, Tijana Zrnic, Moritz Hardt

    Abstract: In performative prediction, the choice of a model influences the distribution of future data, typically through actions taken based on the model's predictions. We initiate the study of stochastic optimization for performative prediction. What sets this setting apart from traditional stochastic optimization is the difference between merely updating model parameters and deploying the new model. Th… ▽ More

    Submitted 19 February, 2021; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: published at NeurIPS 2020

  18. arXiv:2002.06673  [pdf, other

    cs.LG cs.GT stat.ML

    Performative Prediction

    Authors: Juan C. Perdomo, Tijana Zrnic, Celestine Mendler-Dünner, Moritz Hardt

    Abstract: When predictions support decisions they may influence the outcome they aim to predict. We call such predictions performative; the prediction influences the target. Performativity is a well-studied phenomenon in policy-making that has so far been neglected in supervised learning. When ignored, performativity surfaces as undesirable distribution shift, routinely addressed with retraining. We devel… ▽ More

    Submitted 26 February, 2021; v1 submitted 16 February, 2020; originally announced February 2020.

    Comments: published at ICML'20; fixed some typos

  19. arXiv:1910.04968  [pdf, other

    stat.ME stat.ML

    The Power of Batching in Multiple Hypothesis Testing

    Authors: Tijana Zrnic, Daniel L. Jiang, Aaditya Ramdas, Michael I. Jordan

    Abstract: One important partition of algorithms for controlling the false discovery rate (FDR) in multiple testing is into offline and online algorithms. The first generally achieve significantly higher power of discovery, while the latter allow making decisions sequentially as well as adaptively formulating hypotheses based on past observations. Using existing methodology, it is unclear how one could trade… ▽ More

    Submitted 3 March, 2021; v1 submitted 11 October, 2019; originally announced October 2019.

    Comments: 29 pages, 12 figures

  20. arXiv:1901.11143  [pdf, ps, other

    cs.LG math.ST stat.ML

    Natural Analysts in Adaptive Data Analysis

    Authors: Tijana Zrnic, Moritz Hardt

    Abstract: Adaptive data analysis is frequently criticized for its pessimistic generalization guarantees. The source of these pessimistic bounds is a model that permits arbitrary, possibly adversarial analysts that optimally use information to bias results. While being a central issue in the field, still lacking are notions of natural analysts that allow for more optimistic bounds faithful to the reality tha… ▽ More

    Submitted 11 May, 2019; v1 submitted 30 January, 2019; originally announced January 2019.

    Comments: 22 pages

  21. arXiv:1812.05068  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Asynchronous Online Testing of Multiple Hypotheses

    Authors: Tijana Zrnic, Aaditya Ramdas, Michael I. Jordan

    Abstract: We consider the problem of asynchronous online testing, aimed at providing control of the false discovery rate (FDR) during a continual stream of data collection and testing, where each test may be a sequential test that can start and stop at arbitrary times. This setting increasingly characterizes real-world applications in science and industry, where teams of researchers across large organizatio… ▽ More

    Submitted 21 August, 2020; v1 submitted 12 December, 2018; originally announced December 2018.

    Comments: 36 pages, 16 figures

  22. arXiv:1802.09098  [pdf, other

    stat.ME cs.LG math.ST

    SAFFRON: an adaptive algorithm for online control of the false discovery rate

    Authors: Aaditya Ramdas, Tijana Zrnic, Martin Wainwright, Michael Jordan

    Abstract: In the online false discovery rate (FDR) problem, one observes a possibly infinite sequence of $p$-values $P_1,P_2,\dots$, each testing a different null hypothesis, and an algorithm must pick a sequence of rejection thresholds $α_1,α_2,\dots$ in an online fashion, effectively rejecting the $k$-th null hypothesis whenever $P_k \leq α_k$. Importantly, $α_k$ must be a function of the past, and cannot… ▽ More

    Submitted 10 July, 2019; v1 submitted 25 February, 2018; originally announced February 2018.

    Comments: 19 pages, 13 figures