Skip to main content

Showing 1–5 of 5 results for author: Błasiok, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2305.18764  [pdf, other

    cs.LG math.ST stat.ML

    When Does Optimizing a Proper Loss Yield Calibration?

    Authors: Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Preetum Nakkiran

    Abstract: Optimizing proper loss functions is popularly believed to yield predictors with good calibration properties; the intuition being that for such losses, the global optimum is to predict the ground-truth probabilities, which is indeed calibrated. However, typical machine learning models are trained to approximately minimize loss over restricted families of predictors, that are unlikely to contain the… ▽ More

    Submitted 8 December, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: In NeurIPS 2023. Selected for spotlight presentation

  2. arXiv:2304.09424  [pdf, other

    cs.LG cs.AI stat.ML

    Loss Minimization Yields Multicalibration for Large Neural Networks

    Authors: Jarosław Błasiok, Parikshit Gopalan, Lunjia Hu, Adam Tauman Kalai, Preetum Nakkiran

    Abstract: Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal than loss minimization, even for simple predictors such as linear functions. In this work, we consider the setting where the protected groups can be represented by neural networks of size $k$, and the… ▽ More

    Submitted 7 December, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

    Comments: In ITCS 2024

  3. arXiv:2204.03230  [pdf, other

    cs.LG cs.AI cs.CR cs.CV stat.ML

    What You See is What You Get: Principled Deep Learning via Distributional Generalization

    Authors: Bogdan Kulynych, Yao-Yuan Yang, Yaodong Yu, Jarosław Błasiok, Preetum Nakkiran

    Abstract: Having similar behavior at training time and test time $-$ what we call a "What You See Is What You Get" (WYSIWYG) property $-$ is desirable in machine learning. Models trained with standard stochastic gradient descent (SGD), however, do not necessarily have this property, as their complex behaviors such as robustness or subgroup performance can differ drastically between training and test time. I… ▽ More

    Submitted 17 October, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

    Comments: First two authors contributed equally. To appear in NeurIPS 2022

  4. arXiv:1809.05596  [pdf, ps, other

    stat.ME cs.LG math.ST

    The Generic Holdout: Preventing False-Discoveries in Adaptive Data Science

    Authors: Preetum Nakkiran, Jarosław Błasiok

    Abstract: Adaptive data analysis has posed a challenge to science due to its ability to generate false hypotheses on moderately large data sets. In general, with non-adaptive data analyses (where queries to the data are generated without being influenced by answers to previous queries) a data set containing $n$ samples may support exponentially many queries in $n$. This number reduces to linearly many under… ▽ More

    Submitted 14 September, 2018; originally announced September 2018.

  5. arXiv:1609.05388  [pdf, other

    stat.ML cs.LG

    ADAGIO: Fast Data-aware Near-Isometric Linear Embeddings

    Authors: Jarosław Błasiok, Charalampos E. Tsourakakis

    Abstract: Many important applications, including signal reconstruction, parameter estimation, and signal processing in a compressed domain, rely on a low-dimensional representation of the dataset that preserves {\em all} pairwise distances between the data points and leverages the inherent geometric structure that is typically present. Recently Hedge, Sankaranarayanan, Yin and Baraniuk \cite{hedge2015} prop… ▽ More

    Submitted 17 September, 2016; originally announced September 2016.

    Comments: ICDM 2016