Skip to main content

Showing 1–11 of 11 results for author: Donhauser, K

.
  1. arXiv:2404.18905  [pdf, other

    stat.ME cs.LG stat.ML

    Detecting critical treatment effect bias in small subgroups

    Authors: Piersilvio De Bartolomeis, Javier Abad, Konstantin Donhauser, Fanny Yang

    Abstract: Randomized trials are considered the gold standard for making informed decisions in medicine, yet they often lack generalizability to the patient populations in clinical practice. Observational studies, on the other hand, cover a broader patient population but are prone to various biases. Thus, before using an observational study for decision-making, it is crucial to benchmark its treatment effect… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

    Comments: Accepted for presentation at the Conference on Uncertainty in Artificial Intelligence (UAI) 2024

  2. arXiv:2401.17823  [pdf, other

    cs.LG cs.CR

    Privacy-preserving data release leveraging optimal transport and particle gradient descent

    Authors: Konstantin Donhauser, Javier Abad, Neha Hulkund, Fanny Yang

    Abstract: We present a novel approach for differentially private data synthesis of protected tabular datasets, a relevant task in highly sensitive domains such as healthcare and government. Current state-of-the-art methods predominantly use marginal-based approaches, where a dataset is generated from private estimates of the marginals. In this paper, we introduce PrivPGD, a new generation method for margina… ▽ More

    Submitted 12 February, 2024; v1 submitted 31 January, 2024; originally announced January 2024.

  3. arXiv:2312.03871  [pdf, other

    stat.ML cs.LG

    Hidden yet quantifiable: A lower bound for confounding strength using randomized trials

    Authors: Piersilvio De Bartolomeis, Javier Abad, Konstantin Donhauser, Fanny Yang

    Abstract: In the era of fast-paced precision medicine, observational studies play a major role in properly evaluating new treatments in clinical practice. Yet, unobserved confounding can significantly compromise causal conclusions drawn from non-randomized data. We propose a novel strategy that leverages randomized trials to quantify unobserved confounding. First, we design a statistical test to detect unob… ▽ More

    Submitted 1 May, 2024; v1 submitted 6 December, 2023; originally announced December 2023.

    Comments: Accepted for presentation at the International Conference on Artificial Intelligence and Statistics (AISTATS) 2024

  4. arXiv:2302.09680  [pdf, other

    cs.CR

    Certified private data release for sparse Lipschitz functions

    Authors: Konstantin Donhauser, Johan Lokna, Amartya Sanyal, March Boedihardjo, Robert Hönig, Fanny Yang

    Abstract: As machine learning has become more relevant for everyday applications, a natural requirement is the protection of the privacy of the training data. When the relevant learning questions are unknown in advance, or hyper-parameter tuning plays a central role, one solution is to release a differentially private synthetic data set that leads to similar conclusions as the original training data. In thi… ▽ More

    Submitted 28 August, 2023; v1 submitted 19 February, 2023; originally announced February 2023.

    Comments: Revision with major changes

  5. arXiv:2301.07605  [pdf, other

    stat.ML cs.LG

    Strong inductive biases provably prevent harmless interpolation

    Authors: Michael Aerni, Marco Milanta, Konstantin Donhauser, Fanny Yang

    Abstract: Classical wisdom suggests that estimators should avoid fitting noise to achieve good generalization. In contrast, modern overparameterized models can yield small test error despite interpolating noise -- a phenomenon often called "benign overfitting" or "harmless interpolation". This paper argues that the degree to which interpolation is harmless hinges upon the strength of an estimator's inductiv… ▽ More

    Submitted 1 March, 2023; v1 submitted 18 January, 2023; originally announced January 2023.

    Comments: Accepted at ICLR 2023

  6. arXiv:2212.03783  [pdf, ps, other

    stat.ML cs.LG

    Tight bounds for maximum $\ell_1$-margin classifiers

    Authors: Stefan Stojanovic, Konstantin Donhauser, Fanny Yang

    Abstract: Popular iterative algorithms such as boosting methods and coordinate descent on linear models converge to the maximum $\ell_1$-margin classifier, a.k.a. sparse hard-margin SVM, in high dimensional regimes where the data is linearly separable. Previous works consistently show that many estimators relying on the $\ell_1$-norm achieve improved statistical rates for hard sparse ground truths. We show… ▽ More

    Submitted 20 January, 2023; v1 submitted 7 December, 2022; originally announced December 2022.

  7. arXiv:2203.03597  [pdf, other

    stat.ML cs.LG

    Fast Rates for Noisy Interpolation Require Rethinking the Effects of Inductive Bias

    Authors: Konstantin Donhauser, Nicolo Ruggeri, Stefan Stojanovic, Fanny Yang

    Abstract: Good generalization performance on high-dimensional data crucially hinges on a simple structure of the ground truth and a corresponding strong inductive bias of the estimator. Even though this intuition is valid for regularized models, in this paper we caution against a strong inductive bias for interpolation in the presence of noise: While a stronger inductive bias encourages a simpler structure… ▽ More

    Submitted 26 October, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

  8. arXiv:2111.05987  [pdf, other

    math.ST cs.IT cs.LG stat.ML

    Tight bounds for minimum l1-norm interpolation of noisy data

    Authors: Guillaume Wang, Konstantin Donhauser, Fanny Yang

    Abstract: We provide matching upper and lower bounds of order $σ^2/\log(d/n)$ for the prediction error of the minimum $\ell_1$-norm interpolator, a.k.a. basis pursuit. Our result is tight up to negligible terms when $d \gg n$, and is the first to imply asymptotic consistency of noisy minimum-norm interpolation for isotropic features and sparse ground truths. Our work complements the literature on "benign ov… ▽ More

    Submitted 7 March, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

    Comments: 33 pages, 1 figure; accepted to AISTATS 2022

  9. arXiv:2108.02883  [pdf, other

    stat.ML cs.LG

    Interpolation can hurt robust generalization even when there is no noise

    Authors: Konstantin Donhauser, Alexandru Ţifrea, Michael Aerni, Reinhard Heckel, Fanny Yang

    Abstract: Numerous recent works show that overparameterization implicitly reduces variance for min-norm interpolators and max-margin classifiers. These findings suggest that ridge regularization has vanishing benefits in high dimensions. We challenge this narrative by showing that, even in the absence of noise, avoiding interpolation through ridge regularization can significantly improve generalization. We… ▽ More

    Submitted 16 December, 2021; v1 submitted 5 August, 2021; originally announced August 2021.

  10. arXiv:2104.04244  [pdf, other

    math.ST cs.LG stat.ML

    How rotational invariance of common kernels prevents generalization in high dimensions

    Authors: Konstantin Donhauser, Mingqi Wu, Fanny Yang

    Abstract: Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data. In this paper, we show that the rotational invariance property of commonly studie… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

  11. arXiv:1903.07992  [pdf, other

    cs.CV

    Efficient Smoothing of Dilated Convolutions for Image Segmentation

    Authors: Thomas Ziegler, Manuel Fritsche, Lorenz Kuhn, Konstantin Donhauser

    Abstract: Dilated Convolutions have been shown to be highly useful for the task of image segmentation. By introducing gaps into convolutional filters, they enable the use of larger receptive fields without increasing the original kernel size. Even though this allows for the inexpensive capturing of features at different scales, the structure of the dilated convolutional filter leads to a loss of information… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.