Skip to main content

Showing 1–6 of 6 results for author: Mazzetto, A

.
  1. arXiv:2403.05446  [pdf, ps, other

    cs.LG stat.ML

    An Improved Algorithm for Learning Drifting Discrete Distributions

    Authors: Alessio Mazzetto

    Abstract: We present a new adaptive algorithm for learning discrete distributions under distribution drift. In this setting, we observe a sequence of independent samples from a discrete distribution that is changing over time, and the goal is to estimate the current distribution. Since we have access to only a single sample for each time step, a good estimation requires a careful choice of the number of pas… ▽ More

    Submitted 8 March, 2024; originally announced March 2024.

    Comments: To be published in AISTATS 2024

  2. arXiv:2306.01658  [pdf, other

    cs.LG

    An Adaptive Method for Weak Supervision with Drifting Data

    Authors: Alessio Mazzetto, Reza Esfandiarpoor, Eli Upfal, Stephen H. Bach

    Abstract: We introduce an adaptive method with formal quality guarantees for weak supervision in a non-stationary setting. Our goal is to infer the unknown labels of a sequence of data by using weak supervision sources that provide independent noisy signals of the correct classification for each data point. This setting includes crowdsourcing and programmatic weak supervision. We focus on the non-stationary… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  3. arXiv:2305.02252  [pdf, ps, other

    cs.LG

    An Adaptive Algorithm for Learning with Unknown Distribution Drift

    Authors: Alessio Mazzetto, Eli Upfal

    Abstract: We develop and analyze a general technique for learning with an unknown distribution drift. Given a sequence of independent observations from the last $T$ steps of a drifting distribution, our algorithm agnostically learns a family of functions with respect to the current distribution at time $T$. Unlike previous work, our technique does not require prior knowledge about the magnitude of the drift… ▽ More

    Submitted 27 October, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: Updated version for Camera-ready with minor changes in text for readability, and including a new small section on linear regression

  4. arXiv:2302.02460  [pdf, other

    cs.LG stat.ML

    Nonparametric Density Estimation under Distribution Drift

    Authors: Alessio Mazzetto, Eli Upfal

    Abstract: We study nonparametric density estimation in non-stationary drift settings. Given a sequence of independent samples taken from a distribution that gradually changes in time, the goal is to compute the best estimate for the current distribution. We prove tight minimax risk bounds for both discrete and continuous smooth densities, where the minimum is over all possible estimates and the maximum is o… ▽ More

    Submitted 27 October, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: Camera Ready version

  5. arXiv:2205.13068  [pdf, other

    cs.LG

    Tight Lower Bounds on Worst-Case Guarantees for Zero-Shot Learning with Attributes

    Authors: Alessio Mazzetto, Cristina Menghini, Andrew Yuan, Eli Upfal, Stephen H. Bach

    Abstract: We develop a rigorous mathematical analysis of zero-shot learning with attributes. In this setting, the goal is to label novel classes with no training data, only detectors for attributes and a description of how those attributes are correlated with the target classes, called the class-attribute matrix. We develop the first non-trivial lower bound on the worst-case error of the best map from attri… ▽ More

    Submitted 28 November, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

  6. arXiv:1904.12728  [pdf, ps, other

    cs.DC cs.DS

    Accurate MapReduce Algorithms for $k$-median and $k$-means in General Metric Spaces

    Authors: Alessio Mazzetto, Andrea Pietracaprina, Geppino Pucci

    Abstract: Center-based clustering is a fundamental primitive for data analysis and becomes very challenging for large datasets. In this paper, we focus on the popular $k$-median and $k$-means variants which, given a set $P$ of points from a metric space and a parameter $k<|P|$, require to identify a set $S$ of $k$ centers minimizing, respectively, the sum of the distances and of the squared distances of all… ▽ More

    Submitted 29 September, 2019; v1 submitted 29 April, 2019; originally announced April 2019.