Skip to main content

Showing 1–50 of 69 results for author: Kane, D

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.02628  [pdf, ps, other

    stat.ML cs.CC cs.DS cs.LG

    Replicability in High Dimensional Statistics

    Authors: Max Hopkins, Russell Impagliazzo, Daniel Kane, Sihan Liu, Christopher Ye

    Abstract: The replicability crisis is a major issue across nearly all areas of empirical science, calling for the formal study of replicability in statistics. Motivated in this context, [Impagliazzo, Lei, Pitassi, and Sorrell STOC 2022] introduced the notion of replicable learning algorithms, and gave basic procedures for $1$-dimensional tasks including statistical queries. In this work, we study the comput… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 119 pages

    ACM Class: F.2.0

  2. arXiv:2403.10416  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Robust Sparse Estimation for Gaussians with Optimal Error under Huber Contamination

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study Gaussian sparse estimation tasks in Huber's contamination model with a focus on mean estimation, PCA, and linear regression. For each of these tasks, we give the first sample and computationally efficient robust estimators with optimal error guarantees, within constant factors. All prior efficient algorithms for these tasks incur quantitatively suboptimal error. Concretely, for Gaussian r… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  3. arXiv:2403.04744  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions

    Authors: Ilias Diakonikolas, Daniel Kane, Lisheng Ren, Yuxin Sun

    Abstract: We study the complexity of Non-Gaussian Component Analysis (NGCA) in the Statistical Query (SQ) model. Prior work developed a general methodology to prove SQ lower bounds for this task that have been applicable to a wide range of contexts. In particular, it was known that for any univariate distribution $A$ satisfying certain conditions, distinguishing between a standard multivariate Gaussian and… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

    Comments: Conference version published in NeurIPS 2023

  4. arXiv:2403.02300  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Statistical Query Lower Bounds for Learning Truncated Gaussians

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the problem of estimating the mean of an identity covariance Gaussian in the truncated setting, in the regime when the truncation set comes from a low-complexity family $\mathcal{C}$ of sets. Specifically, for a fixed but unknown truncation set $S \subseteq \mathbb{R}^d$, we are given access to samples from the distribution $\mathcal{N}(\boldsymbol{ μ}, \mathbf{ I})$ truncated to the set… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  5. arXiv:2312.16616  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Agnostically Learning Multi-index Models with Queries

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the power of query access for the task of agnostic learning under the Gaussian distribution. In the agnostic model, no assumptions are made on the labels and the goal is to compute a hypothesis that is competitive with the {\em best-fit} function in a known class, i.e., it achieves error $\mathrm{opt}+ε$, where $\mathrm{opt}$ is the error of the best function in the class. We focus on a g… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: abstract shortened due to arxiv requirements

  6. arXiv:2312.11769  [pdf, other

    cs.LG cs.DS cs.IT math.ST stat.ML

    Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas

    Abstract: We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a $k$-component mixture distribution $D = \sum_{i =1}^k w_i P_i$, where each $w_i \ge α$ for some known parameter $α$, and each $P_i$ has unknown covariance $Σ_i \preceq σ^2_i \cdot I_d$ for some unknown $σ_i$, the goal is to cluster the sam… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2312.01547  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Near-Optimal Algorithms for Gaussians with Huber Contamination: Mean Estimation and Linear Regression

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas

    Abstract: We study the fundamental problems of Gaussian mean estimation and linear regression with Gaussian covariates in the presence of Huber contamination. Our main contribution is the design of the first sample near-optimal and almost linear-time algorithms with optimal error guarantees for both of these problems. Specifically, for Gaussian robust mean estimation on $\mathbb{R}^d$ with contamination par… ▽ More

    Submitted 3 December, 2023; originally announced December 2023.

    Comments: To appear in NeurIPS 2023

  8. arXiv:2311.13154  [pdf, other

    cs.DS cs.IT cs.LG math.ST stat.ML

    Testing Closeness of Multivariate Distributions via Ramsey Theory

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sihan Liu

    Abstract: We investigate the statistical task of closeness (or equivalence) testing for multidimensional distributions. Specifically, given sample access to two unknown distributions $\mathbf p, \mathbf q$ on $\mathbb R^d$, we want to distinguish between the case that $\mathbf p=\mathbf q$ versus $\|\mathbf p-\mathbf q\|_{A_k} > ε$, where $\|\mathbf p-\mathbf q\|_{A_k}$ denotes the generalized ${A}_k$ dista… ▽ More

    Submitted 21 November, 2023; originally announced November 2023.

  9. arXiv:2310.15932  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Online Robust Mean Estimation

    Authors: Daniel M. Kane, Ilias Diakonikolas, Hanshen Xiao, Sihan Liu

    Abstract: We study the problem of high-dimensional robust mean estimation in an online setting. Specifically, we consider a scenario where $n$ sensors are measuring some common, ongoing phenomenon. At each time step $t=1,2,\ldots,T$, the $i^{th}$ sensor reports its readings $x^{(i)}_t$ for that time step. The algorithm must then commit to its estimate $μ_t$ for the true mean value of the process at time… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: To appear in SODA2024

  10. arXiv:2310.11876  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Mixtures of Linear Classifiers

    Authors: Ilias Diakonikolas, Daniel M. Kane, Yuxin Sun

    Abstract: We study the problem of learning mixtures of linear classifiers under Gaussian covariates. Given sample access to a mixture of $r$ distributions on $\mathbb{R}^n$ of the form $(\mathbf{x},y_{\ell})$, $\ell\in [r]$, where $\mathbf{x}\sim\mathcal{N}(0,\mathbf{I}_n)$ and $y_\ell=\mathrm{sign}(\langle\mathbf{v}_\ell,\mathbf{x}\rangle)$ for an unknown unit vector $\mathbf{v}_\ell$, the goal is to learn… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: To appear in NeurIPS 2023

  11. arXiv:2309.06983  [pdf, ps, other

    stat.OT

    Creating Community in a Data Science Classroom

    Authors: David Kane

    Abstract: A community is a collection of people who know and care about each other. The vast majority of college courses are not communities. This is especially true of statistics and data science courses, both because our classes are larger and because we are more likely to lecture. However, it is possible to create a community in your classroom. This article offers an idiosyncratic set of practices for cr… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  12. arXiv:2307.12840  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $\mathbb{R}^d$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/ε)^{O(k)}$, where $ε>0$ is the target accuracy. Prior work had given an algorithm for this problem with complexity… ▽ More

    Submitted 25 July, 2023; v1 submitted 24 July, 2023; originally announced July 2023.

  13. arXiv:2307.08438  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise

    Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang, Nikos Zarifis

    Abstract: We study the problem of learning general (i.e., not necessarily homogeneous) halfspaces with Random Classification Noise under the Gaussian distribution. We establish nearly-matching algorithmic and Statistical Query (SQ) lower bound results revealing a surprising information-computation gap for this basic problem. Specifically, the sample complexity of this learning problem is $\widetildeΘ(d/ε)$,… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

  14. arXiv:2306.16352  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Information-Computation Tradeoffs for Learning Margin Halfspaces with Random Classification Noise

    Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang, Nikos Zarifis

    Abstract: We study the problem of PAC learning $γ$-margin halfspaces with Random Classification Noise. We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms. Concretely, the sample complexity of the problem is $\widetildeΘ(1/(γ^2 ε))$. We start by giving a simple efficient alg… ▽ More

    Submitted 28 June, 2023; originally announced June 2023.

  15. arXiv:2306.13057  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Bounded Covariance GMMs

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the complexity of learning mixtures of separated Gaussians with common unknown bounded covariance matrix. Specifically, we focus on learning Gaussian mixture models (GMMs) on $\mathbb{R}^d$ of the form $P= \sum_{i=1}^k w_i \mathcal{N}(\boldsymbol μ_i,\mathbf Σ_i)$, where $\mathbf Σ_i = \mathbf Σ\preceq \mathbf I$ and $\min_{i \neq j} \| \boldsymbol μ_i - \boldsymbol μ_j\|_2 \geq k^ε$ for… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  16. arXiv:2305.02544  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Nearly-Linear Time and Streaming Algorithms for Outlier-Robust PCA

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas

    Abstract: We study principal component analysis (PCA), where given a dataset in $\mathbb{R}^d$ from a distribution, the task is to find a unit vector $v$ that approximately maximizes the variance of the distribution after being projected along $v$. Despite being a classical task, standard estimators fail drastically if the data contains even a small fraction of outliers, motivating the problem of robust PCA… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: To appear in ICML 2023

  17. arXiv:2305.00966  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    A Spectral Algorithm for List-Decodable Covariance Estimation in Relative Frobenius Norm

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of list-decodable Gaussian covariance estimation. Given a multiset $T$ of $n$ points in $\mathbb R^d$ such that an unknown $α<1/2$ fraction of points in $T$ are i.i.d. samples from an unknown Gaussian $\mathcal{N}(μ, Σ)$, the goal is to output a list of $O(1/α)$ hypotheses at least one of which is close to $Σ$ in relative Frobenius norm. Our main result is a… ▽ More

    Submitted 1 May, 2023; originally announced May 2023.

  18. arXiv:2303.05485  [pdf, ps, other

    cs.LG stat.ML

    Efficient Testable Learning of Halfspaces with Adversarial Label Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Sihan Liu, Nikos Zarifis

    Abstract: We give the first polynomial-time algorithm for the testable learning of halfspaces in the presence of adversarial label noise under the Gaussian distribution. In the recently introduced testable learning model, one is required to produce a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data. Our tester-learner runs in time… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  19. arXiv:2302.06285  [pdf, ps, other

    cs.LG stat.ML

    Do PAC-Learners Learn the Marginal Distribution?

    Authors: Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan

    Abstract: We study a foundational variant of Valiant and Vapnik and Chervonenkis' Probably Approximately Correct (PAC)-Learning in which the adversary is restricted to a known family of marginal distributions $\mathscr{P}$. In particular, we study how the PAC-learnability of a triple $(\mathscr{P},X,H)$ relates to the learners ability to infer \emph{distributional} information about the adversary's choice o… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

    MSC Class: 68Q32

  20. arXiv:2212.11221  [pdf, ps, other

    math.PR cs.DS cs.LG math.ST stat.ML

    A Nearly Tight Bound for Fitting an Ellipsoid to Gaussian Random Points

    Authors: Daniel M. Kane, Ilias Diakonikolas

    Abstract: We prove that for $c>0$ a sufficiently small universal constant that a random set of $c d^2/\log^4(d)$ independent Gaussian random points in $\mathbb{R}^d$ lie on a common ellipsoid with high probability. This nearly establishes a conjecture of~\cite{SaundersonCPW12}, within logarithmic factors. The latter conjecture has attracted significant attention over the past decade, due to its connections… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  21. arXiv:2212.03008  [pdf, ps, other

    cs.DS cs.CC cs.LG stat.ML

    A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning

    Authors: Ilias Diakonikolas, Christos Tzamos, Daniel M. Kane

    Abstract: The Forster transform is a method of regularizing a dataset by placing it in {\em radial isotropic position} while maintaining some of its essential properties. Forster transforms have played a key role in a diverse range of settings spanning computer science and functional analysis. Prior work had given {\em weakly} polynomial time algorithms for computing Forster transforms, when they exist. Our… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

  22. arXiv:2211.16333  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Ankit Pensia

    Abstract: We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: To appear in NeurIPS 2022

  23. arXiv:2210.13706  [pdf, ps, other

    math.ST cs.DS cs.LG stat.ML

    Gaussian Mean Testing Made Simple

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia

    Abstract: We study the following fundamental hypothesis testing problem, which we term Gaussian mean testing. Given i.i.d. samples from a distribution $p$ on $\mathbb{R}^d$, the task is to distinguish, with high probability, between the following cases: (i) $p$ is the standard Gaussian distribution, $\mathcal{N}(0,I_d)$, and (ii) $p$ is a Gaussian $\mathcal{N}(μ,Σ)$ for some unknown covariance $Σ$ and mean… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: To appear in SIAM Symposium on Simplicity in Algorithms (SOSA) 2023

  24. arXiv:2210.09949  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    SQ Lower Bounds for Learning Single Neurons with Massart Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren, Yuxin Sun

    Abstract: We study the problem of PAC learning a single neuron in the presence of Massart noise. Specifically, for a known activation function $f: \mathbb{R} \to \mathbb{R}$, the learner is given access to labeled examples $(\mathbf{x}, y) \in \mathbb{R}^d \times \mathbb{R}$, where the marginal distribution of $\mathbf{x}$ is arbitrary and the corresponding label $y$ is a Massart corruption of… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

    Comments: To appear in NeurIPS 2022

  25. arXiv:2206.05245  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    List-Decodable Sparse Mean Estimation via Difference-of-Pairs Filtering

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of list-decodable sparse mean estimation. Specifically, for a parameter $α\in (0, 1/2)$, we are given $m$ points in $\mathbb{R}^n$, $\lfloor αm \rfloor$ of which are i.i.d. samples from a distribution $D$ with unknown $k$-sparse mean $μ$. No assumptions are made on the remaining points, which form the majority of the dataset. The goal is to return a small list of candidates co… ▽ More

    Submitted 5 July, 2024; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: Added fact about taking roots in SoS proofs (Fact 2.9)

  26. arXiv:2206.04589  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Optimal SQ Lower Bounds for Robustly Learning Discrete Product Distributions and Ising Models

    Authors: Ilias Diakonikolas, Daniel M. Kane, Yuxin Sun

    Abstract: We establish optimal Statistical Query (SQ) lower bounds for robustly learning certain families of discrete high-dimensional distributions. In particular, we show that no efficient SQ algorithm with access to an $ε$-corrupted binary product distribution can learn its mean within $\ell_2$-error $o(ε\sqrt{\log(1/ε)})$. Similarly, we show that no efficient SQ algorithm with access to an $ε$-corrupted… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: To appear in COLT 2022

  27. arXiv:2206.03441  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Robust Sparse Mean Estimation via Sum of Squares

    Authors: Ilias Diakonikolas, Daniel M. Kane, Sushrut Karmalkar, Ankit Pensia, Thanasis Pittas

    Abstract: We study the problem of high-dimensional sparse mean estimation in the presence of an $ε$-fraction of adversarial outliers. Prior work obtained sample and computationally efficient algorithms for this task for identity-covariance subgaussian distributions. In this work, we develop the first efficient algorithms for robust sparse mean estimation without a priori knowledge of the covariance. For dis… ▽ More

    Submitted 5 July, 2024; v1 submitted 7 June, 2022; originally announced June 2022.

    Comments: Fixed minor oversight in runtime calculation

  28. arXiv:2204.12399  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Streaming Algorithms for High-Dimensional Robust Statistics

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas

    Abstract: We study high-dimensional robust statistics tasks in the streaming model. A recent line of work obtained computationally efficient algorithms for a range of high-dimensional robust estimation tasks. Unfortunately, all previous algorithms require storing the entire dataset, incurring memory at least quadratic in the dimension. In this work, we develop the first efficient streaming algorithms for hi… ▽ More

    Submitted 3 May, 2023; v1 submitted 26 April, 2022; originally announced April 2022.

  29. arXiv:2202.05444  [pdf, ps, other

    cs.LG cs.AI cs.CC math.OC stat.ML

    Computational-Statistical Gaps in Reinforcement Learning

    Authors: Daniel Kane, Sihan Liu, Shachar Lovett, Gaurav Mahajan

    Abstract: Reinforcement learning with function approximation has recently achieved tremendous results in applications with large state spaces. This empirical success has motivated a growing body of theoretical work proposing necessary and sufficient conditions under which efficient reinforcement learning is possible. From this line of work, a remarkably simple minimal sufficient condition has emerged for sa… ▽ More

    Submitted 2 July, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: Updated references. Added discussion on linear Q* and V* only over reachable states

  30. arXiv:2112.09104  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Non-Gaussian Component Analysis via Lattice Basis Reduction

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: Non-Gaussian Component Analysis (NGCA) is the following distribution learning problem: Given i.i.d. samples from a distribution on $\mathbb{R}^d$ that is non-gaussian in a hidden direction $v$ and an independent standard Gaussian in the orthogonal directions, the goal is to approximate the hidden direction $v$. Prior work \cite{DKS17-sq} provided formal evidence for the existence of an information… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

  31. Realizable Learning is All You Need

    Authors: Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan

    Abstract: The equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory. With variants ranging from classical settings like PAC learning and regression to recent trends such as adversarially robust learning, it's surprising that we still lack a unified theory; traditional proofs of the equivalence tend to be disparate, and rely on strong model-specific assumptions li… ▽ More

    Submitted 2 February, 2024; v1 submitted 8 November, 2021; originally announced November 2021.

    MSC Class: 68Q32

    Journal ref: TheoretiCS (February 6, 2024) theoretics:10093

  32. arXiv:2109.11515  [pdf, other

    cs.LG cs.DS math.OC math.ST stat.ML

    Outlier-Robust Sparse Estimation via Non-Convex Optimization

    Authors: Yu Cheng, Ilias Diakonikolas, Rong Ge, Shivam Gupta, Daniel M. Kane, Mahdi Soltanolkotabi

    Abstract: We explore the connection between outlier-robust high-dimensional statistics and non-convex optimization in the presence of sparsity constraints, with a focus on the fundamental tasks of robust sparse mean estimation and robust sparse PCA. We develop novel and simple optimization formulations for these problems such that any approximate stationary point of the associated optimization problem yield… ▽ More

    Submitted 13 November, 2022; v1 submitted 23 September, 2021; originally announced September 2021.

    Comments: Accepted to Conference on Neural Information Processing Systems (NeurIPS) 2022. (Updated to the NeurIPS'22 version in v2.)

  33. arXiv:2108.08767  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Learning General Halfspaces with General Massart Noise under the Gaussian Distribution

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the problem of PAC learning halfspaces on $\mathbb{R}^d$ with Massart noise under the Gaussian distribution. In the Massart model, an adversary is allowed to flip the label of each point $\mathbf{x}$ with unknown probability $η(\mathbf{x}) \leq η$, for some parameter $η\in [0,1/2]$. The goal is to find a hypothesis with misclassification error of $\mathrm{OPT} + ε$, where $\mathrm{OPT}$ i… ▽ More

    Submitted 8 November, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

    Comments: Revised presentation

  34. arXiv:2107.05582  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Forster Decomposition and Learning Halfspaces with Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Christos Tzamos

    Abstract: A Forster transform is an operation that turns a distribution into one with good anti-concentration properties. While a Forster transform does not always exist, we show that any distribution can be efficiently decomposed as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently. As the main application of this result, we obtain the first polyno… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

  35. arXiv:2106.09689  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Statistical Query Lower Bounds for List-Decodable Linear Regression

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart

    Abstract: We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set $T$ of labeled examples $(x, y) \in \mathbb{R}^d \times \mathbb{R}$ and a parameter $0< α<1/2$ such that an $α$-fraction of the points in $T$ are i.i.d. samples from a linear regression model with Gaussian covariates, and the remaining $(1-α)$-fracti… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  36. arXiv:2106.08537  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean Estimation

    Authors: Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian

    Abstract: We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. Specifically, we are given a set $T$ of $n$ points in $\mathbb{R}^d$ and a parameter $0< α<\frac 1 2$ such that an $α$-fraction of the points in $T$ are i.i.d. samples from a well-behaved distribution $\mathcal{D}$ and the remaining $(1-α)$-fraction are arbitrary. The goal is to output… ▽ More

    Submitted 12 November, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: 64 pages, 1 figure. v2 improves results on bounded-covariance clustering, polishes exposition

  37. arXiv:2106.07779  [pdf, ps, other

    cs.LG stat.ML

    Boosting in the Presence of Massart Noise

    Authors: Ilias Diakonikolas, Russell Impagliazzo, Daniel Kane, Rex Lei, Jessica Sorrell, Christos Tzamos

    Abstract: We study the problem of boosting the accuracy of a weak learner in the (distribution-independent) PAC model with Massart noise. In the Massart noise model, the label of each example $x$ is independently misclassified with probability $η(x) \leq η$, where $η<1/2$. The Massart model lies between the random classification noise model and the agnostic model. Our main positive result is the first compu… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  38. arXiv:2102.05629  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    Agnostic Proper Learning of Halfspaces under Gaussian Marginals

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the problem of agnostically learning halfspaces under the Gaussian distribution. Our main result is the {\em first proper} learning algorithm for this problem whose sample complexity and computational complexity qualitatively match those of the best known improper agnostic learner. Building on this result, we also obtain the first proper polynomial-time approximation scheme (PTAS) for agn… ▽ More

    Submitted 10 February, 2021; originally announced February 2021.

  39. arXiv:2102.05047  [pdf, other

    cs.LG stat.ML

    Bounded Memory Active Learning through Enriched Queries

    Authors: Max Hopkins, Daniel Kane, Shachar Lovett, Michal Moshkovitz

    Abstract: The explosive growth of easily-accessible unlabeled data has lead to growing interest in active learning, a paradigm in which data-hungry learning algorithms adaptively select informative examples in order to lower prohibitively expensive labeling costs. Unfortunately, in standard worst-case models of learning, the active setting often provides no improvement over non-adaptive algorithms. To comba… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

    MSC Class: 68Q32

  40. arXiv:2102.04401  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    The Optimality of Polynomial Regression for Agnostic Learning under Gaussian Marginals

    Authors: Ilias Diakonikolas, Daniel M. Kane, Thanasis Pittas, Nikos Zarifis

    Abstract: We study the problem of agnostic learning under the Gaussian distribution. We develop a method for finding hard families of examples for a wide class of problems by using LP duality. For Boolean-valued concept classes, we show that the $L^1$-regression algorithm is essentially best possible, and therefore that the computational difficulty of agnostically learning a concept class is closely related… ▽ More

    Submitted 8 February, 2021; originally announced February 2021.

  41. arXiv:2102.02171  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart, Yuxin Sun

    Abstract: We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted. Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees. Our algorithm can be seen as a special case of an algorithm for robustly lea… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  42. arXiv:2012.15802  [pdf, ps, other

    cs.LG math.ST stat.ML

    The Sample Complexity of Robust Covariance Testing

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of testing the covariance matrix of a high-dimensional Gaussian in a robust setting, where the input distribution has been corrupted in Huber's contamination model. Specifically, we are given i.i.d. samples from a distribution of the form $Z = (1-ε) X + εB$, where $X$ is a zero-mean and unknown covariance Gaussian $\mathcal{N}(0, Σ)$, $B$ is a fixed but unknown noise distribut… ▽ More

    Submitted 31 December, 2020; originally announced December 2020.

  43. arXiv:2012.14390  [pdf, ps, other

    stat.OT math.HO

    Kill The Math and Let the Introductory Course Be Born

    Authors: David Kane

    Abstract: Our introductory classes in statistics and data science use too much mathematics. The key causal effect which our students want our classes to have is to improve their future performance and opportunities. The more professional their computing skills (in the context of data analysis), the greater their likely success. Introductory courses should feature almost no mathematical/statistical formulas… ▽ More

    Submitted 8 December, 2020; originally announced December 2020.

    Comments: 11 pages

  44. arXiv:2012.09720  [pdf, ps, other

    cs.LG cs.CC math.ST stat.ML

    Near-Optimal Statistical Query Hardness of Learning Halfspaces with Massart Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane

    Abstract: We study the problem of PAC learning halfspaces with Massart noise. Given labeled samples $(x, y)$ from a distribution $D$ on $\mathbb{R}^{d} \times \{ \pm 1\}$ such that the marginal $D_x$ on the examples is arbitrary and the label $y$ of example $x$ is generated from the target halfspace corrupted by a Massart adversary with flip** probability $η(x) \leq η\leq 1/2$, the goal is to compute a hy… ▽ More

    Submitted 8 November, 2021; v1 submitted 17 December, 2020; originally announced December 2020.

    Comments: This version improves on the previous version. It obtains a near-optimal hardness result essentially matching known algorithms

  45. arXiv:2012.02119  [pdf, other

    cs.DS cs.LG math.ST stat.ML

    Robustly Learning Mixtures of $k$ Arbitrary Gaussians

    Authors: Ainesh Bakshi, Ilias Diakonikolas, He Jia, Daniel M. Kane, Pravesh K. Kothari, Santosh S. Vempala

    Abstract: We give a polynomial-time algorithm for the problem of robustly estimating a mixture of $k$ arbitrary Gaussians in $\mathbb{R}^d$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions. This resolves the main open problem in several previous works on algorithmic robust statistics, which addressed the special cases of robustly estimating (a) a single Gaussian, (b) a mix… ▽ More

    Submitted 7 June, 2021; v1 submitted 3 December, 2020; originally announced December 2020.

    Comments: This version extends the previous one to yield 1) robust proper learning algorithm with poly(eps) error and 2) an information theoretic argument proving that the same algorithms in fact also yield parameter recovery guarantees. The updates are included in Sections 7,8, and 9 and the main result from the previous version (Thm 1.4) is presented and proved in Section 6

  46. arXiv:2011.09973  [pdf, ps, other

    cs.DS cs.LG math.OC stat.ML

    List-Decodable Mean Estimation in Nearly-PCA Time

    Authors: Ilias Diakonikolas, Daniel M. Kane, Daniel Kongsgaard, Jerry Li, Kevin Tian

    Abstract: Traditionally, robust statistics has focused on designing estimators tolerant to a minority of contaminated data. Robust list-decodable learning focuses on the more challenging regime where only a minority $\frac 1 k$ fraction of the dataset is drawn from the distribution of interest, and no assumptions are made on the remaining data. We study the fundamental task of list-decodable mean estimation… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: 57 pages

  47. arXiv:2010.01705  [pdf, ps, other

    cs.LG cs.DS math.ST stat.ML

    A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

    Abstract: We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise. In the Tsybakov noise model, the label of every sample is independently flipped with an adversarially controlled probability that can be arbitrarily close to $1/2$ for a fraction of the samples. {\em We give the first polynomial-time algorithm for this fundamental learning problem.} Our algorithm learns… ▽ More

    Submitted 4 October, 2020; originally announced October 2020.

  48. arXiv:2009.06540  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Optimal Testing of Discrete Distributions with High Probability

    Authors: Ilias Diakonikolas, Themis Gouleakis, Daniel M. Kane, John Peebles, Eric Price

    Abstract: We study the problem of testing discrete distributions with a focus on the high probability regime. Specifically, given samples from one or more discrete distributions, a property $\mathcal{P}$, and parameters $0< ε, δ<1$, we want to distinguish {\em with probability at least $1-δ$} whether these distributions satisfy $\mathcal{P}$ or are $ε$-far from $\mathcal{P}$ in total variation distance. Mos… ▽ More

    Submitted 14 September, 2020; originally announced September 2020.

  49. arXiv:2007.15618  [pdf, ps, other

    math.ST cs.DS cs.LG stat.ML

    Outlier Robust Mean Estimation with Subgaussian Rates via Stability

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia

    Abstract: We study the problem of outlier robust high-dimensional mean estimation under a finite covariance assumption, and more broadly under finite low-degree moment assumptions. We consider a standard stability condition from the recent robust statistics literature and prove that, except with exponentially small failure probability, there exists a large fraction of the inliers satisfying this condition.… ▽ More

    Submitted 16 March, 2021; v1 submitted 30 July, 2020; originally announced July 2020.

  50. arXiv:2007.15220  [pdf, ps, other

    cs.LG cs.CC cs.DS stat.ML

    The Complexity of Adversarially Robust Proper Learning of Halfspaces with Agnostic Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi

    Abstract: We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent agnostic PAC model, with a focus on $L_p$ perturbations. We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem. An interesting implication of our findings is that the $L_{\infty}$ perturbations case is prov… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.