Skip to main content

Showing 1–16 of 16 results for author: d'Orsi, T

.
  1. arXiv:2406.04868  [pdf, ps, other

    cs.LG cs.CR cs.DS

    Perturb-and-Project: Differentially Private Similarities and Marginals

    Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Vahab Mirrokni, Peilin Zhong

    Abstract: We revisit the input perturbations framework for differential privacy where noise is added to the input $A\in \mathcal{S}$ and the result is then projected back to the space of admissible datasets $\mathcal{S}$. Through this framework, we first design novel efficient algorithms to privately release pair-wise cosine similarities. Second, we derive a novel algorithm to compute $k$-way marginal queri… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 21 ppages, ICML 2024

    ACM Class: F.2; G.3

  2. arXiv:2406.04860  [pdf, other

    cs.LG cs.DS stat.ML

    Multi-View Stochastic Block Models

    Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Silvio Lattanzi, Rajai Nasser

    Abstract: Graph clustering is a central topic in unsupervised learning with a multitude of practical applications. In recent years, multi-view graph clustering has gained a lot of attention for its applicability to real-world instances where one has access to multiple data sources. In this paper we formalize a new family of models, called \textit{multi-view stochastic block models} that captures this settin… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 31 pages, ICML 2024

    ACM Class: F.2; G.3

  3. arXiv:2406.04857  [pdf, ps, other

    cs.DS cs.LG

    A Near-Linear Time Approximation Algorithm for Beyond-Worst-Case Graph Clustering

    Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Aida Mousavifar

    Abstract: We consider the semi-random graph model of [Makarychev, Makarychev and Vijayaraghavan, STOC'12], where, given a random bipartite graph with $α$ edges and an unknown bipartition $(A, B)$ of the vertex set, an adversary can add arbitrary edges inside each community and remove arbitrary edges from the cut $(A, B)$ (i.e. all adversarial changes are \textit{monotone} with respect to the bipartition). F… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 24 pages, ICML 2024

    ACM Class: F.2; G.3

  4. arXiv:2403.12213  [pdf, ps, other

    cs.DS cs.CC cs.LG stat.ML

    Private graphon estimation via sum-of-squares

    Authors: Hongjie Chen, **gqiu Ding, Tommaso d'Orsi, Yiding Hua, Chih-Hung Liu, David Steurer

    Abstract: We develop the first pure node-differentially-private algorithms for learning stochastic block models and for graphon estimation with polynomial running time for any constant number of blocks. The statistical utility guarantees match those of the previous best information-theoretic (exponential-time) node-private mechanisms for these problems. The algorithm is based on an exponential mechanism for… ▽ More

    Submitted 18 April, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: 71 pages, accepted to STOC 2024

  5. arXiv:2402.18263  [pdf, ps, other

    cs.DS cs.CC

    Max-Cut with $ε$-Accurate Predictions

    Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Anupam Gupta, Euiwoong Lee, Debmalya Panigrahi

    Abstract: We study the approximability of the MaxCut problem in the presence of predictions. Specifically, we consider two models: in the noisy predictions model, for each vertex we are given its correct label in $\{-1,+1\}$ with some unknown probability $1/2 + ε$, and the other (incorrect) label otherwise. In the more-informative partial predictions model, for each vertex we are given its correct label wit… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 18 pages

    ACM Class: F.0

  6. arXiv:2305.10227  [pdf, ps, other

    cs.LG cs.SI stat.ML

    Reaching Kesten-Stigum Threshold in the Stochastic Block Model under Node Corruptions

    Authors: **gqiu Ding, Tommaso d'Orsi, Yiding Hua, David Steurer

    Abstract: We study robust community detection in the context of node-corrupted stochastic block model, where an adversary can arbitrarily modify all the edges incident to a fraction of the $n$ vertices. We present the first polynomial-time algorithm that achieves weak recovery at the Kesten-Stigum threshold even in the presence of a small constant fraction of corrupted nodes. Prior to this work, even state-… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  7. arXiv:2301.04822  [pdf, ps, other

    cs.DS cs.CR cs.LG stat.ML

    Private estimation algorithms for stochastic block models and mixture models

    Authors: Hongjie Chen, Vincent Cohen-Addad, Tommaso d'Orsi, Alessandro Epasto, Jacob Imola, David Steurer, Stefan Tiegel

    Abstract: We introduce general tools for designing efficient private estimation algorithms, in the high-dimensional settings, whose statistical guarantees almost match those of the best known non-private algorithms. To illustrate our techniques, we consider two problems: recovery of stochastic block models and learning mixtures of spherical Gaussians. For the former, we present the first efficient $(ε, δ)$-… ▽ More

    Submitted 15 November, 2023; v1 submitted 11 January, 2023; originally announced January 2023.

  8. arXiv:2211.07327  [pdf, ps, other

    cs.LG stat.ML

    Higher degree sum-of-squares relaxations robust against oblivious outliers

    Authors: Tommaso d'Orsi, Rajai Nasser, Gleb Novikov, David Steurer

    Abstract: We consider estimation models of the form $Y=X^*+N$, where $X^*$ is some $m$-dimensional signal we wish to recover, and $N$ is symmetrically distributed noise that may be unbounded in all but a small $α$ fraction of the entries. We introduce a family of algorithms that under mild assumptions recover the signal $X^*$ in all estimation problems for which there exists a sum-of-squares algorithm that… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: To appear in SODA 2023

  9. arXiv:2206.08092  [pdf, ps, other

    cs.LG stat.ML

    On the well-spread property and its relation to linear regression

    Authors: Hongjie Chen, Tommaso d'Orsi

    Abstract: We consider the robust linear regression model $\boldsymbol{y} = Xβ^* + \boldsymbolη$, where an adversary oblivious to the design $X \in \mathbb{R}^{n \times d}$ may choose $\boldsymbolη$ to corrupt all but a (possibly vanishing) fraction of the observations $\boldsymbol{y}$ in an arbitrary way. Recent work [dLN+21, dNS21] has introduced efficient algorithms for consistent recovery of the paramete… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: To appear in COLT 2022

  10. arXiv:2204.10881  [pdf, ps, other

    cs.CC cs.AI cs.LO stat.ML

    A Ihara-Bass Formula for Non-Boolean Matrices and Strong Refutations of Random CSPs

    Authors: Tommaso d'Orsi, Luca Trevisan

    Abstract: We define a novel notion of ``non-backtracking'' matrix associated to any symmetric matrix, and we prove a ``Ihara-Bass'' type formula for it. We use this theory to prove new results on polynomial-time strong refutations of random constraint satisfaction problems with $k$ variables per constraints (k-CSPs). For a random k-CSP instance constructed out of a constraint that is satisfied by a $p$ fr… ▽ More

    Submitted 14 May, 2023; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: To appear in CCC 2023, 63 pages

  11. arXiv:2202.06442  [pdf, ps, other

    cs.LG cs.DS

    Fast algorithm for overcomplete order-3 tensor decomposition

    Authors: **gqiu Ding, Tommaso d'Orsi, Chih-Hung Liu, Stefan Tiegel, David Steurer

    Abstract: We develop the first fast spectral algorithm to decompose a random third-order tensor over $\mathbb{R}^d$ of rank up to $O(d^{3/2}/\text{polylog}(d))$. Our algorithm only involves simple linear algebra operations and can recover all components in time $O(d^{6.05})$ under the current matrix multiplication time. Prior to this work, comparable guarantees could only be achieved via sum-of-squares [M… ▽ More

    Submitted 28 June, 2022; v1 submitted 13 February, 2022; originally announced February 2022.

    Comments: 59 pages, accepted by COLT 2022

  12. arXiv:2111.08568  [pdf, ps, other

    cs.LG stat.ML

    Robust recovery for stochastic block models

    Authors: **gqiu Ding, Tommaso d'Orsi, Rajai Nasser, David Steurer

    Abstract: We develop an efficient algorithm for weak recovery in a robust version of the stochastic block model. The algorithm matches the statistical guarantees of the best known algorithms for the vanilla version of the stochastic block model. In this sense, our results show that there is no price of robustness in the stochastic block model. Our work is heavily inspired by recent work of Banks, Mohanty, a… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

    Comments: 203 pages, to appear in FOCS 2021

  13. arXiv:2111.02966  [pdf, ps, other

    cs.LG stat.ML

    Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

    Authors: Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser, Gleb Novikov, David Steurer, Stefan Tiegel

    Abstract: We develop machinery to design efficiently computable and consistent estimators, achieving estimation error approaching zero as the number of observations grows, when facing an oblivious adversary that may corrupt responses in all but an $α$ fraction of the samples. As concrete examples, we investigate two problems: sparse regression and principal component analysis (PCA). For sparse regression, w… ▽ More

    Submitted 4 November, 2021; originally announced November 2021.

    Comments: To appear in NeurIPS 2021

  14. arXiv:2106.06308  [pdf, other

    cs.LG cs.CC cs.DS math.ST stat.ML

    The Complexity of Sparse Tensor PCA

    Authors: Davin Choo, Tommaso d'Orsi

    Abstract: We study the problem of sparse tensor principal component analysis: given a tensor $\pmb Y = \pmb W + λx^{\otimes p}$ with $\pmb W \in \otimes^p\mathbb{R}^n$ having i.i.d. Gaussian entries, the goal is to recover the $k$-sparse unit vector $x \in \mathbb{R}^n$. The model captures both sparse PCA (in its Wigner form) and tensor PCA. For the highly sparse regime of $k \leq \sqrt{n}$, we present a… ▽ More

    Submitted 2 November, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

  15. arXiv:2011.06585  [pdf, ps, other

    cs.LG cs.DS

    Sparse PCA: Algorithms, Adversarial Perturbations and Certificates

    Authors: Tommaso d'Orsi, Pravesh K. Kothari, Gleb Novikov, David Steurer

    Abstract: We study efficient algorithms for Sparse PCA in standard statistical models (spiked covariance in its Wishart form). Our goal is to achieve optimal recovery guarantees while being resilient to small perturbations. Despite a long history of prior works, including explicit studies of perturbation resilience, the best known algorithmic guarantees for Sparse PCA are fragile and break down under small… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  16. arXiv:2009.14774  [pdf, ps, other

    cs.LG stat.ML

    Consistent regression when oblivious outliers overwhelm

    Authors: Tommaso d'Orsi, Gleb Novikov, David Steurer

    Abstract: We consider a robust linear regression model $y=Xβ^* + η$, where an adversary oblivious to the design $X\in \mathbb{R}^{n\times d}$ may choose $η$ to corrupt all but an $α$ fraction of the observations $y$ in an arbitrary way. Prior to our work, even for Gaussian $X$, no estimator for $β^*$ was known to be consistent in this model except for quadratic sample size $n \gtrsim (d/α)^2$ or for logarit… ▽ More

    Submitted 25 May, 2021; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: To appear in ICML 2021