Skip to main content

Showing 1–15 of 15 results for author: Lattanzi, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.04860  [pdf, other

    cs.LG cs.DS stat.ML

    Multi-View Stochastic Block Models

    Authors: Vincent Cohen-Addad, Tommaso d'Orsi, Silvio Lattanzi, Rajai Nasser

    Abstract: Graph clustering is a central topic in unsupervised learning with a multitude of practical applications. In recent years, multi-view graph clustering has gained a lot of attention for its applicability to real-world instances where one has access to multiple data sources. In this paper we formalize a new family of models, called \textit{multi-view stochastic block models} that captures this settin… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: 31 pages, ICML 2024

    ACM Class: F.2; G.3

  2. arXiv:2405.19977  [pdf, other

    cs.DS cs.LG stat.ML

    Consistent Submodular Maximization

    Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

    Abstract: Maximizing monotone submodular functions under cardinality constraints is a classic optimization task with several applications in data mining and machine learning. In this paper we study this problem in a dynamic environment with consistency constraints: elements arrive in a streaming fashion and the goal is maintaining a constant approximation to the optimal solution while having a stable soluti… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: To appear at ICML 24

  3. arXiv:2311.17840  [pdf, other

    cs.DS cs.LG stat.ML

    A quasi-polynomial time algorithm for Multi-Dimensional Scaling via LP hierarchies

    Authors: Ainesh Bakshi, Vincent Cohen-Addad, Samuel B. Hopkins, Rajesh Jayaram, Silvio Lattanzi

    Abstract: Multi-dimensional Scaling (MDS) is a family of methods for embedding an $n$-point metric into low-dimensional Euclidean space. We study the Kamada-Kawai formulation of MDS: given a set of non-negative dissimilarities $\{d_{i,j}\}_{i , j \in [n]}$ over $n$ points, the goal is to find an embedding $\{x_1,\dots,x_n\} \in \mathbb{R}^k$ that minimizes \[\text{OPT} = \min_{x} \mathbb{E}_{i,j \in [n]} \l… ▽ More

    Submitted 11 April, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: Extended exposition

  4. arXiv:2305.19918  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Fully Dynamic Submodular Maximization over Matroids

    Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

    Abstract: Maximizing monotone submodular functions under a matroid constraint is a classic algorithmic problem with multiple applications in data mining and machine learning. We study this classic problem in the fully dynamic setting, where elements can be both inserted and deleted in real-time. Our main result is a randomized algorithm that maintains an efficient data structure with an $\tilde{O}(k^2)$ amo… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted at ICML 2023

  5. arXiv:2208.07582  [pdf, ps, other

    cs.DS cs.LG stat.ML

    Deletion Robust Non-Monotone Submodular Maximization over Matroids

    Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

    Abstract: Maximizing a submodular function is a fundamental task in machine learning and in this paper we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose spa… ▽ More

    Submitted 16 August, 2022; originally announced August 2022.

    Comments: Preliminary versions of this work appeared as arXiv:2201.13128 and in ICML'22. The main difference with respect to these versions consists in extending our results to non-monotone submodular functions

  6. arXiv:2207.03522  [pdf, other

    cs.LG cs.NE cs.SI physics.soc-ph stat.ML

    TF-GNN: Graph Neural Networks in TensorFlow

    Authors: Oleksandr Ferludin, Arno Eigenwillig, Martin Blais, Dustin Zelle, Jan Pfeifer, Alvaro Sanchez-Gonzalez, Wai Lok Sibon Li, Sami Abu-El-Haija, Peter Battaglia, Neslihan Bulut, Jonathan Halcrow, Filipe Miguel Gonçalves de Almeida, Pedro Gonnet, Liangze Jiang, Parth Kothari, Silvio Lattanzi, André Linhares, Brandon Mayer, Vahab Mirrokni, John Palowitch, Mihir Paradkar, Jennifer She, Anton Tsitsulin, Kevin Villela, Lisa Wang , et al. (2 additional authors not shown)

    Abstract: TensorFlow-GNN (TF-GNN) is a scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in today's information ecosystems. In addition to enabling machine learning researchers and advanced developers, TF-GNN offers low-code solutions to empower the broader developer community in graph learning. Many… ▽ More

    Submitted 23 July, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

  7. arXiv:2201.13128  [pdf, other

    cs.DS cs.LG stat.ML

    Deletion Robust Submodular Maximization over Matroids

    Authors: Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Morteza Zadimoghaddam

    Abstract: Maximizing a monotone submodular function is a fundamental task in machine learning. In this paper, we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, wh… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:5671-5693, 2022

  8. arXiv:2106.04913  [pdf, ps, other

    cs.LG stat.ML

    On Margin-Based Cluster Recovery with Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study an active cluster recovery problem where, given a set of $n$ points and an oracle answering queries like "are these two points in the same cluster?", the task is to recover exactly all clusters using as few queries as possible. We begin by introducing a simple but general notion of margin between clusters that captures, as special cases, the margins used in previous work, the classic SVM… ▽ More

    Submitted 9 June, 2021; originally announced June 2021.

  9. arXiv:2102.00504  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Clusters in Finite Metric Spaces Using Oracle Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We investigate the problem of exact cluster recovery using oracle queries. Previous results show that clusters in Euclidean spaces that are convex and separated with a margin can be reconstructed exactly using only $O(\log n)$ same-cluster queries, where $n$ is the number of input points. In this work, we study this problem in the more challenging non-convex setting. We introduce a structural char… ▽ More

    Submitted 13 July, 2021; v1 submitted 31 January, 2021; originally announced February 2021.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2021

  10. arXiv:2010.06992  [pdf, other

    cs.LG cs.AI cs.SI stat.ML

    InstantEmbedding: Efficient Local Node Representations

    Authors: Ştefan Postăvaru, Anton Tsitsulin, Filipe Miguel Gonçalves de Almeida, Yingtao Tian, Silvio Lattanzi, Bryan Perozzi

    Abstract: In this paper, we introduce InstantEmbedding, an efficient method for generating single-node representations using local PageRank computations. We theoretically prove that our approach produces globally consistent representations in sublinear time. We demonstrate this empirically by conducting extensive experiments on real-world datasets with over a billion edges. Our experiments confirm that Inst… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 23 pages, 9 figures

  11. arXiv:2006.04675  [pdf, other

    cs.LG stat.ML

    Exact Recovery of Mangled Clusters with Same-Cluster Queries

    Authors: Marco Bressan, Nicolò Cesa-Bianchi, Silvio Lattanzi, Andrea Paudice

    Abstract: We study the cluster recovery problem in the semi-supervised active clustering framework. Given a finite set of input points, and an oracle revealing whether any two points lie in the same cluster, our goal is to recover all clusters exactly using as few queries as possible. To this end, we relax the spherical $k$-means cluster assumption of Ashtiani et al.\ to allow for arbitrary ellipsoidal clus… ▽ More

    Submitted 30 October, 2020; v1 submitted 8 June, 2020; originally announced June 2020.

    Comments: To appear at NeurIPS 2020 (oral)

  12. arXiv:1905.00948  [pdf, other

    cs.LG cs.DS stat.ML

    Submodular Streaming in All its Glory: Tight Approximation, Minimum Memory and Low Adaptive Complexity

    Authors: Ehsan Kazemi, Marko Mitrovic, Morteza Zadimoghaddam, Silvio Lattanzi, Amin Karbasi

    Abstract: Streaming algorithms are generally judged by the quality of their solution, memory footprint, and computational complexity. In this paper, we study the problem of maximizing a monotone submodular function in the streaming setting with a cardinality constraint $k$. We first propose Sieve-Streaming++, which requires just one pass over the data, keeps only $O(k)$ elements and achieves the tight… ▽ More

    Submitted 13 May, 2019; v1 submitted 2 May, 2019; originally announced May 2019.

    Comments: Proceedings of the 36th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019

  13. arXiv:1802.05733  [pdf, other

    cs.LG stat.ML

    Fair Clustering Through Fairlets

    Authors: Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii

    Abstract: We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the $k$-center and the $k$-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Journal ref: NIPS 2017: 5036-5044

  14. arXiv:1711.09649  [pdf, other

    stat.ML cs.LG

    One-Shot Coresets: The Case of k-Clustering

    Authors: Olivier Bachem, Mario Lucic, Silvio Lattanzi

    Abstract: Scaling clustering algorithms to massive data sets is a challenging task. Recently, several successful approaches based on data summarization methods, such as coresets and sketches, were proposed. While these techniques provide provably good and small summaries, they are inherently problem dependent - the practitioner has to commit to a fixed clustering objective before even exploring the data. Ho… ▽ More

    Submitted 20 February, 2018; v1 submitted 27 November, 2017; originally announced November 2017.

    Comments: To Appear In AISTATS 2018

  15. arXiv:1304.8132  [pdf, other

    cs.DS cs.LG stat.ML

    Local Graph Clustering Beyond Cheeger's Inequality

    Authors: Zeyuan Allen Zhu, Silvio Lattanzi, Vahab Mirrokni

    Abstract: Motivated by applications of large-scale graph clustering, we study random-walk-based LOCAL algorithms whose running times depend only on the size of the output cluster, rather than the entire graph. All previously known such algorithms guarantee an output conductance of $\tilde{O}(\sqrt{φ(A)})$ when the target set $A$ has conductance $φ(A)\in[0,1]$. In this paper, we improve it to… ▽ More

    Submitted 7 November, 2013; v1 submitted 30 April, 2013; originally announced April 2013.

    Comments: An extended abstract of this paper has appeared in the proceedings of the 30th International Conference on Machine Learning (ICML 2013)