Skip to main content

Showing 1–40 of 40 results for author: Vassilvitskii, S

.
  1. arXiv:2405.18568  [pdf, other

    cs.DS cs.LG

    Warm-starting Push-Relabel

    Authors: Sami Davies, Sergei Vassilvitskii, Yuyan Wang

    Abstract: Push-Relabel is one of the most celebrated network flow algorithms. Maintaining a pre-flow that saturates a cut, it enjoys better theoretical and empirical running time than other flow algorithms, such as Ford-Fulkerson. In practice, Push-Relabel is even faster than what theoretical guarantees can promise, in part because of the use of good heuristics for seeding and updating the iterative algorit… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2403.04856  [pdf, other

    cs.GT

    Winner-Pays-Bid Auctions Minimize Variance

    Authors: Preston McAfee, Renato Paes Leme, Balasubramanian Sivan, Sergei Vassilvitskii

    Abstract: Any social choice function (e.g the efficient allocation) can be implemented using different payment rules: first price, second price, all-pay, etc. All of these payment rules are guaranteed to have the same expected revenue by the revenue equivalence theorem, but have different distributions of revenue, leading to a question of which one is best. We prove that among all possible payment rules, wi… ▽ More

    Submitted 27 June, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  3. arXiv:2402.04177  [pdf, other

    cs.CL cs.LG stat.ML

    Scaling Laws for Downstream Task Performance of Large Language Models

    Authors: Berivan Isik, Natalia Ponomareva, Hussein Hazimeh, Dimitris Paparas, Sergei Vassilvitskii, Sanmi Koyejo

    Abstract: Scaling laws provide important insights that can guide the design of large language models (LLMs). Existing work has primarily focused on studying scaling laws for pretraining (upstream) loss. However, in transfer learning settings, in which LLMs are pretrained on an unsupervised dataset and then finetuned on a downstream task, we often also care about the downstream performance. In this work, we… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

  4. arXiv:2308.10316  [pdf, ps, other

    cs.DS

    Almost Tight Bounds for Differentially Private Densest Subgraph

    Authors: Michael Dinitz, Satyen Kale, Silvio Lattanzi, Sergei Vassilvitskii

    Abstract: We study the Densest Subgraph (DSG) problem under the additional constraint of differential privacy. DSG is a fundamental theoretical question which plays a central role in graph analytics, and so privacy is a natural requirement. All known private algorithms for Densest Subgraph lose constant multiplicative factors, despite the existence of non-private exact algorithms. We show that, perhaps surp… ▽ More

    Submitted 7 April, 2024; v1 submitted 20 August, 2023; originally announced August 2023.

    Comments: Revised presentation, added value bound

  5. arXiv:2308.05067  [pdf, other

    cs.DS

    Controlling Tail Risk in Online Ski-Rental

    Authors: Michael Dinitz, Sung** Im, Thomas Lavastida, Benjamin Moseley, Sergei Vassilvitskii

    Abstract: The classical ski-rental problem admits a textbook 2-competitive deterministic algorithm, and a simple randomized algorithm that is $\frac{e}{e-1}$-competitive in expectation. The randomized algorithm, while optimal in expectation, has a large variance in its performance: it has more than a 37% chance of competitive ratio exceeding 2, and a $Θ(1/n)$ chance of the competitive ratio exceeding $n$!… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: 28 pages, 2 figures

  6. arXiv:2304.07210  [pdf, other

    cs.CR cs.LG

    Measuring Re-identification Risk

    Authors: CJ Carey, Travis Dick, Alessandro Epasto, Adel Javanmard, Josh Karlin, Shankar Kumar, Andres Munoz Medina, Vahab Mirrokni, Gabriel Henrique Nunes, Sergei Vassilvitskii, Peilin Zhong

    Abstract: Compact user representations (such as embeddings) form the backbone of personalization services. In this work, we present a new theoretical framework to measure re-identification risk in such user representations. Our framework, based on hypothesis testing, formally bounds the probability that an attacker may be able to obtain the identity of a user from their representation. As an application, we… ▽ More

    Submitted 31 July, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

  7. arXiv:2304.06929  [pdf

    cs.CR

    Advancing Differential Privacy: Where We Are Now and Future Directions for Real-World Deployment

    Authors: Rachel Cummings, Damien Desfontaines, David Evans, Roxana Geambasu, Yangsibo Huang, Matthew Jagielski, Peter Kairouz, Gautam Kamath, Sewoong Oh, Olga Ohrimenko, Nicolas Papernot, Ryan Rogers, Milan Shen, Shuang Song, Weijie Su, Andreas Terzis, Abhradeep Thakurta, Sergei Vassilvitskii, Yu-Xiang Wang, Li Xiong, Sergey Yekhanin, Da Yu, Huanyu Zhang, Wanrong Zhang

    Abstract: In this article, we present a detailed review of current practices and state-of-the-art methodologies in the field of differential privacy (DP), with a focus of advancing DP's deployment in real-world applications. Key points and high-level contents of the article were originated from the discussions from "Differential Privacy (DP): Challenges Towards the Next Frontier," a workshop held in July 20… ▽ More

    Submitted 12 March, 2024; v1 submitted 14 April, 2023; originally announced April 2023.

  8. arXiv:2303.00837  [pdf, other

    cs.DS

    Predictive Flows for Faster Ford-Fulkerson

    Authors: Sami Davies, Benjamin Moseley, Sergei Vassilvitskii, Yuyan Wang

    Abstract: Recent work has shown that leveraging learned predictions can improve the running time of algorithms for bipartite matching and similar combinatorial problems. In this work, we build on this idea to improve the performance of the widely used Ford-Fulkerson algorithm for computing maximum flows by seeding Ford-Fulkerson with predicted flows. Our proposed method offers strong theoretical performance… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

  9. arXiv:2303.00654  [pdf, other

    cs.LG cs.CR stat.ML

    How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy

    Authors: Natalia Ponomareva, Hussein Hazimeh, Alex Kurakin, Zheng Xu, Carson Denison, H. Brendan McMahan, Sergei Vassilvitskii, Steve Chien, Abhradeep Thakurta

    Abstract: ML models are ubiquitous in real world applications and are a constant focus of research. At the same time, the community has started to realize the importance of protecting the privacy of ML training data. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP t… ▽ More

    Submitted 31 July, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Journal ref: Journal of Artificial Intelligence Research 77 (2023) 1113-1201

  10. arXiv:2301.05605  [pdf, ps, other

    cs.DS

    Differentially Private Continual Releases of Streaming Frequency Moment Estimations

    Authors: Alessandro Epasto, Jieming Mao, Andres Munoz Medina, Vahab Mirrokni, Sergei Vassilvitskii, Peilin Zhong

    Abstract: The streaming model of computation is a popular approach for working with large-scale data. In this setting, there is a stream of items and the goal is to compute the desired quantities (usually data statistics) while making a single pass through the stream and using as little space as possible. Motivated by the importance of data privacy, we develop differentially private streaming algorithms u… ▽ More

    Submitted 13 January, 2023; originally announced January 2023.

  11. arXiv:2210.12438  [pdf, ps, other

    cs.LG cs.DS

    Algorithms with Prediction Portfolios

    Authors: Michael Dinitz, Sung** Im, Thomas Lavastida, Benjamin Moseley, Sergei Vassilvitskii

    Abstract: The research area of algorithms with predictions has seen recent success showing how to incorporate machine learning into algorithm design to improve performance when the predictions are correct, while retaining worst-case guarantees when they are not. Most previous work has assumed that the algorithm has access to a single predictor. However, in practice, there are many machine learning methods a… ▽ More

    Submitted 2 December, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

    Comments: 24 pages. Appears at NeurIPS 2022

  12. arXiv:2210.11222  [pdf, other

    cs.CR cs.AI cs.DS cs.LG stat.ML

    Learning-Augmented Private Algorithms for Multiple Quantile Release

    Authors: Mikhail Khodak, Kareem Amin, Travis Dick, Sergei Vassilvitskii

    Abstract: When applying differential privacy to sensitive data, we can often improve performance using external information such as other sensitive data, public data, or human priors. We propose to use the learning-augmented algorithms (or algorithms with predictions) framework -- previously applied largely to improve time complexity or competitive ratios -- as a powerful way of designing and analyzing priv… ▽ More

    Submitted 8 May, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: To appear in ICML 2023

  13. arXiv:2208.07353  [pdf, other

    cs.LG cs.CR

    Easy Differentially Private Linear Regression

    Authors: Kareem Amin, Matthew Joseph, Mónica Ribero, Sergei Vassilvitskii

    Abstract: Linear regression is a fundamental tool for statistical analysis. This has motivated the development of linear regression methods that also satisfy differential privacy and thus guarantee that the learned model reveals little about any one data point used to construct it. However, existing differentially private solutions assume that the end user can easily specify good data bounds and hyperparame… ▽ More

    Submitted 16 March, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

    Comments: This version corresponds to the camera-ready at ICLR 2023

  14. arXiv:2206.08646  [pdf, other

    cs.DS cs.CR cs.LG

    Scalable Differentially Private Clustering via Hierarchically Separated Trees

    Authors: Vincent Cohen-Addad, Alessandro Epasto, Silvio Lattanzi, Vahab Mirrokni, Andres Munoz, David Saulpic, Chris Schwiegelshohn, Sergei Vassilvitskii

    Abstract: We study the private $k$-median and $k$-means clustering problem in $d$ dimensional Euclidean space. By leveraging tree embeddings, we give an efficient and easy to implement algorithm, that is empirically competitive with state of the art non private methods. We prove that our method computes a solution with cost at most $O(d^{3/2}\log n)\cdot OPT + O(k d^2 \log^2 n / ε^2)$, where $ε$ is the priv… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: To appear at KDD'22

  15. arXiv:2202.09312  [pdf, other

    cs.LG cs.AI cs.DS stat.ML

    Learning Predictions for Algorithms with Predictions

    Authors: Mikhail Khodak, Maria-Florina Balcan, Ameet Talwalkar, Sergei Vassilvitskii

    Abstract: A burgeoning paradigm in algorithm design is the field of algorithms with predictions, in which algorithms can take advantage of a possibly-imperfect prediction of some aspect of the problem. While much work has focused on using predictions to improve competitive ratios, running times, or other performance measures, less effort has been devoted to the question of how to obtain the predictions them… ▽ More

    Submitted 17 October, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: NeurIPS 2022 camera-ready

  16. arXiv:2201.11603  [pdf, other

    cs.CR cs.DC

    Plume: Differential Privacy at Scale

    Authors: Kareem Amin, Jennifer Gillenwater, Matthew Joseph, Alex Kulesza, Sergei Vassilvitskii

    Abstract: Differential privacy has become the standard for private data analysis, and an extensive literature now offers differentially private solutions to a wide variety of problems. However, translating these solutions into practical systems often requires confronting details that the literature ignores or abstracts away: users may contribute multiple records, the domain of possible records may be unknow… ▽ More

    Submitted 27 January, 2022; originally announced January 2022.

  17. arXiv:2110.02159  [pdf, other

    cs.LG cs.CR cs.DS cs.IT

    Label differential privacy via clustering

    Authors: Hossein Esfandiari, Vahab Mirrokni, Umar Syed, Sergei Vassilvitskii

    Abstract: We present new mechanisms for \emph{label differential privacy}, a relaxation of differentially private machine learning that only protects the privacy of the labels in the training set. Our mechanisms cluster the examples in the training set using their (non-private) feature vectors, randomly re-sample each label from examples in the same cluster, and output a training set with noisy labels as we… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  18. arXiv:2107.09770  [pdf, other

    cs.LG cs.DS

    Faster Matchings via Learned Duals

    Authors: Michael Dinitz, Sung** Im, Thomas Lavastida, Benjamin Moseley, Sergei Vassilvitskii

    Abstract: A recent line of research investigates how algorithms can be augmented with machine-learned predictions to overcome worst case lower bounds. This area has revealed interesting algorithmic insights into problems, with particular success in the design of competitive online algorithms. However, the question of improving algorithm running times with predictions has largely been unexplored. We take a… ▽ More

    Submitted 20 July, 2021; originally announced July 2021.

    Comments: 27 pages, 7 figures

  19. arXiv:2011.06726  [pdf, ps, other

    cs.DS cs.DM

    Secretaries with Advice

    Authors: Paul Dütting, Silvio Lattanzi, Renato Paes Leme, Sergei Vassilvitskii

    Abstract: The secretary problem is probably the purest model of decision making under uncertainty. In this paper we ask which advice can we give the algorithm to improve its success probability? We propose a general model that unifies a broad range of problems: from the classic secretary problem with no advice, to the variant where the quality of a secretary is drawn from a known distribution and the algo… ▽ More

    Submitted 12 November, 2020; originally announced November 2020.

  20. arXiv:2007.01181  [pdf, other

    cs.LG cs.CR stat.ML

    Private Optimization Without Constraint Violations

    Authors: Andrés Muñoz Medina, Umar Syed, Sergei Vassilvitskii, Ellen Vitercik

    Abstract: We study the problem of differentially private optimization with linear constraints when the right-hand-side of the constraints depends on private data. This type of problem appears in many applications, especially resource allocation. Previous research provided solutions that retained privacy but sometimes violated the constraints. In many settings, however, the constraints cannot be violated und… ▽ More

    Submitted 3 November, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

  21. arXiv:2006.10221  [pdf, other

    cs.DS cs.LG stat.ML

    Fair Hierarchical Clustering

    Authors: Sara Ahmadian, Alessandro Epasto, Marina Knittel, Ravi Kumar, Mohammad Mahdian, Benjamin Moseley, Philip Pham, Sergei Vassilvitskii, Yuyan Wang

    Abstract: As machine learning has become more prevalent, researchers have begun to recognize the necessity of ensuring machine learning systems are fair. Recently, there has been an interest in defining a notion of fairness that mitigates over-representation in traditional clustering. In this paper we extend this notion to hierarchical clustering, where the goal is to recursively partition the data to opt… ▽ More

    Submitted 18 June, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

  22. arXiv:2006.09123  [pdf, other

    cs.DS

    Algorithms with Predictions

    Authors: Michael Mitzenmacher, Sergei Vassilvitskii

    Abstract: We introduce algorithms that use predictions from machine learning applied to the input to circumvent worst-case analysis. We aim for algorithms that have near optimal performance when these predictions are good, but recover the prediction-less worst case behavior when the predictions have large errors.

    Submitted 16 June, 2020; originally announced June 2020.

    Comments: survey is to appear as a chapter in Beyond the Worst-Case Analysis of Algorithms, a collection edited by Tim Roughgarden. We hope to occasionally update the survey here, with new versions that include discussions of new results and advances in the area of Algorithms with Predictions

  23. arXiv:2006.05850  [pdf, other

    cs.DS

    Sliding Window Algorithms for k-Clustering Problems

    Authors: Michele Borassi, Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam

    Abstract: The sliding window model of computation captures scenarios in which data is arriving continuously, but only the latest $w$ elements should be used for analysis. The goal is to design algorithms that update the solution efficiently with each arrival rather than recomputing it from scratch. In this work, we focus on $k$-clustering problems such as $k$-means and $k$-median. In this setting, we provid… ▽ More

    Submitted 23 October, 2020; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: 43 pages, 7 figures

    Journal ref: In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS 2020)

  24. arXiv:1802.05733  [pdf, other

    cs.LG stat.ML

    Fair Clustering Through Fairlets

    Authors: Flavio Chierichetti, Ravi Kumar, Silvio Lattanzi, Sergei Vassilvitskii

    Abstract: We study the question of fair clustering under the {\em disparate impact} doctrine, where each protected class must have approximately equal representation in every cluster. We formulate the fair clustering problem under both the $k$-center and the $k$-median objectives, and show that even with two protected classes the problem is challenging, as the optimum solution can violate common conventions… ▽ More

    Submitted 15 February, 2018; originally announced February 2018.

    Journal ref: NIPS 2017: 5036-5044

  25. arXiv:1802.05399  [pdf, other

    cs.DS cs.LG

    Competitive caching with machine learned advice

    Authors: Thodoris Lykouris, Sergei Vassilvitskii

    Abstract: Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution as compared to an offline optimum. On the other hand, machine learning algorithms are in the business of extrapolating patterns found in the data to predict the future, and usually come with strong guarantees on the exp… ▽ More

    Submitted 21 August, 2020; v1 submitted 14 February, 2018; originally announced February 2018.

    Comments: Preliminary versions appeared in ICML 18 and SysML 18. The current version improves the presentation of the suggested framework (Section 2.2), provides a more clear discussion on how it can be more broadly applied, and fixes some more minor presentation issues in other sections

  26. arXiv:1802.05315  [pdf, other

    cs.LG stat.ML

    Online Learning for Non-Stationary A/B Tests

    Authors: Andrés Muñoz Medina, Sergei Vassilvitskii, Dong Yin

    Abstract: The rollout of new versions of a feature in modern applications is a manual multi-stage process, as the feature is released to ever larger groups of users, while its performance is carefully monitored. This kind of A/B testing is ubiquitous, but suboptimal, as the monitoring requires heavy human intervention, is not guaranteed to capture consistent, but short-term fluctuations in performance, and… ▽ More

    Submitted 27 May, 2018; v1 submitted 14 February, 2018; originally announced February 2018.

  27. arXiv:1706.04732  [pdf, other

    cs.LG cs.GT

    Revenue Optimization with Approximate Bid Predictions

    Authors: Andrés Muñoz Medina, Sergei Vassilvitskii

    Abstract: In the context of advertising auctions, finding good reserve prices is a notoriously challenging learning problem. This is due to the heterogeneity of ad opportunity types and the non-convexity of the objective function. In this work, we show how to reduce reserve price optimization to the standard setting of prediction under squared loss, a well understood problem in the learning community. We fu… ▽ More

    Submitted 6 November, 2017; v1 submitted 15 June, 2017; originally announced June 2017.

    Comments: Accepted to NIPS 2017

  28. arXiv:1703.03111  [pdf, other

    cs.GT cs.LG

    Statistical Cost Sharing

    Authors: Eric Balkanski, Umar Syed, Sergei Vassilvitskii

    Abstract: We study the cost sharing problem for cooperative games in situations where the cost function $C$ is not available via oracle queries, but must instead be derived from data, represented as tuples $(S, C(S))$, for different subsets $S$ of players. We formalize this approach, which we call statistical cost sharing, and consider the computation of the core and the Shapley value, when the tuples are d… ▽ More

    Submitted 8 March, 2017; originally announced March 2017.

  29. arXiv:1610.09984  [pdf, other

    cs.DS

    Submodular Optimization over Sliding Windows

    Authors: Alessandro Epasto, Silvio Lattanzi, Sergei Vassilvitskii, Morteza Zadimoghaddam

    Abstract: Maximizing submodular functions under cardinality constraints lies at the core of numerous data mining and machine learning applications, including data diversification, data summarization, and coverage problems. In this work, we study this question in the context of data streams, where elements arrive one at a time, and we want to design low-memory and fast update-time algorithms that maintain a… ▽ More

    Submitted 31 October, 2016; originally announced October 2016.

    ACM Class: G.1.6; G.2.1; H.2.8

  30. arXiv:1602.07720  [pdf, other

    cs.GT

    A Field Guide to Personalized Reserve Prices

    Authors: Renato Paes Leme, Martin Pal, Sergei Vassilvitskii

    Abstract: We study the question of setting and testing reserve prices in single item auctions when the bidders are not identical. At a high level, there are two generalizations of the standard second price auction: in the lazy version we first determine the winner, and then apply reserve prices; in the eager version we first discard the bidders not meeting their reserves, and then determine the winner among… ▽ More

    Submitted 24 February, 2016; originally announced February 2016.

    Comments: Accepted to WWW'16

  31. arXiv:1503.05225  [pdf, ps, other

    cs.DS cs.CG cs.IT

    Sketching, Embedding, and Dimensionality Reduction for Information Spaces

    Authors: Amirali Abdullah, Ravi Kumar, Andrew McGregor, Sergei Vassilvitskii, Suresh Venkatasubramanian

    Abstract: Information distances like the Hellinger distance and the Jensen-Shannon divergence have deep roots in information theory and machine learning. They are used extensively in data analysis especially when the objects being compared are high dimensional empirical probability distributions built from data. However, we lack common tools needed to actually use information distances in applications effic… ▽ More

    Submitted 17 March, 2015; originally announced March 2015.

  32. arXiv:1407.3338  [pdf, ps, other

    cs.GT

    Value of Targeting

    Authors: Kshipra Bhawalkar, Patrick Hummel, Sergei Vassilvitskii

    Abstract: We undertake a formal study of the value of targeting data to an advertiser. As expected, this value is increasing in the utility difference between realizations of the targeting data and the accuracy of the data, and depends on the distribution of competing bids. However, this value may vary non-monotonically with an advertiser's budget. Similarly, modeling the values as either private or correla… ▽ More

    Submitted 11 July, 2014; originally announced July 2014.

  33. arXiv:1203.6402  [pdf, other

    cs.DB

    Scalable K-Means++

    Authors: Bahman Bahmani, Benjamin Moseley, Andrea Vattani, Ravi Kumar, Sergei Vassilvitskii

    Abstract: Over half a century old and showing no signs of aging, k-means remains one of the most popular data processing algorithms. As is well-known, a proper initialization of k-means is crucial for obtaining a good final solution. The recently proposed k-means++ initialization algorithm achieves this, obtaining an initial set of centers that is provably close to the optimum solution. A major downside of… ▽ More

    Submitted 28 March, 2012; originally announced March 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 7, pp. 622-633 (2012)

  34. arXiv:1203.3619  [pdf, other

    cs.DS

    SHALE: An Efficient Algorithm for Allocation of Guaranteed Display Advertising

    Authors: Vijay Bharadwaj, Peiji Chen, Wen**g Ma, Chandrashekhar Nagarajan, John Tomlin, Sergei Vassilvitskii, Erik Vee, Jian Yang

    Abstract: Motivated by the problem of optimizing allocation in guaranteed display advertising, we develop an efficient, lightweight method of generating a compact {\em allocation plan} that can be used to guide ad server decisions. The plan itself uses just O(1) state per guaranteed contract, is robust to noise, and allows us to serve (provably) nearly optimally. The optimization method we develop is scalab… ▽ More

    Submitted 16 March, 2012; originally announced March 2012.

  35. arXiv:1203.3593  [pdf, other

    cs.DS

    Ad Serving Using a Compact Allocation Plan

    Authors: Peiji Chen, Wen**g Ma, Srinath Mandalapu, Chandrashekhar Nagarajan, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, Manfai Yu, Jason Zien

    Abstract: A large fraction of online display advertising is sold via guaranteed contracts: a publisher guarantees to the advertiser a certain number of user visits satisfying the targeting predicates of the contract. The publisher is then tasked with solving the ad serving problem - given a user visit, which of the thousands of matching contracts should be displayed, so that by the expiration time every con… ▽ More

    Submitted 15 March, 2012; originally announced March 2012.

  36. arXiv:1201.6567  [pdf, other

    cs.DB

    Densest Subgraph in Streaming and MapReduce

    Authors: Bahman Bahmani, Ravi Kumar, Sergei Vassilvitskii

    Abstract: The problem of finding locally dense components of a graph is an important primitive in data analysis, with wide-ranging applications from community mining to spam detection and the discovery of biological network modules. In this paper we present new algorithms for finding the densest subgraph in the streaming model. For any epsilon>0, our algorithms make O((log n)/log (1+epsilon)) passes over th… ▽ More

    Submitted 31 January, 2012; originally announced January 2012.

    Comments: VLDB2012

    Journal ref: Proceedings of the VLDB Endowment (PVLDB), Vol. 5, No. 5, pp. 454-465 (2012)

  37. arXiv:1108.1956  [pdf, ps, other

    cs.IR

    Factorization-based Lossless Compression of Inverted Indices

    Authors: George Beskales, Marcus Fontoura, Maxim Gurevich, Sergei Vassilvitskii, Vanja Josifovski

    Abstract: Many large-scale Web applications that require ranked top-k retrieval such as Web search and online advertising are implemented using inverted indices. An inverted index represents a sparse term-document matrix, where non-zero elements indicate the strength of term-document association. In this work, we present an approach for lossless compression of inverted indices. Our approach maps terms in a… ▽ More

    Submitted 9 August, 2011; originally announced August 2011.

    Comments: To Appear as a short paper in CIKM'11

    ACM Class: H.3.1

  38. arXiv:1008.3551  [pdf, other

    cs.CE

    Inventory Allocation for Online Graphical Display Advertising

    Authors: Jian Yang, Erik Vee, Sergei Vassilvitskii, John Tomlin, Jayavel Shanmugasundaram, Tasos Anastasakos, Oliver Kennedy

    Abstract: We discuss a multi-objective/goal programming model for the allocation of inventory of graphical advertisements. The model considers two types of campaigns: guaranteed delivery (GD), which are sold months in advance, and non-guaranteed delivery (NGD), which are sold using real-time auctions. We investigate various advertiser and publisher objectives such as (a) revenue from the sale of impressions… ▽ More

    Submitted 20 August, 2010; originally announced August 2010.

    Report number: YL-2010-004

  39. arXiv:0910.0916  [pdf, ps, other

    cs.GT

    Social Networks and Stable Matchings in the Job Market

    Authors: Esteban Arcaute, Sergei Vassilvitskii

    Abstract: For most people, social contacts play an integral part in finding a new job. As observed by Granovetter's seminal study, the proportion of jobs obtained through social contacts is usually large compared to those obtained through postings or agencies. At the same time, job markets are a natural example of two-sided matching markets. An important solution concept in such markets is that of stable… ▽ More

    Submitted 5 October, 2009; originally announced October 2009.

    Comments: 19 pages. A preliminary version will appear at the 5th International Workshop on Internet and Network Economics, WINE 2009

  40. arXiv:0910.0880  [pdf, other

    cs.MA cs.GT

    Bidding for Representative Allocations for Display Advertising

    Authors: Arpita Ghosh, Preston McAfee, Kishore Papineni, Sergei Vassilvitskii

    Abstract: Display advertising has traditionally been sold via guaranteed contracts -- a guaranteed contract is a deal between a publisher and an advertiser to allocate a certain number of impressions over a certain period, for a pre-specified price per impression. However, as spot markets for display ads, such as the RightMedia Exchange, have grown in prominence, the selection of advertisements to show on… ▽ More

    Submitted 5 October, 2009; originally announced October 2009.