-
Gradient Descent Converges Linearly for Logistic Regression on Separable Data
Authors:
Kyriakos Axiotis,
Maxim Sviridenko
Abstract:
We show that running gradient descent with variable learning rate guarantees loss $f(x) \leq 1.1 \cdot f(x^*) + ε$ for the logistic regression objective, where the error $ε$ decays exponentially with the number of iterations and polynomially with the magnitude of the entries of an arbitrary fixed solution $x^*$. This is in contrast to the common intuition that the absence of strong convexity precl…
▽ More
We show that running gradient descent with variable learning rate guarantees loss $f(x) \leq 1.1 \cdot f(x^*) + ε$ for the logistic regression objective, where the error $ε$ decays exponentially with the number of iterations and polynomially with the magnitude of the entries of an arbitrary fixed solution $x^*$. This is in contrast to the common intuition that the absence of strong convexity precludes linear convergence of first-order methods, and highlights the importance of variable learning rates for gradient descent. We also apply our ideas to sparse logistic regression, where they lead to an exponential improvement of the sparsity-error tradeoff.
△ Less
Submitted 25 June, 2023;
originally announced June 2023.
-
Iterative Hard Thresholding with Adaptive Regularization: Sparser Solutions Without Sacrificing Runtime
Authors:
Kyriakos Axiotis,
Maxim Sviridenko
Abstract:
We propose a simple modification to the iterative hard thresholding (IHT) algorithm, which recovers asymptotically sparser solutions as a function of the condition number. When aiming to minimize a convex function $f(x)$ with condition number $κ$ subject to $x$ being an $s$-sparse vector, the standard IHT guarantee is a solution with relaxed sparsity $O(sκ^2)$, while our proposed algorithm, regula…
▽ More
We propose a simple modification to the iterative hard thresholding (IHT) algorithm, which recovers asymptotically sparser solutions as a function of the condition number. When aiming to minimize a convex function $f(x)$ with condition number $κ$ subject to $x$ being an $s$-sparse vector, the standard IHT guarantee is a solution with relaxed sparsity $O(sκ^2)$, while our proposed algorithm, regularized IHT, returns a solution with sparsity $O(sκ)$. Our algorithm significantly improves over ARHT which also finds a solution of sparsity $O(sκ)$, as it does not require re-optimization in each iteration (and so is much faster), is deterministic, and does not require knowledge of the optimal solution value $f(x^*)$ or the optimal sparsity level $s$. Our main technical tool is an adaptive regularization framework, in which the algorithm progressively learns the weights of an $\ell_2$ regularization term that will allow convergence to sparser solutions. We also apply this framework to low rank optimization, where we achieve a similar improvement of the best known condition number dependence from $κ^2$ to $κ$.
△ Less
Submitted 11 April, 2022;
originally announced April 2022.
-
TSI: an Ad Text Strength Indicator using Text-to-CTR and Semantic-Ad-Similarity
Authors:
Shaunak Mishra,
Changwei Hu,
Manisha Verma,
Kevin Yen,
Yifan Hu,
Maxim Sviridenko
Abstract:
Coming up with effective ad text is a time consuming process, and particularly challenging for small businesses with limited advertising experience. When an inexperienced advertiser onboards with a poorly written ad text, the ad platform has the opportunity to detect low performing ad text, and provide improvement suggestions. To realize this opportunity, we propose an ad text strength indicator (…
▽ More
Coming up with effective ad text is a time consuming process, and particularly challenging for small businesses with limited advertising experience. When an inexperienced advertiser onboards with a poorly written ad text, the ad platform has the opportunity to detect low performing ad text, and provide improvement suggestions. To realize this opportunity, we propose an ad text strength indicator (TSI) which: (i) predicts the click-through-rate (CTR) for an input ad text, (ii) fetches similar existing ads to create a neighborhood around the input ad, (iii) and compares the predicted CTRs in the neighborhood to declare whether the input ad is strong or weak. In addition, as suggestions for ad text improvement, TSI shows anonymized versions of superior ads (higher predicted CTR) in the neighborhood. For (i), we propose a BERT based text-to-CTR model trained on impressions and clicks associated with an ad text. For (ii), we propose a sentence-BERT based semantic-ad-similarity model trained using weak labels from ad campaign setup data. Offline experiments demonstrate that our BERT based text-to-CTR model achieves a significant lift in CTR prediction AUC for cold start (new) advertisers compared to bag-of-words based baselines. In addition, our semantic-textual-similarity model for similar ads retrieval achieves a precision@1 of 0.93 (for retrieving ads from the same product category); this is significantly higher compared to unsupervised TF-IDF, word2vec, and sentence-BERT baselines. Finally, we share promising online results from advertisers in the Yahoo (Verizon Media) ad platform where a variant of TSI was implemented with sub-second end-to-end latency.
△ Less
Submitted 18 August, 2021;
originally announced August 2021.
-
VisualTextRank: Unsupervised Graph-based Content Extraction for Automating Ad Text to Image Search
Authors:
Shaunak Mishra,
Mikhail Kuznetsov,
Gaurav Srivastava,
Maxim Sviridenko
Abstract:
Numerous online stock image libraries offer high quality yet copyright free images for use in marketing campaigns. To assist advertisers in navigating such third party libraries, we study the problem of automatically fetching relevant ad images given the ad text (via a short textual query for images). Motivated by our observations in logged data on ad image search queries (given ad text), we formu…
▽ More
Numerous online stock image libraries offer high quality yet copyright free images for use in marketing campaigns. To assist advertisers in navigating such third party libraries, we study the problem of automatically fetching relevant ad images given the ad text (via a short textual query for images). Motivated by our observations in logged data on ad image search queries (given ad text), we formulate a keyword extraction problem, where a keyword extracted from the ad text (or its augmented version) serves as the ad image query. In this context, we propose VisualTextRank: an unsupervised method to (i) augment input ad text using semantically similar ads, and (ii) extract the image query from the augmented ad text. VisualTextRank builds on prior work on graph based context extraction (biased TextRank in particular) by leveraging both the text and image of similar ads for better keyword extraction, and using advertiser category specific biasing with sentence-BERT embeddings. Using data collected from the Verizon Media Native (Yahoo Gemini) ad platform's stock image search feature for onboarding advertisers, we demonstrate the superiority of VisualTextRank compared to competitive keyword extraction baselines (including an $11\%$ accuracy lift over biased TextRank). For the case when the stock image library is restricted to English queries, we show the effectiveness of VisualTextRank on multilingual ads (translated to English) while leveraging semantically similar English ads. Online tests with a simplified version of VisualTextRank led to a 28.7% increase in the usage of stock image search, and a 41.6% increase in the advertiser onboarding rate in the Verizon Media Native ad platform.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Local Search Algorithms for Rank-Constrained Convex Optimization
Authors:
Kyriakos Axiotis,
Maxim Sviridenko
Abstract:
We propose greedy and local search algorithms for rank-constrained convex optimization, namely solving $\underset{\mathrm{rank}(A)\leq r^*}{\min}\, R(A)$ given a convex function $R:\mathbb{R}^{m\times n}\rightarrow \mathbb{R}$ and a parameter $r^*$. These algorithms consist of repeating two steps: (a) adding a new rank-1 matrix to $A$ and (b) enforcing the rank constraint on $A$. We refine and imp…
▽ More
We propose greedy and local search algorithms for rank-constrained convex optimization, namely solving $\underset{\mathrm{rank}(A)\leq r^*}{\min}\, R(A)$ given a convex function $R:\mathbb{R}^{m\times n}\rightarrow \mathbb{R}$ and a parameter $r^*$. These algorithms consist of repeating two steps: (a) adding a new rank-1 matrix to $A$ and (b) enforcing the rank constraint on $A$. We refine and improve the theoretical analysis of Shalev-Shwartz et al. (2011), and show that if the rank-restricted condition number of $R$ is $κ$, a solution $A$ with rank $O(r^*\cdot \min\{κ\log \frac{R(\mathbf{0})-R(A^*)}ε, κ^2\})$ and $R(A) \leq R(A^*) + ε$ can be recovered, where $A^*$ is the optimal solution. This significantly generalizes associated results on sparse convex optimization, as well as rank-constrained convex optimization for smooth functions. We then introduce new practical variants of these algorithms that have superior runtime and recover better solutions in practice. We demonstrate the versatility of these methods on a wide range of applications involving matrix completion and robust principal component analysis.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Sparse Convex Optimization via Adaptively Regularized Hard Thresholding
Authors:
Kyriakos Axiotis,
Maxim Sviridenko
Abstract:
The goal of Sparse Convex Optimization is to optimize a convex function $f$ under a sparsity constraint $s\leq s^*γ$, where $s^*$ is the target number of non-zero entries in a feasible solution (sparsity) and $γ\geq 1$ is an approximation factor. There has been a lot of work to analyze the sparsity guarantees of various algorithms (LASSO, Orthogonal Matching Pursuit (OMP), Iterative Hard Threshold…
▽ More
The goal of Sparse Convex Optimization is to optimize a convex function $f$ under a sparsity constraint $s\leq s^*γ$, where $s^*$ is the target number of non-zero entries in a feasible solution (sparsity) and $γ\geq 1$ is an approximation factor. There has been a lot of work to analyze the sparsity guarantees of various algorithms (LASSO, Orthogonal Matching Pursuit (OMP), Iterative Hard Thresholding (IHT)) in terms of the Restricted Condition Number $κ$. The best known algorithms guarantee to find an approximate solution of value $f(x^*)+ε$ with the sparsity bound of $γ= O\left(κ\min\left\{\log \frac{f(x^0)-f(x^*)}ε, κ\right\}\right)$, where $x^*$ is the target solution. We present a new Adaptively Regularized Hard Thresholding (ARHT) algorithm that makes significant progress on this problem by bringing the bound down to $γ=O(κ)$, which has been shown to be tight for a general class of algorithms including LASSO, OMP, and IHT. This is achieved without significant sacrifice in the runtime efficiency compared to the fastest known algorithms. We also provide a new analysis of OMP with Replacement (OMPR) for general $f$, under the condition $s > s^* \frac{κ^2}{4}$, which yields Compressed Sensing bounds under the Restricted Isometry Property (RIP). When compared to other Compressed Sensing approaches, it has the advantage of providing a strong tradeoff between the RIP condition and the solution sparsity, while working for any general function $f$ that meets the RIP condition.
△ Less
Submitted 25 June, 2020;
originally announced June 2020.
-
On the computational complexity of the probabilistic label tree algorithms
Authors:
Robert Busa-Fekete,
Krzysztof Dembczynski,
Alexander Golovnev,
Kalina Jasinska,
Mikhail Kuznetsov,
Maxim Sviridenko,
Chao Xu
Abstract:
Label tree-based algorithms are widely used to tackle multi-class and multi-label problems with a large number of labels. We focus on a particular subclass of these algorithms that use probabilistic classifiers in the tree nodes. Examples of such algorithms are hierarchical softmax (HSM), designed for multi-class classification, and probabilistic label trees (PLTs) that generalize HSM to multi-lab…
▽ More
Label tree-based algorithms are widely used to tackle multi-class and multi-label problems with a large number of labels. We focus on a particular subclass of these algorithms that use probabilistic classifiers in the tree nodes. Examples of such algorithms are hierarchical softmax (HSM), designed for multi-class classification, and probabilistic label trees (PLTs) that generalize HSM to multi-label problems. If the tree structure is given, learning of PLT can be solved with provable regret guaranties [Wydmuch et.al. 2018]. However, to find a tree structure that results in a PLT with a low training and prediction computational costs as well as low statistical error seems to be a very challenging problem, not well-understood yet.
In this paper, we address the problem of finding a tree structure that has low computational cost. First, we show that finding a tree with optimal training cost is NP-complete, nevertheless there are some tractable special cases with either perfect approximation or exact solution that can be obtained in linear time in terms of the number of labels $m$. For the general case, we obtain $O(\log m)$ approximation in linear time too. Moreover, we prove an upper bound on the expected prediction cost expressed in terms of the expected training cost. We also show that under additional assumptions the prediction cost of a PLT is $O(\log m)$.
△ Less
Submitted 1 June, 2019;
originally announced June 2019.
-
Sponsored Search Auctions with Rich Ads
Authors:
Ruggiero Cavallo,
Prabhakar Krishnamurthy,
Maxim Sviridenko,
Christopher A. Wilkens
Abstract:
The generalized second price (GSP) auction has served as the core selling mechanism for sponsored search ads for over a decade. However, recent trends expanding the set of allowed ad formats---to include a variety of sizes, decorations, and other distinguishing features---have raised critical problems for GSP-based platforms. Alternatives such as the Vickrey-Clarke-Groves (VCG) auction raise diffe…
▽ More
The generalized second price (GSP) auction has served as the core selling mechanism for sponsored search ads for over a decade. However, recent trends expanding the set of allowed ad formats---to include a variety of sizes, decorations, and other distinguishing features---have raised critical problems for GSP-based platforms. Alternatives such as the Vickrey-Clarke-Groves (VCG) auction raise different complications because they fundamentally change the way prices are computed. In this paper we report on our efforts to redesign a search ad selling system from the ground up in this new context, proposing a mechanism that optimizes an entire slate of ads globally and computes prices that achieve properties analogous to those held by GSP in the original, simpler setting of uniform ads. A careful algorithmic coupling of allocation-optimization and pricing-computation allows our auction to operate within the strict timing constraints inherent in real-time ad auctions. We report performance results of the auction in Yahoo's Gemini Search platform.
△ Less
Submitted 20 January, 2017;
originally announced January 2017.
-
Determining Tournament Payout Structures for Daily Fantasy Sports
Authors:
Christopher Musco,
Maxim Sviridenko,
Justin Thaler
Abstract:
With an exploding global market and the recent introduction of online cash prize tournaments, fantasy sports contests are quickly becoming a central part of the social gaming and sports industries. For sports fans and online media companies, fantasy sports contests are an opportunity for large financial gains. However, they present a host of technical challenges that arise from the complexities in…
▽ More
With an exploding global market and the recent introduction of online cash prize tournaments, fantasy sports contests are quickly becoming a central part of the social gaming and sports industries. For sports fans and online media companies, fantasy sports contests are an opportunity for large financial gains. However, they present a host of technical challenges that arise from the complexities involved in running a web-scale, prize driven fantasy sports platform.
We initiate the study of these challenges by examining one concrete problem in particular: how to algorithmically generate contest payout structures that are 1) economically motivating and appealing to contestants and 2) reasonably structured and succinctly representable. We formalize this problem and present a general two-staged approach for producing satisfying payout structures given constraints on contest size, entry fee, prize bucketing, etc.
We then propose and evaluate several potential algorithms for solving the payout problem efficiently, including methods based on dynamic programming, integer programming, and heuristic techniques. Experimental results show that a carefully designed heuristic scales very well, even to contests with over 100,000 prize winners.
Our approach extends beyond fantasy sports -- it is suitable for generating engaging payout structures for any contest with a large number of entrants and a large number of prize winners, including other massive online games, poker tournaments, and real-life sports tournaments.
△ Less
Submitted 4 November, 2016; v1 submitted 16 January, 2016;
originally announced January 2016.
-
A bi-criteria approximation algorithm for $k$ Means
Authors:
Konstantin Makarychev,
Yury Makarychev,
Maxim Sviridenko,
Justin Ward
Abstract:
We consider the classical $k$-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output $βk > k$ clusters, and must produce a clustering with cost at most $α$ times the to the cost of the optimal set of $k$ clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant…
▽ More
We consider the classical $k$-means clustering problem in the setting bi-criteria approximation, in which an algoithm is allowed to output $βk > k$ clusters, and must produce a clustering with cost at most $α$ times the to the cost of the optimal set of $k$ clusters. We argue that this approach is natural in many settings, for which the exact number of clusters is a priori unknown, or unimportant up to a constant factor. We give new bi-criteria approximation algorithms, based on linear programming and local search, respectively, which attain a guarantee $α(β)$ depending on the number $βk$ of clusters that may be opened. Our gurantee $α(β)$ is always at most $9 + ε$ and improves rapidly with $β$ (for example: $α(2)<2.59$, and $α(3) < 1.4$). Moreover, our algorithms have only polynomial dependence on the dimension of the input data, and so are applicable in high-dimensional settings.
△ Less
Submitted 3 August, 2015; v1 submitted 15 July, 2015;
originally announced July 2015.
-
Greedy Minimization of Weakly Supermodular Set Functions
Authors:
Christos Boutsidis,
Edo Liberty,
Maxim Sviridenko
Abstract:
This paper defines weak-$α$-supermodularity for set functions. Many optimization objectives in machine learning and data mining seek to minimize such functions under cardinality constrains. We prove that such problems benefit from a greedy extension phase. Explicitly, let $S^*$ be the optimal set of cardinality $k$ that minimizes $f$ and let $S_0$ be an initial solution such that…
▽ More
This paper defines weak-$α$-supermodularity for set functions. Many optimization objectives in machine learning and data mining seek to minimize such functions under cardinality constrains. We prove that such problems benefit from a greedy extension phase. Explicitly, let $S^*$ be the optimal set of cardinality $k$ that minimizes $f$ and let $S_0$ be an initial solution such that $f(S_0)/f(S^*) \le ρ$. Then, a greedy extension $S \supset S_0$ of size $|S| \le |S_0| + \lceil αk \ln(ρ/\varepsilon) \rceil$ yields $f(S)/f(S^*) \le 1+\varepsilon$. As example usages of this framework we give new bicriteria results for $k$-means, sparse regression, and columns subset selection.
△ Less
Submitted 23 February, 2015;
originally announced February 2015.
-
An Algorithm for Online K-Means Clustering
Authors:
Edo Liberty,
Ram Sriharsha,
Maxim Sviridenko
Abstract:
This paper shows that one can be competitive with the k-means objective while operating online. In this model, the algorithm receives vectors v_1,...,v_n one by one in an arbitrary order. For each vector the algorithm outputs a cluster identifier before receiving the next one. Our online algorithm generates ~O(k) clusters whose k-means cost is ~O(W*). Here, W* is the optimal k-means cost using k c…
▽ More
This paper shows that one can be competitive with the k-means objective while operating online. In this model, the algorithm receives vectors v_1,...,v_n one by one in an arbitrary order. For each vector the algorithm outputs a cluster identifier before receiving the next one. Our online algorithm generates ~O(k) clusters whose k-means cost is ~O(W*). Here, W* is the optimal k-means cost using k clusters and ~O suppresses poly-logarithmic factors. We also show that, experimentally, it is not much worse than k-means++ while operating in a strictly more constrained computational model.
△ Less
Submitted 23 February, 2015; v1 submitted 18 December, 2014;
originally announced December 2014.
-
Polynomial-Time Approximation Schemes for Circle and Other Packing Problems
Authors:
Flávio K. Miyazawa,
Lehilton L. C. Pedrosa,
Rafael C. S. Schouery,
Maxim Sviridenko,
Yoshiko Wakabayashi
Abstract:
We give an asymptotic approximation scheme (APTAS) for the problem of packing a set of circles into a minimum number of unit square bins. To obtain rational solutions, we use augmented bins of height $1+γ$, for some arbitrarily small number $γ> 0$. Our algorithm is polynomial on $\log 1/γ$, and thus $γ$ is part of the problem input. For the special case that $γ$ is constant, we give a (one dimensi…
▽ More
We give an asymptotic approximation scheme (APTAS) for the problem of packing a set of circles into a minimum number of unit square bins. To obtain rational solutions, we use augmented bins of height $1+γ$, for some arbitrarily small number $γ> 0$. Our algorithm is polynomial on $\log 1/γ$, and thus $γ$ is part of the problem input. For the special case that $γ$ is constant, we give a (one dimensional) resource augmentation scheme, that is, we obtain a packing into bins of unit width and height $1+γ$ using no more than the number of bins in an optimal packing. Additionally, we obtain an APTAS for the circle strip packing problem, whose goal is to pack a set of circles into a strip of unit width and minimum height. These are the first approximation and resource augmentation schemes for these problems.
Our algorithm is based on novel ideas of iteratively separating small and large items, and may be extended to a wide range of packing problems that satisfy certain conditions. These extensions comprise problems with different kinds of items, such as regular polygons, or with bins of different shapes, such as circles and spheres. As an example, we obtain APTAS's for the problems of packing d-dimensional spheres into hypercubes under the $L_p$-norm.
△ Less
Submitted 15 December, 2014;
originally announced December 2014.
-
Optimization Problems with Diseconomies of Scale via Decoupling
Authors:
Konstantin Makarychev,
Maxim Sviridenko
Abstract:
We present a new framework for solving optimization problems with a diseconomy of scale. In such problems, our goal is to minimize the cost of resources used to perform a certain task. The cost of resources grows superlinearly, as $x^q$, $q\ge 1$, with the amount $x$ of resources used. We define a novel linear programming relaxation for such problems, and then show that the integrality gap of the…
▽ More
We present a new framework for solving optimization problems with a diseconomy of scale. In such problems, our goal is to minimize the cost of resources used to perform a certain task. The cost of resources grows superlinearly, as $x^q$, $q\ge 1$, with the amount $x$ of resources used. We define a novel linear programming relaxation for such problems, and then show that the integrality gap of the relaxation is $A_q$, where $A_q$ is the $q$-th moment of the Poisson random variable with parameter 1. Using our framework, we obtain approximation algorithms for the Minimum Energy Efficient Routing, Minimum Degree Balanced Spanning Tree, Load Balancing on Unrelated Parallel Machines, and Unrelated Parallel Machine Scheduling with Nonlinear Functions of Completion Times problems.
Our analysis relies on the decoupling inequality for nonnegative random variables. The inequality states that $$\big \|\sum_{i=1}^n X_i\big\|_{q} \leq C_q \,\big \|\sum_{i=1}^n Y_i\big\|_{q},$$ where $X_i$ are independent nonnegative random variables, $Y_i$ are possibly dependent nonnegative random variable, and each $Y_i$ has the same distribution as $X_i$. The inequality was proved by de la Peña in 1990. De la Peña, Ibragimov, and Sharakhmetov (2003) showed that $C_q\leq 2$ for $q\in (1,2)$ and $C_q\leq A_q^{1/q}$ for $q\geq 2$. We show that the optimal constant is $C_q=A_q^{1/q}$ for any $q\geq 1$. We then prove a more general inequality: For every convex function $\varphi$, $$\mathbb{E}[\varphi\Big(\sum_{i=1}^n X_i\Big)]\leq \mathbb{E}[\varphi\Big(P\sum_{i=1}^n Y_i\Big)],$$ and, for every concave function $ψ$, $$\mathbb{E}[ψ\Big(\sum_{i=1}^n X_i\Big)] \geq \mathbb{E}[ψ\Big(P\sum_{i=1}^n Y_i\Big)],$$ where $P$ is a Poisson random variable with parameter 1 independent of the random variables $Y_i$.
△ Less
Submitted 21 January, 2015; v1 submitted 11 April, 2014;
originally announced April 2014.
-
Maximum Quadratic Assignment Problem: Reduction from Maximum Label Cover and LP-based Approximation Algorithm
Authors:
Konstantin Makarychev,
Rajsekar Manokaran,
Maxim Sviridenko
Abstract:
We show that for every positive $ε> 0$, unless NP $\subset$ BPQP, it is impossible to approximate the maximum quadratic assignment problem within a factor better than $2^{\log^{1-ε} n}$ by a reduction from the maximum label cover problem. Our result also implies that Approximate Graph Isomorphism is not robust and is in fact, $1 - ε$ vs $ε$ hard assuming the Unique Games Conjecture.
Then, we pre…
▽ More
We show that for every positive $ε> 0$, unless NP $\subset$ BPQP, it is impossible to approximate the maximum quadratic assignment problem within a factor better than $2^{\log^{1-ε} n}$ by a reduction from the maximum label cover problem. Our result also implies that Approximate Graph Isomorphism is not robust and is in fact, $1 - ε$ vs $ε$ hard assuming the Unique Games Conjecture.
Then, we present an $O(\sqrt{n})$-approximation algorithm for the problem based on rounding of the linear programming relaxation often used in the state of the art exact algorithms.
△ Less
Submitted 30 March, 2014;
originally announced March 2014.
-
Energy Efficient Scheduling and Routing via Randomized Rounding
Authors:
Evripidis Bampis,
Alexander Kononov,
Dimitrios Letsios,
Giorgio Lucarelli,
Maxim Sviridenko
Abstract:
We propose a unifying framework based on configuration linear programs and randomized rounding, for different energy optimization problems in the dynamic speed-scaling setting. We apply our framework to various scheduling and routing problems in heterogeneous computing and networking environments. We first consider the energy minimization problem of scheduling a set of jobs on a set of parallel sp…
▽ More
We propose a unifying framework based on configuration linear programs and randomized rounding, for different energy optimization problems in the dynamic speed-scaling setting. We apply our framework to various scheduling and routing problems in heterogeneous computing and networking environments. We first consider the energy minimization problem of scheduling a set of jobs on a set of parallel speed scalable processors in a fully heterogeneous setting. For both the preemptive-non-migratory and the preemptive-migratory variants, our approach allows us to obtain solutions of almost the same quality as for the homogeneous environment. By exploiting the result for the preemptive-non-migratory variant, we are able to improve the best known approximation ratio for the single processor non-preemptive problem. Furthermore, we show that our approach allows to obtain a constant-factor approximation algorithm for the power-aware preemptive job shop scheduling problem. Finally, we consider the min-power routing problem where we are given a network modeled by an undirected graph and a set of uniform demands that have to be routed on integral routes from their sources to their destinations so that the energy consumption is minimized. We improve the best known approximation ratio for this problem.
△ Less
Submitted 19 March, 2014;
originally announced March 2014.
-
Optimal approximation for submodular and supermodular optimization with bounded curvature
Authors:
Maxim Sviridenko,
Jan Vondrák,
Justin Ward
Abstract:
We design new approximation algorithms for the problems of optimizing submodular and supermodular functions subject to a single matroid constraint. Specifically, we consider the case in which we wish to maximize a nondecreasing submodular function or minimize a nonincreasing supermodular function in the setting of bounded total curvature $c$. In the case of submodular maximization with curvature…
▽ More
We design new approximation algorithms for the problems of optimizing submodular and supermodular functions subject to a single matroid constraint. Specifically, we consider the case in which we wish to maximize a nondecreasing submodular function or minimize a nonincreasing supermodular function in the setting of bounded total curvature $c$. In the case of submodular maximization with curvature $c$, we obtain a $(1-c/e)$-approximation --- the first improvement over the greedy $(1-e^{-c})/c$-approximation of Conforti and Cornuejols from 1984, which holds for a cardinality constraint, as well as recent approaches that hold for an arbitrary matroid constraint.
Our approach is based on modifications of the continuous greedy algorithm and non-oblivious local search, and allows us to approximately maximize the sum of a nonnegative, nondecreasing submodular function and a (possibly negative) linear function. We show how to reduce both submodular maximization and supermodular minimization to this general problem when the objective function has bounded total curvature. We prove that the approximation results we obtain are the best possible in the value oracle model, even in the case of a cardinality constraint.
We define an extension of the notion of curvature to general monotone set functions and show $(1-c)$-approximation for maximization and $1/(1-c)$-approximation for minimization cases. Finally, we give two concrete applications of our results in the settings of maximum entropy sampling, and the column-subset selection problem.
△ Less
Submitted 12 December, 2014; v1 submitted 19 November, 2013;
originally announced November 2013.
-
Submodular Stochastic Probing on Matroids
Authors:
Marek Adamczyk,
Maxim Sviridenko,
Justin Ward
Abstract:
In a stochastic probing problem we are given a universe $E$, where each element $e \in E$ is active independently with probability $p_e$, and only a probe of e can tell us whether it is active or not. On this universe we execute a process that one by one probes elements --- if a probed element is active, then we have to include it in the solution, which we gradually construct. Throughout the proce…
▽ More
In a stochastic probing problem we are given a universe $E$, where each element $e \in E$ is active independently with probability $p_e$, and only a probe of e can tell us whether it is active or not. On this universe we execute a process that one by one probes elements --- if a probed element is active, then we have to include it in the solution, which we gradually construct. Throughout the process we need to obey inner constraints on the set of elements taken into the solution, and outer constraints on the set of all probed elements. This abstract model was presented by Gupta and Nagarajan (IPCO '13), and provides a unified view of a number of problems. Thus far, all the results falling under this general framework pertain mainly to the case in which we are maximizing a linear objective function of the successfully probed elements. In this paper we generalize the stochastic probing problem by considering a monotone submodular objective function. We give a $(1 - 1/e)/(k_{in} + k_{out}+1)$-approximation algorithm for the case in which we are given $k_{in}$ matroids as inner constraints and $k_{out}$ matroids as outer constraints. Additionally, we obtain an improved $1/(k_{in} + k_{out})$-approximation algorithm for linear objective functions.
△ Less
Submitted 18 February, 2014; v1 submitted 16 October, 2013;
originally announced October 2013.
-
Optimizing Maximum Flow Time and Maximum Throughput in Broadcast Scheduling
Authors:
Sung** Im,
Maxim Sviridenko
Abstract:
We consider the pull-based broadcast scheduling model. In this model, there are n unit-sized pages of information available at the server. Requests arrive over time at the server asking for a specific page. When the server transmits a page, all outstanding requests for the page are simultaneously satisfied, and this is what distinguishes broadcast scheduling from the standard scheduling setting wh…
▽ More
We consider the pull-based broadcast scheduling model. In this model, there are n unit-sized pages of information available at the server. Requests arrive over time at the server asking for a specific page. When the server transmits a page, all outstanding requests for the page are simultaneously satisfied, and this is what distinguishes broadcast scheduling from the standard scheduling setting where each job must be processed separately by the server. Broadcast scheduling has received a considerable amount of attention due to the algorithmic challenges that it gives in addition to its applications in multicast systems and wireless and LAN networks. In this paper, we give the following new approximation results for two popular objectives:
- For the objective of minimizing the maximum flow time, we give the first PTAS. Previously, it was known that the algorithm First-In-First-Out (FIFO) is a 2-approximation, and it is tight. It has been suggested as an open problem to obtain a better approximation.
- For the objective of maximizing the throughput, we give a 0.7759-approximation which improves upon the previous best known 0.75-approximation.
Our improved results are enabled by our novel rounding schemes and linear programming which can effectively reduce congestion in schedule which is often the main bottleneck in designing scheduling algorithms based on linear programming. We believe that our algorithmic ideas and techniques could be of potential use for other scheduling problems.
△ Less
Submitted 14 September, 2013;
originally announced September 2013.
-
Large Neighborhood Local Search for the Maximum Set Packing Problem
Authors:
Maxim Sviridenko,
Justin Ward
Abstract:
In this paper we consider the classical maximum set packing problem where set cardinality is upper bounded by $k$. We show how to design a variant of a polynomial-time local search algorithm with performance guarantee $(k+2)/3$. This local search algorithm is a special case of a more general procedure that allows to swap up to $Θ(\log n)$ elements per iteration. We also design problem instances wi…
▽ More
In this paper we consider the classical maximum set packing problem where set cardinality is upper bounded by $k$. We show how to design a variant of a polynomial-time local search algorithm with performance guarantee $(k+2)/3$. This local search algorithm is a special case of a more general procedure that allows to swap up to $Θ(\log n)$ elements per iteration. We also design problem instances with locality gap $k/3$ even for a wide class of exponential time local search procedures, which can swap up to $cn$ elements for a constant $c$. This shows that our analysis of this class of algorithms is almost tight.
△ Less
Submitted 18 February, 2013;
originally announced February 2013.
-
No-Wait Flowshop Scheduling is as Hard as Asymmetric Traveling Salesman Problem
Authors:
Marcin Mucha,
Maxim Sviridenko
Abstract:
In this paper we study the classical no-wait flowshop scheduling problem with makespan objective (F|no-wait|C_max in the standard three-field notation). This problem is well-known to be a special case of the asymmetric traveling salesman problem (ATSP) and as such has an approximation algorithm with logarithmic performance guarantee. In this work we show a reverse connection, we show that any poly…
▽ More
In this paper we study the classical no-wait flowshop scheduling problem with makespan objective (F|no-wait|C_max in the standard three-field notation). This problem is well-known to be a special case of the asymmetric traveling salesman problem (ATSP) and as such has an approximation algorithm with logarithmic performance guarantee. In this work we show a reverse connection, we show that any polynomial time α-approximation algorithm for the no-wait flowshop scheduling problem with makespan objective implies the existence of a polynomial-time α(1+ε)-approximation algorithm for the ATSP, for any ε>0. This in turn implies that all non-approximability results for the ATSP (current or future) will carry over to its special case. In particular, it follows that no-wait flowshop problem is APX-hard, which is the first non-approximability result for this problem.
△ Less
Submitted 25 April, 2013; v1 submitted 11 February, 2013;
originally announced February 2013.
-
Approximation Algorithms for the Joint Replenishment Problem with Deadlines
Authors:
Marcin Bienkowski,
Jaroslaw Byrka,
Marek Chrobak,
Neil Dobbs,
Tomasz Nowicki,
Maxim Sviridenko,
Grzegorz Swirszcz,
Neal E. Young
Abstract:
The Joint Replenishment Problem (JRP) is a fundamental optimization problem in supply-chain management, concerned with optimizing the flow of goods from a supplier to retailers. Over time, in response to demands at the retailers, the supplier ships orders, via a warehouse, to the retailers. The objective is to schedule these orders to minimize the sum of ordering costs and retailers' waiting costs…
▽ More
The Joint Replenishment Problem (JRP) is a fundamental optimization problem in supply-chain management, concerned with optimizing the flow of goods from a supplier to retailers. Over time, in response to demands at the retailers, the supplier ships orders, via a warehouse, to the retailers. The objective is to schedule these orders to minimize the sum of ordering costs and retailers' waiting costs.
We study the approximability of JRP-D, the version of JRP with deadlines, where instead of waiting costs the retailers impose strict deadlines. We study the integrality gap of the standard linear-program (LP) relaxation, giving a lower bound of 1.207, a stronger, computer-assisted lower bound of 1.245, as well as an upper bound and approximation ratio of 1.574. The best previous upper bound and approximation ratio was 1.667; no lower bound was previously published. For the special case when all demand periods are of equal length we give an upper bound of 1.5, a lower bound of 1.2, and show APX-hardness.
△ Less
Submitted 2 December, 2015; v1 submitted 13 December, 2012;
originally announced December 2012.
-
Concentration and Moment Inequalities for Polynomials of Independent Random Variables
Authors:
Warren Schudy,
Maxim Sviridenko
Abstract:
In this work we design a general method for proving moment inequalities for polynomials of independent random variables. Our method works for a wide range of random variables including Gaussian, Boolean, exponential, Poisson and many others. We apply our method to derive general concentration inequalities for polynomials of independent random variables. We show that our method implies concentratio…
▽ More
In this work we design a general method for proving moment inequalities for polynomials of independent random variables. Our method works for a wide range of random variables including Gaussian, Boolean, exponential, Poisson and many others. We apply our method to derive general concentration inequalities for polynomials of independent random variables. We show that our method implies concentration inequalities for some previously open problems, e.g. permanent of a random symmetric matrices. We show that our concentration inequality is stronger than the well-known concentration inequality due to Kim and Vu. The main advantage of our method in comparison with the existing ones is a wide range of random variables we can handle and bounds for previously intractable regimes of high degree polynomials and small expectations. On the negative side we show that even for boolean random variables each term in our concentration inequality is tight.
△ Less
Submitted 8 June, 2012; v1 submitted 26 April, 2011;
originally announced April 2011.
-
Non-monotone submodular maximization under matroid and knapsack constraints
Authors:
Jon Lee,
Vahab Mirrokni,
Viswanath Nagarjan,
Maxim Sviridenko
Abstract:
Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, and maximum facility location problems. Unlike submodular minimization, submodular maximization is NP-hard. For the problem of maximizing a…
▽ More
Submodular function maximization is a central problem in combinatorial optimization, generalizing many important problems including Max Cut in directed/undirected graphs and in hypergraphs, certain constraint satisfaction problems, maximum entropy sampling, and maximum facility location problems. Unlike submodular minimization, submodular maximization is NP-hard. For the problem of maximizing a non-monotone submodular function, Feige, Mirrokni, and Vondrák recently developed a $2\over 5$-approximation algorithm \cite{FMV07}, however, their algorithms do not handle side constraints.} In this paper, we give the first constant-factor approximation algorithm for maximizing any non-negative submodular function subject to multiple matroid or knapsack constraints. We emphasize that our results are for {\em non-monotone} submodular functions. In particular, for any constant $k$, we present a $({1\over k+2+{1\over k}+ε})$-approximation for the submodular maximization problem under $k$ matroid constraints, and a $({1\over 5}-ε)$-approximation algorithm for this problem subject to $k$ knapsack constraints ($ε>0$ is any constant). We improve the approximation guarantee of our algorithm to ${1\over k+1+{1\over k-1}+ε}$ for $k\ge 2$ partition matroid constraints. This idea also gives a $({1\over k+ε})$-approximation for maximizing a {\em monotone} submodular function subject to $k\ge 2$ partition matroids, which improves over the previously best known guarantee of $\frac{1}{k+1}$.
△ Less
Submitted 2 February, 2009;
originally announced February 2009.