Search | arXiv e-print repository

doi 10.1007/978-3-319-18263-6_18

Generalized Hypergraph Matching via Iterated Packing and Local Ratio

Abstract: In $k$-hypergraph matching, we are given a collection of sets of size at most $k$, each with an associated weight, and we seek a maximum-weight subcollection whose sets are pairwise disjoint. More generally, in $k$-hypergraph $b$-matching, instead of disjointness we require that every element appears in at most $b$ sets of the subcollection. Our main result is a linear-programming based… ▽ More In $k$-hypergraph matching, we are given a collection of sets of size at most $k$, each with an associated weight, and we seek a maximum-weight subcollection whose sets are pairwise disjoint. More generally, in $k$-hypergraph $b$-matching, instead of disjointness we require that every element appears in at most $b$ sets of the subcollection. Our main result is a linear-programming based $(k-1+\tfrac{1}{k})$-approximation algorithm for $k$-hypergraph $b$-matching. This settles the integrality gap when $k$ is one more than a prime power, since it matches a previously-known lower bound. When the hypergraph is bipartite, we are able to improve the approximation ratio to $k-1$, which is also best possible relative to the natural LP. These results are obtained using a more careful application of the \emph{iterated packing} method. Using the bipartite algorithmic integrality gap upper bound, we show that for the family of combinatorial auctions in which anyone can win at most $t$ items, there is a truthful-in-expectation polynomial-time auction that $t$-approximately maximizes social welfare. We also show that our results directly imply new approximations for a generalization of the recently introduced bounded-color matching problem. We also consider the generalization of $b$-matching to \emph{demand matching}, where edges have nonuniform demand values. The best known approximation algorithm for this problem has ratio $2k$ on $k$-hypergraphs. We give a new algorithm, based on local ratio, that obtains the same approximation ratio in a much simpler way. △ Less

Submitted 1 April, 2016; originally announced April 2016.

Comments: 12 pages. Appeared in the 12th Workshop on Approximation and Online Algorithms (WAOA 2014), available at Springer via http://dx.doi.org/10.1007/978-3-319-18263-6_18

arXiv:1509.07238 [pdf, other]

Frequency Distribution of Error Messages

Authors: David Pritchard

Abstract: Which programming error messages are the most common? We investigate this question, motivated by writing error explanations for novices. We consider large data sets in Python and Java that include both syntax and run-time errors. In both data sets, after grou** essentially identical messages, the error message frequencies empirically resemble Zipf-Mandelbrot distributions. We use a maximum-likel… ▽ More Which programming error messages are the most common? We investigate this question, motivated by writing error explanations for novices. We consider large data sets in Python and Java that include both syntax and run-time errors. In both data sets, after grou** essentially identical messages, the error message frequencies empirically resemble Zipf-Mandelbrot distributions. We use a maximum-likelihood approach to fit the distribution parameters. This gives one possible way to contrast languages or compilers quantitatively. △ Less

Submitted 24 September, 2015; originally announced September 2015.

Comments: To appear at PLATEAU 2015

arXiv:1209.2166 [pdf, other]

CS Circles: An In-Browser Python Course for Beginners

Authors: David Pritchard, Troy Vasiga

Abstract: Computer Science Circles is a free programming website for beginners that is designed to be fun, easy to use, and accessible to the broadest possible audience. We teach Python since it is simple yet powerful, and the course content is well-structured but written in plain language. The website has over one hundred exercises in thirty lesson pages, plus special features to help teachers support thei… ▽ More Computer Science Circles is a free programming website for beginners that is designed to be fun, easy to use, and accessible to the broadest possible audience. We teach Python since it is simple yet powerful, and the course content is well-structured but written in plain language. The website has over one hundred exercises in thirty lesson pages, plus special features to help teachers support their students. It is available in both English and French. We discuss the philosophy behind the course and its design, we describe how it was implemented, and we give statistics on its use. △ Less

Submitted 10 December, 2012; v1 submitted 10 September, 2012; originally announced September 2012.

Comments: To appear in SIGCSE 2013

ACM Class: K.3.1; K.3.2

arXiv:1202.2820 [pdf, other]

On Approximating String Selection Problems with Outliers

Authors: Christina Boucher, Gad M. Landau, Avivit Levy, David Pritchard, Oren Weimann

Abstract: Many problems in bioinformatics are about finding strings that approximately represent a collection of given strings. We look at more general problems where some input strings can be classified as outliers. The Close to Most Strings problem is, given a set S of same-length strings, and a parameter d, find a string x that maximizes the number of "non-outliers" within Hamming distance d of x. We pro… ▽ More Many problems in bioinformatics are about finding strings that approximately represent a collection of given strings. We look at more general problems where some input strings can be classified as outliers. The Close to Most Strings problem is, given a set S of same-length strings, and a parameter d, find a string x that maximizes the number of "non-outliers" within Hamming distance d of x. We prove this problem has no PTAS unless ZPP=NP, correcting a decade-old mistake. The Most Strings with Few Bad Columns problem is to find a maximum-size subset of input strings so that the number of non-identical positions is at most k; we show it has no PTAS unless P=NP. We also observe Closest to k Strings has no EPTAS unless W[1]=FPT. In sum, outliers help model problems associated with using biological data, but we show the problem of finding an approximate solution is computationally difficult. △ Less

Submitted 13 February, 2012; originally announced February 2012.

arXiv:1103.0412 [pdf, other]

Counting large distances in convex polygons

Authors: Filip Morić, David Pritchard

Abstract: In a convex n-gon, let d[1] > d[2] > ... denote the set of all distances between pairs of vertices, and let m[i] be the number of pairs of vertices at distance d[i] from one another. Erdos, Lovasz, and Vesztergombi conjectured that m[1] + ... + m[k] <= k*n. Using a new computational approach, we prove their conjecture when k <= 4 and n is large; we also make some progress for arbitrary k by provin… ▽ More In a convex n-gon, let d[1] > d[2] > ... denote the set of all distances between pairs of vertices, and let m[i] be the number of pairs of vertices at distance d[i] from one another. Erdos, Lovasz, and Vesztergombi conjectured that m[1] + ... + m[k] <= k*n. Using a new computational approach, we prove their conjecture when k <= 4 and n is large; we also make some progress for arbitrary k by proving that m[1] + ... + m[k] <= (2k-1)n. Our main approach revolves around a few known facts about distances, together with a computer program that searches all distance configurations of two disjoint convex hull intervals up to some finite size. We thereby obtain other new bounds such as m[3] <= 3n/2 for large n. △ Less

Submitted 29 July, 2011; v1 submitted 2 March, 2011; originally announced March 2011.

Comments: Shorter version presented at EuroComb 2011

arXiv:1009.6144 [pdf, ps, other]

Cover-Decomposition and Polychromatic Numbers

Authors: Béla Bollobás, David Pritchard, Thomas Rothvoß, Alex Scott

Abstract: A colouring of a hypergraph's vertices is polychromatic if every hyperedge contains at least one vertex of each colour; the polychromatic number is the maximum number of colours in such a colouring. Its dual, the cover-decomposition number, is the maximum number of disjoint hyperedge-covers. In geometric hypergraphs, there is extensive work on lower-bounding these numbers in terms of their trivial… ▽ More A colouring of a hypergraph's vertices is polychromatic if every hyperedge contains at least one vertex of each colour; the polychromatic number is the maximum number of colours in such a colouring. Its dual, the cover-decomposition number, is the maximum number of disjoint hyperedge-covers. In geometric hypergraphs, there is extensive work on lower-bounding these numbers in terms of their trivial upper bounds (minimum hyperedge size and degree); our goal here is to broaden the study beyond geometric settings. We obtain algorithms yielding near-tight bounds for three families of hypergraphs: bounded hyperedge size, paths in trees, and bounded VC-dimension. This reveals that discrepancy theory and iterated linear program relaxation are useful for cover-decomposition. Finally, we discuss the generalization of cover-decomposition to sensor cover. △ Less

Submitted 29 May, 2012; v1 submitted 30 September, 2010; originally announced September 2010.

Comments: Supercedes arXiv:1009.5893

arXiv:1006.2249 [pdf, ps, other]

Integrality Gap of the Hypergraphic Relaxation of Steiner Trees: a short proof of a 1.55 upper bound

Authors: Deeparnab Chakrabarty, Jochen Koenemann, David Pritchard

Abstract: Recently Byrka, Grandoni, Rothvoss and Sanita (at STOC 2010) gave a 1.39-approximation for the Steiner tree problem, using a hypergraph-based linear programming relaxation. They also upper-bounded its integrality gap by 1.55. We describe a shorter proof of the same integrality gap bound, by applying some of their techniques to a randomized loss-contracting algorithm. Recently Byrka, Grandoni, Rothvoss and Sanita (at STOC 2010) gave a 1.39-approximation for the Steiner tree problem, using a hypergraph-based linear programming relaxation. They also upper-bounded its integrality gap by 1.55. We describe a shorter proof of the same integrality gap bound, by applying some of their techniques to a randomized loss-contracting algorithm. △ Less

Submitted 11 June, 2010; originally announced June 2010.

arXiv:1005.3324 [pdf, ps, other]

An LP with Integrality Gap 1+epsilon for Multidimensional Knapsack

Authors: David Pritchard

Abstract: In this note we study packing or covering integer programs with at most k constraints, which are also known as k-dimensional knapsack problems. For any integer k > 0 and real epsilon > 0, we observe there is a polynomial-sized LP for the k-dimensional knapsack problem with integrality gap at most 1+epsilon. The variables may be unbounded or have arbitrary upper bounds. In the packing case, we can… ▽ More In this note we study packing or covering integer programs with at most k constraints, which are also known as k-dimensional knapsack problems. For any integer k > 0 and real epsilon > 0, we observe there is a polynomial-sized LP for the k-dimensional knapsack problem with integrality gap at most 1+epsilon. The variables may be unbounded or have arbitrary upper bounds. In the packing case, we can also remove the dependence of the LP on the cost-function, yielding a polyhedral approximation of the integer hull. This generalizes a recent result of Bienstock on the classical knapsack problem. △ Less

Submitted 2 February, 2011; v1 submitted 18 May, 2010; originally announced May 2010.

arXiv:1004.1917 [pdf, ps, other]

k-Edge-Connectivity: Approximation and LP Relaxation

Authors: David Pritchard

Abstract: In the k-edge-connected spanning subgraph problem we are given a graph (V, E) and costs for each edge, and want to find a minimum-cost subset F of E such that (V, F) is k-edge-connected. We show there is a constant eps > 0 so that for all k > 1, finding a (1 + eps)-approximation for k-ECSS is NP-hard, establishing a gap between the unit-cost and general-cost versions. Next, we consider the multi-s… ▽ More In the k-edge-connected spanning subgraph problem we are given a graph (V, E) and costs for each edge, and want to find a minimum-cost subset F of E such that (V, F) is k-edge-connected. We show there is a constant eps > 0 so that for all k > 1, finding a (1 + eps)-approximation for k-ECSS is NP-hard, establishing a gap between the unit-cost and general-cost versions. Next, we consider the multi-subgraph cousin of k-ECSS, in which we purchase a multi-subset F of E, with unlimited parallel copies available at the same cost as the original edge. We conjecture that a (1 + Theta(1/k))-approximation algorithm exists, and we describe an approach based on graph decompositions applied to its natural linear programming (LP) relaxation. The LP is essentially equivalent to the Held-Karp LP for TSP and the undirected LP for Steiner tree. We give a family of extreme points for the LP which are more complex than those previously known. △ Less

Submitted 4 October, 2010; v1 submitted 12 April, 2010; originally announced April 2010.

Comments: Appeared at WAOA 2010

arXiv:1002.3864 [pdf, other]

Limits of Approximation Algorithms: PCPs and Unique Games (DIMACS Tutorial Lecture Notes)

Authors: Prahladh Harsha, Moses Charikar, Matthew Andrews, Sanjeev Arora, Subhash Khot, Dana Moshkovitz, Lisa Zhang, Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov, Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard, Gwen Spencer

Abstract: These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intracta… ▽ More These are the lecture notes for the DIMACS Tutorial "Limits of Approximation Algorithms: PCPs and Unique Games" held at the DIMACS Center, CoRE Building, Rutgers University on 20-21 July, 2009. This tutorial was jointly sponsored by the DIMACS Special Focus on Hardness of Approximation, the DIMACS Special Focus on Algorithmic Foundations of the Internet, and the Center for Computational Intractability with support from the National Security Agency and the National Science Foundation. The speakers at the tutorial were Matthew Andrews, Sanjeev Arora, Moses Charikar, Prahladh Harsha, Subhash Khot, Dana Moshkovitz and Lisa Zhang. The sribes were Ashkan Aazami, Dev Desai, Igor Gorodezky, Geetha Jagannathan, Alexander S. Kulikov, Darakhshan J. Mir, Alantha Newman, Aleksandar Nikolov, David Pritchard and Gwen Spencer. △ Less

Submitted 20 February, 2010; originally announced February 2010.

Comments: 74 pages, lecture notes

Report number: DIMACS Technical Report 2010-02

arXiv:0910.0281 [pdf, ps, other]

doi 10.1007/978-3-642-13036-6_29

Hypergraphic LP Relaxations for Steiner Trees

Authors: Deeparnab Chakrabarty, Jochen Koenemann, David Pritchard

Abstract: We investigate hypergraphic LP relaxations for the Steiner tree problem, primarily the partition LP relaxation introduced by Koenemann et al. [Math. Programming, 2009]. Specifically, we are interested in proving upper bounds on the integrality gap of this LP, and studying its relation to other linear relaxations. Our results are the following. Structural results: We extend the technique of uncross… ▽ More We investigate hypergraphic LP relaxations for the Steiner tree problem, primarily the partition LP relaxation introduced by Koenemann et al. [Math. Programming, 2009]. Specifically, we are interested in proving upper bounds on the integrality gap of this LP, and studying its relation to other linear relaxations. Our results are the following. Structural results: We extend the technique of uncrossing, usually applied to families of sets, to families of partitions. As a consequence we show that any basic feasible solution to the partition LP formulation has sparse support. Although the number of variables could be exponential, the number of positive variables is at most the number of terminals. Relations with other relaxations: We show the equivalence of the partition LP relaxation with other known hypergraphic relaxations. We also show that these hypergraphic relaxations are equivalent to the well studied bidirected cut relaxation, if the instance is quasibipartite. Integrality gap upper bounds: We show an upper bound of sqrt(3) ~ 1.729 on the integrality gap of these hypergraph relaxations in general graphs. In the special case of uniformly quasibipartite instances, we show an improved upper bound of 73/60 ~ 1.216. By our equivalence theorem, the latter result implies an improved upper bound for the bidirected cut relaxation as well. △ Less

Submitted 6 March, 2010; v1 submitted 1 October, 2009; originally announced October 2009.

Comments: Revised full version; a shorter version will appear at IPCO 2010.

arXiv:0904.0859 [pdf, ps, other]

Approximability of Sparse Integer Programs

Authors: David Pritchard, Deeparnab Chakrabarty

Abstract: The main focus of this paper is a pair of new approximation algorithms for certain integer programs. First, for covering integer programs {min cx: Ax >= b, 0 <= x <= d} where A has at most k nonzeroes per row, we give a k-approximation algorithm. (We assume A, b, c, d are nonnegative.) For any k >= 2 and eps>0, if P != NP this ratio cannot be improved to k-1-eps, and under the unique games conje… ▽ More The main focus of this paper is a pair of new approximation algorithms for certain integer programs. First, for covering integer programs {min cx: Ax >= b, 0 <= x <= d} where A has at most k nonzeroes per row, we give a k-approximation algorithm. (We assume A, b, c, d are nonnegative.) For any k >= 2 and eps>0, if P != NP this ratio cannot be improved to k-1-eps, and under the unique games conjecture this ratio cannot be improved to k-eps. One key idea is to replace individual constraints by others that have better rounding properties but the same nonnegative integral solutions; another critical ingredient is knapsack-cover inequalities. Second, for packing integer programs {max cx: Ax <= b, 0 <= x <= d} where A has at most k nonzeroes per column, we give a (2k^2+2)-approximation algorithm. Our approach builds on the iterated LP relaxation framework. In addition, we obtain improved approximations for the second problem when k=2, and for both problems when every A_{ij} is small compared to b_i. Finally, we demonstrate a 17/16-inapproximability for covering integer programs with at most two nonzeroes per column. △ Less

Submitted 9 February, 2010; v1 submitted 6 April, 2009; originally announced April 2009.

Comments: Version submitted to Algorithmica special issue on ESA 2009. Previous conference version: http://dx.doi.org/10.1007/978-3-642-04128-0_8

arXiv:0712.3568 [pdf, ps, other]

A Partition-Based Relaxation For Steiner Trees

Authors: Jochen Konemann, David Pritchard, Kunlun Tan

Abstract: The Steiner tree problem is a classical NP-hard optimization problem with a wide range of practical applications. In an instance of this problem, we are given an undirected graph G=(V,E), a set of terminals R, and non-negative costs c_e for all edges e in E. Any tree that contains all terminals is called a Steiner tree; the goal is to find a minimum-cost Steiner tree. The nodes V R are called St… ▽ More The Steiner tree problem is a classical NP-hard optimization problem with a wide range of practical applications. In an instance of this problem, we are given an undirected graph G=(V,E), a set of terminals R, and non-negative costs c_e for all edges e in E. Any tree that contains all terminals is called a Steiner tree; the goal is to find a minimum-cost Steiner tree. The nodes V R are called Steiner nodes. The best approximation algorithm known for the Steiner tree problem is due to Robins and Zelikovsky (SIAM J. Discrete Math, 2005); their greedy algorithm achieves a performance guarantee of 1+(ln 3)/2 ~ 1.55. The best known linear (LP)-based algorithm, on the other hand, is due to Goemans and Bertsimas (Math. Programming, 1993) and achieves an approximation ratio of 2-2/|R|. In this paper we establish a link between greedy and LP-based approaches by showing that Robins and Zelikovsky's algorithm has a natural primal-dual interpretation with respect to a novel partition-based linear programming relaxation. We also exhibit surprising connections between the new formulation and existing LPs and we show that the new LP is stronger than the bidirected cut formulation. An instance is b-quasi-bipartite if each connected component of G R has at most b vertices. We show that Robins' and Zelikovsky's algorithm has an approximation ratio better than 1+(ln 3)/2 for such instances, and we prove that the integrality gap of our LP is between 8/7 and (2b+1)/(b+1). △ Less

Submitted 20 December, 2007; originally announced December 2007.

Comments: Submitted to Math. Prog

arXiv:0708.0580 [pdf, ps, other]

Efficient Divide-and-Conquer Implementations Of Symmetric FSAs

Authors: David Pritchard

Abstract: A deterministic finite-state automaton (FSA) is an abstract sequential machine that reads the symbols comprising an input word one at a time. An FSA is symmetric if its output is independent of the order in which the input symbols are read, i.e., if the output is invariant under permutations of the input. We show how to convert a symmetric FSA A into an automaton-like divide-and-conquer process wh… ▽ More A deterministic finite-state automaton (FSA) is an abstract sequential machine that reads the symbols comprising an input word one at a time. An FSA is symmetric if its output is independent of the order in which the input symbols are read, i.e., if the output is invariant under permutations of the input. We show how to convert a symmetric FSA A into an automaton-like divide-and-conquer process whose intermediate results are no larger than the size of A's memory. In comparison, a similar result for general FSA's has been long known via functional composition, but entails an exponential increase in memory size. The new result has applications to parallel processing and symmetric FSA networks. △ Less

Submitted 5 August, 2010; v1 submitted 3 August, 2007; originally announced August 2007.

Journal ref: Journal of Cellular Automata 5(6) (special issue for Automata 2007, H. Fuks & A. T. Lawniczak, eds), pages 481-490, 2010

arXiv:cs/0702114 [pdf, ps, other]

Nearest Neighbor Network Traversal

Authors: David Pritchard

Abstract: A mobile agent in a network wants to visit every node of an n-node network, using a small number of steps. We investigate the performance of the following ``nearest neighbor'' heuristic: always go to the nearest unvisited node. If the network graph never changes, then from (Rosenkrantz, Stearns and Lewis, 1977) and (Hurkens and Woeginger, 2004) it follows that Theta(n log n) steps are necessary… ▽ More A mobile agent in a network wants to visit every node of an n-node network, using a small number of steps. We investigate the performance of the following ``nearest neighbor'' heuristic: always go to the nearest unvisited node. If the network graph never changes, then from (Rosenkrantz, Stearns and Lewis, 1977) and (Hurkens and Woeginger, 2004) it follows that Theta(n log n) steps are necessary and sufficient in the worst case. We give a simpler proof of the upper bound and an example that improves the best known lower bound. We investigate how the performance of this heuristic changes when it is distributively implemented in a network. Even if network edges are allow to fail over time, we show that the nearest neighbor strategy never runs for more than O(n^2) iterations. We also show that any strategy can be forced to take at least n(n-1)/2 steps before all nodes are visited, if the edges of the network are deleted in an adversarial way. △ Less

Submitted 19 February, 2007; originally announced February 2007.

arXiv:cs/0702113 [pdf, ps, other]

Fast Computation of Small Cuts via Cycle Space Sampling

Authors: David Pritchard, Ramakrishna Thurimella

Abstract: We describe a new sampling-based method to determine cuts in an undirected graph. For a graph (V, E), its cycle space is the family of all subsets of E that have even degree at each vertex. We prove that with high probability, sampling the cycle space identifies the cuts of a graph. This leads to simple new linear-time sequential algorithms for finding all cut edges and cut pairs (a set of 2 edges… ▽ More We describe a new sampling-based method to determine cuts in an undirected graph. For a graph (V, E), its cycle space is the family of all subsets of E that have even degree at each vertex. We prove that with high probability, sampling the cycle space identifies the cuts of a graph. This leads to simple new linear-time sequential algorithms for finding all cut edges and cut pairs (a set of 2 edges that form a cut) of a graph. In the model of distributed computing in a graph G=(V, E) with O(log V)-bit messages, our approach yields faster algorithms for several problems. The diameter of G is denoted by Diam, and the maximum degree by Delta. We obtain simple O(Diam)-time distributed algorithms to find all cut edges, 2-edge-connected components, and cut pairs, matching or improving upon previous time bounds. Under natural conditions these new algorithms are universally optimal --- i.e. a Omega(Diam)-time lower bound holds on every graph. We obtain a O(Diam+Delta/log V)-time distributed algorithm for finding cut vertices; this is faster than the best previous algorithm when Delta, Diam = O(sqrt(V)). A simple extension of our work yields the first distributed algorithm with sub-linear time for 3-edge-connected components. The basic distributed algorithms are Monte Carlo, but they can be made Las Vegas without increasing the asymptotic complexity. In the model of parallel computing on the EREW PRAM our approach yields a simple algorithm with optimal time complexity O(log V) for finding cut pairs and 3-edge-connected components. △ Less

Submitted 21 July, 2010; v1 submitted 19 February, 2007; originally announced February 2007.

Comments: Previous version appeared in Proc. 35th ICALP, pages 145--160, 2008

ACM Class: F.1.2; F.2.2; G.2.2; G.3

arXiv:cs/0602013 [pdf, ps, other]

An Optimal Distributed Edge-Biconnectivity Algorithm

Authors: David Pritchard

Abstract: We describe a synchronous distributed algorithm which identifies the edge-biconnected components of a connected network. It requires a leader, and uses messages of size O(log |V|). The main idea is to preorder a BFS spanning tree, and then to efficiently compute least common ancestors so as to mark cycle edges. This algorithm takes O(Diam) time and uses O(|E|) messages. Furthermore, we show that… ▽ More We describe a synchronous distributed algorithm which identifies the edge-biconnected components of a connected network. It requires a leader, and uses messages of size O(log |V|). The main idea is to preorder a BFS spanning tree, and then to efficiently compute least common ancestors so as to mark cycle edges. This algorithm takes O(Diam) time and uses O(|E|) messages. Furthermore, we show that no correct singly-initiated edge-biconnectivity algorithm can beat either bound on any graph by more than a constant factor. We also describe a near-optimal local algorithm for edge-biconnectivity. △ Less

Submitted 5 February, 2006; originally announced February 2006.

Comments: Submitted to PODC 2006. Contains a pstricks figure

Showing 1–17 of 17 results for author: Pritchard, D