Search | arXiv e-print repository

Finding Near-Optimal Weight Independent Sets at Scale

Authors: Ernestine Großmann, Sebastian Lamm, Christian Schulz, Darren Strash

Abstract: Computing maximum weight independent sets in graphs is an important NP-hard optimization problem. The problem is particularly difficult to solve in large graphs for which data reduction techniques do not work well. To be more precise, state-of-the-art branch-and-reduce algorithms can solve many large-scale graphs if reductions are applicable. Otherwise, their performance quickly degrades due to br… ▽ More Computing maximum weight independent sets in graphs is an important NP-hard optimization problem. The problem is particularly difficult to solve in large graphs for which data reduction techniques do not work well. To be more precise, state-of-the-art branch-and-reduce algorithms can solve many large-scale graphs if reductions are applicable. Otherwise, their performance quickly degrades due to branching requiring exponential time. In this paper, we develop an advanced memetic algorithm to tackle the problem, which incorporates recent data reduction techniques to compute near-optimal weighted independent sets in huge sparse networks. More precisely, we use a memetic approach to recursively choose vertices that are likely to be in a large-weight independent set. We include these vertices into the solution, and further reduce the graph. We show that identifying and removing vertices likely to be in large-weight independent sets opens up the reduction space and speeds up the computation of large-weight independent sets remarkably. Our experimental evaluation indicates that we are able to outperform state-of-the-art algorithms. For example, our two algorithm configurations compute the best results among all competing algorithms for 205 out of 207 instances. Thus can be seen as a useful tool when large-weight independent sets need to be computed in~practice. △ Less

Submitted 21 April, 2023; v1 submitted 29 August, 2022; originally announced August 2022.

arXiv:2102.01540 [pdf, other]

Targeted Branching for the Maximum Independent Set Problem

Authors: Demian Hespe, Sebastian Lamm, Christian Schorr

Abstract: Finding a maximum independent set is a fundamental NP-hard problem that is used in many real-world applications. Given an unweighted graph, this problem asks for a maximum cardinality set of pairwise non-adjacent vertices. Some of the most successful algorithms for this problem are based on the branch-and-bound or branch-and-reduce paradigms. In particular, branch-and-reduce algorithms, which comb… ▽ More Finding a maximum independent set is a fundamental NP-hard problem that is used in many real-world applications. Given an unweighted graph, this problem asks for a maximum cardinality set of pairwise non-adjacent vertices. Some of the most successful algorithms for this problem are based on the branch-and-bound or branch-and-reduce paradigms. In particular, branch-and-reduce algorithms, which combine branch-and-bound with reduction rules, achieved substantial results, solving many previously infeasible instances. These results were to a large part achieved by develo** new, more practical reduction rules. However, other components that have been shown to have an impact on the performance of these algorithms have not received as much attention. One of these is the branching strategy, which determines what vertex is included or excluded in a potential solution. The most commonly used strategy selects vertices based on their degree and does not take into account other factors that contribute to the performance. In this work, we develop and evaluate several novel branching strategies for both branch-and-bound and branch-and-reduce algorithms. Our strategies are based on one of two approaches. They either (1) aim to decompose the graph into two or more connected components which can then be solved independently, or (2) try to remove vertices that hinder the application of a reduction rule. Our experiments on a large set of real-world instances indicate that our strategies are able to improve the performance of the state-of-the-art branch-and-reduce algorithms. To be more specific, our reduction-based packing branching rule is able to outperform the default branching strategy of selecting a vertex of highest degree on 65% of all instances tested. Furthermore, our decomposition-based strategy based on edge cuts is able to achieve a speedup of 2.29 on sparse networks (1.22 on all instances). △ Less

Submitted 29 March, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

arXiv:2012.12594 [pdf, ps, other]

Recent Advances in Practical Data Reduction

Authors: Faisal Abu-Khzam, Sebastian Lamm, Matthias Mnich, Alexander Noe, Christian Schulz, Darren Strash

Abstract: Over the last two decades, significant advances have been made in the design and analysis of fixed-parameter algorithms for a wide variety of graph-theoretic problems. This has resulted in an algorithmic toolbox that is by now well-established. However, these theoretical algorithmic ideas have received very little attention from the practical perspective. We survey recent trends in data reduction… ▽ More Over the last two decades, significant advances have been made in the design and analysis of fixed-parameter algorithms for a wide variety of graph-theoretic problems. This has resulted in an algorithmic toolbox that is by now well-established. However, these theoretical algorithmic ideas have received very little attention from the practical perspective. We survey recent trends in data reduction engineering results for selected problems. Moreover, we describe concrete techniques that may be useful for future implementations in the area and give open problems and research questions. △ Less

Submitted 31 December, 2020; v1 submitted 23 December, 2020; originally announced December 2020.

arXiv:2008.05180 [pdf, other]

Boosting Data Reduction for the Maximum Weight Independent Set Problem Using Increasing Transformations

Authors: Alexander Gellner, Sebastian Lamm, Christian Schulz, Darren Strash, Bogdán Zaválnij

Abstract: Given a vertex-weighted graph, the maximum weight independent set problem asks for a pair-wise non-adjacent set of vertices such that the sum of their weights is maximum. The branch-and-reduce paradigm is the de facto standard approach to solve the problem to optimality in practice. In this paradigm, data reduction rules are applied to decrease the problem size. These data reduction rules ensure t… ▽ More Given a vertex-weighted graph, the maximum weight independent set problem asks for a pair-wise non-adjacent set of vertices such that the sum of their weights is maximum. The branch-and-reduce paradigm is the de facto standard approach to solve the problem to optimality in practice. In this paradigm, data reduction rules are applied to decrease the problem size. These data reduction rules ensure that given an optimum solution on the new (smaller) input, one can quickly construct an optimum solution on the original input. We introduce new generalized data reduction and transformation rules for the problem. A key feature of our work is that some transformation rules can increase the size of the input. Surprisingly, these so-called increasing transformations can simplify the problem and also open up the reduction space to yield even smaller irreducible graphs later throughout the algorithm. In experiments, our algorithm computes significantly smaller irreducible graphs on all except one instance, solves more instances to optimality than previously possible, is up to two orders of magnitude faster than the best state-of-the-art solver, and finds higher-quality solutions than heuristic solvers DynWVC and HILS on many instances. While the increasing transformations are only efficient enough for preprocessing at this time, we see this as a critical initial step towards a new branch-and-transform paradigm. △ Less

Submitted 13 August, 2020; v1 submitted 12 August, 2020; originally announced August 2020.

arXiv:2003.00736 [pdf, other]

Recent Advances in Scalable Network Generation

Authors: Manuel Penschuck, Ulrik Brandes, Michael Hamann, Sebastian Lamm, Ulrich Meyer, Ilya Safro, Peter Sanders, Christian Schulz

Abstract: Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to… ▽ More Random graph models are frequently used as a controllable and versatile data source for experimental campaigns in various research fields. Generating such data-sets at scale is a non-trivial task as it requires design decisions typically spanning multiple areas of expertise. Challenges begin with the identification of relevant domain-specific network features, continue with the question of how to compile such features into a tractable model, and culminate in algorithmic details arising while implementing the pertaining model. In the present survey, we explore crucial aspects of random graph models with known scalable generators. We begin by briefly introducing network features considered by such models, and then discuss random graphs alongside with generation algorithms. Our focus lies on modelling techniques and algorithmic primitives that have proven successful in obtaining massive graphs. We consider concepts and graph models for various domains (such as social network, infrastructure, ecology, and numerical simulations), and discuss generators for different models of computation (including shared-memory parallelism, massive-parallel GPUs, and distributed systems). △ Less

Submitted 2 March, 2020; originally announced March 2020.

arXiv:1908.06795 [pdf, other]

WeGotYouCovered: The Winning Solver from the PACE 2019 Implementation Challenge, Vertex Cover Track

Authors: Demian Hespe, Sebastian Lamm, Christian Schulz, Darren Strash

Abstract: We present the winning solver of the PACE 2019 Implementation Challenge, Vertex Cover Track. The minimum vertex cover problem is one of a handful of problems for which kernelization---the repeated reducing of the input size via data reduction rules---is known to be highly effective in practice. Our algorithm uses a portfolio of techniques, including an aggressive kernelization strategy, local sear… ▽ More We present the winning solver of the PACE 2019 Implementation Challenge, Vertex Cover Track. The minimum vertex cover problem is one of a handful of problems for which kernelization---the repeated reducing of the input size via data reduction rules---is known to be highly effective in practice. Our algorithm uses a portfolio of techniques, including an aggressive kernelization strategy, local search, branch-and-reduce, and a state-of-the-art branch-and-bound solver. Of particular interest is that several of our techniques were not from the literature on the vertex over problem: they were originally published to solve the (complementary) maximum independent set and maximum clique problems. Aside from illustrating our solver's performance in the PACE 2019 Implementation Challenge, our experiments provide several key insights not yet seen before in the literature. First, kernelization can boost the performance of branch-and-bound clique solvers enough to outperform branch-and-reduce solvers. Second, local search can significantly boost the performance of branch-and-reduce solvers. And finally, somewhat surprisingly, kernelization can sometimes make branch-and-bound algorithms perform worse than running branch-and-bound alone. △ Less

Submitted 20 August, 2019; v1 submitted 19 August, 2019; originally announced August 2019.

arXiv:1905.10902 [pdf, other]

Engineering Kernelization for Maximum Cut

Authors: Damir Ferizovic, Demian Hespe, Sebastian Lamm, Matthias Mnich, Christian Schulz, Darren Strash

Abstract: Kernelization is a general theoretical framework for preprocessing instances of NP-hard problems into (generally smaller) instances with bounded size, via the repeated application of data reduction rules. For the fundamental Max Cut problem, kernelization algorithms are theoretically highly efficient for various parameterizations. However, the efficacy of these reduction rules in practice---to aid… ▽ More Kernelization is a general theoretical framework for preprocessing instances of NP-hard problems into (generally smaller) instances with bounded size, via the repeated application of data reduction rules. For the fundamental Max Cut problem, kernelization algorithms are theoretically highly efficient for various parameterizations. However, the efficacy of these reduction rules in practice---to aid solving highly challenging benchmark instances to optimality---remains entirely unexplored. We engineer a new suite of efficient data reduction rules that subsume most of the previously published rules, and demonstrate their significant impact on benchmark data sets, including synthetic instances, and data sets from the VLSI and image segmentation application domains. Our experiments reveal that current state-of-the-art solvers can be sped up by up to multiple orders of magnitude when combined with our data reduction rules. On social and biological networks in particular, kernelization enables us to solve four instances that were previously unsolved in a ten-hour time limit with state-of-the-art solvers; three of these instances are now solved in less than two seconds. △ Less

Submitted 26 May, 2019; originally announced May 2019.

Comments: 16 pages, 4 tables, 2 figures

ACM Class: F.2.2; G.2.2

arXiv:1810.10834 [pdf, other]

Exactly Solving the Maximum Weight Independent Set Problem on Large Real-World Graphs

Authors: Sebastian Lamm, Christian Schulz, Darren Strash, Robert Williger, Huashuo Zhang

Abstract: One powerful technique to solve NP-hard optimization problems in practice is branch-and-reduce search---which is branch-and-bound that intermixes branching with reductions to decrease the input size. While this technique is known to be very effective in practice for unweighted problems, very little is known for weighted problems, in part due to a lack of known effective reductions. In this work, w… ▽ More One powerful technique to solve NP-hard optimization problems in practice is branch-and-reduce search---which is branch-and-bound that intermixes branching with reductions to decrease the input size. While this technique is known to be very effective in practice for unweighted problems, very little is known for weighted problems, in part due to a lack of known effective reductions. In this work, we develop a full suite of new reductions for the maximum weight independent set problem and provide extensive experiments to show their effectiveness in practice on real-world graphs of up to millions of vertices and edges. Our experiments indicate that our approach is able to outperform existing state-of-the-art algorithms, solving many instances that were previously infeasible. In particular, we show that branch-and-reduce is able to solve a large number of instances up to two orders of magnitude faster than existing (inexact) local search algorithms---and is able to solve the majority of instances within 15 minutes. For those instances remaining infeasible, we show that combining kernelization with local search produces higher-quality solutions than local search alone. △ Less

Submitted 25 October, 2018; originally announced October 2018.

arXiv:1710.07565 [pdf, other]

Communication-free Massively Distributed Graph Generation

Authors: Daniel Funke, Sebastian Lamm, Ulrich Meyer, Peter Sanders, Manuel Penschuck, Christian Schulz, Darren Strash, Moritz von Looz

Abstract: Analyzing massive complex networks yields promising insights about our everyday lives. Building scalable algorithms to do so is a challenging task that requires a careful analysis and an extensive evaluation. However, engineering such algorithms is often hindered by the scarcity of publicly~available~datasets. Network generators serve as a tool to alleviate this problem by providing synthetic in… ▽ More Analyzing massive complex networks yields promising insights about our everyday lives. Building scalable algorithms to do so is a challenging task that requires a careful analysis and an extensive evaluation. However, engineering such algorithms is often hindered by the scarcity of publicly~available~datasets. Network generators serve as a tool to alleviate this problem by providing synthetic instances with controllable parameters. However, many network generators fail to provide instances on a massive scale due to their sequential nature or resource constraints. Additionally, truly scalable network generators are few and often limited in their realism. In this work, we present novel generators for a variety of network models that are frequently used as benchmarks. By making use of pseudorandomization and divide-and-conquer schemes, our generators follow a communication-free paradigm. The resulting generators are thus embarrassingly parallel and have a near optimal scaling behavior. This allows us to generate instances of up to $2^{43}$ vertices and $2^{47}$ edges in less than 22 minutes on 32768 cores. Therefore, our generators allow new graph families to be used on an unprecedented scale. △ Less

Submitted 18 March, 2019; v1 submitted 20 October, 2017; originally announced October 2017.

arXiv:1610.05141 [pdf, other]

doi 10.1145/3157734

Efficient Random Sampling -- Parallel, Vectorized, Cache-Efficient, and Online

Authors: Peter Sanders, Sebastian Lamm, Lorenz Hübschle-Schneider, Emanuel Schrade, Carsten Dachsbacher

Abstract: We consider the problem of sampling $n$ numbers from the range $\{1,\ldots,N\}$ without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and leads to a parallel algorithm running in expected time $\mathcal{O}(n/p+\log p)$ on $p$ processors, i.e., scales to massively parallel machines even for moderate v… ▽ More We consider the problem of sampling $n$ numbers from the range $\{1,\ldots,N\}$ without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and leads to a parallel algorithm running in expected time $\mathcal{O}(n/p+\log p)$ on $p$ processors, i.e., scales to massively parallel machines even for moderate values of $n$. The amount of communication between the processors is very small (at most $\mathcal{O}(\log p)$) and independent of the sample size. We also discuss modifications needed for load balancing, online sampling, sampling with replacement, Bernoulli sampling, and vectorization on SIMD units or GPUs. △ Less

Submitted 15 November, 2019; v1 submitted 17 October, 2016; originally announced October 2016.

ACM Class: G.4; G.3; G.2

Journal ref: ACM Transactions on Mathematical Software (TOMS), Volume 44, Issue 3 (April 2018), pages 29:1-29:14

arXiv:1608.05634 [pdf, other]

Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++

Authors: Timo Bingmann, Michael Axtmann, Emanuel Jöbstl, Sebastian Lamm, Huyen Chau Nguyen, Alexander Noe, Sebastian Schlag, Matthias Stumpp, Tobias Sturm, Peter Sanders

Abstract: We present the design and a first performance evaluation of Thrill -- a prototype of a general purpose big data processing framework with a convenient data-flow style programming interface. Thrill is somewhat similar to Apache Spark and Apache Flink with at least two main differences. First, Thrill is based on C++ which enables performance advantages due to direct native code compilation, a more c… ▽ More We present the design and a first performance evaluation of Thrill -- a prototype of a general purpose big data processing framework with a convenient data-flow style programming interface. Thrill is somewhat similar to Apache Spark and Apache Flink with at least two main differences. First, Thrill is based on C++ which enables performance advantages due to direct native code compilation, a more cache-friendly memory layout, and explicit memory management. In particular, Thrill uses template meta-programming to compile chains of subsequent local operations into a single binary routine without intermediate buffering and with minimal indirections. Second, Thrill uses arrays rather than multisets as its primary data structure which enables additional operations like sorting, prefix sums, window scans, or combining corresponding fields of several arrays (zip**). We compare Thrill with Apache Spark and Apache Flink using five kernels from the HiBench suite. Thrill is consistently faster and often several times faster than the other frameworks. At the same time, the source codes have a similar level of simplicity and abstraction △ Less

Submitted 19 August, 2016; originally announced August 2016.

ACM Class: D.1.3; E.1; H.2.4

arXiv:1602.01659 [pdf, other]

Accelerating Local Search for the Maximum Independent Set Problem

Authors: Jakob Dahlum, Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, Renato F. Werneck

Abstract: Computing high-quality independent sets quickly is an important problem in combinatorial optimization. Several recent algorithms have shown that kernelization techniques can be used to find exact maximum independent sets in medium-sized sparse graphs, as well as high-quality independent sets in huge sparse graphs that are intractable for exact (exponential-time) algorithms. However, a major drawba… ▽ More Computing high-quality independent sets quickly is an important problem in combinatorial optimization. Several recent algorithms have shown that kernelization techniques can be used to find exact maximum independent sets in medium-sized sparse graphs, as well as high-quality independent sets in huge sparse graphs that are intractable for exact (exponential-time) algorithms. However, a major drawback of these algorithms is that they require significant preprocessing overhead, and therefore cannot be used to find a high-quality independent set quickly. In this paper, we show that performing simple kernelization techniques in an online fashion significantly boosts the performance of local search, and is much faster than pre-computing a kernel using advanced techniques. In addition, we show that cutting high-degree vertices can boost local search performance even further, especially on huge (sparse) complex networks. Our experiments show that we can drastically speed up the computation of large independent sets compared to other state-of-the-art algorithms, while also producing results that are very close to the best known solutions. △ Less

Submitted 4 February, 2016; originally announced February 2016.

Comments: 17 pages, 2 figures, 3 tables

ACM Class: F.2.2; G.2.2

arXiv:1509.00764 [pdf, other]

Finding Near-Optimal Independent Sets at Scale

Authors: Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, Renato F. Werneck

Abstract: The independent set problem is NP-hard and particularly difficult to solve in large sparse graphs. In this work, we develop an advanced evolutionary algorithm, which incorporates kernelization techniques to compute large independent sets in huge sparse networks. A recent exact algorithm has shown that large networks can be solved exactly by employing a branch-and-reduce technique that recursively… ▽ More The independent set problem is NP-hard and particularly difficult to solve in large sparse graphs. In this work, we develop an advanced evolutionary algorithm, which incorporates kernelization techniques to compute large independent sets in huge sparse networks. A recent exact algorithm has shown that large networks can be solved exactly by employing a branch-and-reduce technique that recursively kernelizes the graph and performs branching. However, one major drawback of their algorithm is that, for huge graphs, branching still can take exponential time. To avoid this problem, we recursively choose vertices that are likely to be in a large independent set (using an evolutionary approach), then further kernelize the graph. We show that identifying and removing vertices likely to be in large independent sets opens up the reduction space---which not only speeds up the computation of large independent sets drastically, but also enables us to compute high-quality independent sets on much larger instances than previously reported in the literature. △ Less

Submitted 2 September, 2015; originally announced September 2015.

Comments: 17 pages, 1 figure, 8 tables. arXiv admin note: text overlap with arXiv:1502.01687

ACM Class: F.2.2; G.2.2

arXiv:1502.01687 [pdf, other]

Graph Partitioning for Independent Sets

Authors: Sebastian Lamm, Peter Sanders, Christian Schulz

Abstract: Computing maximum independent sets in graphs is an important problem in computer science. In this paper, we develop an evolutionary algorithm to tackle the problem. The core innovations of the algorithm are very natural combine operations based on graph partitioning and local search algorithms. More precisely, we employ a state-of-the-art graph partitioner to derive operations that enable us to qu… ▽ More Computing maximum independent sets in graphs is an important problem in computer science. In this paper, we develop an evolutionary algorithm to tackle the problem. The core innovations of the algorithm are very natural combine operations based on graph partitioning and local search algorithms. More precisely, we employ a state-of-the-art graph partitioner to derive operations that enable us to quickly exchange whole blocks of given independent sets. To enhance newly computed offsprings we combine our operators with a local search algorithm. Our experimental evaluation indicates that we are able to outperform state-of-the-art algorithms on a variety of instances. △ Less

Submitted 5 February, 2015; originally announced February 2015.

Showing 1–14 of 14 results for author: Lamm, S