-
Stale Profile Matching
Authors:
Amir Ayupov,
Maksim Panchenko,
Sergey Pupyrev
Abstract:
Profile-guided optimizations rely on profile data for directing compilers to generate optimized code. To achieve the maximum performance boost, profile data needs to be collected on the same version of the binary that is being optimized. In practice however, there is typically a gap between the profile collection and the release, which makes a portion of the profile invalid for optimizations. This…
▽ More
Profile-guided optimizations rely on profile data for directing compilers to generate optimized code. To achieve the maximum performance boost, profile data needs to be collected on the same version of the binary that is being optimized. In practice however, there is typically a gap between the profile collection and the release, which makes a portion of the profile invalid for optimizations. This phenomenon is known as profile staleness, and it is a serious practical problem for data-center workloads both for compilers and binary optimizers.
In this paper we thoroughly study the staleness problem and propose the first practical solution for utilizing profiles collected on binaries built from several revisions behind the release. Our algorithm is developed and implemented in a mainstream open-source post-link optimizer, BOLT. An extensive evaluation on a variety of standalone benchmarks and production services indicates that the new method recovers up to $0.8$ of the maximum BOLT benefit, even when most of the input profile data is stale and would have been discarded by the optimizer otherwise.
△ Less
Submitted 30 January, 2024;
originally announced January 2024.
-
Linear Layouts of Bipartite Planar Graphs
Authors:
Henry Förster,
Michael Kaufmann,
Laura Merker,
Sergey Pupyrev,
Chrysanthi Raftopoulou
Abstract:
A linear layout of a graph $ G $ consists of a linear order $\prec$ of the vertices and a partition of the edges. A part is called a queue (stack) if no two edges nest (cross), that is, two edges $ (v,w) $ and $ (x,y) $ with $ v \prec x \prec y \prec w $ ($ v \prec x \prec w \prec y $) may not be in the same queue (stack). The best known lower and upper bounds for the number of queues needed for p…
▽ More
A linear layout of a graph $ G $ consists of a linear order $\prec$ of the vertices and a partition of the edges. A part is called a queue (stack) if no two edges nest (cross), that is, two edges $ (v,w) $ and $ (x,y) $ with $ v \prec x \prec y \prec w $ ($ v \prec x \prec w \prec y $) may not be in the same queue (stack). The best known lower and upper bounds for the number of queues needed for planar graphs are 4 [Alam et al., Algorithmica 2020] and 42 [Bekos et al., Algorithmica 2022], respectively. While queue layouts of special classes of planar graphs have received increased attention following the breakthrough result of [Dujmović et al., J. ACM 2020], the meaningful class of bipartite planar graphs has remained elusive so far, explicitly asked for by Bekos et al. In this paper we investigate bipartite planar graphs and give an improved upper bound of 28 by refining existing techniques. In contrast, we show that two queues or one queue together with one stack do not suffice; the latter answers an open question by Pupyrev [GD 2018]. We further investigate subclasses of bipartite planar graphs and give improved upper bounds; in particular we construct 5-queue layouts for 2-degenerate quadrangulations.
△ Less
Submitted 25 May, 2023;
originally announced May 2023.
-
Optimizing Function Layout for Mobile Applications
Authors:
Ellis Hoag,
Kyungwoo Lee,
Julián Mestre,
Sergey Pupyrev
Abstract:
Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance of large-scale applications or reduce the compressed size of mobile applications. Although the technique has been studied in the context of large-scale binaries,…
▽ More
Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance of large-scale applications or reduce the compressed size of mobile applications. Although the technique has been studied in the context of large-scale binaries, no recent study has investigated the impact of function layout on mobile applications.
In this paper we develop the first principled solution for optimizing function layouts in the mobile space. To this end, we identify two important optimization goals, the compressed code size and the cold start-up time of a mobile application. Then we propose a formal model for the layout problem, whose objective closely matches the goals. Our novel algorithm to optimize the layout is inspired by the classic balanced graph partitioning problem. We carefully engineer and implement the algorithm in an open source compiler, LLVM. An extensive evaluation of the new method on large commercial mobile applications indicates up to 2% compressed size reduction and up to 3% start-up time improvement on top of the state-of-the-art approach.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Minimum Coverage Instrumentation
Authors:
Li Chen,
Ellis Hoag,
Kyungwoo Lee,
Julian Mestre,
Sergey Pupyrev
Abstract:
Modern compilers leverage block coverage profile data to carry out downstream profile-guided optimizations to improve the runtime performance and the size of a binary. Given a control-flow graph $G=(V, E)$ of a function in the binary, where nodes in $V$ correspond to basic blocks (sequences of instructions that are always executed sequentially) and edges in $E$ represent jumps in the control flow,…
▽ More
Modern compilers leverage block coverage profile data to carry out downstream profile-guided optimizations to improve the runtime performance and the size of a binary. Given a control-flow graph $G=(V, E)$ of a function in the binary, where nodes in $V$ correspond to basic blocks (sequences of instructions that are always executed sequentially) and edges in $E$ represent jumps in the control flow, the goal is to know for each block $u \in V$ whether $u$ was executed during a session. To this end, extra instrumentation code that records when a block is executed needs to be added to the binary. This extra code creates a time and space overhead, which one would like to minimize as much as possible. Motivated by this application, we study the Minimum Coverage Instrumentation problem, where the goal is to find a minimum size subset of blocks to instrument such that the coverage of the remaining blocks in the graph can be inferred from the coverage status of the instrumented subset. Our main result is an algorithm to find an optimal instrumentation strategy and to carry out the inference in $O(|E|)$ time. We also study variants of this basic problem in which we are interested in learning the coverage of edges instead of the nodes, or when we are only allowed to instrument edges instead of the nodes.
△ Less
Submitted 29 August, 2022;
originally announced August 2022.
-
Queue Layouts of Two-Dimensional Posets
Authors:
Sergey Pupyrev
Abstract:
The queue number of a poset is the queue number of its cover graph when the vertex order is a linear extension of the poset. Heath and Pemmaraju conjectured that every poset of width $w$ has queue number at most $w$. The conjecture has been confirmed for posets of width $w=2$ and for planar posets with $0$ and $1$. In contrast, the conjecture has been refused by a family of general (non-planar) po…
▽ More
The queue number of a poset is the queue number of its cover graph when the vertex order is a linear extension of the poset. Heath and Pemmaraju conjectured that every poset of width $w$ has queue number at most $w$. The conjecture has been confirmed for posets of width $w=2$ and for planar posets with $0$ and $1$. In contrast, the conjecture has been refused by a family of general (non-planar) posets of width $w>2$.
In this paper, we study queue layouts of two-dimensional posets. First, we construct a two-dimensional poset of width $w > 2$ with queue number $2(w - 1)$, thereby disproving the conjecture for two-dimensional posets. Second, we show an upper bound of $w(w+1)/2$ on the queue number of such posets, thus improving the previously best-known bound of $(w-1)^2+1$ for every $w > 3$.
△ Less
Submitted 26 August, 2022;
originally announced August 2022.
-
Robust and fair work allocation
Authors:
Amine Allouah,
Christian Kroer,
Xuan Zhang,
Vashist Avadhanula,
Anil Dania,
Caner Gocmen,
Sergey Pupyrev,
Parikshit Shah,
Nicolas Stier
Abstract:
In today's digital world, interaction with online platforms is ubiquitous, and thus content moderation is important for protecting users from content that do not comply with pre-established community guidelines. Having a robust content moderation system throughout every stage of planning is particularly important. We study the short-term planning problem of allocating human content reviewers to di…
▽ More
In today's digital world, interaction with online platforms is ubiquitous, and thus content moderation is important for protecting users from content that do not comply with pre-established community guidelines. Having a robust content moderation system throughout every stage of planning is particularly important. We study the short-term planning problem of allocating human content reviewers to different harmful content categories. We use tools from fair division and study the application of competitive equilibrium and leximin allocation rules. Furthermore, we incorporate, to the traditional Fisher market setup, novel aspects that are of practical importance. The first aspect is the forecasted workload of different content categories. We show how a formulation that is inspired by the celebrated Eisenberg-Gale program allows us to find an allocation that not only satisfies the forecasted workload, but also fairly allocates the remaining reviewing hours among all content categories. The resulting allocation is also robust as the additional allocation provides a guardrail in cases where the actual workload deviates from the predicted workload. The second practical consideration is time dependent allocation that is motivated by the fact that partners need scheduling guidance for the reviewers across days to achieve efficiency.
To address the time component, we introduce new extensions of the various fair allocation approaches for the single-time period setting, and we show that many properties extend in essence, albeit with some modifications. Related to the time component, we additionally investigate how to satisfy markets' desire for smooth allocation (e.g., partners for content reviewers prefer an allocation that does not vary much from time to time, to minimize staffing switch). We demonstrate the performance of our proposed approaches through real-world data obtained from Meta.
△ Less
Submitted 14 February, 2022; v1 submitted 10 February, 2022;
originally announced February 2022.
-
Matching Algorithms for Blood Donation
Authors:
Duncan C McElfresh,
Christian Kroer,
Sergey Pupyrev,
Eric Sodomka,
Karthik Sankararaman,
Zack Chauvin,
Neil Dexter,
John P Dickerson
Abstract:
Global demand for donated blood far exceeds supply, and unmet need is greatest in low- and middle-income countries; experts suggest that large-scale coordination is necessary to alleviate demand. Using the Facebook Blood Donation tool, we conduct the first large-scale algorithmic matching of blood donors with donation opportunities. While measuring actual donation rates remains a challenge, we mea…
▽ More
Global demand for donated blood far exceeds supply, and unmet need is greatest in low- and middle-income countries; experts suggest that large-scale coordination is necessary to alleviate demand. Using the Facebook Blood Donation tool, we conduct the first large-scale algorithmic matching of blood donors with donation opportunities. While measuring actual donation rates remains a challenge, we measure donor action (e.g., making a donation appointment) as a proxy for actual donation. We develop automated policies for matching donors with donation opportunities, based on an online matching model. We provide theoretical guarantees for these policies, both regarding the number of expected donations and the equitable treatment of blood recipients. In simulations, a simple matching strategy increases the number of donations by 5-10%; a pilot experiment with real donors shows a 5% relative increase in donor action rate (from 3.7% to 3.9%). When scaled to the global Blood Donation tool user base, this corresponds to an increase of around one hundred thousand users taking action toward donation. Further, observing donor action on a social network can shed light onto donor behavior and response to incentives. Our initial findings align with several observations made in the medical and social science literature regarding donor behavior.
△ Less
Submitted 13 August, 2021; v1 submitted 10 August, 2021;
originally announced August 2021.
-
On Families of Planar DAGs with Constant Stack Number
Authors:
Martin Nöllenburg,
Sergey Pupyrev
Abstract:
A $k$-stack layout (or $k$-page book embedding) of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing edges with respect to the vertex order. The stack number of a graph is the minimum $k$ such that it admits a $k$-stack layout. In this paper we study a long-standing problem regarding the stack number of planar directed acyclic graphs (DAG…
▽ More
A $k$-stack layout (or $k$-page book embedding) of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing edges with respect to the vertex order. The stack number of a graph is the minimum $k$ such that it admits a $k$-stack layout. In this paper we study a long-standing problem regarding the stack number of planar directed acyclic graphs (DAGs), for which the vertex order has to respect the orientation of the edges. We investigate upper and lower bounds on the stack number of several families of planar graphs: We improve the constant upper bounds on the stack number of single-source and monotone outerplanar DAGs and of outerpath DAGs, and improve the constant upper bound for upward planar 3-trees. Further, we provide computer-aided lower bounds for upward (outer-) planar DAGs.
△ Less
Submitted 5 September, 2023; v1 submitted 28 July, 2021;
originally announced July 2021.
-
On the Extended TSP Problem
Authors:
Julián Mestre,
Sergey Pupyrev,
Seeun William Umboh
Abstract:
We initiate the theoretical study of Ext-TSP, a problem that originates in the area of profile-guided binary optimization. Given a graph $G=(V, E)$ with positive edge weights $w: E \rightarrow R^+$, and a non-increasing discount function $f(\cdot)$ such that $f(1) = 1$ and $f(i) = 0$ for $i > k$, for some parameter $k$ that is part of the problem definition. The problem is to sequence the vertices…
▽ More
We initiate the theoretical study of Ext-TSP, a problem that originates in the area of profile-guided binary optimization. Given a graph $G=(V, E)$ with positive edge weights $w: E \rightarrow R^+$, and a non-increasing discount function $f(\cdot)$ such that $f(1) = 1$ and $f(i) = 0$ for $i > k$, for some parameter $k$ that is part of the problem definition. The problem is to sequence the vertices $V$ so as to maximize $\sum_{(u, v) \in E} f(|d_u - d_v|)\cdot w(u,v)$, where $d_v \in \{1, \ldots, |V| \}$ is the position of vertex~$v$ in the sequence.
We show that \prob{Ext-TSP} is APX-hard to approximate in general and we give a $(k+1)$-approximation algorithm for general graphs and a PTAS for some sparse graph classes such as planar or treewidth-bounded graphs.
Interestingly, the problem remains challenging even on very simple graph classes; indeed, there is no exact $n^{o(k)}$ time algorithm for trees unless the ETH fails. We complement this negative result with an exact $n^{O(k)}$ time algorithm for trees.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
The Mixed Page Number of Graphs
Authors:
Jawaherul Md. Alam,
Michael A. Bekos,
Martin Gronemann,
Michael Kaufmann,
Sergey Pupyrev
Abstract:
A linear layout of a graph typically consists of a total vertex order, and a partition of the edges into sets of either non-crossing edges, called stacks, or non-nested edges, called queues. The stack (queue) number of a graph is the minimum number of required stacks (queues) in a linear layout. Mixed linear layouts combine these layouts by allowing each set of edges to form either a stack or a qu…
▽ More
A linear layout of a graph typically consists of a total vertex order, and a partition of the edges into sets of either non-crossing edges, called stacks, or non-nested edges, called queues. The stack (queue) number of a graph is the minimum number of required stacks (queues) in a linear layout. Mixed linear layouts combine these layouts by allowing each set of edges to form either a stack or a queue. In this work we initiate the study of the mixed page number of a graph which corresponds to the minimum number of such sets.
First, we study the edge density of graphs with bounded mixed page number. Then, we focus on complete and complete bipartite graphs, for which we derive lower and upper bounds on their mixed page number. Our findings indicate that combining stacks and queues is more powerful in various ways compared to the two traditional layouts.
△ Less
Submitted 11 July, 2021;
originally announced July 2021.
-
Lazy Queue Layouts of Posets
Authors:
Jawaherul Md. Alam,
Michael A. Bekos,
Martin Gronemann,
Michael Kaufmann,
Sergey Pupyrev
Abstract:
We investigate the queue number of posets in terms of their width, that is, the maximum number of pairwise incomparable elements. A long-standing conjecture of Heath and Pemmaraju asserts that every poset of width w has queue number at most w. The conjecture has been confirmed for posets of width w=2 via so-called lazy linear extension.
We extend and thoroughly analyze lazy linear extensions for…
▽ More
We investigate the queue number of posets in terms of their width, that is, the maximum number of pairwise incomparable elements. A long-standing conjecture of Heath and Pemmaraju asserts that every poset of width w has queue number at most w. The conjecture has been confirmed for posets of width w=2 via so-called lazy linear extension.
We extend and thoroughly analyze lazy linear extensions for posets of width w > 2. Our analysis implies an upper bound of $(w-1)^2 +1$ on the queue number of width-w posets, which is tight for the strategy and yields an improvement over the previously best-known bound. Further, we provide an example of a poset that requires at least w+1 queues in every linear extension, thereby disproving the conjecture for posets of width w > 2.
△ Less
Submitted 25 August, 2020; v1 submitted 24 August, 2020;
originally announced August 2020.
-
The Turing Test for Graph Drawing Algorithms
Authors:
Helen C. Purchase,
Daniel Archambault,
Stephen Kobourov,
Martin Nöllenburg,
Sergey Pupyrev,
Hsiang-Yun Wu
Abstract:
Do algorithms for drawing graphs pass the Turing Test? That is, are their outputs indistinguishable from graphs drawn by humans? We address this question through a human-centred experiment, focusing on `small' graphs, of a size for which it would be reasonable for someone to choose to draw the graph manually. Overall, we find that hand-drawn layouts can be distinguished from those generated by gra…
▽ More
Do algorithms for drawing graphs pass the Turing Test? That is, are their outputs indistinguishable from graphs drawn by humans? We address this question through a human-centred experiment, focusing on `small' graphs, of a size for which it would be reasonable for someone to choose to draw the graph manually. Overall, we find that hand-drawn layouts can be distinguished from those generated by graph drawing algorithms, although this is not always the case for graphs drawn by force-directed or multi-dimensional scaling algorithms, making these good candidates for Turing Test success. We show that, in general, hand-drawn graphs are judged to be of higher quality than automatically generated ones, although this result varies with graph size and algorithm.
△ Less
Submitted 19 August, 2020; v1 submitted 11 August, 2020;
originally announced August 2020.
-
Book Embeddings of Graph Products
Authors:
Sergey Pupyrev
Abstract:
A $k$-stack layout (also called a $k$-page book embedding) of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing edges with respect to the vertex order. The stack number (book thickness, page number) of a graph is the minimum $k$ such that it admits a $k$-stack layout. A $k$-queue layout is defined similarly, except that no two edges in a…
▽ More
A $k$-stack layout (also called a $k$-page book embedding) of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing edges with respect to the vertex order. The stack number (book thickness, page number) of a graph is the minimum $k$ such that it admits a $k$-stack layout. A $k$-queue layout is defined similarly, except that no two edges in a single set may be nested.
It was recently proved that graphs of various non-minor-closed classes are subgraphs of the strong product of a path and a graph with bounded treewidth. Motivated by this decomposition result, we explore stack layouts of graph products. We show that the stack number is bounded for the strong product of a path and (i) a graph of bounded pathwidth or (ii) a bipartite graph of bounded treewidth and bounded degree. The results are obtained via a novel concept of simultaneous stack-queue layouts, which may be of independent interest.
△ Less
Submitted 29 July, 2020;
originally announced July 2020.
-
Four Pages Are Indeed Necessary for Planar Graphs
Authors:
Michael A. Bekos,
Michael Kaufmann,
Fabian Klute,
Sergey Pupyrev,
Chrysanthi Raftopoulou,
Torsten Ueckerdt
Abstract:
An embedding of a graph in a book consists of a linear order of its vertices along the spine of the book and of an assignment of its edges to the pages of the book, so that no two edges on the same page cross. The book thickness of a graph is the minimum number of pages over all its book embeddings. Accordingly, the book thickness of a class of graphs is the maximum book thickness over all its mem…
▽ More
An embedding of a graph in a book consists of a linear order of its vertices along the spine of the book and of an assignment of its edges to the pages of the book, so that no two edges on the same page cross. The book thickness of a graph is the minimum number of pages over all its book embeddings. Accordingly, the book thickness of a class of graphs is the maximum book thickness over all its members. In this paper, we address a long-standing open problem regarding the exact book thickness of the class of planar graphs, which previously was known to be either three or four. We settle this problem by demonstrating planar graphs that require four pages in any of their book embeddings, thus establishing that the book thickness of the class of planar graphs is four.
△ Less
Submitted 16 April, 2020;
originally announced April 2020.
-
Improved Bounds for Track Numbers of Planar Graphs
Authors:
Sergey Pupyrev
Abstract:
A track layout of a graph consists of a vertex coloring and a total order of each color class, such that no two edges cross between any two color classes. The track number of a graph is the minimum number of colors required by a track layout of the graph.
This paper improves lower and upper bounds on the track number of several families of planar graphs. We prove that every planar graph has trac…
▽ More
A track layout of a graph consists of a vertex coloring and a total order of each color class, such that no two edges cross between any two color classes. The track number of a graph is the minimum number of colors required by a track layout of the graph.
This paper improves lower and upper bounds on the track number of several families of planar graphs. We prove that every planar graph has track number at most $225$ and every planar $3$-tree has track number at most $25$. Then we show that there exist outerplanar graphs whose track number is $5$, which leads to the best known lower bound of $8$ for planar graphs. Finally, we investigate leveled planar graphs and tighten bounds on the track number of weakly leveled graphs, Halin graphs, and X-trees.
△ Less
Submitted 22 July, 2020; v1 submitted 30 October, 2019;
originally announced October 2019.
-
Multi-Dimensional Balanced Graph Partitioning via Projected Gradient Descent
Authors:
Dmitrii Avdiukhin,
Sergey Pupyrev,
Grigory Yaroslavtsev
Abstract:
Motivated by performance optimization of large-scale graph processing systems that distribute the graph across multiple machines, we consider the balanced graph partitioning problem. Compared to the previous work, we study the multi-dimensional variant when balance according to multiple weight functions is required. As we demonstrate by experimental evaluation, such multi-dimensional balance is im…
▽ More
Motivated by performance optimization of large-scale graph processing systems that distribute the graph across multiple machines, we consider the balanced graph partitioning problem. Compared to the previous work, we study the multi-dimensional variant when balance according to multiple weight functions is required. As we demonstrate by experimental evaluation, such multi-dimensional balance is important for achieving performance improvements for typical distributed graph processing workloads. We propose a new scalable technique for the multidimensional balanced graph partitioning problem. The method is based on applying randomized projected gradient descent to a non-convex continuous relaxation of the objective. We show how to implement the new algorithm efficiently in both theory and practice utilizing various approaches for projection. Experiments with large-scale social networks containing up to hundreds of billions of edges indicate that our algorithm has superior performance compared with the state-of-the-art approaches.
△ Less
Submitted 15 February, 2019; v1 submitted 9 February, 2019;
originally announced February 2019.
-
Bandana: Using Non-volatile Memory for Storing Deep Learning Models
Authors:
Assaf Eisenman,
Maxim Naumov,
Darryl Gardner,
Misha Smelyanskiy,
Sergey Pupyrev,
Kim Hazelwood,
Asaf Cidon,
Sachin Katti
Abstract:
Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in…
▽ More
Typical large-scale recommender systems use deep learning models that are stored on a large amount of DRAM. These models often rely on embeddings, which consume most of the required memory. We present Bandana, a storage system that reduces the DRAM footprint of embeddings, by using Non-volatile Memory (NVM) as the primary storage medium, with a small amount of DRAM as cache. The main challenge in storing embeddings on NVM is its limited read bandwidth compared to DRAM. Bandana uses two primary techniques to address this limitation: first, it stores embedding vectors that are likely to be read together in the same physical location, using hypergraph partitioning, and second, it decides the number of embedding vectors to cache in DRAM by simulating dozens of small caches. These techniques allow Bandana to increase the effective read bandwidth of NVM by 2-3x and thereby significantly reduce the total cost of ownership.
△ Less
Submitted 14 November, 2018; v1 submitted 14 November, 2018;
originally announced November 2018.
-
Improved Basic Block Reordering
Authors:
Andy Newell,
Sergey Pupyrev
Abstract:
Basic block reordering is an important step for profile-guided binary optimization. The state-of-the-art goal for basic block reordering is to maximize the number of fall-through branches. However, we demonstrate that such orderings may impose suboptimal performance on instruction and I-TLB caches. We propose a new algorithm that relies on a model combining the effects of fall-through and caching…
▽ More
Basic block reordering is an important step for profile-guided binary optimization. The state-of-the-art goal for basic block reordering is to maximize the number of fall-through branches. However, we demonstrate that such orderings may impose suboptimal performance on instruction and I-TLB caches. We propose a new algorithm that relies on a model combining the effects of fall-through and caching behavior. As details of modern processor caching is quite complex and often unknown, we show how to use machine learning in selecting parameters that best trade off different caching effects to maximize binary performance.
An extensive evaluation on a variety of applications, including Facebook production workloads, the open-source compilers Clang and GCC, and SPEC CPU benchmarks, indicate that the new method outperforms existing block reordering techniques, improving the resulting performance of applications with large code size. We have open sourced the code of the new algorithm as a part of a post-link binary optimization tool, BOLT.
△ Less
Submitted 11 April, 2020; v1 submitted 12 September, 2018;
originally announced September 2018.
-
Queue Layouts of Planar 3-Trees
Authors:
Jawaherul Md. Alam,
Michael A. Bekos,
Martin Gronemann,
Michael Kaufmann,
Sergey Pupyrev
Abstract:
A queue layout of a graph G consists of a linear order of the vertices of G and a partition of the edges of G into queues, so that no two independent edges of the same queue are nested. The queue number of G is the minimum number of queues required by any queue layout of G.
In this paper, we continue the study of the queue number of planar 3-trees. As opposed to general planar graphs, whose queu…
▽ More
A queue layout of a graph G consists of a linear order of the vertices of G and a partition of the edges of G into queues, so that no two independent edges of the same queue are nested. The queue number of G is the minimum number of queues required by any queue layout of G.
In this paper, we continue the study of the queue number of planar 3-trees. As opposed to general planar graphs, whose queue number is not known to be bounded by a constant, the queue number of planar 3-trees has been shown to be at most seven. In this work, we improve the upper bound to five. We also show that there exist planar 3-trees, whose queue number is at least four; this is the first example of a planar graph with queue number greater than three.
△ Less
Submitted 6 September, 2018; v1 submitted 31 August, 2018;
originally announced August 2018.
-
On Dispersable Book Embeddings
Authors:
Jawaherul Md. Alam,
Michael A. Bekos,
Martin Gronemann,
Michael Kaufmann,
Sergey Pupyrev
Abstract:
In a dispersable book embedding, the vertices of a given graph $G$ must be ordered along a line l, called spine, and the edges of G must be drawn at different half-planes bounded by l, called pages of the book, such that: (i) no two edges of the same page cross, and (ii) the graphs induced by the edges of each page are 1-regular. The minimum number of pages needed by any dispersable book embedding…
▽ More
In a dispersable book embedding, the vertices of a given graph $G$ must be ordered along a line l, called spine, and the edges of G must be drawn at different half-planes bounded by l, called pages of the book, such that: (i) no two edges of the same page cross, and (ii) the graphs induced by the edges of each page are 1-regular. The minimum number of pages needed by any dispersable book embedding of $G$ is referred to as the dispersable book thickness $dbt(G)$ of $G$. Graph $G$ is called dispersable if $dbt(G) = Δ(G)$ holds (note that $Δ(G) \leq dbt(G)$ always holds).
Back in 1979, Bernhart and Kainen conjectured that any $k$-regular bipartite graph $G$ is dispersable, i.e., $dbt(G)=k$. In this paper, we disprove this conjecture for the cases $k=3$ (with a computer-aided proof), and $k=4$ (with a purely combinatorial proof). In particular, we show that the Gray graph, which is 3-regular and bipartite, has dispersable book thickness four, while the Folkman graph, which is 4-regular and bipartite, has dispersable book thickness five. On the positive side, we prove that 3-connected 3-regular bipartite planar graphs are dispersable, and conjecture that this property holds, even if 3-connectivity is relaxed.
△ Less
Submitted 27 March, 2018;
originally announced March 2018.
-
Mixed Linear Layouts of Planar Graphs
Authors:
Sergey Pupyrev
Abstract:
A $k$-stack (respectively, $k$-queue) layout of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing (non-nested) edges with respect to the vertex ordering. In 1992, Heath and Rosenberg conjectured that every planar graph admits a mixed $1$-stack $1$-queue layout in which every edge is assigned to a stack or to a queue that use a common vert…
▽ More
A $k$-stack (respectively, $k$-queue) layout of a graph consists of a total order of the vertices, and a partition of the edges into $k$ sets of non-crossing (non-nested) edges with respect to the vertex ordering. In 1992, Heath and Rosenberg conjectured that every planar graph admits a mixed $1$-stack $1$-queue layout in which every edge is assigned to a stack or to a queue that use a common vertex ordering.
We disprove this conjecture by providing a planar graph that does not have such a mixed layout. In addition, we study mixed layouts of graph subdivisions, and show that every planar graph has a mixed subdivision with one division vertex per edge.
△ Less
Submitted 16 January, 2018; v1 submitted 1 September, 2017;
originally announced September 2017.
-
Social Hash Partitioner: A Scalable Distributed Hypergraph Partitioner
Authors:
Igor Kabiljo,
Brian Karrer,
Mayank Pundir,
Sergey Pupyrev,
Alon Shalita,
Alessandro Presta,
Yaroslav Akhremtsev
Abstract:
We design and implement a distributed algorithm for balanced $k$-way hypergraph partitioning that minimizes fanout, a fundamental hypergraph quantity also known as the communication volume and ($k-1$)-cut metric, by optimizing a novel objective called probabilistic fanout. This choice allows a simple local search heuristic to achieve comparable solution quality to the best existing hypergraph part…
▽ More
We design and implement a distributed algorithm for balanced $k$-way hypergraph partitioning that minimizes fanout, a fundamental hypergraph quantity also known as the communication volume and ($k-1$)-cut metric, by optimizing a novel objective called probabilistic fanout. This choice allows a simple local search heuristic to achieve comparable solution quality to the best existing hypergraph partitioners.
Our algorithm is arbitrarily scalable due to a careful design that controls computational complexity, space complexity, and communication. In practice, we commonly process hypergraphs with billions of vertices and hyperedges in a few hours. We explain how the algorithm's scalability, both in terms of hypergraph size and bucket count, is limited only by the number of machines available. We perform an extensive comparison to existing distributed hypergraph partitioners and find that our approach is able to optimize hypergraphs roughly $100$ times bigger on the same set of machines.
We call the resulting tool Social Hash Partitioner (SHP), and accompanying this paper, we open-source the most scalable version based on recursive bisection.
△ Less
Submitted 20 July, 2017;
originally announced July 2017.
-
The Bundled Crossing Number
Authors:
Md. Jawaherul Alam,
Martin Fink,
Sergey Pupyrev
Abstract:
We study the algorithmic aspect of edge bundling. A bundled crossing in a drawing of a graph is a group of crossings between two sets of parallel edges. The bundled crossing number is the minimum number of bundled crossings that group all crossings in a drawing of the graph.
We show that the bundled crossing number is closely related to the orientable genus of the graph. If multiple crossings an…
▽ More
We study the algorithmic aspect of edge bundling. A bundled crossing in a drawing of a graph is a group of crossings between two sets of parallel edges. The bundled crossing number is the minimum number of bundled crossings that group all crossings in a drawing of the graph.
We show that the bundled crossing number is closely related to the orientable genus of the graph. If multiple crossings and self-intersections of edges are allowed, the two values are identical; otherwise, the bundled crossing number can be higher than the genus.
We then investigate the problem of minimizing the number of bundled crossings. For circular graph layouts with a fixed order of vertices, we present a constant-factor approximation algorithm. When the circular order is not prescribed, we get a $\frac{6c}{c-2}$ approximation for a graph with $n$ vertices having at least $cn$ edges for $c>2$. For general graph layouts, we develop an algorithm with an approximation factor of $\frac{6c}{c-3}$ for graphs with at least $cn$ edges for $c > 3$.
△ Less
Submitted 1 September, 2016; v1 submitted 29 August, 2016;
originally announced August 2016.
-
Compressing Graphs and Indexes with Recursive Graph Bisection
Authors:
Laxman Dhulipala,
Igor Kabiljo,
Brian Karrer,
Giuseppe Ottaviano,
Sergey Pupyrev,
Alon Shalita
Abstract:
Graph reordering is a powerful technique to increase the locality of the representations of graphs, which can be helpful in several applications. We study how the technique can be used to improve compression of graphs and inverted indexes.
We extend the recent theoretical model of Chierichetti et al. (KDD 2009) for graph compression, and show how it can be employed for compression-friendly reord…
▽ More
Graph reordering is a powerful technique to increase the locality of the representations of graphs, which can be helpful in several applications. We study how the technique can be used to improve compression of graphs and inverted indexes.
We extend the recent theoretical model of Chierichetti et al. (KDD 2009) for graph compression, and show how it can be employed for compression-friendly reordering of social networks and web graphs and for assigning document identifiers in inverted indexes. We design and implement a novel theoretically sound reordering algorithm that is based on recursive graph bisection.
Our experiments show a significant improvement of the compression rate of graph and indexes over existing heuristics. The new method is relatively simple and allows efficient parallel and distributed implementations, which is demonstrated on graphs with billions of vertices and hundreds of billions of edges.
△ Less
Submitted 28 February, 2016;
originally announced February 2016.
-
Colored Non-Crossing Euclidean Steiner Forest
Authors:
Sergey Bereg,
Krzysztof Fleszar,
Philipp Kindermann,
Sergey Pupyrev,
Joachim Spoerhase,
Alexander Wolff
Abstract:
Given a set of $k$-colored points in the plane, we consider the problem of finding $k$ trees such that each tree connects all points of one color class, no two trees cross, and the total edge length of the trees is minimized. For $k=1$, this is the well-known Euclidean Steiner tree problem. For general $k$, a $kρ$-approximation algorithm is known, where $ρ\le 1.21$ is the Steiner ratio.
We prese…
▽ More
Given a set of $k$-colored points in the plane, we consider the problem of finding $k$ trees such that each tree connects all points of one color class, no two trees cross, and the total edge length of the trees is minimized. For $k=1$, this is the well-known Euclidean Steiner tree problem. For general $k$, a $kρ$-approximation algorithm is known, where $ρ\le 1.21$ is the Steiner ratio.
We present a PTAS for $k=2$, a $(5/3+\varepsilon)$-approximation algorithm for $k=3$, and two approximation algorithms for general~$k$, with ratios $O(\sqrt n \log k)$ and $k+\varepsilon$.
△ Less
Submitted 4 November, 2016; v1 submitted 18 September, 2015;
originally announced September 2015.
-
On Embeddability of Buses in Point Sets
Authors:
Till Bruckdorfer,
Michael Kaufmann,
Stephen Kobourov,
Sergey Pupyrev
Abstract:
Set membership of points in the plane can be visualized by connecting corresponding points via graphical features, like paths, trees, polygons, ellipses. In this paper we study the \emph{bus embeddability problem} (BEP): given a set of colored points we ask whether there exists a planar realization with one horizontal straight-line segment per color, called bus, such that all points with the same…
▽ More
Set membership of points in the plane can be visualized by connecting corresponding points via graphical features, like paths, trees, polygons, ellipses. In this paper we study the \emph{bus embeddability problem} (BEP): given a set of colored points we ask whether there exists a planar realization with one horizontal straight-line segment per color, called bus, such that all points with the same color are connected with vertical line segments to their bus. We present an ILP and an FPT algorithm for the general problem. For restricted versions of this problem, such as when the relative order of buses is predefined, or when a bus must be placed above all its points, we provide efficient algorithms. We show that another restricted version of the problem can be solved using 2-stack pushall sorting. On the negative side we prove the NP-completeness of a special case of BEP.
△ Less
Submitted 27 August, 2015;
originally announced August 2015.
-
Contact Representations of Sparse Planar Graphs
Authors:
Md. Jawaherul Alam,
David Eppstein,
Michael Kaufmann,
Stephen G. Kobourov,
Sergey Pupyrev,
Andre Schulz,
Torsten Ueckerdt
Abstract:
We study representations of graphs by contacts of circular arcs, CCA-representations for short, where the vertices are interior-disjoint circular arcs in the plane and each edge is realized by an endpoint of one arc touching the interior of another. A graph is (2,k)-sparse if every s-vertex subgraph has at most 2s - k edges, and (2, k)-tight if in addition it has exactly 2n - k edges, where n is t…
▽ More
We study representations of graphs by contacts of circular arcs, CCA-representations for short, where the vertices are interior-disjoint circular arcs in the plane and each edge is realized by an endpoint of one arc touching the interior of another. A graph is (2,k)-sparse if every s-vertex subgraph has at most 2s - k edges, and (2, k)-tight if in addition it has exactly 2n - k edges, where n is the number of vertices. Every graph with a CCA- representation is planar and (2, 0)-sparse, and it follows from known results on contacts of line segments that for k >= 3 every (2, k)-sparse graph has a CCA-representation. Hence the question of CCA-representability is open for (2, k)-sparse graphs with 0 <= k <= 2. We partially answer this question by computing CCA-representations for several subclasses of planar (2,0)-sparse graphs. In particular, we show that every plane (2, 2)-sparse graph has a CCA-representation, and that any plane (2, 1)-tight graph or (2, 0)-tight graph dual to a (2, 3)-tight graph or (2, 4)-tight graph has a CCA-representation. Next, we study CCA-representations in which each arc has an empty convex hull. We characterize the plane graphs that have such a representation, based on the existence of a special orientation of the graph edges. Using this characterization, we show that every plane graph of maximum degree 4 has such a representation, but that finding such a representation for a plane (2, 0)-tight graph with maximum degree 5 is an NP-complete problem. Finally, we describe a simple algorithm for representing plane (2, 0)-sparse graphs with wedges, where each vertex is represented with a sequence of two circular arcs (straight-line segments).
△ Less
Submitted 1 January, 2015;
originally announced January 2015.
-
Contact Representations of Graphs in 3D
Authors:
Md. Jawaherul Alam,
William Evans,
Stephen G. Kobourov,
Sergey Pupyrev,
Jackson Toeniskoetter,
Torsten Ueckerdt
Abstract:
We study contact representations of graphs in which vertices are represented by axis-aligned polyhedra in 3D and edges are realized by non-zero area common boundaries between corresponding polyhedra. We show that for every 3-connected planar graph, there exists a simultaneous representation of the graph and its dual with 3D boxes. We give a linear-time algorithm for constructing such a representat…
▽ More
We study contact representations of graphs in which vertices are represented by axis-aligned polyhedra in 3D and edges are realized by non-zero area common boundaries between corresponding polyhedra. We show that for every 3-connected planar graph, there exists a simultaneous representation of the graph and its dual with 3D boxes. We give a linear-time algorithm for constructing such a representation. This result extends the existing primal-dual contact representations of planar graphs in 2D using circles and triangles. While contact graphs in 2D directly correspond to planar graphs, we next study representations of non-planar graphs in 3D. In particular we consider representations of optimal 1-planar graphs. A graph is 1-planar if there exists a drawing in the plane where each edge is crossed at most once, and an optimal n-vertex 1-planar graph has the maximum (4n - 8) number of edges. We describe a linear-time algorithm for representing optimal 1-planar graphs without separating 4-cycles with 3D boxes. However, not every optimal 1-planar graph admits a representation with boxes. Hence, we consider contact representations with the next simplest axis-aligned 3D object, L-shaped polyhedra. We provide a quadratic-time algorithm for representing optimal 1-planar graph with L-shaped polyhedra.
△ Less
Submitted 3 May, 2015; v1 submitted 1 January, 2015;
originally announced January 2015.
-
Weak Unit Disk and Interval Representation of Planar Graphs
Authors:
Md. Jawaherul Alam,
Stephen G. Kobourov,
Sergey Pupyrev,
Jackson Toeniskoetter
Abstract:
We study a variant of intersection representations with unit balls, that is, unit disks in the plane and unit intervals on the line. Given a planar graph and a bipartition of the edges of the graph into near and far sets, the goal is to represent the vertices of the graph by unit balls so that the balls representing two adjacent vertices intersect if and only if the corresponding edge is near. We…
▽ More
We study a variant of intersection representations with unit balls, that is, unit disks in the plane and unit intervals on the line. Given a planar graph and a bipartition of the edges of the graph into near and far sets, the goal is to represent the vertices of the graph by unit balls so that the balls representing two adjacent vertices intersect if and only if the corresponding edge is near. We consider the problem in the plane and prove that it is NP-hard to decide whether such a representation exists for a given edge-partition. On the other hand, every series-parallel graph admits such a representation with unit disks for any near/far labeling of the edges. We also show that the representation problem on the line is equivalent to a variant of a graph coloring. We give examples of girth-4 planar and girth-3 outerplanar graphs that have no such representation with unit intervals. On the other hand, all triangle-free outerplanar graphs and all graphs with maximum average degree less than 26/11 can always be represented. In particular, this gives a simple proof of representability of all planar graphs with large girth.
△ Less
Submitted 29 August, 2014;
originally announced August 2014.
-
Balanced Circle Packings for Planar Graphs
Authors:
Md. Jawaherul Alam,
David Eppstein,
Michael T. Goodrich,
Stephen G. Kobourov,
Sergey Pupyrev
Abstract:
We study balanced circle packings and circle-contact representations for planar graphs, where the ratio of the largest circle's diameter to the smallest circle's diameter is polynomial in the number of circles. We provide a number of positive and negative results for the existence of such balanced configurations.
We study balanced circle packings and circle-contact representations for planar graphs, where the ratio of the largest circle's diameter to the smallest circle's diameter is polynomial in the number of circles. We provide a number of positive and negative results for the existence of such balanced configurations.
△ Less
Submitted 21 August, 2014;
originally announced August 2014.
-
Improved Approximation Algorithms for Box Contact Representations
Authors:
Michael A. Bekos,
Thomas C. van Dijk,
Martin Fink,
Philipp Kindermann,
Stephen Kobourov,
Sergey Pupyrev,
Joachim Spoerhase,
Alexander Wolff
Abstract:
We study the following geometric representation problem: Given a graph whose vertices correspond to axis-aligned rectangles with fixed dimensions, arrange the rectangles without overlaps in the plane such that two rectangles touch if the graph contains an edge between them. This problem is called \textsc{Contact Representation of Word Networks} (\textsc{Crown}) since it formalizes the geometric pr…
▽ More
We study the following geometric representation problem: Given a graph whose vertices correspond to axis-aligned rectangles with fixed dimensions, arrange the rectangles without overlaps in the plane such that two rectangles touch if the graph contains an edge between them. This problem is called \textsc{Contact Representation of Word Networks} (\textsc{Crown}) since it formalizes the geometric problem behind drawing word clouds in which semantically related words are close to each other. \textsc{Crown} is known to be NP-hard, and there are approximation algorithms for certain graph classes for the optimization version, \textsc{Max-Crown}, in which realizing each desired adjacency yields a certain profit. We present the first $O(1)$-approximation algorithm for the general case, when the input is a complete weighted graph, and for the bipartite case. Since the subgraph of realized adjacencies is necessarily planar, we also consider several planar graph classes (namely stars, trees, outerplanar, and planar graphs), improving upon the known results. For some graph classes, we also describe improvements in the unweighted case, where each adjacency yields the same profit. Finally, we show that the problem is APX-hard on bipartite graphs of bounded maximum degree.
△ Less
Submitted 19 November, 2014; v1 submitted 19 March, 2014;
originally announced March 2014.
-
Semantic Word Cloud Representations: Hardness and Approximation Algorithms
Authors:
Lukas Barth,
Sara Irina Fabrikant,
Stephen Kobourov,
Anna Lubiw,
Martin Nöllenburg,
Yoshio Okamoto,
Sergey Pupyrev,
Claudio Squarcella,
Torsten Ueckerdt,
Alexander Wolff
Abstract:
We study a geometric representation problem, where we are given a set $\cal R$ of axis-aligned rectangles with fixed dimensions and a graph with vertex set $\cal R$. The task is to place the rectangles without overlap such that two rectangles touch if and only if the graph contains an edge between them. We call this problem Contact Representation of Word Networks (CROWN). It formalizes the geometr…
▽ More
We study a geometric representation problem, where we are given a set $\cal R$ of axis-aligned rectangles with fixed dimensions and a graph with vertex set $\cal R$. The task is to place the rectangles without overlap such that two rectangles touch if and only if the graph contains an edge between them. We call this problem Contact Representation of Word Networks (CROWN). It formalizes the geometric problem behind drawing word clouds in which semantically related words are close to each other. Here, we represent words by rectangles and semantic relationships by edges. We show that CROWN is strongly NP-hard even restricted trees and weakly NP-hard if restricted stars. We consider the optimization problem Max-CROWN where each adjacency induces a certain profit and the task is to maximize the sum of the profits. For this problem, we present constant-factor approximations for several graph classes, namely stars, trees, planar graphs, and graphs of bounded degree. Finally, we evaluate the algorithms experimentally and show that our best method improves upon the best existing heuristic by 45%.
△ Less
Submitted 19 November, 2013;
originally announced November 2013.
-
Drawing Permutations with Few Corners
Authors:
Sergey Bereg,
Alexander E. Holroyd,
Lev Nachmanson,
Sergey Pupyrev
Abstract:
A permutation may be represented by a collection of paths in the plane. We consider a natural class of such representations, which we call tangles, in which the paths consist of straight segments at 45 degree angles, and the permutation is decomposed into nearest-neighbour transpositions. We address the problem of minimizing the number of crossings together with the number of corners of the paths,…
▽ More
A permutation may be represented by a collection of paths in the plane. We consider a natural class of such representations, which we call tangles, in which the paths consist of straight segments at 45 degree angles, and the permutation is decomposed into nearest-neighbour transpositions. We address the problem of minimizing the number of crossings together with the number of corners of the paths, focusing on classes of permutations in which both can be minimized simultaneously. We give algorithms for computing such tangles for several classes of permutations.
△ Less
Submitted 17 June, 2013;
originally announced June 2013.
-
Metro-Line Crossing Minimization: Hardness, Approximations, and Tractable Cases
Authors:
Martin Fink,
Sergey Pupyrev
Abstract:
Crossing minimization is one of the central problems in graph drawing. Recently, there has been an increased interest in the problem of minimizing crossings between paths in drawings of graphs. This is the metro-line crossing minimization problem (MLCM): Given an embedded graph and a set L of simple paths, called lines, order the lines on each edge so that the total number of crossings is minimize…
▽ More
Crossing minimization is one of the central problems in graph drawing. Recently, there has been an increased interest in the problem of minimizing crossings between paths in drawings of graphs. This is the metro-line crossing minimization problem (MLCM): Given an embedded graph and a set L of simple paths, called lines, order the lines on each edge so that the total number of crossings is minimized. So far, the complexity of MLCM has been an open problem. In contrast, the problem variant in which line ends must be placed in outermost position on their edges (MLCM-P) is known to be NP-hard. Our main results answer two open questions: (i) We show that MLCM is NP-hard. (ii) We give an $O(\sqrt{\log |L|})$-approximation algorithm for MLCM-P.
△ Less
Submitted 18 June, 2013; v1 submitted 9 June, 2013;
originally announced June 2013.
-
Happy Edges: Threshold-Coloring of Regular Lattices
Authors:
Md. Jawaherul Alam,
Stephen G. Kobourov,
Sergey Pupyrev,
Jakson Toeniskoetter
Abstract:
We study a graph coloring problem motivated by a fun Sudoku-style puzzle. Given a bipartition of the edges of a graph into {\em near} and {\em far} sets and an integer threshold $t$, a {\em threshold-coloring} of the graph is an assignment of integers to the vertices so that endpoints of near edges differ by $t$ or less, while endpoints of far edges differ by more than $t$. We study threshold-colo…
▽ More
We study a graph coloring problem motivated by a fun Sudoku-style puzzle. Given a bipartition of the edges of a graph into {\em near} and {\em far} sets and an integer threshold $t$, a {\em threshold-coloring} of the graph is an assignment of integers to the vertices so that endpoints of near edges differ by $t$ or less, while endpoints of far edges differ by more than $t$. We study threshold-coloring of tilings of the plane by regular polygons, known as Archimedean lattices, and their duals, the Laves lattices. We prove that some are threshold-colorable with constant number of colors for any edge labeling, some require an unbounded number of colors for specific labelings, and some are not threshold-colorable.
△ Less
Submitted 5 March, 2014; v1 submitted 9 June, 2013;
originally announced June 2013.
-
Ordering Metro Lines by Block Crossings
Authors:
Martin Fink,
Sergey Pupyrev
Abstract:
A problem that arises in drawings of transportation networks is to minimize the number of crossings between different transportation lines. While this can be done efficiently under specific constraints, not all solutions are visually equivalent. We suggest merging crossings into block crossings, that is, crossings of two neighboring groups of consecutive lines. Unfortunately, minimizing the total…
▽ More
A problem that arises in drawings of transportation networks is to minimize the number of crossings between different transportation lines. While this can be done efficiently under specific constraints, not all solutions are visually equivalent. We suggest merging crossings into block crossings, that is, crossings of two neighboring groups of consecutive lines. Unfortunately, minimizing the total number of block crossings is NP-hard even for very simple graphs. We give approximation algorithms for special classes of graphs and an asymptotically worst-case optimal algorithm for block crossings on general graphs. That is, we bound the number of block crossings that our algorithm needs and construct worst-case instances on which the number of block crossings that is necessary in any solution is asymptotically the same as our bound.
△ Less
Submitted 22 June, 2013; v1 submitted 30 April, 2013;
originally announced May 2013.
-
On Semantic Word Cloud Representation
Authors:
Lukas Barth,
Stephen Kobourov,
Sergey Pupyrev,
Torsten Ueckerdt
Abstract:
We study the problem of computing semantic-preserving word clouds in which semantically related words are close to each other. While several heuristic approaches have been described in the literature, we formalize the underlying geometric algorithm problem: Word Rectangle Adjacency Contact (WRAC). In this model each word is associated with rectangle with fixed dimensions, and the goal is to repres…
▽ More
We study the problem of computing semantic-preserving word clouds in which semantically related words are close to each other. While several heuristic approaches have been described in the literature, we formalize the underlying geometric algorithm problem: Word Rectangle Adjacency Contact (WRAC). In this model each word is associated with rectangle with fixed dimensions, and the goal is to represent semantically related words by ensuring that the two corresponding rectangles touch. We design and analyze efficient polynomial-time algorithms for some variants of the WRAC problem, show that several general variants are NP-hard, and describe a number of approximation algorithms. Finally, we experimentally demonstrate that our theoretically-sound algorithms outperform the early heuristics.
△ Less
Submitted 23 April, 2013;
originally announced April 2013.
-
Threshold-Coloring and Unit-Cube Contact Representation of Graphs
Authors:
Md. Jawaherul Alam,
Steven Chaplick,
Gašper Fijavž,
Michael Kaufmann,
Stephen G. Kobourov,
Sergey Pupyrev
Abstract:
In this paper we study threshold coloring of graphs, where the vertex colors represented by integers are used to describe any spanning subgraph of the given graph as follows. Pairs of vertices with near colors imply the edge between them is present and pairs of vertices with far colors imply the edge is absent. Not all planar graphs are threshold-colorable, but several subclasses, such as trees, s…
▽ More
In this paper we study threshold coloring of graphs, where the vertex colors represented by integers are used to describe any spanning subgraph of the given graph as follows. Pairs of vertices with near colors imply the edge between them is present and pairs of vertices with far colors imply the edge is absent. Not all planar graphs are threshold-colorable, but several subclasses, such as trees, some planar grids, and planar graphs without short cycles can always be threshold-colored. Using these results we obtain unit-cube contact representation of several subclasses of planar graphs. Variants of the threshold coloring problem are related to well-known graph coloring and other graph-theoretic problems. Using these relations we show the NP-completeness for two of these variants, and describe a polynomial-time algorithm for another.
△ Less
Submitted 16 May, 2013; v1 submitted 25 February, 2013;
originally announced February 2013.
-
Computing Consensus Curves
Authors:
Livio De La Cruz,
Stephen Kobourov,
Sergey Pupyrev,
Paul Shen,
Sankar Veeramoni
Abstract:
We consider the problem of extracting accurate average ant trajectories from many (possibly inaccurate) input trajectories contributed by citizen scientists. Although there are many generic software tools for motion tracking and specific ones for insect tracking, even untrained humans are much better at this task, provided a robust method to computing the average trajectories. We implemented and t…
▽ More
We consider the problem of extracting accurate average ant trajectories from many (possibly inaccurate) input trajectories contributed by citizen scientists. Although there are many generic software tools for motion tracking and specific ones for insect tracking, even untrained humans are much better at this task, provided a robust method to computing the average trajectories. We implemented and tested several local (one ant at a time) and global (all ants together) method. Our best performing algorithm uses a novel global method, based on finding edge-disjoint paths in an ant-interaction graph constructed from the input trajectories. The underlying optimization problem is a new and interesting variant of network flow. Even though the problem is NP-hard, we implemented two heuristics, which work very well in practice, outperforming all other approaches, including the best automated system.
△ Less
Submitted 14 May, 2014; v1 submitted 5 December, 2012;
originally announced December 2012.
-
Edge Routing with Ordered Bundles
Authors:
Sergey Bereg,
Alexander E. Holroyd,
Lev Nachmanson,
Sergey Pupyrev
Abstract:
Edge bundling reduces the visual clutter in a drawing of a graph by uniting the edges into bundles. We propose a method of edge bundling drawing each edge of a bundle separately as in metro-maps and call our method ordered bundles. To produce aesthetically looking edge routes it minimizes a cost function on the edges. The cost function depends on the ink, required to draw the edges, the edge lengt…
▽ More
Edge bundling reduces the visual clutter in a drawing of a graph by uniting the edges into bundles. We propose a method of edge bundling drawing each edge of a bundle separately as in metro-maps and call our method ordered bundles. To produce aesthetically looking edge routes it minimizes a cost function on the edges. The cost function depends on the ink, required to draw the edges, the edge lengths, widths and separations. The cost also penalizes for too many edges passing through narrow channels by using the constrained Delaunay triangulation. The method avoids unnecessary edge-node and edge-edge crossings. To draw edges with the minimal number of crossings and separately within the same bundle we develop an efficient algorithm solving a variant of the metro-line crossing minimization problem. In general, the method creates clear and smooth edge routes giving an overview of the global graph structure, while still drawing each edge separately and thus enabling local analysis.
△ Less
Submitted 19 September, 2012;
originally announced September 2012.