-
Enumerating Graphlets with Amortized Time Complexity Independent of Graph Size
Authors:
Alessio Conte,
Roberto Grossi,
Yasuaki Kobayashi,
Kazuhiro Kurita,
Davide Rucci,
Takeaki Uno,
Kunihiro Wasa
Abstract:
Graphlets of order $k$ in a graph $G$ are connected subgraphs induced by $k$ nodes (called $k$-graphlets) or by $k$ edges (called edge $k$-graphlets). They are among the interesting subgraphs in network analysis to get insights on both the local and global structure of a network. While several algorithms exist for discovering and enumerating graphlets, the cost per solution of such algorithms typi…
▽ More
Graphlets of order $k$ in a graph $G$ are connected subgraphs induced by $k$ nodes (called $k$-graphlets) or by $k$ edges (called edge $k$-graphlets). They are among the interesting subgraphs in network analysis to get insights on both the local and global structure of a network. While several algorithms exist for discovering and enumerating graphlets, the cost per solution of such algorithms typically depends on the size of the graph $G$, or its maximum degree. In real networks, even the latter can be in the order of millions, whereas $k$ is typically required to be a small value. In this paper we provide the first algorithm to list all graphlets of order $k$ in a graph $G=(V,E)$ with an amortized cost per solution depending \emph{solely} on the order $k$, contrarily to previous approaches where the cost depends \emph{also} on the size of $G$ or its maximum degree. Specifically, we show that it is possible to list $k$-graphlets in $O(k^2)$ time per solution, and to list edge $k$-graphlets in $O(k)$ time per solution. Furthermore we show that, if the input graph has bounded degree, then the cost per solution for listing $k$-graphlets is reduced to $O(k)$. Whenever $k = O(1)$, as it is often the case in practical settings, these algorithms are the first to achieve constant time per solution.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Algorithms for Optimally Shifting Intervals under Intersection Graph Models
Authors:
Nicolás Honorato Droguett,
Kazuhiro Kurita,
Tesshu Hanaka,
Hirotaka Ono
Abstract:
We propose a new model for graph editing problems on intersection graphs. In well-studied graph editing problems, adding and deleting vertices and edges are used as graph editing operations. As a graph editing operation on intersection graphs, we propose moving objects corresponding to vertices. In this paper, we focus on interval graphs as an intersection graph. We give a linear-time algorithm to…
▽ More
We propose a new model for graph editing problems on intersection graphs. In well-studied graph editing problems, adding and deleting vertices and edges are used as graph editing operations. As a graph editing operation on intersection graphs, we propose moving objects corresponding to vertices. In this paper, we focus on interval graphs as an intersection graph. We give a linear-time algorithm to find the total moving distance for transforming an interval graph into a complete graph. The concept of this algorithm can be applied for (i) transforming a unit square graph into a complete graph over $L_1$ distance and (ii) attaining the existence of a $k$-clique on unit interval graphs. In addition, we provide LP-formulations to achieve several properties in the associated graph of unit intervals.
△ Less
Submitted 9 January, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Dichotomies for Tree Minor Containment with Structural Parameters
Authors:
Tatsuya Gima,
Soh Kumabe,
Kazuhiro Kurita,
Yuto Okada,
Yota Otachi
Abstract:
The problem of determining whether a graph $G$ contains another graph $H$ as a minor, referred to as the minor containment problem, is a fundamental problem in the field of graph algorithms. While it is NP-complete when $G$ and $H$ are general graphs, it is sometimes tractable on more restricted graph classes. This study focuses on the case where both $G$ and $H$ are trees, known as the tree minor…
▽ More
The problem of determining whether a graph $G$ contains another graph $H$ as a minor, referred to as the minor containment problem, is a fundamental problem in the field of graph algorithms. While it is NP-complete when $G$ and $H$ are general graphs, it is sometimes tractable on more restricted graph classes. This study focuses on the case where both $G$ and $H$ are trees, known as the tree minor containment problem. Even in this case, the problem is known to be NP-complete. In contrast, polynomial-time algorithms are known for the case when both trees are caterpillars or when the maximum degree of $H$ is a constant. Our research aims to clarify the boundary of tractability and intractability for the tree minor containment problem. Specifically, we provide dichotomies for the computational complexities of the problem based on three structural parameters: the diameter, pathwidth, and path eccentricity.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
Enumerating minimal vertex covers and dominating sets with capacity and/or connectivity constraints
Authors:
Yasuaki Kobayashi,
Kazuhiro Kurita,
Yasuko Matsui,
Hirotaka Ono
Abstract:
In this paper, we consider the problems of enumerating minimal vertex covers and minimal dominating sets with capacity and/or connectivity constraints. We develop polynomial-delay enumeration algorithms for these problems on bounded-degree graphs. For the case of minimal connected vertex cover, our algorithm runs in polynomial delay even on the class of $d$-claw free graphs, which extends the resu…
▽ More
In this paper, we consider the problems of enumerating minimal vertex covers and minimal dominating sets with capacity and/or connectivity constraints. We develop polynomial-delay enumeration algorithms for these problems on bounded-degree graphs. For the case of minimal connected vertex cover, our algorithm runs in polynomial delay even on the class of $d$-claw free graphs, which extends the result on bounded-degree graphs. To complement these algorithmic results, we show that the problems of enumerating minimal connected vertex covers and minimal capacitated vertex covers in bipartite graphs are at least as hard as enumerating minimal transversals in hypergraphs.
△ Less
Submitted 30 August, 2023;
originally announced August 2023.
-
On the hardness of inclusion-wise minimal separators enumeration
Authors:
Caroline Brosse,
Oscar Defrain,
Kazuhiro Kurita,
Vincent Limouzy,
Takeaki Uno,
Kunihiro Wasa
Abstract:
Enumeration problems are often encountered as key subroutines in the exact computation of graph parameters such as chromatic number, treewidth, or treedepth. In the case of treedepth computation, the enumeration of inclusion-wise minimal separators plays a crucial role. However and quite surprisingly, the complexity status of this problem has not been settled since it has been posed as an open dir…
▽ More
Enumeration problems are often encountered as key subroutines in the exact computation of graph parameters such as chromatic number, treewidth, or treedepth. In the case of treedepth computation, the enumeration of inclusion-wise minimal separators plays a crucial role. However and quite surprisingly, the complexity status of this problem has not been settled since it has been posed as an open direction by Kloks and Kratsch in 1998. Recently at the PACE 2020 competition dedicated to treedepth computation, solvers have been circumventing that by listing all minimal $a$-$b$ separators and filtering out those that are not inclusion-wise minimal, at the cost of efficiency. Naturally, having an efficient algorithm for listing inclusion-wise minimal separators would drastically improve such practical algorithms. In this note, however, we show that no efficient algorithm is to be expected from an output-sensitive perspective, namely, we prove that there is no output-polynomial time algorithm for inclusion-wise minimal separators enumeration unless P = NP.
△ Less
Submitted 13 December, 2023; v1 submitted 29 August, 2023;
originally announced August 2023.
-
Polynomial-Delay Enumeration of Large Maximal Common Independent Sets in Two Matroids and Beyond
Authors:
Yasuaki Kobayashi,
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
Finding a maximum cardinality common independent set in two matroids (also known as \textsc{Matroid Intersection}) is a classical combinatorial optimization problem, which generalizes several well-known problems, such as finding a maximum bipartite matching, a maximum colorful forest, and an arborescence in directed graphs. Enumerating all maximal common independent sets in two (or more) matroids…
▽ More
Finding a maximum cardinality common independent set in two matroids (also known as \textsc{Matroid Intersection}) is a classical combinatorial optimization problem, which generalizes several well-known problems, such as finding a maximum bipartite matching, a maximum colorful forest, and an arborescence in directed graphs. Enumerating all maximal common independent sets in two (or more) matroids is a classical enumeration problem. In this paper, we address an ``intersection'' of these problems: Given two matroids and a threshold $τ$, the goal is to enumerate all maximal common independent sets in the matroids with cardinality at least $τ$. We show that this problem can be solved in polynomial delay and polynomial space. Moreover, our technique can be extended to a more general problem, which is relevant to Matroid Matching. We give a polynomial-delay and polynomial-space algorithm for enumerating all maximal ``matchings'' with cardinality at least $τ$, assuming that the optimization counterpart is ``tractable'' in a certain sense. This extension allows us to enumerate small minimal connected vertex covers in subcubic graphs. We also discuss a framework to convert enumeration with cardinality constraints into ranked enumeration.
△ Less
Submitted 8 February, 2024; v1 submitted 17 July, 2023;
originally announced July 2023.
-
Optimal LZ-End Parsing is Hard
Authors:
Hideo Bannai,
Mitsuru Funakoshi,
Kazuhiro Kurita,
Yuto Nakashima,
Kazuhisa Seto,
Takeaki Uno
Abstract:
LZ-End is a variant of the well-known Lempel-Ziv parsing family such that each phrase of the parsing has a previous occurrence, with the additional constraint that the previous occurrence must end at the end of a previous phrase. LZ-End was initially proposed as a greedy parsing, where each phrase is determined greedily from left to right, as the longest factor that satisfies the above constraint~…
▽ More
LZ-End is a variant of the well-known Lempel-Ziv parsing family such that each phrase of the parsing has a previous occurrence, with the additional constraint that the previous occurrence must end at the end of a previous phrase. LZ-End was initially proposed as a greedy parsing, where each phrase is determined greedily from left to right, as the longest factor that satisfies the above constraint~[Kreft & Navarro, 2010]. In this work, we consider an optimal LZ-End parsing that has the minimum number of phrases in such parsings. We show that a decision version of computing the optimal LZ-End parsing is NP-complete by showing a reduction from the vertex cover problem. Moreover, we give a MAX-SAT formulation for the optimal LZ-End parsing adapting an approach for computing various NP-hard repetitiveness measures recently presented by [Bannai et al., 2022]. We also consider the approximation ratio of the size of greedy LZ-End parsing to the size of the optimal LZ-End parsing, and give a lower bound of the ratio which asymptotically approaches $2$.
△ Less
Submitted 6 February, 2023;
originally announced February 2023.
-
A Framework to Design Approximation Algorithms for Finding Diverse Solutions in Combinatorial Problems
Authors:
Tesshu Hanaka,
Masashi Kiyomi,
Yasuaki Kobayashi,
Yusuke Kobayashi,
Kazuhiro Kurita,
Yota Otachi
Abstract:
Finding a \emph{single} best solution is the most common objective in combinatorial optimization problems. However, such a single solution may not be applicable to real-world problems as objective functions and constraints are only "approximately" formulated for original real-world problems. To solve this issue, finding \emph{multiple} solutions is a natural direction, and diversity of solutions i…
▽ More
Finding a \emph{single} best solution is the most common objective in combinatorial optimization problems. However, such a single solution may not be applicable to real-world problems as objective functions and constraints are only "approximately" formulated for original real-world problems. To solve this issue, finding \emph{multiple} solutions is a natural direction, and diversity of solutions is an important concept in this context. Unfortunately, finding diverse solutions is much harder than finding a single solution. To cope with difficulty, we investigate the approximability of finding diverse solutions. As a main result, we propose a framework to design approximation algorithms for finding diverse solutions, which yields several outcomes including constant-factor approximation algorithms for finding diverse matchings in graphs and diverse common bases in two matroids and PTASes for finding diverse minimum cuts and interval schedulings.
△ Less
Submitted 21 January, 2022;
originally announced January 2022.
-
An Approximation Algorithm for $K$-best Enumeration of Minimal Connected Edge Dominating Sets with Cardinality Constraints
Authors:
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
\emph{$K$-best enumeration}, which asks to output $k$-best solutions without duplication, is a helpful tool in data analysis for many fields. In such fields, graphs typically represent data. Thus subgraph enumeration has been paid much attention to such fields. However, $k$-best enumeration tends to be intractable since, in many cases, finding one optimum solution is \NP-hard. To overcome this dif…
▽ More
\emph{$K$-best enumeration}, which asks to output $k$-best solutions without duplication, is a helpful tool in data analysis for many fields. In such fields, graphs typically represent data. Thus subgraph enumeration has been paid much attention to such fields. However, $k$-best enumeration tends to be intractable since, in many cases, finding one optimum solution is \NP-hard. To overcome this difficulty, we combine $k$-best enumeration with a concept of enumeration algorithms called \emph{approximation enumeration algorithms}. As a main result, we propose a $4$-approximation algorithm for minimal connected edge dominating sets which outputs $k$ minimal solutions with cardinality at most $4\cdot\OPT$, where $\OPT$ is the cardinality of a minimum solution which is \emph{not} outputted by the algorithm. Our proposed algorithm runs in $\order{nm^2Δ}$ delay, where $n$, $m$, $Δ$ are the number of vertices, the number of edges, and the maximum degree of an input graph.
△ Less
Submitted 12 May, 2024; v1 submitted 21 January, 2022;
originally announced January 2022.
-
Computing Diverse Shortest Paths Efficiently: A Theoretical and Experimental Study
Authors:
Tesshu Hanaka,
Yasuaki Kobayashi,
Kazuhiro Kurita,
See Woo Lee,
Yota Otachi
Abstract:
Finding diverse solutions in combinatorial problems recently has received considerable attention (Baste et al. 2020; Fomin et al. 2020; Hanaka et al. 2021). In this paper we study the following type of problems: given an integer $k$, the problem asks for $k$ solutions such that the sum of pairwise (weighted) Hamming distances between these solutions is maximized. Such solutions are called diverse…
▽ More
Finding diverse solutions in combinatorial problems recently has received considerable attention (Baste et al. 2020; Fomin et al. 2020; Hanaka et al. 2021). In this paper we study the following type of problems: given an integer $k$, the problem asks for $k$ solutions such that the sum of pairwise (weighted) Hamming distances between these solutions is maximized. Such solutions are called diverse solutions. We present a polynomial-time algorithm for finding diverse shortest $st$-paths in weighted directed graphs. Moreover, we study the diverse version of other classical combinatorial problems such as diverse weighted matroid bases, diverse weighted arborescences, and diverse bipartite matchings. We show that these problems can be solved in polynomial time as well. To evaluate the practical performance of our algorithm for finding diverse shortest $st$-paths, we conduct a computational experiment with synthetic and real-world instances.The experiment shows that our algorithm successfully computes diverse solutions within reasonable computational time.
△ Less
Submitted 15 December, 2021; v1 submitted 10 December, 2021;
originally announced December 2021.
-
Polynomial-Delay Enumeration of Large Maximal Matchings
Authors:
Yasuaki Kobayashi,
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
Enumerating matchings is a classical problem in the field of enumeration algorithms. There are polynomial-delay enumeration algorithms for several settings, such as enumerating perfect matchings, maximal matchings, and (weighted) matchings in specific orders. In this paper, we present polynomial-delay enumeration algorithms for maximal matchings with cardinality at least given threshold $t$. Our a…
▽ More
Enumerating matchings is a classical problem in the field of enumeration algorithms. There are polynomial-delay enumeration algorithms for several settings, such as enumerating perfect matchings, maximal matchings, and (weighted) matchings in specific orders. In this paper, we present polynomial-delay enumeration algorithms for maximal matchings with cardinality at least given threshold $t$. Our algorithm enumerates all such matchings in $O(nm)$ delay with exponential space, where $n$ and $m$ are the number of vertices and edges of an input graph, respectively. We also present a polynomial-delay and polynomial-space enumeration algorithm for this problem. As a variant of this algorithm, we give an algorithm that enumerates all maximal matchings in non-decreasing order of its cardinality and runs in $O(nm)$ delay.
△ Less
Submitted 5 September, 2022; v1 submitted 10 May, 2021;
originally announced May 2021.
-
Constant Amortized Time Enumeration of Eulerian trails
Authors:
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
In this paper, we consider enumeration problems for edge-distinct and vertex-distinct Eulerian trails. Here, two Eulerian trails are \emph{edge-distinct} if the edge sequences are not identical, and they are \emph{vertex-distinct} if the vertex sequences are not identical. As the main result, we propose optimal enumeration algorithms for both problems, that is, these algorithm runs in…
▽ More
In this paper, we consider enumeration problems for edge-distinct and vertex-distinct Eulerian trails. Here, two Eulerian trails are \emph{edge-distinct} if the edge sequences are not identical, and they are \emph{vertex-distinct} if the vertex sequences are not identical. As the main result, we propose optimal enumeration algorithms for both problems, that is, these algorithm runs in $\mathcal{O}(N)$ total time, where $N$ is the number of solutions. Our algorithms are based on the reverse search technique introduced by [Avis and Fukuda, DAM 1996], and the push out amortization technique introduced by [Uno, WADS 2015].
△ Less
Submitted 25 January, 2021;
originally announced January 2021.
-
An Improved Deterministic Parameterized Algorithm for Cactus Vertex Deletion
Authors:
Yuuki Aoike,
Tatsuya Gima,
Tesshu Hanaka,
Masashi Kiyomi,
Yasuaki Kobayashi,
Yusuke Kobayashi,
Kazuhiro Kurita,
Yota Otachi
Abstract:
A cactus is a connected graph that does not contain $K_4 - e$ as a minor. Given a graph $G = (V, E)$ and integer $k \ge 0$, Cactus Vertex Deletion (also known as Diamond Hitting Set) is the problem of deciding whether $G$ has a vertex set of size at most $k$ whose removal leaves a forest of cacti. The current best deterministic parameterized algorithm for this problem was due to Bonnet et al. [WG…
▽ More
A cactus is a connected graph that does not contain $K_4 - e$ as a minor. Given a graph $G = (V, E)$ and integer $k \ge 0$, Cactus Vertex Deletion (also known as Diamond Hitting Set) is the problem of deciding whether $G$ has a vertex set of size at most $k$ whose removal leaves a forest of cacti. The current best deterministic parameterized algorithm for this problem was due to Bonnet et al. [WG 2016], which runs in time $26^kn^{O(1)}$, where $n$ is the number of vertices of $G$. In this paper, we design a deterministic algorithm for Cactus Vertex Deletion, which runs in time $17.64^kn^{O(1)}$. As a straightforward application of our algorithm, we give a $17.64^kn^{O(1)}$-time algorithm for Even Cycle Transversal. The idea behind this improvement is to apply the measure and conquer analysis with a slightly elaborate measure of instances.
△ Less
Submitted 26 March, 2021; v1 submitted 9 December, 2020;
originally announced December 2020.
-
Linear-Delay Enumeration for Minimal Steiner Problems
Authors:
Yasuaki Kobayashi,
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
Kimelfeld and Sagiv [Kimelfeld and Sagiv, PODS 2006], [Kimelfeld and Sagiv, Inf. Syst. 2008] pointed out the problem of enumerating $K$-fragments is of great importance in a keyword search on data graphs. In a graph-theoretic term, the problem corresponds to enumerating minimal Steiner trees in (directed) graphs. In this paper, we propose a linear-delay and polynomial-space algorithm for enumerati…
▽ More
Kimelfeld and Sagiv [Kimelfeld and Sagiv, PODS 2006], [Kimelfeld and Sagiv, Inf. Syst. 2008] pointed out the problem of enumerating $K$-fragments is of great importance in a keyword search on data graphs. In a graph-theoretic term, the problem corresponds to enumerating minimal Steiner trees in (directed) graphs. In this paper, we propose a linear-delay and polynomial-space algorithm for enumerating all minimal Steiner trees, improving on a previous result in [Kimelfeld and Sagiv, Inf. Syst. 2008]. Our enumeration algorithm can be extended to other Steiner problems, such as minimal Steiner forests, minimal terminal Steiner trees, and minimal directed Steiner trees. As another variant of the minimal Steiner tree enumeration problem, we study the problem of enumerating minimal induced Steiner subgraphs. We propose a polynomial-delay and exponential-space enumeration algorithm of minimal induced Steiner subgraphs on claw-free graphs. Contrary to these tractable results, we show that the problem of enumerating minimal group Steiner trees is at least as hard as the minimal transversal enumeration problem on hypergraphs.
△ Less
Submitted 12 May, 2022; v1 submitted 22 October, 2020;
originally announced October 2020.
-
Efficient Constant-Factor Approximate Enumeration of Minimal Subsets for Monotone Properties with Weight Constraints
Authors:
Yasuaki Kobayashi,
Kazuhiro Kurita,
Kunihiro Wasa
Abstract:
A property $Π$ on a finite set $U$ is \emph{monotone} if for every $X \subseteq U$ satisfying $Π$, every superset $Y \subseteq U$ of $X$ also satisfies $Π$. Many combinatorial properties can be seen as monotone properties. The problem of finding a minimum subset of $U$ satisfying $Π$ is a central problem in combinatorial optimization. Although many approximate/exact algorithms have been developed…
▽ More
A property $Π$ on a finite set $U$ is \emph{monotone} if for every $X \subseteq U$ satisfying $Π$, every superset $Y \subseteq U$ of $X$ also satisfies $Π$. Many combinatorial properties can be seen as monotone properties. The problem of finding a minimum subset of $U$ satisfying $Π$ is a central problem in combinatorial optimization. Although many approximate/exact algorithms have been developed to solve this kind of problem on numerous properties, a solution obtained by these algorithms is often unsuitable for real-world applications due to the difficulty of building accurate mathematical models on real-world problems. A promising approach to overcome this difficulty is to \emph{enumerate} multiple small solutions rather than to \emph{find} a single small solution. To this end, given a weight function $w: U \to \mathbb N$ and an integer $k$, we devise algorithms that \emph{approximately} enumerate all minimal subsets of $U$ with weight at most $k$ satisfying $Π$ for various monotone properties $Π$, where "approximate enumeration" means that algorithms output all minimal subsets satisfying $Π$ whose weight at most $k$ and may output some minimal subsets satisfying $Π$ whose weight exceeds $k$ but is at most $ck$ for some constant $c \ge 1$. These algorithms allow us to efficiently enumerate minimal vertex covers, minimal dominating sets in bounded degree graphs, minimal feedback vertex sets, minimal hitting sets in bounded rank hypergraphs, etc., of weight at most $k$ with constant approximation factors.
△ Less
Submitted 20 February, 2021; v1 submitted 18 September, 2020;
originally announced September 2020.
-
Finding Diverse Trees, Paths, and More
Authors:
Tesshu Hanaka,
Yasuaki Kobayashi,
Kazuhiro Kurita,
Yota Otachi
Abstract:
Mathematical modeling is a standard approach to solve many real-world problems and {\em diversity} of solutions is an important issue, emerging in applying solutions obtained from mathematical models to real-world problems. Many studies have been devoted to finding diverse solutions. Baste et al. (Algorithms 2019, IJCAI 2020) recently initiated the study of computing diverse solutions of combinato…
▽ More
Mathematical modeling is a standard approach to solve many real-world problems and {\em diversity} of solutions is an important issue, emerging in applying solutions obtained from mathematical models to real-world problems. Many studies have been devoted to finding diverse solutions. Baste et al. (Algorithms 2019, IJCAI 2020) recently initiated the study of computing diverse solutions of combinatorial problems from the perspective of fixed-parameter tractability. They considered problems of finding $r$ solutions that maximize some diversity measures (the minimum or sum of the pairwise Hamming distances among them) and gave some fixed-parameter tractable algorithms for the diverse version of several well-known problems, such as {\sc Vertex Cover}, {\sc Feedback Vertex Set}, {\sc $d$-Hitting Set}, and problems on bounded-treewidth graphs. In this work, we investigate the (fixed-parameter) tractability of problems of finding diverse spanning trees, paths, and several subgraphs. In particular, we show that, given a graph $G$ and an integer $r$, the problem of computing $r$ spanning trees of $G$ maximizing the sum of the pairwise Hamming distances among them can be solved in polynomial time. To the best of the authors' knowledge, this is the first polynomial-time solvable case for finding diverse solutions of unbounded size.
△ Less
Submitted 13 December, 2020; v1 submitted 8 September, 2020;
originally announced September 2020.
-
Efficient Enumerations for Minimal Multicuts and Multiway Cuts
Authors:
Kazuhiro Kurita,
Yasuaki Kobayashi
Abstract:
Let $G = (V, E)$ be an undirected graph and let $B \subseteq V \times V$ be a set of terminal pairs. A node/edge multicut is a subset of vertices/edges of $G$ whose removal destroys all the paths between every terminal pair in $B$. The problem of computing a {\em minimum} node/edge multicut is NP-hard and extensively studied from several viewpoints. In this paper, we study the problem of enumerati…
▽ More
Let $G = (V, E)$ be an undirected graph and let $B \subseteq V \times V$ be a set of terminal pairs. A node/edge multicut is a subset of vertices/edges of $G$ whose removal destroys all the paths between every terminal pair in $B$. The problem of computing a {\em minimum} node/edge multicut is NP-hard and extensively studied from several viewpoints. In this paper, we study the problem of enumerating all {\em minimal} node multicuts. We give an incremental polynomial delay enumeration algorithm for minimal node multicuts, which extends an enumeration algorithm due to Khachiyan et al. (Algorithmica, 2008) for minimal edge multicuts. Important special cases of node/edge multicuts are node/edge {\em multiway cuts}, where the set of terminal pairs contains every pair of vertices in some subset $T \subseteq V$, that is, $B = T \times T$. We improve the running time bound for this special case: We devise a polynomial delay and exponential space enumeration algorithm for minimal node multiway cuts and a polynomial delay and space enumeration algorithm for minimal edge multiway cuts.
△ Less
Submitted 29 June, 2020;
originally announced June 2020.
-
Weight Poisoning Attacks on Pre-trained Models
Authors:
Keita Kurita,
Paul Michel,
Graham Neubig
Abstract:
Recently, NLP has seen a surge in the usage of large pre-trained models. Users download weights of models pre-trained on large datasets, then fine-tune the weights on a task of their choice. This raises the question of whether downloading untrusted pre-trained weights can pose a security threat. In this paper, we show that it is possible to construct ``weight poisoning'' attacks where pre-trained…
▽ More
Recently, NLP has seen a surge in the usage of large pre-trained models. Users download weights of models pre-trained on large datasets, then fine-tune the weights on a task of their choice. This raises the question of whether downloading untrusted pre-trained weights can pose a security threat. In this paper, we show that it is possible to construct ``weight poisoning'' attacks where pre-trained weights are injected with vulnerabilities that expose ``backdoors'' after fine-tuning, enabling the attacker to manipulate the model prediction simply by injecting an arbitrary keyword. We show that by applying a regularization method, which we call RIPPLe, and an initialization procedure, which we call Embedding Surgery, such attacks are possible even with limited knowledge of the dataset and fine-tuning procedure. Our experiments on sentiment classification, toxicity detection, and spam detection show that this attack is widely applicable and poses a serious threat. Finally, we outline practical defenses against such attacks. Code to reproduce our experiments is available at https://github.com/neulab/RIPPLe.
△ Less
Submitted 14 April, 2020;
originally announced April 2020.
-
Towards Robust Toxic Content Classification
Authors:
Keita Kurita,
Anna Belova,
Antonios Anastasopoulos
Abstract:
Toxic content detection aims to identify content that can offend or harm its recipients. Automated classifiers of toxic content need to be robust against adversaries who deliberately try to bypass filters. We propose a method of generating realistic model-agnostic attacks using a lexicon of toxic tokens, which attempts to mislead toxicity classifiers by diluting the toxicity signal either by obfus…
▽ More
Toxic content detection aims to identify content that can offend or harm its recipients. Automated classifiers of toxic content need to be robust against adversaries who deliberately try to bypass filters. We propose a method of generating realistic model-agnostic attacks using a lexicon of toxic tokens, which attempts to mislead toxicity classifiers by diluting the toxicity signal either by obfuscating toxic tokens through character-level perturbations, or by injecting non-toxic distractor tokens. We show that these realistic attacks reduce the detection recall of state-of-the-art neural toxicity detectors, including those using ELMo and BERT, by more than 50% in some cases. We explore two approaches for defending against such attacks. First, we examine the effect of training on synthetically noised data. Second, we propose the Contextual Denoising Autoencoder (CDAE): a method for learning robust representations that uses character-level and contextual information to denoise perturbed tokens. We show that the two approaches are complementary, improving robustness to both character-level perturbations and distractors, recovering a considerable portion of the lost accuracy. Finally, we analyze the robustness characteristics of the most competitive methods and outline practical considerations for improving toxicity detectors.
△ Less
Submitted 14 December, 2019;
originally announced December 2019.
-
Constant Amortized Time Enumeration of Independent Sets for Graphs with Bounded Clique Number
Authors:
Kazuhiro Kurita,
Kunihiro Wasa,
Hiroki Arimura,
Takeaki Uno
Abstract:
In this study, we address the independent set enumeration problem. Although several efficient enumeration algorithms and careful analyses have been proposed for maximal independent sets, no fine-grained analysis has been given for the non-maximal variant. From the main result, we propose an algorithm $\texttt{EIS}$ for the non-maximal variant that runs in $O(q)$ amortized time and linear space, wh…
▽ More
In this study, we address the independent set enumeration problem. Although several efficient enumeration algorithms and careful analyses have been proposed for maximal independent sets, no fine-grained analysis has been given for the non-maximal variant. From the main result, we propose an algorithm $\texttt{EIS}$ for the non-maximal variant that runs in $O(q)$ amortized time and linear space, where $q$ is the clique number, i.e., the maximum size of a clique in an input graph. Note that $\texttt{EIS}$ works correctly even if the exact value of $q$ is unknown. Despite its simplicity, $\texttt{EIS}$ is optimal for graphs with a bounded clique number, such as, triangle-free graphs, planar graphs, bounded degenerate graphs, locally bounded expansion graphs, and $F$-free graphs for any fixed graph $F$, where a $F$-free graph is a graph that has no copy of $F$ as a subgraph.
△ Less
Submitted 9 July, 2019; v1 submitted 23 June, 2019;
originally announced June 2019.
-
Measuring Bias in Contextualized Word Representations
Authors:
Keita Kurita,
Nidhi Vyas,
Ayush Pareek,
Alan W Black,
Yulia Tsvetkov
Abstract:
Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1)~propose a template-based method to quantify bias in BERT; (2)~show that this method obtains more consistent…
▽ More
Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1)~propose a template-based method to quantify bias in BERT; (2)~show that this method obtains more consistent results in capturing social biases than the traditional cosine based method; and (3)~conduct a case study, evaluating gender bias in a downstream task of Gender Pronoun Resolution. Although our case study focuses on gender bias, the proposed technique is generalizable to unveiling other biases, including in multiclass settings, such as racial and religious biases.
△ Less
Submitted 17 June, 2019;
originally announced June 2019.
-
An Efficient Algorithm for Enumerating Chordal Bipartite Induced Subgraphs in Sparse Graphs
Authors:
Kazuhiro Kurita,
Kunihiro Wasa,
Hiroki Arimura,
Takeaki Uno
Abstract:
In this paper, we propose a characterization of chordal bipartite graphs and an efficient enumeration algorithm for chordal bipartite induced subgraphs. A chordal bipartite graph is a bipartite graph without induced cycles with length six or more. It is known that the incident graph of a hypergraph is chordal bipartite graph if and only if the hypergraph is $β$-acyclic. As the main result of our p…
▽ More
In this paper, we propose a characterization of chordal bipartite graphs and an efficient enumeration algorithm for chordal bipartite induced subgraphs. A chordal bipartite graph is a bipartite graph without induced cycles with length six or more. It is known that the incident graph of a hypergraph is chordal bipartite graph if and only if the hypergraph is $β$-acyclic. As the main result of our paper, we show that a graph $G$ is chordal bipartite if and only if there is a special vertex elimination ordering for $G$, called CBEO. Moreover, we propose an algorithm ECB which enumerates all chordal bipartite induced subgraphs in $O(ktΔ^2)$ time per solution on average, where $k$ is the degeneracy, $t$ is the maximum size of $K_{t,t}$ as an induced subgraph, and $Δ$ is the degree. ECB achieves constant amortized time enumeration for bounded degree graphs.
△ Less
Submitted 5 March, 2019;
originally announced March 2019.
-
Efficient Enumeration of Subgraphs and Induced Subgraphs with Bounded Girth
Authors:
Kazuhiro Kurita,
Kunihiro Wasa,
Alessio Conte,
Hiroki Arimura,
Takeaki Uno
Abstract:
The girth of a graph is the length of its shortest cycle. Due to its relevance in graph theory, network analysis and practical fields such as distributed computing, girth-related problems have been object of attention in both past and recent literature. In this paper, we consider the problem of listing connected subgraphs with bounded girth. As a large girth is index of sparsity, this allows to ex…
▽ More
The girth of a graph is the length of its shortest cycle. Due to its relevance in graph theory, network analysis and practical fields such as distributed computing, girth-related problems have been object of attention in both past and recent literature. In this paper, we consider the problem of listing connected subgraphs with bounded girth. As a large girth is index of sparsity, this allows to extract sparse structures from the input graph. We propose two algorithms, for enumerating respectively vertex induced subgraphs and edge induced subgraphs with bounded girth, both running in $O(n)$ amortized time per solution and using $O(n^3)$ space. Furthermore, the algorithms can be easily adapted to relax the connectivity requirement and to deal with weighted graphs. As a byproduct, the second algorithm can be used to answer the well known question of finding the densest $n$-vertex graph(s) of girth $k$.
△ Less
Submitted 11 June, 2018;
originally announced June 2018.
-
Efficient Enumeration of Dominating Sets for Sparse Graphs
Authors:
Kazuhiro Kurita,
Kunihiro Wasa,
Hiroki Arimura,
Takeaki Uno
Abstract:
A dominating set $D$ of a graph $G$ is a set of vertices such that any vertex in $G$ is in $D$ or its neighbor is in $D$. Enumeration of minimal dominating sets in a graph is one of central problems in enumeration study since enumeration of minimal dominating sets corresponds to enumeration of minimal hypergraph transversal. However, enumeration of dominating sets including non-minimal ones has no…
▽ More
A dominating set $D$ of a graph $G$ is a set of vertices such that any vertex in $G$ is in $D$ or its neighbor is in $D$. Enumeration of minimal dominating sets in a graph is one of central problems in enumeration study since enumeration of minimal dominating sets corresponds to enumeration of minimal hypergraph transversal. However, enumeration of dominating sets including non-minimal ones has not been received much attention. In this paper, we address enumeration problems for dominating sets from sparse graphs which are degenerate graphs and graphs with large girth, and we propose two algorithms for solving the problems. The first algorithm enumerates all the dominating sets for a $k$-degenerate graph in $O(k)$ time per solution using $O(n + m)$ space, where $n$ and $m$ are respectively the number of vertices and edges in an input graph. That is, the algorithm is optimal for graphs with constant degeneracy such as trees, planar graphs, $H$-minor free graphs with some fixed $H$. The second algorithm enumerates all the dominating sets in constant time per solution for input graphs with girth at least nine.
△ Less
Submitted 28 September, 2018; v1 submitted 21 February, 2018;
originally announced February 2018.
-
Efficient Enumeration of Induced Matchings in a Graph without Cycles with Length Four
Authors:
Kazuhiro Kurita,
Kunihiro Wasa,
Takeaki Uno,
Hiroki Arimura
Abstract:
We address the induced matching enumeration problem. An edge set $M$ is an induced matching of a graph $G =(V,E)$. The enumeration of matchings are widely studied in literature, but the induced matching has not been paid much attention. A straightforward algorithm takes $O(|V|)$ time for each solution, that is coming from the time to generate a subproblem. We investigated local structures that ena…
▽ More
We address the induced matching enumeration problem. An edge set $M$ is an induced matching of a graph $G =(V,E)$. The enumeration of matchings are widely studied in literature, but the induced matching has not been paid much attention. A straightforward algorithm takes $O(|V|)$ time for each solution, that is coming from the time to generate a subproblem. We investigated local structures that enables us to generate subproblems in short time, and proved that the time complexity will be $O(1)$ if the input graph is $C_4$-free. A $C_4$-free graph is a graph any whose subgraph is not a cycle of length four. Finally, we show the fixed parameter tractability of counting induced matchings for graphs with bounded tree-width and planar graphs.
△ Less
Submitted 10 July, 2017;
originally announced July 2017.