-
A lower bound on the saturation number, and graphs for which it is sharp
Authors:
Alex Cameron,
Gregory J. Puleo
Abstract:
Let $H$ be a fixed graph. We say that a graph $G$ is $H$-saturated if it has no subgraph isomorphic to $H$, but the addition of any edge to $G$ results in an $H$-subgraph. The saturation number $\mathrm{sat}(H,n)$ is the minimum number of edges in an $H$-saturated graph on $n$ vertices. Kászonyi and Tuza, in 1986, gave a general upper bound on the saturation number of a graph $H$, but a nontrivial…
▽ More
Let $H$ be a fixed graph. We say that a graph $G$ is $H$-saturated if it has no subgraph isomorphic to $H$, but the addition of any edge to $G$ results in an $H$-subgraph. The saturation number $\mathrm{sat}(H,n)$ is the minimum number of edges in an $H$-saturated graph on $n$ vertices. Kászonyi and Tuza, in 1986, gave a general upper bound on the saturation number of a graph $H$, but a nontrivial lower bound has remained elusive. In this paper we give a general lower bound on $\mathrm{sat}(H,n)$ and prove that it is asymptotically sharp (up to an additive constant) on a large class of graphs. This class includes all threshold graphs and many graphs for which the saturation number was previously determined exactly. Our work thus gives an asymptotic common generalization of several earlier results. The class also includes disjoint unions of cliques, allowing us to address an open problem of Faudree, Ferrara, Gould, and Jacobson.
△ Less
Submitted 19 July, 2021; v1 submitted 11 April, 2020;
originally announced April 2020.
-
Strong coloring 2-regular graphs: Cycle restrictions and partial colorings
Authors:
Jessica McDonald,
Gregory J. Puleo
Abstract:
Let $H$ be a graph with $Δ(H) \leq 2$, and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. We prove that if $H$ contains at most one odd cycle of length exceeding $3$, or if $H$ contains at most $3$ triangles, then $χ(G) \leq 4$. This proves the Strong Coloring Conjecture for such graphs $H$. For graphs $H$ with $Δ=2$ that are not covered by our theorem, we prove an appr…
▽ More
Let $H$ be a graph with $Δ(H) \leq 2$, and let $G$ be obtained from $H$ by gluing in vertex-disjoint copies of $K_4$. We prove that if $H$ contains at most one odd cycle of length exceeding $3$, or if $H$ contains at most $3$ triangles, then $χ(G) \leq 4$. This proves the Strong Coloring Conjecture for such graphs $H$. For graphs $H$ with $Δ=2$ that are not covered by our theorem, we prove an approximation result towards the conjecture.
△ Less
Submitted 6 July, 2021; v1 submitted 14 January, 2020;
originally announced January 2020.
-
Upper bounds for inverse domination in graphs
Authors:
Elliot Krop,
Jessica McDonald,
Gregory J. Puleo
Abstract:
In any graph $G$, the domination number $γ(G)$ is at most the independence number $α(G)$. The Inverse Domination Conjecture says that, in any isolate-free $G$, there exists pair of vertex-disjoint dominating sets $D, D'$ with $|D|=γ(G)$ and $|D'| \leq α(G)$. Here we prove that this statement is true if the upper bound $α(G)$ is replaced by $\frac{3}{2}α(G) - 1$ (and $G$ is not a clique). We also p…
▽ More
In any graph $G$, the domination number $γ(G)$ is at most the independence number $α(G)$. The Inverse Domination Conjecture says that, in any isolate-free $G$, there exists pair of vertex-disjoint dominating sets $D, D'$ with $|D|=γ(G)$ and $|D'| \leq α(G)$. Here we prove that this statement is true if the upper bound $α(G)$ is replaced by $\frac{3}{2}α(G) - 1$ (and $G$ is not a clique). We also prove that the conjecture holds whenever $γ(G)\leq 5$ or $|V(G)|\leq 16$.
△ Less
Submitted 12 July, 2019;
originally announced July 2019.
-
Some results on multithreshold graphs
Authors:
Gregory J. Puleo
Abstract:
Jamison and Sprague defined a graph $G$ to be a $k$-threshold graph with thresholds $θ_1 , \ldots, θ_k$ (strictly increasing) if one can assign real numbers $(r_v)_{v \in V(G)}$, called ranks, such that for every pair of vertices $v,w$, we have $vw \in E(G)$ if and only if the inequality $θ_i \leq r_v + r_w$ holds for an odd number of indices $i$. When $k=1$ or $k=2$, the precise choice of thresho…
▽ More
Jamison and Sprague defined a graph $G$ to be a $k$-threshold graph with thresholds $θ_1 , \ldots, θ_k$ (strictly increasing) if one can assign real numbers $(r_v)_{v \in V(G)}$, called ranks, such that for every pair of vertices $v,w$, we have $vw \in E(G)$ if and only if the inequality $θ_i \leq r_v + r_w$ holds for an odd number of indices $i$. When $k=1$ or $k=2$, the precise choice of thresholds $θ_1, \ldots, θ_k$ does not matter, as a suitable transformation of the ranks transforms a representation with one choice of thresholds into a representation with any other choice of thresholds. Jamison asked whether this remained true for $k \geq 3$ or whether different thresholds define different classes of graphs for such $k$, offering \$50 for a solution of the problem. Letting $C_t$ for $t > 1$ denote the class of $3$-threshold graphs with thresholds $-1, 1, t$, we prove that there are infinitely many distinct classes $C_t$, answering Jamison's question. We also consider some other problems on multithreshold graphs, some of which remain open.
△ Less
Submitted 30 April, 2019;
originally announced May 2019.
-
Motif and Hypergraph Correlation Clustering
Authors:
Pan Li,
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
Motivated by applications in social and biological network analysis, we introduce a new form of agnostic clustering termed~\emph{motif correlation clustering}, which aims to minimize the cost of clustering errors associated with both edges and higher-order network structures. The problem may be succinctly described as follows: Given a complete graph $G$, partition the vertices of the graph so that…
▽ More
Motivated by applications in social and biological network analysis, we introduce a new form of agnostic clustering termed~\emph{motif correlation clustering}, which aims to minimize the cost of clustering errors associated with both edges and higher-order network structures. The problem may be succinctly described as follows: Given a complete graph $G$, partition the vertices of the graph so that certain predetermined `important' subgraphs mostly lie within the same cluster, while `less relevant' subgraphs are allowed to lie across clusters. Our contributions are as follows: We first introduce several variants of motif correlation clustering and then show that these clustering problems are NP-hard. We then proceed to describe polynomial-time clustering algorithms that provide constant approximation guarantees for the problems at hand. Despite following the frequently used LP relaxation and rounding procedure, the algorithms involve a sophisticated and carefully designed neighborhood growing step that combines information about both edge and motif structures. We conclude with several examples illustrating the performance of the developed algorithms on synthetic and real networks.
△ Less
Submitted 5 November, 2018;
originally announced November 2018.
-
Paired Threshold Graphs
Authors:
Vida Ravanmehr,
Gregory J. Puleo,
Sadegh Bolouki,
Olgica Milenkovic
Abstract:
Threshold graphs are recursive deterministic network models that have been proposed for describing certain economic and social interactions. One drawback of this graph family is that it has limited generative attachment rules. To mitigate this problem, we introduce a new class of graphs termed Paired Threshold (PT) graphs described through vertex weights that govern the existence of edges via two…
▽ More
Threshold graphs are recursive deterministic network models that have been proposed for describing certain economic and social interactions. One drawback of this graph family is that it has limited generative attachment rules. To mitigate this problem, we introduce a new class of graphs termed Paired Threshold (PT) graphs described through vertex weights that govern the existence of edges via two inequalities. One inequality imposes the constraint that the sum of weights of adjacent vertices has to exceed a specified threshold. The second inequality ensures that adjacent vertices have a weight difference upper bounded by another threshold. We provide a conceptually simple characterization and decomposition of PT graphs, analyze their forbidden induced subgraphs and present a method for performing vertex weight assignments on PT graphs that satisfy the defining constraints. Furthermore, we describe a polynomial-time algorithm for recognizing PT graphs. We conclude our exposition with an analysis of the intersection number, diameter and clustering coefficient of PT graphs.
△ Less
Submitted 23 May, 2018; v1 submitted 31 March, 2016;
originally announced March 2016.
-
A new correlation clustering method for cancer mutation analysis
Authors:
Jack P. Hou,
Amin Emad,
Gregory J. Puleo,
Jian Ma,
Olgica Milenkovic
Abstract:
Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. It is widely believed that these alterations follow combinatorial patterns that have a strong connection with the underlying molecular interaction networks and functional pathways. A better understanding of the generative mechanisms behind the mutation rules and their influence on gene commun…
▽ More
Cancer genomes exhibit a large number of different alterations that affect many genes in a diverse manner. It is widely believed that these alterations follow combinatorial patterns that have a strong connection with the underlying molecular interaction networks and functional pathways. A better understanding of the generative mechanisms behind the mutation rules and their influence on gene communities is of great importance for the process of driver mutations discovery and for identification of network modules related to cancer development and progression. We developed a new method for cancer mutation pattern analysis based on a constrained form of correlation clustering. Correlation clustering is an agnostic learning method that can be used for general community detection problems in which the number of communities or their structure is not known beforehand. The resulting algorithm, named $C^3$, leverages mutual exclusivity of mutations, patient coverage, and driver network concentration principles; it accepts as its input a user determined combination of heterogeneous patient data, such as that available from TCGA (including mutation, copy number, and gene expression information), and creates a large number of clusters containing mutually exclusive mutated genes in a particular type of cancer. The cluster sizes may be required to obey some useful soft size constraints, without impacting the computational complexity of the algorithm. To test $C^3$, we performed a detailed analysis on TCGA breast cancer and glioblastoma data and showed that our algorithm outperforms the state-of-the-art CoMEt method in terms of discovering mutually exclusive gene modules and identifying driver genes. Our $C^3$ method represents a unique tool for efficient and reliable identification of mutation patterns and driver pathways in large-scale cancer genomics studies.
△ Less
Submitted 24 January, 2016;
originally announced January 2016.
-
Correlation Clustering and Biclustering with Locally Bounded Errors
Authors:
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
We consider a generalized version of the correlation clustering problem, defined as follows. Given a complete graph $G$ whose edges are labeled with $+$ or $-$, we wish to partition the graph into clusters while trying to avoid errors: $+$ edges between clusters or $-$ edges within clusters. Classically, one seeks to minimize the total number of such errors. We introduce a new framework that allow…
▽ More
We consider a generalized version of the correlation clustering problem, defined as follows. Given a complete graph $G$ whose edges are labeled with $+$ or $-$, we wish to partition the graph into clusters while trying to avoid errors: $+$ edges between clusters or $-$ edges within clusters. Classically, one seeks to minimize the total number of such errors. We introduce a new framework that allows the objective to be a more general function of the number of errors at each vertex (for example, we may wish to minimize the number of errors at the worst vertex) and provide a rounding algorithm which converts "fractional clusterings" into discrete clusterings while causing only a constant-factor blowup in the number of errors at each vertex. This rounding algorithm yields constant-factor approximation algorithms for the discrete problem under a wide variety of objective functions.
△ Less
Submitted 24 May, 2016; v1 submitted 26 June, 2015;
originally announced June 2015.
-
Complexity of a Disjoint Matching Problem on Bipartite Graphs
Authors:
Gregory J. Puleo
Abstract:
We consider the following question: given an $(X,Y)$-bigraph $G$ and a set $S \subset X$, does $G$ contain two disjoint matchings $M_1$ and $M_2$ such that $M_1$ saturates $X$ and $M_2$ saturates $S$? When $|S|\geq |X|-1$, this question is solvable by finding an appropriate factor of the graph. In contrast, we show that when $S$ is allowed to be an arbitrary subset of $X$, the problem is NP-hard.
We consider the following question: given an $(X,Y)$-bigraph $G$ and a set $S \subset X$, does $G$ contain two disjoint matchings $M_1$ and $M_2$ such that $M_1$ saturates $X$ and $M_2$ saturates $S$? When $|S|\geq |X|-1$, this question is solvable by finding an appropriate factor of the graph. In contrast, we show that when $S$ is allowed to be an arbitrary subset of $X$, the problem is NP-hard.
△ Less
Submitted 19 June, 2015;
originally announced June 2015.
-
Codes for DNA Sequence Profiles
Authors:
Han Mao Kiah,
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
We consider the problem of storing and retrieving information from synthetic DNA media. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on their collection of substrings observed through a noisy channel. This problem of reconstructing sequences from traces was first investigated in the noiseless setting under the name of "Markov typ…
▽ More
We consider the problem of storing and retrieving information from synthetic DNA media. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on their collection of substrings observed through a noisy channel. This problem of reconstructing sequences from traces was first investigated in the noiseless setting under the name of "Markov type" analysis. Here, we explain the connection between the reconstruction problem and the problem of DNA synthesis and sequencing, and introduce the notion of a DNA storage channel. We analyze the number of sequence equivalence classes under the channel map** and propose new asymmetric coding techniques to combat the effects of synthesis and sequencing noise. In our analysis, we make use of restricted de Bruijn graphs and Ehrhart theory for rational polytopes.
△ Less
Submitted 2 February, 2015;
originally announced February 2015.
-
Correlation Clustering with Constrained Cluster Sizes and Extended Weights Bounds
Authors:
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
We consider the problem of correlation clustering on graphs with constraints on both the cluster sizes and the positive and negative weights of edges. Our contributions are twofold: First, we introduce the problem of correlation clustering with bounded cluster sizes. Second, we extend the regime of weight values for which the clustering may be performed with constant approximation guarantees in po…
▽ More
We consider the problem of correlation clustering on graphs with constraints on both the cluster sizes and the positive and negative weights of edges. Our contributions are twofold: First, we introduce the problem of correlation clustering with bounded cluster sizes. Second, we extend the regime of weight values for which the clustering may be performed with constant approximation guarantees in polynomial time and apply the results to the bounded cluster size problem.
△ Less
Submitted 22 May, 2015; v1 submitted 3 November, 2014;
originally announced November 2014.
-
Codes for DNA Storage Channels
Authors:
Han Mao Kiah,
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
We consider the problem of assembling a sequence based on a collection of its substrings observed through a noisy channel. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on a collection of their substrings observed through a noisy channel. We explain the connection between the sequence reconstruction problem and the problem of DNA…
▽ More
We consider the problem of assembling a sequence based on a collection of its substrings observed through a noisy channel. The mathematical basis of the problem is the construction and design of sequences that may be discriminated based on a collection of their substrings observed through a noisy channel. We explain the connection between the sequence reconstruction problem and the problem of DNA synthesis and sequencing, and introduce the notion of a DNA storage channel. We analyze the number of sequence equivalence classes under the channel map** and propose new asymmetric coding techniques to combat the effects of synthesis and sequencing noise. In our analysis, we make use of restricted de Bruijn graphs and Ehrhart theory for rational polytopes.
△ Less
Submitted 3 November, 2015; v1 submitted 31 October, 2014;
originally announced October 2014.
-
Computing Similarity Distances Between Rankings
Authors:
Farzad Farnoud,
Lili Su,
Gregory J. Puleo,
Olgica Milenkovic
Abstract:
We address the problem of computing distances between rankings that take into account similarities between candidates. The need for evaluating such distances is governed by applications as diverse as rank aggregation, bioinformatics, social sciences and data storage. The problem may be summarized as follows: Given two rankings and a positive cost function on transpositions that depends on the simi…
▽ More
We address the problem of computing distances between rankings that take into account similarities between candidates. The need for evaluating such distances is governed by applications as diverse as rank aggregation, bioinformatics, social sciences and data storage. The problem may be summarized as follows: Given two rankings and a positive cost function on transpositions that depends on the similarity of the candidates involved, find a smallest cost sequence of transpositions that converts one ranking into another. Our focus is on costs that may be described via special metric-tree structures and on complete rankings modeled as permutations. The presented results include a quadratic-time algorithm for finding a minimum cost decomposition for simple cycles, and a quadratic-time, $4/3$-approximation algorithm for permutations that contain multiple cycles. The proposed methods rely on investigating a newly introduced balancing property of cycles embedded in trees, cycle-merging methods, and shortest path optimization techniques.
△ Less
Submitted 19 November, 2014; v1 submitted 16 July, 2013;
originally announced July 2013.
-
Revolutionaries and spies: Spy-good and spy-bad graphs
Authors:
Jane V. Butterfield,
Daniel W. Cranston,
Gregory J. Puleo,
Douglas B. West,
Reza Zamani
Abstract:
We study a game on a graph $G$ played by $r$ {\it revolutionaries} and $s$ {\it spies}. Initially, revolutionaries and then spies occupy vertices. In each subsequent round, each revolutionary may move to a neighboring vertex or not move, and then each spy has the same option. The revolutionaries win if $m$ of them meet at some vertex having no spy (at the end of a round); the spies win if they can…
▽ More
We study a game on a graph $G$ played by $r$ {\it revolutionaries} and $s$ {\it spies}. Initially, revolutionaries and then spies occupy vertices. In each subsequent round, each revolutionary may move to a neighboring vertex or not move, and then each spy has the same option. The revolutionaries win if $m$ of them meet at some vertex having no spy (at the end of a round); the spies win if they can avoid this forever.
Let $σ(G,m,r)$ denote the minimum number of spies needed to win. To avoid degenerate cases, assume $|V(G)|\ge r-m+1\ge\floor{r/m}\ge 1$. The easy bounds are then $\floor{r/m}\le σ(G,m,r)\le r-m+1$. We prove that the lower bound is sharp when $G$ has a rooted spanning tree $T$ such that every edge of $G$ not in $T$ joins two vertices having the same parent in $T$. As a consequence, $σ(G,m,r)\leγ(G)\floor{r/m}$, where $γ(G)$ is the domination number; this bound is nearly sharp when $γ(G)\le m$.
For the random graph with constant edge-probability $p$, we obtain constants $c$ and $c'$ (depending on $m$ and $p$) such that $σ(G,m,r)$ is near the trivial upper bound when $r<c\ln n$ and at most $c'$ times the trivial lower bound when $r>c'\ln n$. For the hypercube $Q_d$ with $d\ge r$, we have $σ(G,m,r)=r-m+1$ when $m=2$, and for $m\ge 3$ at least $r-39m$ spies are needed.
For complete $k$-partite graphs with partite sets of size at least $2r$, the leading term in $σ(G,m,r)$ is approximately $\frac{k}{k-1}\frac{r}{m}$ when $k\ge m$. For $k=2$, we have $σ(G,2,r)=\bigl\lceil{\frac{\floor{7r/2}-3}5}\bigr\rceil$ and $σ(G,3,r)=\floor{r/2}$, and in general $\frac{3r}{2m}-3\le σ(G,m,r)\le\frac{(1+1/\sqrt3)r}{m}$.
△ Less
Submitted 26 May, 2012; v1 submitted 13 February, 2012;
originally announced February 2012.