-
Cycles of weight divisible by $k$
Authors:
Ajit A. Diwan
Abstract:
A weighted (directed) graph is a (directed) graph with integer weights assigned to its vertices and edges. The weight of a subgraph is the sum of weights of vertices and edges in the subgraph. The problem of determining the largest order $f(k)$ of a weighted complete directed graph that does not contain a directed cycle of weight divisible by $k$, for an integer $k \ge 2$, was raised by Alon and K…
▽ More
A weighted (directed) graph is a (directed) graph with integer weights assigned to its vertices and edges. The weight of a subgraph is the sum of weights of vertices and edges in the subgraph. The problem of determining the largest order $f(k)$ of a weighted complete directed graph that does not contain a directed cycle of weight divisible by $k$, for an integer $k \ge 2$, was raised by Alon and Krivelevich [J. Graph Theory 98 (2021) 623-629]. They showed that $f(k)$ is $O(k\log k)$ and $f(k) \le 2k-2$ if $k$ is prime. The best bounds known to us are $f(k) \le 2k-2$ for all $k$ and $f(k) < (3k-1)/2$ for prime $k$. It is also known that $f(k) \ge k$ and this is believed to be the correct value. We prove that $f(k) < k+2Ω(k)$, where $Ω(k)$ is the number of prime factors, not necessarily distinct, in the prime factorization of $k$.
We also show that any weighted undirected graph of minimum degree $2k-1$ contains a cycle of weight divisible by $k$. This result is proved in the more general setting in which the weights are from a finite abelian group of order $k$, and the cycle has weight equal to the group identity. We conjecture that this holds for undirected graphs with minimum degree $k+1$.
△ Less
Submitted 1 July, 2024;
originally announced July 2024.
-
Planar Cycle-Extendable Graphs
Authors:
Aditya Y Dalwadi,
Kapil R Shenvi Pause,
Ajit A Diwan,
Nishad Kothari
Abstract:
For most problems pertaining to perfect matchings, one may restrict attention to matching covered graphs -- that is, connected nontrivial graphs with the property that each edge belongs to some perfect matching. There is extensive literature on these graphs that are also known as $1$-extendable graphs (since each edge extends to a perfect matching) including an ear decomposition theorem due to Lov…
▽ More
For most problems pertaining to perfect matchings, one may restrict attention to matching covered graphs -- that is, connected nontrivial graphs with the property that each edge belongs to some perfect matching. There is extensive literature on these graphs that are also known as $1$-extendable graphs (since each edge extends to a perfect matching) including an ear decomposition theorem due to Lovasz and Plummer.
A cycle $C$ of a graph $G$ is conformal if $G-V(C)$ has a perfect matching; such cycles play an important role in the study of perfect matchings, especially when investigating the Pfaffian orientation problem. A matching covered graph $G$ is cycle-extendable if -- for each even cycle $C$ -- the cycle $C$ is conformal, or equivalently, each perfect matching of $C$ extends to a perfect matching of $G$, or equivalently, $C$ is the symmetric difference of two perfect matchings of $G$, or equivalently, $C$ extends to an ear decomposition of $G$. In the literature, these are also known as cycle-nice or as $1$-cycle resonant graphs.
Zhang, Wang, Yuan, Ng and Cheng [Discrete Mathematics, 345:7 (2022), 112876] provided a characterization of claw-free cycle-extendable graphs. Guo and Zhang [Discrete Mathematics, 275:1-3 (2004), 151-164] and independently Zhang and Li [Discrete Applied Mathematics, 160:13-14 (2012), 2069-2074], provided characterizations of bipartite planar cycle-extendable graphs. In this paper, we establish a characterization of all planar cycle-extendable graphs -- in terms of $K_2$ and four infinite families.
△ Less
Submitted 27 June, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Extremal minimal bipartite matching covered graphs
Authors:
Amit Kumar Mallik,
Ajit A. Diwan,
Nishad Kothari
Abstract:
A connected graph, on four or more vertices, is matching covered if every edge is present in some perfect matching. An ear decomposition theorem (similar to the one for $2$-connected graphs) exists for bipartite matching covered graphs due to Hetyei. From the results and proofs of Lovász and Plummer, that rely on Hetyei's theorem, one may deduce that any minimal bipartite matching covered graph ha…
▽ More
A connected graph, on four or more vertices, is matching covered if every edge is present in some perfect matching. An ear decomposition theorem (similar to the one for $2$-connected graphs) exists for bipartite matching covered graphs due to Hetyei. From the results and proofs of Lovász and Plummer, that rely on Hetyei's theorem, one may deduce that any minimal bipartite matching covered graph has at least $2(m-n+2)$ vertices of degree two (where minimal means that deleting any edge results in a graph that is not matching covered); such a graph is said to be extremal if it attains the stated lower bound.
In this paper, we provide a complete characterization of the class of extremal minimal bipartite matching covered graphs. In particular, we prove that every such graph $G$ is obtained from two copies of a tree devoid of degree two vertices, say $T$ and $T'$, by adding edges -- each of which joins a leaf of $T$ with the corresponding leaf of $T'$.
Apart from the aforementioned bound, there are four other bounds that appear in, or may be deduced from, the work of Lovász and Plummer. Each of these bounds leads to a notion of extremality. In this paper, we obtain a complete characterization of all of these extremal classes and also establish relationships between them. Two of our characterizations are in the same spirit as the one stated above. For the remaining two extremal classes, we reduce each of them to one of the already characterized extremal classes using standard matching theoretic operations.
△ Less
Submitted 11 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Authors:
Anuj Diwan,
Eunsol Choi,
David Harwath
Abstract:
We present the first unified study of the efficiency of self-attention-based Transformer variants spanning text, speech and vision. We identify input length thresholds (tip** points) at which efficient Transformer variants become more efficient than vanilla models, using a variety of efficiency metrics (latency, throughput, and memory). To conduct this analysis for speech, we introduce L-HuBERT,…
▽ More
We present the first unified study of the efficiency of self-attention-based Transformer variants spanning text, speech and vision. We identify input length thresholds (tip** points) at which efficient Transformer variants become more efficient than vanilla models, using a variety of efficiency metrics (latency, throughput, and memory). To conduct this analysis for speech, we introduce L-HuBERT, a novel local-attention variant of a self-supervised speech model. We observe that these thresholds are (a) much higher than typical dataset sequence lengths and (b) dependent on the metric and modality, showing that choosing the right model depends on modality, task type (long-form vs. typical context) and resource constraints (time vs. memory). By visualising the breakdown of the computational costs for transformer components, we also show that non-self-attention components exhibit significant computational costs. We release our profiling toolkit at https://github.com/ajd12342/profiling-transformers .
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
Colouring planar graphs with a precoloured induced cycle
Authors:
Ajit Diwan
Abstract:
Let $C$ be a cycle and $f : V(C) \rightarrow \{c_1,c_2,\ldots,c_k\}$ a proper $k$-colouring of $C$ for some $k \ge 4$. We say the colouring $f$ is safe if for any planar graph $G$ in which $C$ is an induced cycle, there exists a proper $k$-colouring $f'$ of $G$ such that $f'(v) = f(v)$ for all $v \in V(C)$. The only safe $4$-colouring is any proper colouring of a triangle. We give a simple necessa…
▽ More
Let $C$ be a cycle and $f : V(C) \rightarrow \{c_1,c_2,\ldots,c_k\}$ a proper $k$-colouring of $C$ for some $k \ge 4$. We say the colouring $f$ is safe if for any planar graph $G$ in which $C$ is an induced cycle, there exists a proper $k$-colouring $f'$ of $G$ such that $f'(v) = f(v)$ for all $v \in V(C)$. The only safe $4$-colouring is any proper colouring of a triangle. We give a simple necessary condition for a $k$-colouring of a cycle to be safe and conjecture that it is sufficient for all $k \ge 4$. The sufficiency for $k=4$ follows from the four colour theorem and we prove it for $k = 5$, independent of the four colour theorem. We show that a stronger condition is sufficient for all $k \ge 4$. As a consequence, it follows that any proper $k$-colouring of a cycle that uses at most $k-3$ distinct colours is safe. Also, any proper $k$-colouring of a cycle of length at most $2k-5$ that uses at most $k-1$ distinct colours is safe.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Textless Low-Resource Speech-to-Speech Translation With Unit Language Models
Authors:
Anuj Diwan,
Anirudh Srinivasan,
David Harwath,
Eunsol Choi
Abstract:
Existing speech-to-speech translation models fall into two camps: textless models trained with hundreds of hours of parallel speech data or unsupervised models that leverage text as an intermediate step. Both approaches limit building speech-to-speech translation models for a wide range of languages, as they exclude languages that are primarily spoken and language pairs that lack large-scale paral…
▽ More
Existing speech-to-speech translation models fall into two camps: textless models trained with hundreds of hours of parallel speech data or unsupervised models that leverage text as an intermediate step. Both approaches limit building speech-to-speech translation models for a wide range of languages, as they exclude languages that are primarily spoken and language pairs that lack large-scale parallel speech data. We present a new framework for training textless low-resource speech-to-speech translation (S2ST) systems that only need dozens of hours of parallel speech data. We reformulate S2ST as a unit-to-unit seq2seq translation task, and start by pretraining a model on large-scale monolingual speech data. Then, we finetune it with a small amount of parallel speech data ($20-60$ hours). Lastly, we improve model performance through an unsupervised backtranslation objective. We train and evaluate our models for English-to-German, German-to-English and Marathi-to-English translation on three different domains (European Parliament, Common Voice, and All India Radio) with single-speaker synthesized speech data. Evaluated using the ASR-BLEU metric, our models achieve reasonable performance on all three domains, with some being within 1-2 points of our supervised topline.
△ Less
Submitted 20 February, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Authors:
Anuj Diwan,
Ching-Feng Yeh,
Wei-Ning Hsu,
Paden Tomasello,
Eunsol Choi,
David Harwath,
Abdelrahman Mohamed
Abstract:
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world scenario, we introduce LibriContinual, a continual learning benchmark for speaker-specific domain adaptation derived from LibriVox audiobooks, with dat…
▽ More
Automatic speech recognition research focuses on training and evaluating on static datasets. Yet, as speech models are increasingly deployed on personal devices, such models encounter user-specific distributional shifts. To simulate this real-world scenario, we introduce LibriContinual, a continual learning benchmark for speaker-specific domain adaptation derived from LibriVox audiobooks, with data corresponding to 118 individual speakers and 6 train splits per speaker of different sizes. Additionally, current speech recognition models and continual learning algorithms are not optimized to be compute-efficient. We adapt a general-purpose training algorithm NetAug for ASR and create a novel Conformer variant called the DisConformer (Disentangled Conformer). This algorithm produces ASR models consisting of a frozen 'core' network for general-purpose use and several tunable 'augment' networks for speaker-specific tuning. Using such models, we propose a novel compute-efficient continual learning algorithm called DisentangledCL. Our experiments show that the DisConformer models significantly outperform baselines on general ASR i.e. LibriSpeech (15.58% rel. WER on test-other). On speaker-specific LibriContinual they significantly outperform trainable-parameter-matched baselines (by 20.65% rel. WER on test) and even match fully finetuned baselines in some settings.
△ Less
Submitted 13 December, 2022; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Machine Learning enabled models for YouTube Ranking Mechanism and Views Prediction
Authors:
Vandit Gupta,
Akshit Diwan,
Chaitanya Chadha,
Ashish Khanna,
Deepak Gupta
Abstract:
With the continuous increase of internet usage in todays time, everyone is influenced by this source of the power of technology. Due to this, the rise of applications and games Is unstoppable. A major percentage of our population uses these applications for multiple purposes. These range from education, communication, news, entertainment, and many more. Out of this, the application that is making…
▽ More
With the continuous increase of internet usage in todays time, everyone is influenced by this source of the power of technology. Due to this, the rise of applications and games Is unstoppable. A major percentage of our population uses these applications for multiple purposes. These range from education, communication, news, entertainment, and many more. Out of this, the application that is making sure that the world stays in touch with each other and with current affairs is social media. Social media applications have seen a boom in the last 10 years with the introduction of smartphones and the internet being available at affordable prices. Applications like Twitch and Youtube are some of the best platforms for producing content and expressing their talent as well. It is the goal of every content creator to post the best and most reliable content so that they can gain recognition. It is important to know the methods of achieving popularity easily, which is what this paper proposes to bring to the spotlight. There should be certain parameters based on which the reach of content could be multiplied by a good factor. The proposed research work aims to identify and estimate the reach, popularity, and views of a YouTube video by using certain features using machine learning and AI techniques. A ranking system would also be used kee** the trending videos in consideration. This would eventually help the content creator know how authentic their content is and healthy competition to make better content before uploading the video on the platform will be ensured.
△ Less
Submitted 15 November, 2022;
originally announced November 2022.
-
Zero-shot Video Moment Retrieval With Off-the-Shelf Models
Authors:
Anuj Diwan,
Puyuan Peng,
Raymond J. Mooney
Abstract:
For the majority of the machine learning community, the expensive nature of collecting high-quality human-annotated data and the inability to efficiently finetune very large state-of-the-art pretrained models on limited compute are major bottlenecks for building models for new tasks. We propose a zero-shot simple approach for one such task, Video Moment Retrieval (VMR), that does not perform any a…
▽ More
For the majority of the machine learning community, the expensive nature of collecting high-quality human-annotated data and the inability to efficiently finetune very large state-of-the-art pretrained models on limited compute are major bottlenecks for building models for new tasks. We propose a zero-shot simple approach for one such task, Video Moment Retrieval (VMR), that does not perform any additional finetuning and simply repurposes off-the-shelf models trained on other tasks. Our three-step approach consists of moment proposal, moment-query matching and postprocessing, all using only off-the-shelf models. On the QVHighlights benchmark for VMR, we vastly improve performance of previous zero-shot approaches by at least 2.5x on all metrics and reduce the gap between zero-shot and state-of-the-art supervised by over 74%. Further, we also show that our zero-shot approach beats non-pretrained supervised models on the Recall metrics and comes very close on mAP metrics; and that it also performs better than the best pretrained supervised model on shorter moments. Finally, we ablate and analyze our results and propose interesting future directions.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality
Authors:
Anuj Diwan,
Layne Berry,
Eunsol Choi,
David Harwath,
Kyle Mahowald
Abstract:
Recent visuolinguistic pre-trained models show promising progress on various end tasks such as image retrieval and video captioning. Yet, they fail miserably on the recently proposed Winoground dataset, which challenges models to match paired images and English captions, with items constructed to overlap lexically but differ in meaning (e.g., "there is a mug in some grass" vs. "there is some grass…
▽ More
Recent visuolinguistic pre-trained models show promising progress on various end tasks such as image retrieval and video captioning. Yet, they fail miserably on the recently proposed Winoground dataset, which challenges models to match paired images and English captions, with items constructed to overlap lexically but differ in meaning (e.g., "there is a mug in some grass" vs. "there is some grass in a mug"). By annotating the dataset using new fine-grained tags, we show that solving the Winoground task requires not just compositional language understanding, but a host of other abilities like commonsense reasoning or locating small, out-of-focus objects in low-resolution images. In this paper, we identify the dataset's main challenges through a suite of experiments on related tasks (probing task, image retrieval task), data augmentation, and manual inspection of the dataset. Our analysis suggests that a main challenge in visuolinguistic models may lie in fusing visual and textual representations, rather than in compositional language understanding. We release our annotation and code at https://github.com/ajd12342/why-winoground-hard .
△ Less
Submitted 3 December, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Another simple reformulation of the four color theorem
Authors:
Ajit Diwan
Abstract:
We give a simple reformulation of the four color theorem as a problem on strings over a four letter alphabet.
We give a simple reformulation of the four color theorem as a problem on strings over a four letter alphabet.
△ Less
Submitted 23 August, 2021;
originally announced August 2021.
-
Multilingual and code-switching ASR challenges for low resource Indian languages
Authors:
Anuj Diwan,
Rakesh Vaideeswaran,
Sanket Shah,
Ankita Singh,
Srinivasa Raghavan,
Shreya Khare,
Vinit Unni,
Saurabh Vyas,
Akash Rajpuria,
Chiranjeevi Yarra,
Ashish Mittal,
Prasanta Kumar Ghosh,
Preethi Jyothi,
Kalika Bali,
Vivek Seshadri,
Sunayana Sitaram,
Samarth Bharadwaj,
Jai Nanavati,
Raoul Nanavati,
Karthik Sankaranarayanan,
Tejaswi Seeram,
Basil Abraham
Abstract:
Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple languages. With multilingualism becoming common in today's world, there has been increasing interest in code-switching ASR as well. In code-switching, multiple language…
▽ More
Recently, there is increasing interest in multilingual automatic speech recognition (ASR) where a speech recognition system caters to multiple low resource languages by taking advantage of low amounts of labeled corpora in multiple languages. With multilingualism becoming common in today's world, there has been increasing interest in code-switching ASR as well. In code-switching, multiple languages are freely interchanged within a single sentence or between sentences. The success of low-resource multilingual and code-switching ASR often depends on the variety of languages in terms of their acoustics, linguistic characteristics as well as the amount of data available and how these are carefully considered in building the ASR system. In this challenge, we would like to focus on building multilingual and code-switching ASR systems through two different subtasks related to a total of seven Indian languages, namely Hindi, Marathi, Odia, Tamil, Telugu, Gujarati and Bengali. For this purpose, we provide a total of ~600 hours of transcribed speech data, comprising train and test sets, in these languages including two code-switched language pairs, Hindi-English and Bengali-English. We also provide a baseline recipe for both the tasks with a WER of 30.73% and 32.45% on the test sets of multilingual and code-switching subtasks, respectively.
△ Less
Submitted 31 March, 2021;
originally announced April 2021.
-
Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages
Authors:
Anuj Diwan,
Preethi Jyothi
Abstract:
This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages. By identifying sets of acoustically similar graphemes in these languages, we first reduce the output alphabet of the ASR system using linguistically meaningful reductions and then reconstruct the original alphabet using a standalone module. We demonstrate that this lessens the…
▽ More
This work presents a seemingly simple but effective technique to improve low-resource ASR systems for phonetic languages. By identifying sets of acoustically similar graphemes in these languages, we first reduce the output alphabet of the ASR system using linguistically meaningful reductions and then reconstruct the original alphabet using a standalone module. We demonstrate that this lessens the burden and improves the performance of low-resource end-to-end ASR systems (because only reduced-alphabet predictions are needed) and that it is possible to design a very simple but effective reconstruction module that recovers sequences in the original alphabet from sequences in the reduced alphabet. We present a finite state transducer-based reconstruction module that operates on the 1-best ASR hypothesis in the reduced alphabet. We demonstrate the efficacy of our proposed technique using ASR systems for two Indian languages, Gujarati and Telugu. With access to only 10 hrs of speech data, we obtain relative WER reductions of up to 7% compared to systems that do not use any reduction.
△ Less
Submitted 3 June, 2021; v1 submitted 19 October, 2020;
originally announced October 2020.
-
On colouring point visibility graphs
Authors:
Ajit Arvind Diwan,
Bodhayan Roy
Abstract:
In this paper we show that it can be decided in polynomial time whether or not the visibility graph of a given point set is 4-colourable, and such a 4-colouring, if it exists, can also be constructed in polynomial time. We show that the problem of deciding whether the visibility graph of a point set is 5-colourable, is NP-complete. We give an example of a point visibility graph that has chromatic…
▽ More
In this paper we show that it can be decided in polynomial time whether or not the visibility graph of a given point set is 4-colourable, and such a 4-colouring, if it exists, can also be constructed in polynomial time. We show that the problem of deciding whether the visibility graph of a point set is 5-colourable, is NP-complete. We give an example of a point visibility graph that has chromatic number 6 while its clique number is only 4.
△ Less
Submitted 24 June, 2017; v1 submitted 4 October, 2016;
originally announced October 2016.
-
Partitions of planar point sets into polygons
Authors:
Ajit Arvind Diwan,
Bodhayan Roy
Abstract:
In this paper, we characterize planar point sets that can be partitioned into disjoint polygons of arbitrarily specified sizes. We provide an algorithm to construct such a partition, if it exists, in polynomial time. We show that this problem is equivalent to finding a specified $2$-factor in the visibility graph of the point set. The characterization for the case where all cycles have length $3$…
▽ More
In this paper, we characterize planar point sets that can be partitioned into disjoint polygons of arbitrarily specified sizes. We provide an algorithm to construct such a partition, if it exists, in polynomial time. We show that this problem is equivalent to finding a specified $2$-factor in the visibility graph of the point set. The characterization for the case where all cycles have length $3$ also translates to finding a $K_3$-factor of the visibility graph of the point set. We show that the generalized problem of finding a $K_k$-factor of the visibility graph of a given point set for $k \geq 5$ is NP-hard.
△ Less
Submitted 18 May, 2016;
originally announced May 2016.
-
On the Maximum Rate of Networked Computation in a Capacitated Network
Authors:
Pooja Vyavahare,
Nutan Limaye Ajit A. Diwan,
D. Manjunath
Abstract:
Given a capacitated communication network $\mathcal{N}$ and a function f that needs to be computed on $\mathcal{N},$ we study the problem of generating a computation and communication schedule in $\mathcal{N}$ to maximize the rate of computation of f. Shah et. al.[IEEE Journal of Selected Areas in Communication, 2013] studied this problem when the computation schema $\mathcal{G}$ for f is a tree.…
▽ More
Given a capacitated communication network $\mathcal{N}$ and a function f that needs to be computed on $\mathcal{N},$ we study the problem of generating a computation and communication schedule in $\mathcal{N}$ to maximize the rate of computation of f. Shah et. al.[IEEE Journal of Selected Areas in Communication, 2013] studied this problem when the computation schema $\mathcal{G}$ for f is a tree. We define the notion of a schedule when $\mathcal{G}$ is a general DAG and show that finding an optimal schedule is equivalent to finding the solution of a packing LP. We prove that approximating the maximum rate is MAX SNP-hard by looking at the packing LP. For this packing LP we prove that solving the separation oracle of its dual is equivalent to solving the LP. The separation oracle of the dual reduces to the problem of finding minimum cost embedding given $\mathcal{N},\mathcal{G},$ which we prove to be MAX SNP-hard even when $\mathcal{G}$ has bounded degree and bounded edge weights and $\mathcal{N}$ has just three vertices. We present a polynomial time algorithm to compute the maximum rate of function computation when $\mathcal{N}$ has two vertices by reducing the problem to a version of submodular function minimization problem. For the general $\mathcal{N}$ we study restricted class of schedules and its equivalent packing LP. We observe that for this packing LP also the separation oracle of its dual reduces to finding minimum cost embedding. A version of this minimum cost embedding problem has been studied in literature. We present a quadratic integer program for the minimum cost embedding problem and its linear programming relaxation based on earthmover metric. We also present some approximate algorithms for special classes of $\mathcal{G}.$
△ Less
Submitted 21 January, 2016; v1 submitted 15 July, 2015;
originally announced July 2015.
-
Four-connected triangulations of planar point sets
Authors:
Ajit Arvind Diwan,
Subir Kumar Ghosh,
Bodhayan Roy
Abstract:
In this paper, we consider the problem of determining in polynomial time whether a given planar point set $P$ of $n$ points admits 4-connected triangulation. We propose a necessary and sufficient condition for recognizing $P$, and present an $O(n^3)$ algorithm of constructing a 4-connected triangulation of $P$. Thus, our algorithm solves a longstanding open problem in computational geometry and ge…
▽ More
In this paper, we consider the problem of determining in polynomial time whether a given planar point set $P$ of $n$ points admits 4-connected triangulation. We propose a necessary and sufficient condition for recognizing $P$, and present an $O(n^3)$ algorithm of constructing a 4-connected triangulation of $P$. Thus, our algorithm solves a longstanding open problem in computational geometry and geometric graph theory. We also provide a simple method for constructing a noncomplex triangulation of $P$ which requires $O(n^2)$ steps. This method provides a new insight to the structure of 4-connected triangulation of point sets.
△ Less
Submitted 7 October, 2013;
originally announced October 2013.
-
Component Coloring of Proper Interval Graphs and Split Graphs
Authors:
Ajit Diwan,
Soumitra Pal,
Abhiram Ranade
Abstract:
We introduce a generalization of the well known graph (vertex) coloring problem, which we call the problem of \emph{component coloring of graphs}. Given a graph, the problem is to color the vertices using minimum number of colors so that the size of each connected component of the subgraph induced by the vertices of the same color does not exceed $C$. We give a linear time algorithm for the proble…
▽ More
We introduce a generalization of the well known graph (vertex) coloring problem, which we call the problem of \emph{component coloring of graphs}. Given a graph, the problem is to color the vertices using minimum number of colors so that the size of each connected component of the subgraph induced by the vertices of the same color does not exceed $C$. We give a linear time algorithm for the problem on proper interval graphs. We extend this algorithm to solve two weighted versions of the problem in which vertices have integer weights. In the \emph{splittable} version the weights of vertices can be split into differently colored parts, however, the total weight of a monochromatic component cannot exceed $C$. For this problem on proper interval graphs we give a polynomial time algorithm. In the \emph{non-splittable} version the vertices cannot be split. Using the algorithm for the splittable version we give a 2-approximation algorithm for the non-splittable problem on proper interval graphs which is NP-hard. We also prove that even the unweighted version of the problem is NP-hard for split graphs.
△ Less
Submitted 3 November, 2012; v1 submitted 16 January, 2012;
originally announced January 2012.
-
On joint triangulations of two sets of points in the plane
Authors:
Ajit Arvind Diwan,
Subir Kumar Ghosh,
Partha Pratim Goswami,
Andrzej Lingas
Abstract:
In this paper, we establish two necessary conditions for a joint triangulation of two sets of $n$ points in the plane and conjecture that they are sufficient. We show that these necessary conditions can be tested in $O(n^3)$ time. For the problem of a joint triangulation of two simple polygons of $n$ vertices, we propose an $O(n^3)$ time algorithm for constructing a joint triangulation using dynam…
▽ More
In this paper, we establish two necessary conditions for a joint triangulation of two sets of $n$ points in the plane and conjecture that they are sufficient. We show that these necessary conditions can be tested in $O(n^3)$ time. For the problem of a joint triangulation of two simple polygons of $n$ vertices, we propose an $O(n^3)$ time algorithm for constructing a joint triangulation using dynamic programming.
△ Less
Submitted 7 February, 2011;
originally announced February 2011.
-
Generalized Collective Inference with Symmetric Clique Potentials
Authors:
Rahul Gupta,
Sunita Sarawagi,
Ajit A. Diwan
Abstract:
Collective graphical models exploit inter-instance associative dependence to output more accurate labelings. However existing models support very limited kind of associativity which restricts accuracy gains. This paper makes two major contributions. First, we propose a general collective inference framework that biases data instances to agree on a set of {\em properties} of their labelings. Agre…
▽ More
Collective graphical models exploit inter-instance associative dependence to output more accurate labelings. However existing models support very limited kind of associativity which restricts accuracy gains. This paper makes two major contributions. First, we propose a general collective inference framework that biases data instances to agree on a set of {\em properties} of their labelings. Agreement is encouraged through symmetric clique potentials. We show that rich properties leads to bigger gains, and present a systematic inference procedure for a large class of such properties. The procedure performs message passing on the cluster graph, where property-aware messages are computed with cluster specific algorithms. This provides an inference-only solution for domain adaptation. Our experiments on bibliographic information extraction illustrate significant test error reduction over unseen domains. Our second major contribution consists of algorithms for computing outgoing messages from clique clusters with symmetric clique potentials. Our algorithms are exact for arbitrary symmetric potentials on binary labels and for max-like and majority-like potentials on multiple labels. For majority potentials, we also provide an efficient Lagrangian Relaxation based algorithm that compares favorably with the exact algorithm. We present a 13/15-approximation algorithm for the NP-hard Potts potential, with runtime sub-quadratic in the clique size. In contrast, the best known previous guarantee for graphs with Potts potentials is only 1/2. We empirically show that our method for Potts potentials is an order of magnitude faster than the best alternatives, and our Lagrangian Relaxation based algorithm for majority potentials beats the best applicable heuristic -- ICM.
△ Less
Submitted 7 July, 2009; v1 submitted 3 July, 2009;
originally announced July 2009.
-
Circumference, Chromatic Number and Online Coloring
Authors:
Ajit A. Diwan,
Sreyash Kenkre,
Sundar Vishwanathan
Abstract:
Erdös conjectured that if $G$ is a triangle free graph of chromatic number at least $k\geq 3$, then it contains an odd cycle of length at least $k^{2-o(1)}$ \cite{sudakovverstraete, verstraete}. Nothing better than a linear bound (\cite{gyarfas}, Problem 5.1.55 in \cite{West}) was so far known. We make progress on this conjecture by showing that $G$ contains an odd cycle of length at least…
▽ More
Erdös conjectured that if $G$ is a triangle free graph of chromatic number at least $k\geq 3$, then it contains an odd cycle of length at least $k^{2-o(1)}$ \cite{sudakovverstraete, verstraete}. Nothing better than a linear bound (\cite{gyarfas}, Problem 5.1.55 in \cite{West}) was so far known. We make progress on this conjecture by showing that $G$ contains an odd cycle of length at least $O(k\log\log k)$. Erdös' conjecture is known to hold for graphs with girth at least 5. We show that if a girth 4 graph is $C_5$ free, then Erdös' conjecture holds. When the number of vertices is not too large we can prove better bounds on $χ$. We also give bounds on the chromatic number of graphs with at most $r$ cycles of length $1\bmod k$, or at most $s$ cycles of length $2\bmod k$, or no cycles of length $3\bmod k$. Our techniques essentially consist of using a depth first search tree to decompose the graph into ordered paths, which are then fed to an online coloring algorithm. Using this technique we give simple proofs of some old results, and also obtain several simpler results. We also obtain a lower bound on the number of colors an online coloring algorithm needs to use on triangle free graphs.
△ Less
Submitted 10 September, 2008;
originally announced September 2008.
-
Reducing Order Enforcement Cost in Complex Query Plans
Authors:
Ravindra Guravannavar,
S Sudarshan,
Ajit A Diwan,
Ch. Sobhan Babu
Abstract:
Algorithms that exploit sort orders are widely used to implement joins, grou**, duplicate elimination and other set operations. Query optimizers traditionally deal with sort orders by using the notion of interesting orders. The number of interesting orders is unfortunately factorial in the number of participating attributes. Optimizer implementations use heuristics to prune the number of inter…
▽ More
Algorithms that exploit sort orders are widely used to implement joins, grou**, duplicate elimination and other set operations. Query optimizers traditionally deal with sort orders by using the notion of interesting orders. The number of interesting orders is unfortunately factorial in the number of participating attributes. Optimizer implementations use heuristics to prune the number of interesting orders, but the quality of the heuristics is unclear. Increasingly complex decision support queries and increasing use of covering indices, which provide multiple alternative sort orders for relations, motivate us to better address the problem of optimization with interesting orders.
We show that even a simplified version of optimization with sort orders is NP-hard and provide principled heuristics for choosing interesting orders. We have implemented the proposed techniques in a Volcano-style cost-based optimizer, and our performance study shows significant improvements in estimated cost. We also executed our plans on a widely used commercial database system, and on PostgreSQL, and found that actual execution times for our plans were significantly better than for plans generated by those systems in several cases.
△ Less
Submitted 20 November, 2006;
originally announced November 2006.