-
Position Heaps for Parameterized Strings
Authors:
Diptarama,
Takashi Katsura,
Yuhei Otomo,
Kazuyuki Narisawa,
Ayumi Shinohara
Abstract:
We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective map** from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized pos…
▽ More
We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective map** from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized position heap of a text and show that our algorithm runs in linear time with respect to the text size. We also show that by using parameterized position heap, we can find all occurrences of a pattern in the text in linear time with respect to the product of the pattern size and the alphabet size.
△ Less
Submitted 17 April, 2017; v1 submitted 8 February, 2017;
originally announced February 2017.
-
Longest Common Subsequence in at Least $k$ Length Order-Isomorphic Substrings
Authors:
Yohei Ueki,
Diptarama,
Masatoshi Kurihara,
Yoshiaki Matsuoka,
Kazuyuki Narisawa,
Ryo Yoshinaka,
Hideo Bannai,
Shunsuke Inenaga,
Ayumi Shinohara
Abstract:
We consider the longest common subsequence (LCS) problem with the restriction that the common subsequence is required to consist of at least $k$ length substrings. First, we show an $O(mn)$ time algorithm for the problem which gives a better worst-case running time than existing algorithms, where $m$ and $n$ are lengths of the input strings. Furthermore, we mainly consider the LCS in at least $k$…
▽ More
We consider the longest common subsequence (LCS) problem with the restriction that the common subsequence is required to consist of at least $k$ length substrings. First, we show an $O(mn)$ time algorithm for the problem which gives a better worst-case running time than existing algorithms, where $m$ and $n$ are lengths of the input strings. Furthermore, we mainly consider the LCS in at least $k$ length order-isomorphic substrings problem. We show that the problem can also be solved in $O(mn)$ worst-case time by an easy-to-implement algorithm.
△ Less
Submitted 6 February, 2017; v1 submitted 12 September, 2016;
originally announced September 2016.
-
Efficient computation of longest single-arm-gapped palindromes in a string
Authors:
Shintaro Narisada,
Diptarama Hendrian,
Kazuyuki Narisawa,
Shunsuke Inenaga,
Ayumi Shinohara
Abstract:
In this paper, we introduce new types of approximate palindromes called single-arm-gapped palindromes (shortly SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either $wguc u^R w^R$ or $wuc u^Rgw^R$, where $w$ and $u$ are non-empty strings, $w^R$ and $u^R$ are respectively the reversed strings of $w$ and $u$, $g$ is a string called a gap, and $c$ is either a s…
▽ More
In this paper, we introduce new types of approximate palindromes called single-arm-gapped palindromes (shortly SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either $wguc u^R w^R$ or $wuc u^Rgw^R$, where $w$ and $u$ are non-empty strings, $w^R$ and $u^R$ are respectively the reversed strings of $w$ and $u$, $g$ is a string called a gap, and $c$ is either a single character or the empty string. Here we call $wu$ and $u^R w^R$ the arm of the SAGP, and $|uv|$ the length of the arm. We classify SAGPs into two groups: those which have $ucu^R$ as a maximal palindrome (type-1), and the others (type-2). We propose several algorithms to compute type-1 SAGPs with longest arms occurring in a given string, based on suffix arrays. Then, we propose a linear-time algorithm to compute all type-1 SAGPs with longest arms, based on suffix trees. Also, we show how to compute type-2 SAGPs with longest arms in linear time. We also perform some preliminary experiments to show practical performances of the proposed methods.
△ Less
Submitted 31 October, 2019; v1 submitted 10 September, 2016;
originally announced September 2016.
-
Detecting regularities on grammar-compressed strings
Authors:
Tomohiro I,
Wataru Matsubara,
Kouji Shimohira,
Shunsuke Inenaga,
Hideo Bannai,
Masayuki Takeda,
Kazuyuki Narisawa,
Ayumi Shinohara
Abstract:
We solve the problems of detecting and counting various forms of regularities in a string represented as a Straight Line Program (SLP). Given an SLP of size $n$ that represents a string $s$ of length $N$, our algorithm compute all runs and squares in $s$ in $O(n^3h)$ time and $O(n^2)$ space, where $h$ is the height of the derivation tree of the SLP. We also show an algorithm to compute all gapped-…
▽ More
We solve the problems of detecting and counting various forms of regularities in a string represented as a Straight Line Program (SLP). Given an SLP of size $n$ that represents a string $s$ of length $N$, our algorithm compute all runs and squares in $s$ in $O(n^3h)$ time and $O(n^2)$ space, where $h$ is the height of the derivation tree of the SLP. We also show an algorithm to compute all gapped-palindromes in $O(n^3h + gnh\log N)$ time and $O(n^2)$ space, where $g$ is the length of the gap. The key technique of the above solution also allows us to compute the periods and covers of the string in $O(n^2 h)$ time and $O(nh(n+\log^2 N))$ time, respectively.
△ Less
Submitted 26 April, 2013;
originally announced April 2013.