Skip to main content

Showing 1–25 of 25 results for author: Shinohara, A

.
  1. Algorithms for Galois Words: Detection, Factorization, and Rotation

    Authors: Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Lyndon words are extensively studied in combinatorics on words -- they play a crucial role on upper bounding the number of runs a word can have [Bannai+, SIAM J. Comput.'17]. We can determine Lyndon words, factorize a word into Lyndon words in lexicographically non-increasing order, and find the Lyndon rotation of a word, all in linear time within constant additional working space. A recent resear… ▽ More

    Submitted 23 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 16 pages,3 figures,accepted to CPM 2024

  2. arXiv:2308.05977  [pdf, other

    cs.DS

    Breaking a Barrier in Constructing Compact Indexes for Parameterized Pattern Matching

    Authors: Kento Iseri, Tomohiro I, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: A parameterized string (p-string) is a string over an alphabet $(Σ_{s} \cup Σ_{p})$, where $Σ_{s}$ and $Σ_{p}$ are disjoint alphabets for static symbols (s-symbols) and for parameter symbols (p-symbols), respectively. Two p-strings $x$ and $y$ are said to parameterized match (p-match) if and only if $x$ can be transformed into $y$ by applying a bijection on $Σ_{p}$ to every occurrence of p-symbols… ▽ More

    Submitted 11 August, 2023; originally announced August 2023.

  3. arXiv:2306.10714  [pdf, ps, other

    cs.DS

    Efficient Parameterized Pattern Matching in Sublinear Space

    Authors: Haruki Ideguchi, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: The parameterized matching problem is a variant of string matching, which is to search for all parameterized occurrences of a pattern $P$ in a text $T$. In considering matching algorithms, the combinatorial natures of strings, especially periodicity, play an important role. In this paper, we analyze the properties of periods of parameterized strings and propose a generalization of Galil and Seifer… ▽ More

    Submitted 19 June, 2023; originally announced June 2023.

  4. arXiv:2209.12405  [pdf, ps, other

    cs.DS

    Inferring Strings from Position Heaps in Linear Time

    Authors: Koshiro Kumagai, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Position heaps are index structures of text strings used for the string matching problem. They are rooted trees whose edges and nodes are labeled and numbered, respectively. This paper is concerned with variants of the inverse problem of position heap construction and gives linear-time algorithms for those problems. The basic problem is to restore a text string from a rooted tree with labeled edge… ▽ More

    Submitted 12 December, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

    Comments: 10 pages, 5 figures

  5. arXiv:2206.15100  [pdf, ps, other

    cs.DS

    Computing the Parameterized Burrows--Wheeler Transform Online

    Authors: Daiki Hashimoto, Diptarama Hendrian, Dominik Köppl, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Parameterized strings are a generalization of strings in that their characters are drawn from two different alphabets, where one is considered to be the alphabet of static characters and the other to be the alphabet of parameter characters. Two parameterized strings are a parameterized match if there is a bijection over all characters such that the bijection transforms one string to the other whil… ▽ More

    Submitted 30 August, 2022; v1 submitted 30 June, 2022; originally announced June 2022.

    Comments: 13 pages, accepted to SPIRE 2022

  6. arXiv:2202.13284  [pdf, other

    cs.DS

    Parallel algorithm for pattern matching problems under substring consistent equivalence relations

    Authors: Davaajav Jargalsaikhan, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Given a text and a pattern over an alphabet, the pattern matching problem searches for all occurrences of the pattern in the text. An equivalence relation $\approx$ is called a substring consistent equivalence relation (SCER), if for two strings $X$ and $Y$, $X \approx Y$ implies $|X| = |Y|$ and $X[i:j] \approx Y[i:j]$ for all $1 \le i \le j \le |X|$. In this paper, we propose an efficient paralle… ▽ More

    Submitted 27 July, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

  7. In-Place Bijective Burrows-Wheeler Transforms

    Authors: Dominik Köppl, Daiki Hashimoto, Diptarama Hendrian, Ayumi Shinohara

    Abstract: One of the most well-known variants of the Burrows-Wheeler transform (BWT) [Burrows and Wheeler, 1994] is the bijective BWT (BBWT) [Gil and Scott, arXiv 2012], which applies the extended BWT (EBWT) [Mantaci et al., TCS 2007] to the multiset of Lyndon factors of a given text. Since the EBWT is invertible, the BBWT is a bijective transform in the sense that the inverse image of the EBWT restores thi… ▽ More

    Submitted 27 April, 2020; originally announced April 2020.

    Comments: In proceedings of CPM 2020

  8. arXiv:2003.08097  [pdf, other

    cs.DS

    Grammar compression with probabilistic context-free grammar

    Authors: Hiroaki Naganuma, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara, Naoki Kobayashi

    Abstract: We propose a new approach for universal lossless text compression, based on grammar compression. In the literature, a target string $T$ has been compressed as a context-free grammar $G$ in Chomsky normal form satisfying $L(G) = \{T\}$. Such a grammar is often called a \emph{straight-line program} (SLP). In this paper, we consider a probabilistic grammar $G$ that generates $T$, but not necessarily… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

    Comments: 11 pages, 3 figures, accepted for poster presentation at DCC 2020

  9. arXiv:2002.08004  [pdf, ps, other

    cs.DS

    Fast and linear-time string matching algorithms based on the distances of $q$-gram occurrences

    Authors: Satoshi Kobayashi, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Given a text $T$ of length $n$ and a pattern $P$ of length $m$, the string matching problem is a task to find all occurrences of $P$ in $T$. In this study, we propose an algorithm that solves this problem in $O((n + m)q)$ time considering the distance between two adjacent occurrences of the same $q$-gram contained in $P$. We also propose a theoretical improvement of it which runs in $O(n + m)$ tim… ▽ More

    Submitted 12 April, 2020; v1 submitted 18 February, 2020; originally announced February 2020.

    Comments: 14 pages, accepted to SEA 2020

  10. arXiv:2002.06796  [pdf, other

    cs.DS

    Detecting $k$-(Sub-)Cadences and Equidistant Subsequence Occurrences

    Authors: Mitsuru Funakoshi, Yuto Nakashima, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Ayumi Shinohara

    Abstract: The equidistant subsequence pattern matching problem is considered. Given a pattern string $P$ and a text string $T$, we say that $P$ is an \emph{equidistant subsequence} of $T$ if $P$ is a subsequence of the text such that consecutive symbols of $P$ in the occurrence are equally spaced. We can consider the problem of equidistant subsequences as generalizations of (sub-)cadences. We give bit-paral… ▽ More

    Submitted 17 February, 2020; originally announced February 2020.

  11. Parameterized DAWGs: efficient constructions and bidirectional pattern searches

    Authors: Katsuhito Nakashima, Noriki Fujisato, Diptarama Hendrian, Yuto Nakashima, Ryo Yoshinaka, Shunsuke Inenaga, Hideo Bannai, Ayumi Shinohara, Masayuki Takeda

    Abstract: Two strings $x$ and $y$ over $Σ\cup Π$ of equal length are said to \emph{parameterized match} (\emph{p-match}) if there is a renaming bijection $f:Σ\cup Π\rightarrow Σ\cup Π$ that is identity on $Σ$ and transforms $x$ to $y$ (or vice versa). The \emph{p-matching} problem is to look for substrings in a text that p-match a given pattern. In this paper, we propose \emph{parameterized suffix automata}… ▽ More

    Submitted 16 September, 2022; v1 submitted 17 February, 2020; originally announced February 2020.

    Comments: 28 pages, 7 figures

    Journal ref: Theoretical Computer Science (2022)

  12. arXiv:2002.06764  [pdf, ps, other

    cs.DS

    Computing Covers under Substring Consistent Equivalence Relations

    Authors: Natsumi Kikuchi, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Covers are a kind of quasiperiodicity in strings. A string $C$ is a cover of another string $T$ if any position of $T$ is inside some occurrence of $C$ in $T$. The shortest and longest cover arrays of $T$ have the lengths of the shortest and longest covers of each prefix of $T$, respectively. The literature has proposed linear-time algorithms computing longest and shortest cover arrays taking bord… ▽ More

    Submitted 30 July, 2020; v1 submitted 16 February, 2020; originally announced February 2020.

    Comments: 16 pages

  13. Query Learning Algorithm for Residual Symbolic Finite Automata

    Authors: Kaizaburo Chubachi, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: We propose a query learning algorithm for residual symbolic finite automata (RSFAs). Symbolic finite automata (SFAs) are finite automata whose transitions are labeled by predicates over a Boolean algebra, in which a big collection of characters leading the same transition may be represented by a single predicate. Residual finite automata (RFAs) are a special type of non-deterministic finite automa… ▽ More

    Submitted 17 September, 2019; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: In Proceedings GandALF 2019, arXiv:1909.05979

    Journal ref: EPTCS 305, 2019, pp. 140-153

  14. arXiv:1902.00216  [pdf, other

    cs.DS

    An Extension of Linear-size Suffix Tries for Parameterized Strings

    Authors: Katsuhito Nakashima, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: In this paper, we propose a new indexing structure for parameterized strings which we call PLSTs, by generalizing linear-size suffix tries for ordinary strings. Two parameterized strings are said to match if there is a bijection on the symbol set that makes the two coincide. PLSTs are applicable to the parameterized pattern matching problem, which is to decide whether the input parameterized text… ▽ More

    Submitted 4 September, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: 13 pages, 6 figures

  15. arXiv:1807.11580  [pdf, ps, other

    cs.FL cs.DS

    Enumerating Cryptarithms Using Deterministic Finite Automata

    Authors: Yuki Nozaki, Diptarama Hendrian, Ryo Yoshinaka, Takashi Horiyama, Ayumi Shinohara

    Abstract: A cryptarithm is a mathematical puzzle where given an arithmetic equation written with letters rather than numerals, a player must discover an assignment of numerals on letters that makes the equation hold true. In this paper, we propose a method to construct a DFA that accepts cryptarithms that admit (unique) solutions for each base. We implemented the method and constructed a DFA for bases… ▽ More

    Submitted 26 July, 2018; originally announced July 2018.

  16. arXiv:1806.09806  [pdf, other

    cs.DS

    Linear-Time Online Algorithm Inferring the Shortest Path from a Walk

    Authors: Shintaro Narisada, Diptarama Hendrian, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: We consider the problem of inferring an edge-labeled graph from the sequence of edge labels seen in a walk of that graph. It has been known that this problem is solvable in $O(n \log n)$ time when the targets are path or cycle graphs. This paper presents an online algorithm for the problem of this restricted case that runs in $O(n)$ time, based on Manacher's algorithm for computing all the maximal… ▽ More

    Submitted 20 February, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: 31 pages, 7 figures, extended version of the proceeding paper in SPIRE 2018

  17. Towards an energy measurement of the internal conversion electron in the de-excitation of the Th-229 isomer

    Authors: Simon Stellmer, Yudai Shigekawa, Veronika Rosecker, Georgy A. Kazakov, Yoshitaka Kasamatsu, Yuki Yasuda, Atsushi Shinohara, Thorsten Schumm

    Abstract: The first excited isomeric state of Th-229 has an exceptionally low energy of only a few eV and could form the gateway to high-precision laser spectroscopy of nuclei. The excitation energy of the isomeric state has been inferred from precision gamma spectroscopy, but its uncertainty is still too large to commence laser spectroscopy. Reducing this uncertainty is one of the most pressing challenges… ▽ More

    Submitted 13 May, 2018; originally announced May 2018.

    Comments: 11 pages, 8 figures

    Journal ref: Phys. Rev. C 98, 014317 (2018)

  18. Efficient Dynamic Dictionary Matching with DAWGs and AC-automata

    Authors: Diptarama Hendrian, Shunsuke Inenaga, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: The dictionary matching is a task to find all occurrences of patterns in a set $D$ (called a dictionary) on a text $T$. The Aho-Corasick-automaton (AC-automaton) is a data structure which enables us to solve the dictionary matching problem in $O(d\logσ)$ preprocessing time and $O(n\logσ+occ)$ matching time, where $d$ is the total length of the patterns in $D$, $n$ is the length of the text, $σ$ is… ▽ More

    Submitted 20 February, 2019; v1 submitted 9 October, 2017; originally announced October 2017.

    Comments: 20 pages, 4 figures

  19. arXiv:1705.09504  [pdf, other

    cs.DS

    New Variants of Pattern Matching with Constants and Variables

    Authors: Yuki Igarashi, Diptarama, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Given a text and a pattern over two types of symbols called constants and variables, the parameterized pattern matching problem is to find all occurrences of substrings of the text that the pattern matches by substituting a variable in the text for each variable in the pattern, where the substitution should be injective. The function matching problem is a variant of it that lifts the injection con… ▽ More

    Submitted 26 May, 2017; originally announced May 2017.

    Comments: 15 pages, 2 figures

  20. arXiv:1705.09438  [pdf, ps, other

    cs.DS

    Duel and sweep algorithm for order-preserving pattern matching

    Authors: Davaajav Jargalsaikhan, Diptarama, Ryo Yoshinaka, Ayumi Shinohara

    Abstract: Given a text $T$ and a pattern $P$ over alphabet $Σ$, the classic exact matching problem searches for all occurrences of pattern $P$ in text $T$. Unlike exact matching problem, order-preserving pattern matching (OPPM) considers the relative order of elements, rather than their real values. In this paper, we propose an efficient algorithm for OPPM problem using the "duel-and-sweep" paradigm. Our al… ▽ More

    Submitted 26 May, 2017; originally announced May 2017.

    Comments: 13 pages, 5 figures

  21. arXiv:1702.02321  [pdf, other

    cs.DS

    Position Heaps for Parameterized Strings

    Authors: Diptarama, Takashi Katsura, Yuhei Otomo, Kazuyuki Narisawa, Ayumi Shinohara

    Abstract: We propose a new indexing structure for parameterized strings, called parameterized position heap. Parameterized position heap is applicable for parameterized pattern matching problem, where the pattern matches a substring of the text if there exists a bijective map** from the symbols of the pattern to the symbols of the substring. We propose an online construction algorithm of parameterized pos… ▽ More

    Submitted 17 April, 2017; v1 submitted 8 February, 2017; originally announced February 2017.

    Comments: 14 pages, 4 figures, accepted to CPM 2017

    ACM Class: F.2.2

  22. arXiv:1609.03668  [pdf, other

    cs.DS

    Longest Common Subsequence in at Least $k$ Length Order-Isomorphic Substrings

    Authors: Yohei Ueki, Diptarama, Masatoshi Kurihara, Yoshiaki Matsuoka, Kazuyuki Narisawa, Ryo Yoshinaka, Hideo Bannai, Shunsuke Inenaga, Ayumi Shinohara

    Abstract: We consider the longest common subsequence (LCS) problem with the restriction that the common subsequence is required to consist of at least $k$ length substrings. First, we show an $O(mn)$ time algorithm for the problem which gives a better worst-case running time than existing algorithms, where $m$ and $n$ are lengths of the input strings. Furthermore, we mainly consider the LCS in at least $k$… ▽ More

    Submitted 6 February, 2017; v1 submitted 12 September, 2016; originally announced September 2016.

    Comments: 14 pages, 7 figures, contains erratum to Springer's version (SOFSEM 2017)

  23. Efficient computation of longest single-arm-gapped palindromes in a string

    Authors: Shintaro Narisada, Diptarama Hendrian, Kazuyuki Narisawa, Shunsuke Inenaga, Ayumi Shinohara

    Abstract: In this paper, we introduce new types of approximate palindromes called single-arm-gapped palindromes (shortly SAGPs). A SAGP contains a gap in either its left or right arm, which is in the form of either $wguc u^R w^R$ or $wuc u^Rgw^R$, where $w$ and $u$ are non-empty strings, $w^R$ and $u^R$ are respectively the reversed strings of $w$ and $u$, $g$ is a string called a gap, and $c$ is either a s… ▽ More

    Submitted 31 October, 2019; v1 submitted 10 September, 2016; originally announced September 2016.

    Comments: 19 pages, 11 figures

    Journal ref: Theoretical Computer Science, 2019

  24. arXiv:1304.7067  [pdf, ps, other

    cs.DS

    Detecting regularities on grammar-compressed strings

    Authors: Tomohiro I, Wataru Matsubara, Kouji Shimohira, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda, Kazuyuki Narisawa, Ayumi Shinohara

    Abstract: We solve the problems of detecting and counting various forms of regularities in a string represented as a Straight Line Program (SLP). Given an SLP of size $n$ that represents a string $s$ of length $N$, our algorithm compute all runs and squares in $s$ in $O(n^3h)$ time and $O(n^2)$ space, where $h$ is the height of the derivation tree of the SLP. We also show an algorithm to compute all gapped-… ▽ More

    Submitted 26 April, 2013; originally announced April 2013.

  25. arXiv:0804.1214  [pdf, ps, other

    cs.DM

    New Lower Bounds for the Maximum Number of Runs in a String

    Authors: Kazuhiko Kusano, Wataru Matsubara, Akira Ishino, Hideo Bannai, Ayumi Shinohara

    Abstract: We show a new lower bound for the maximum number of runs in a string. We prove that for any e > 0, (a -- e)n is an asymptotic lower bound, where a = 56733/60064 = 0.944542. It is superior to the previous bound 0.927 given by Franek et al. Moreover, our construction of the strings and the proof is much simpler than theirs.

    Submitted 8 April, 2008; originally announced April 2008.

    ACM Class: G.2.1