Skip to main content

Showing 1–30 of 30 results for author: Iliopoulos, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2007.13471  [pdf, ps, other

    cs.DS

    Internal Quasiperiod Queries

    Authors: Maxime Crochemore, Costas Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

    Abstract: Internal pattern matching requires one to answer queries about factors of a given string. Many results are known on answering internal period queries, asking for the periods of a given factor. In this paper we investigate (for the first time) internal queries asking for covers (also known as quasiperiods) of a given factor. We propose a data structure that answers such queries in… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: To appear in the SPIRE 2020 proceedings

  2. arXiv:1908.01664  [pdf, other

    cs.DS

    On the cyclic regularities of strings

    Authors: Oluwole Ajala, Miznah Alshammary, Mai Alzamel, Jia Gao, Costas Iliopoulos, Jakub Radoszewski, Wojciech Rytter, Bruce Watson

    Abstract: Regularities in strings are often related to periods and covers, which have extensively been studied, and algorithms for their efficient computation have broad application. In this paper we concentrate on computing cyclic regularities of strings, in particular, we propose several efficient algorithms for computing: (i) cyclic periodicity; (ii) all cyclic periodicity; (iii) maximal local cyclic per… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  3. arXiv:1901.11305  [pdf, other

    cs.DS

    Quasi-Linear-Time Algorithm for Longest Common Circular Factor

    Authors: Mai Alzamel, Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Juliusz Straszyński, Tomasz Waleń, Wiktor Zuba

    Abstract: We introduce the Longest Common Circular Factor (LCCF) problem in which, given strings $S$ and $T$ of length $n$, we are to compute the longest factor of $S$ whose cyclic shift occurs as a factor of $T$. It is a new similarity measure, an extension of the classic Longest Common Factor. We show how to solve the LCCF problem in $O(n \log^5 n)$ time.

    Submitted 31 January, 2019; originally announced January 2019.

    ACM Class: F.2.2

  4. arXiv:1810.02099  [pdf, other

    cs.DS

    Longest Property-Preserved Common Factor

    Authors: Lorraine A. K Ayad, Giulia Bernardini, Roberto Grossi, Costas S. Iliopoulos, Nadia Pisanti, Solon P. Pissis, Giovanna Rosone

    Abstract: In this paper we introduce a new family of string processing problems. We are given two or more strings and we are asked to compute a factor common to all strings that preserves a specific property and has maximal length. Here we consider three fundamental string properties: square-free factors, periodic factors, and palindromic factors under three different settings, one per property. In the firs… ▽ More

    Submitted 4 October, 2018; originally announced October 2018.

    Comments: Extended version of SPIRE 2018 paper

  5. arXiv:1807.11702  [pdf, ps, other

    cs.DS

    Efficient Computation of Sequence Mappability

    Authors: Panagiotis Charalampopoulos, Costas S. Iliopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Juliusz Straszyński

    Abstract: In the $(k,m)$-mappability problem, for a given sequence $T$ of length $n$, the goal is to compute a table whose $i$th entry is the number of indices $j \ne i$ such that the length-$m$ substrings of $T$ starting at positions $i$ and $j$ have at most $k$ mismatches. Previous works on this problem focused on heuristics computing a rough approximation of the result or on the case of $k=1$. We present… ▽ More

    Submitted 16 June, 2021; v1 submitted 31 July, 2018; originally announced July 2018.

    Comments: Accepted to SPIRE 2018

    ACM Class: F.2.2

  6. arXiv:1802.06369  [pdf, ps, other

    cs.DS

    Linear-Time Algorithm for Long LCF with $k$ Mismatches

    Authors: Panagiotis Charalampopoulos, Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

    Abstract: In the Longest Common Factor with $k$ Mismatches (LCF$_k$) problem, we are given two strings $X$ and $Y$ of total length $n$, and we are asked to find a pair of maximal-length factors, one of $X$ and the other of $Y$, such that their Hamming distance is at most $k$. Thankachan et al. show that this problem can be solved in $\mathcal{O}(n \log^k n)$ time and $\mathcal{O}(n)$ space for constant $k$.… ▽ More

    Submitted 18 February, 2018; originally announced February 2018.

    Comments: submitted to CPM 2018

  7. arXiv:1801.04425  [pdf, ps, other

    cs.DS

    Longest Common Prefixes with $k$-Errors and Applications

    Authors: Lorraine A. K. Ayad, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis

    Abstract: Although real-world text datasets, such as DNA sequences, are far from being uniformly random, average-case string searching algorithms perform significantly better than worst-case ones in most applications of interest. In this paper, we study the problem of computing the longest prefix of each suffix of a given string of length $n$ over a constant-sized alphabet that occurs elsewhere in the strin… ▽ More

    Submitted 13 January, 2018; originally announced January 2018.

  8. arXiv:1705.04589  [pdf, ps, other

    cs.DS

    How to answer a small batch of RMQs or LCA queries in practice

    Authors: Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis

    Abstract: In the Range Minimum Query (RMQ) problem, we are given an array $A$ of $n$ numbers and we are asked to answer queries of the following type: for indices $i$ and $j$ between $0$ and $n-1$, query $\text{RMQ}_A(i,j)$ returns the index of a minimum element in the subarray $A[i..j]$. Answering a small batch of RMQs is a core computational task in many real-world applications, in particular due to the c… ▽ More

    Submitted 12 May, 2017; originally announced May 2017.

    Comments: Accepted to IWOCA 2017

  9. arXiv:1705.04022  [pdf, ps, other

    cs.DS

    Faster algorithms for 1-mappability of a sequence

    Authors: Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Solon P. Pissis, Jakub Radoszewski, Wing-Kin Sung

    Abstract: In the k-mappability problem, we are given a string x of length n and integers m and k, and we are asked to count, for each length-m factor y of x, the number of other factors of length m of x that are at Hamming distance at most k from y. We focus here on the version of the problem where k = 1. The fastest known algorithm for k = 1 requires time O(mn log n/ log log n) and space O(n). We present t… ▽ More

    Submitted 11 May, 2017; originally announced May 2017.

  10. arXiv:1705.03385  [pdf, ps, other

    cs.DS

    Optimal Computation of Overabundant Words

    Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos

    Abstract: The observed frequency of the longest proper prefix, the longest proper suffix, and the longest infix of a word $w$ in a given sequence $x$ can be used for classifying $w$ as avoided or overabundant. The definitions used for the expectation and deviation of $w$ in this statistical model were described and biologically justified by Brendel et al. (J Biomol Struct Dyn 1986). We have very recently in… ▽ More

    Submitted 9 May, 2017; originally announced May 2017.

  11. arXiv:1703.08931  [pdf, ps, other

    cs.DS

    Palindromic Decompositions with Gaps and Errors

    Authors: Michał Adamczyk, Mai Alzamel, Panagiotis Charalampopoulos, Costas S. Iliopoulos, Jakub Radoszewski

    Abstract: Identifying palindromes in sequences has been an interesting line of research in combinatorics on words and also in computational biology, after the discovery of the relation of palindromes in the DNA sequence with the HIV virus. Efficient algorithms for the factorization of sequences into palindromes and maximal palindromes have been devised in recent years. We extend these studies by allowing ga… ▽ More

    Submitted 27 March, 2017; originally announced March 2017.

    Comments: accepted to CSR 2017

  12. arXiv:1703.00195  [pdf, ps, other

    cs.FL cs.DM

    Two strings at Hamming distance 1 cannot be both quasiperiodic

    Authors: Amihood Amir, Costas S. Iliopoulos, Jakub Radoszewski

    Abstract: We present a generalization of a known fact from combinatorics on words related to periodicity into quasiperiodicity. A string is called periodic if it has a period which is at most half of its length. A string $w$ is called quasiperiodic if it has a non-trivial cover, that is, there exists a string $c$ that is shorter than $w$ and such that every position in $w$ is inside one of the occurrences o… ▽ More

    Submitted 1 March, 2017; originally announced March 2017.

    Comments: 6 pages, 3 figures

  13. arXiv:1610.08111  [pdf, ps, other

    cs.DS

    Efficient Pattern Matching in Elastic-Degenerate Strings

    Authors: Costas Iliopoulos, Ritu Kundu, Solon Pissis

    Abstract: In this paper, we extend the notion of gapped strings to elastic-degenerate strings. An elastic-degenerate string can been seen as an ordered collection of k > 1 seeds (substrings/subpatterns) interleaved by elastic-degenerate symbols such that each elastic-degenerate symbol corresponds to a set of two or more variable length strings. Here, we present an algorithm for solving the pattern matching… ▽ More

    Submitted 25 October, 2016; originally announced October 2016.

    Comments: 11 pages (without references)

    MSC Class: 68W32

  14. arXiv:1606.08275  [pdf, ps, other

    cs.DS

    Near-Optimal Computation of Runs over General Alphabet via Non-Crossing LCE Queries

    Authors: Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Ritu Kundu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

    Abstract: Longest common extension queries (LCE queries) and runs are ubiquitous in algorithmic stringology. Linear-time algorithms computing runs and preprocessing for constant-time LCE queries have been known for over a decade. However, these algorithms assume a linearly-sortable integer alphabet. A recent breakthrough paper by Bannai et.\ al.\ (SODA 2015) showed a link between the two notions: all the ru… ▽ More

    Submitted 27 June, 2016; originally announced June 2016.

    ACM Class: F.2.2

  15. arXiv:1604.08760  [pdf, ps, other

    cs.DS

    Optimal Computation of Avoided Words

    Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos

    Abstract: The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A wor… ▽ More

    Submitted 29 April, 2016; originally announced April 2016.

  16. arXiv:1506.04559  [pdf, ps, other

    cs.DS

    Linear Algorithm for Conservative Degenerate Pattern Matching

    Authors: Maxime Crochemore, Costas S. Iliopoulos, Ritu Kundu, Manal Mohamed, Fatima Vayani

    Abstract: A degenerate symbol x* over an alphabet A is a non-empty subset of A, and a sequence of such symbols is a degenerate string. A degenerate string is said to be conservative if its number of non-solid symbols is upper-bounded by a fixed positive constant k. We consider here the matching problem of conservative degenerate strings and present the first linear-time algorithm that can find, for given de… ▽ More

    Submitted 15 June, 2015; originally announced June 2015.

  17. arXiv:1505.04019  [pdf, ps, other

    cs.DS

    Linear-Time Superbubble Identification Algorithm for Genome Assembly

    Authors: Ljiljana Brankovic, Costas S. Iliopoulos, Ritu Kundu, Manal Mohamed, Solon P. Pissis, Fatima Vayani

    Abstract: DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly… ▽ More

    Submitted 17 September, 2015; v1 submitted 15 May, 2015; originally announced May 2015.

  18. arXiv:1503.00049  [pdf, other

    cs.DS

    Algorithms for Longest Common Abelian Factors

    Authors: Ali Alatabbi, Costas S. Iliopoulos, Alessio Langiu, M. Sohel Rahman

    Abstract: In this paper we consider the problem of computing the longest common abelian factor (LCAF) between two given strings. We present a simple $O(σ~ n^2)$ time algorithm, where $n$ is the length of the strings and $σ$ is the alphabet size, and a sub-quadratic running time solution for the binary string case, both having linear space requirement. Furthermore, we present a modified algorithm applying so… ▽ More

    Submitted 27 February, 2015; originally announced March 2015.

    Comments: 13 pages, 4 figures

  19. arXiv:1412.3696  [pdf, ps, other

    cs.DS

    Covering Problems for Partial Words and for Indeterminate Strings

    Authors: Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter, Tomasz Waleń

    Abstract: We consider the problem of computing a shortest solid cover of an indeterminate string. An indeterminate string may contain non-solid symbols, each of which specifies a subset of the alphabet that could be present at the corresponding position. We also consider covering partial words, which are a special case of indeterminate strings where each non-solid symbol is a don't care symbol. We prove tha… ▽ More

    Submitted 11 December, 2014; originally announced December 2014.

    Comments: full version (simplified and corrected); preliminary version appeared at ISAAC 2014; 14 pages, 4 figures

    MSC Class: 68W32 (Primary); 68Q25 (Secondary) ACM Class: F.2.2

  20. arXiv:1406.5480  [pdf, ps, other

    cs.DS

    Average-Case Optimal Approximate Circular String Matching

    Authors: Carl Barton, Costas S. Iliopoulos, Solon P. Pissis

    Abstract: Approximate string matching is the problem of finding all factors of a text t of length n that are at a distance at most k from a pattern x of length m. Approximate circular string matching is the problem of finding all factors of t that are at a distance at most k from x or from any of its rotations. In this article, we present a new algorithm for approximate circular string matching under the ed… ▽ More

    Submitted 25 April, 2016; v1 submitted 20 June, 2014; originally announced June 2014.

  21. arXiv:1312.2381  [pdf, ps, other

    cs.DS cs.FL

    A Note on the Longest Common Compatible Prefix Problem for Partial Words

    Authors: Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Jakub Radoszewski, Wojciech Rytter, Bartosz Szreder, Tomasz Waleń

    Abstract: For a partial word $w$ the longest common compatible prefix of two positions $i,j$, denoted $lccp(i,j)$, is the largest $k$ such that $w[i,i+k-1]\uparrow w[j,j+k-1]$, where $\uparrow$ is the compatibility relation of partial words (it is not an equivalence relation). The LCCP problem is to preprocess a partial word in such a way that any query $lccp(i,j)$ about this word can be answered in $O(1)$… ▽ More

    Submitted 9 December, 2013; originally announced December 2013.

  22. arXiv:1309.1981  [pdf, other

    cs.DS

    The Swap Matching Problem Revisited

    Authors: Pritom Ahmed, Costas S. Iliopoulos, A. S. M. Sohidull Islam, M. Sohel Rahman

    Abstract: In this paper, we revisit the much studied problem of Pattern Matching with Swaps (Swap Matching problem, for short). We first present a graph-theoretic model, which opens a new and so far unexplored avenue to solve the problem. Then, using the model, we devise two efficient algorithms to solve the swap matching problem. The resulting algorithms are adaptations of the classic shift-and algorithm.… ▽ More

    Submitted 18 September, 2013; v1 submitted 8 September, 2013; originally announced September 2013.

    Comments: 23 pages, 3 Figures and 17 Tables

  23. arXiv:1305.1744  [pdf, ps, other

    cs.DS

    Suffix Tree of Alignment: An Efficient Index for Similar Data

    Authors: Joong Chae Na, Hee** Park, Maxime Crochemore, Jan Holub, Costas S. Iliopoulos, Laurent Mouchard, Kunsoo Park

    Abstract: We consider an index data structure for similar strings. The generalized suffix tree can be a solution for this. The generalized suffix tree of two strings $A$ and $B$ is a compacted trie representing all suffixes in $A$ and $B$. It has $|A|+|B|$ leaves and can be constructed in $O(|A|+|B|)$ time. However, if the two strings are similar, the generalized suffix tree is not efficient because it does… ▽ More

    Submitted 8 May, 2013; originally announced May 2013.

    Comments: 12 pages

  24. arXiv:1303.6872  [pdf, other

    cs.DS

    Order-Preserving Suffix Trees and Their Algorithmic Applications

    Authors: Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Tomasz Walen

    Abstract: Recently Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.) introduced order-preserving pattern matching. In this problem we are looking for consecutive substrings of the text that have the same "shape" as a given pattern. These results include a linear-time order-preserving pattern matching algorithm for polynomially-bounded alphabet and an extension of this… ▽ More

    Submitted 27 March, 2013; originally announced March 2013.

  25. arXiv:1302.4064  [pdf, other

    cs.DS

    Order Preserving Matching

    Authors: **il Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, Takeshi Tokuyama

    Abstract: We introduce a new string matching problem called order-preserving matching on numeric strings where a pattern matches a text if the text contains a substring whose relative orders coincide with those of the pattern. Order-preserving matching is applicable to many scenarios such as stock price analysis and musical melody matching in which the order relations should be matched instead of the string… ▽ More

    Submitted 17 February, 2013; originally announced February 2013.

    Comments: 15 pages; submitted to Theoretical Computer Science, 5 Dec 2012; presented at Theo Murphy International Scientific Meeting of the Royal Society on Storage and Indexing of Massive Data, 7 Feb 2013

  26. arXiv:1208.3313  [pdf, ps, other

    cs.DS cs.DM

    A Note on Efficient Computation of All Abelian Periods in a String

    Authors: Maxime Crochemore, Costas Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Jakub Pachocki, Jakub Radoszewski, Wojciech Rytter, Wojciech Tyczyński, Tomasz Waleń

    Abstract: We derive a simple efficient algorithm for Abelian periods knowing all Abelian squares in a string. An efficient algorithm for the latter problem was given by Cummings and Smyth in 1997. By the way we show an alternative algorithm for Abelian squares. We also obtain a linear time algorithm finding all `long' Abelian periods. The aim of the paper is a (new) reduction of the problem of all Abelian p… ▽ More

    Submitted 16 August, 2012; originally announced August 2012.

    ACM Class: F.2.2

  27. arXiv:1207.1307  [pdf, ps, other

    cs.DS

    Identifying all abelian periods of a string in quadratic time and relevant problems

    Authors: Michalis Christou, Maxime Crochemore, Costas S. Iliopoulos

    Abstract: Abelian periodicity of strings has been studied extensively over the last years. In 2006 Constantinescu and Ilie defined the abelian period of a string and several algorithms for the computation of all abelian periods of a string were given. In contrast to the classical period of a word, its abelian version is more flexible, factors of the word are considered the same under any internal permutatio… ▽ More

    Submitted 5 July, 2012; originally announced July 2012.

    Comments: Accepted in the "International Journal of foundations of Computer Science"

  28. arXiv:1201.6162  [pdf, ps, other

    math.CO cs.DS

    Quasiperiodicities in Fibonacci strings

    Authors: Michalis Christou, Maxime Crochemore, Costas Iliopoulos

    Abstract: We consider the problem of finding quasiperiodicities in a Fibonacci string. A factor u of a string y is a cover of y if every letter of y falls within some occurrence of u in y. A string v is a seed of y, if it is a cover of a superstring of y. A left seed of a string y is a prefix of y that it is a cover of a superstring of y. Similarly a right seed of a string y is a suffix of y that it is a co… ▽ More

    Submitted 30 January, 2012; originally announced January 2012.

    Comments: In Local Proceedings of "The 38th International Conference on Current Trends in Theory and Practice of Computer Science" (SOFSEM 2012)

  29. arXiv:1104.3153  [pdf, ps, other

    cs.DS

    Efficient Seeds Computation Revisited

    Authors: Michalis Christou, Maxime Crochemore, Costas S. Iliopoulos, Marcin Kubica, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, Bartosz Szreder, Tomasz Walen

    Abstract: The notion of the cover is a generalization of a period of a string, and there are linear time algorithms for finding the shortest cover. The seed is a more complicated generalization of periodicity, it is a cover of a superstring of a given string, and the shortest seed problem is of much higher algorithmic difficulty. The problem is not well understood, no linear time algorithm is known. In the… ▽ More

    Submitted 15 April, 2011; originally announced April 2011.

    Comments: 14 pages, accepted to CPM 2011

  30. arXiv:0907.2157  [pdf, ps, other

    cs.DS cs.DM

    On the maximal number of highly periodic runs in a string

    Authors: Maxime Crochemore, Costas Iliopoulos, Marcin Kubica, Jakub Radoszewski, Wojciech Rytter, Tomasz Walen

    Abstract: A run is a maximal occurrence of a repetition $v$ with a period $p$ such that $2p \le |v|$. The maximal number of runs in a string of length $n$ was studied by several authors and it is known to be between $0.944 n$ and $1.029 n$. We investigate highly periodic runs, in which the shortest period $p$ satisfies $3p \le |v|$. We show the upper bound $0.5n$ on the maximal number of such runs in a st… ▽ More

    Submitted 13 July, 2009; originally announced July 2009.

    Comments: 8 pages, 2 figures