Skip to main content

Showing 1–20 of 20 results for author: Faro, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.15711  [pdf, other

    cs.DS

    Efficient Online String Matching through Linked Weak Factors

    Authors: Matthew N. Palmer, Simone Faro, Stefano Scafiti

    Abstract: Online string matching is a computational problem involving the search for patterns or substrings in a large text dataset, with the pattern and text being processed sequentially, without prior access to the entire text. Its relevance stems from applications in data compression, data mining, text editing, and bioinformatics, where rapid and efficient pattern matching is crucial. Various solutions h… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

  2. arXiv:2309.01250  [pdf, ps, other

    cs.DS quant-ph

    Longest Common Substring and Longest Palindromic Substring in $\tilde{\mathcal{O}}(\sqrt{n})$ Time

    Authors: Domenico Cantone, Simone Faro, Arianna Pavone, Caterina Viola

    Abstract: The Longest Common Substring (LCS) and Longest Palindromic Substring (LPS) are classical problems in computer science, representing fundamental challenges in string processing. Both problems can be solved in linear time using a classical model of computation, by means of very similar algorithms, both relying on the use of suffix trees. Very recently, two sublinear algorithms for LCS and LPS in the… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  3. arXiv:2308.11758  [pdf, ps, other

    cs.DS

    Quantum Circuits for Fixed Substring Matching Problems

    Authors: Domenico Cantone, Simone Faro, Arianna Pavone, Caterina Viola

    Abstract: Quantum computation represents a computational paradigm whose distinctive attributes confer the ability to devise algorithms with asymptotic performance levels significantly superior to those achievable via classical computation. Recent strides have been taken to apply this computational framework in tackling and resolving various issues related to text processing. The resultant solutions demonstr… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

  4. arXiv:2303.18063  [pdf, ps, other

    cs.DS

    The Many Qualities of a New Directly Accessible Compression Scheme

    Authors: Domenico Cantone, Simone Faro

    Abstract: We present a new variable-length computation-friendly encoding scheme, named SFDC (Succinct Format with Direct aCcesibility), that supports direct and fast accessibility to any element of the compressed sequence and achieves compression ratios often higher than those offered by other solutions in the literature. The SFDC scheme provides a flexible and simple representation geared towards either pr… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: 33 pages

  5. arXiv:2303.03749  [pdf, ps, other

    cs.PL

    Daml: A Smart Contract Language for Securely Automating Real-World Multi-Party Business Workflows

    Authors: Alexander Bernauer, Sofia Faro, Rémy Hämmerle, Martin Huschenbett, Moritz Kiefer, Andreas Lochbihler, Jussi Mäki, Francesco Mazzoli, Simon Meier, Neil Mitchell, Ratko G. Veprek

    Abstract: Distributed ledger technologies, also known as blockchains for enterprises, promise to significantly reduce the high cost of automating multi-party business workflows. We argue that a programming language for writing such on-ledger logic should satisfy three desiderata: (1) Provide concepts to capture the legal rules that govern real-world business workflows. (2) Include simple means for specifyin… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

    ACM Class: D.3.1; F.3.2

  6. arXiv:2101.00718  [pdf, other

    cs.DS cs.LO

    Text Searching Allowing for Non-Overlap** Adjacent Unbalanced Translocations

    Authors: Domenico Cantone, Simone Faro, Arianna Pavone

    Abstract: In this paper we investigate the \emph{approximate string matching problem} when the allowed edit operations are \emph{non-overlap** unbalanced translocations of adjacent factors}. Such kind of edit operations take place when two adjacent sub-strings of the text swap, resulting in a modified string. The two involved substrings are allowed to be of different lengths. Such large-scale modificati… ▽ More

    Submitted 3 January, 2021; originally announced January 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:1812.00421

  7. arXiv:1911.01644  [pdf, other

    cs.DS

    Fast Multiple Pattern Cartesian Tree Matching

    Authors: Geonmo Gu, Siwoo Song, Simone Faro, Thierry Lecroq, Kunsoo Park

    Abstract: Cartesian tree matching is the problem of finding all substrings in a given text which have the same Cartesian trees as that of a given pattern. In this paper, we deal with Cartesian tree matching for the case of multiple patterns. We present two fingerprinting methods, i.e., the parent-distance encoding and the binary encoding. By combining an efficient fingerprinting method and a conventional mu… ▽ More

    Submitted 5 November, 2019; originally announced November 2019.

    Comments: Submitted to WALCOM 2020

  8. arXiv:1908.05930  [pdf, ps, other

    cs.DS

    Efficient Online String Matching Based on Characters Distance Text Sampling

    Authors: Simone Faro, Arianna Pavone, Francesco Pio Marino

    Abstract: Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. Sampled string matching is an efficient approach recently introduced in order to overcome the prohibitive space requirements of an index construction, on the one hand, and drastic… ▽ More

    Submitted 16 August, 2019; originally announced August 2019.

  9. arXiv:1908.04937  [pdf, other

    cs.DS

    Fast Cartesian Tree Matching

    Authors: Siwoo Song, Cheol Ryu, Simone Faro, Thierry Lecroq, Kunsoo Park

    Abstract: Cartesian tree matching is the problem of finding all substrings of a given text which have the same Cartesian trees as that of a given pattern. So far there is one linear-time solution for Cartesian tree matching, which is based on the KMP algorithm. We improve the running time of the previous solution by introducing new representations. We present the framework of a binary filtration method and… ▽ More

    Submitted 13 August, 2019; originally announced August 2019.

    Comments: 14 pages, 3 figures, Submitted to SPIRE 2019

  10. arXiv:1812.00421  [pdf, other

    cs.DS

    Sequence Searching Allowing for Non-Overlap** Adjacent Unbalanced Translocations

    Authors: Domenico Cantone, Simone Faro, Arianna Pavone

    Abstract: Unbalanced translocations are among the most frequent chromosomal alterations, accounted for 30\% of all losses of heterozygosity, a major genetic event causing inactivation of tumor suppressor genes. Despite of their central role in genomic sequence analysis, little attention has been devoted to the problem of matching sequences allowing for this kind of chromosomal alteration. In this paper we i… ▽ More

    Submitted 2 December, 2018; originally announced December 2018.

  11. arXiv:1803.02807  [pdf, ps, other

    cs.DS cs.IR

    Flexible and Efficient Algorithms for Abelian Matching in Strings

    Authors: Simone Faro, Arianna Pavone

    Abstract: The abelian pattern matching problem consists in finding all substrings of a text which are permutations of a given pattern. This problem finds application in many areas and can be solved in linear time by a naive sliding window approach. In this short communication we present a new class of algorithms based on a new efficient fingerprint computation approach, called Heap-Counting, which turns out… ▽ More

    Submitted 7 March, 2018; originally announced March 2018.

    Comments: This is a short preliminary version of a full paper submitted to an international journal. Most examples, details, lemmas and theorems have been omitted

  12. arXiv:1707.00469  [pdf, ps, other

    cs.DS cs.IR

    Speeding Up String Matching by Weak Factor Recognition

    Authors: Domenico Cantone, Simone Faro, Arianna Pavone

    Abstract: String matching is the problem of finding all the substrings of a text which match a given pattern. It is one of the most investigated problems in computer science, mainly due to its very diverse applications in several fields. Recently, much research in the string matching field has focused on the efficiency and flexibility of the searching procedure and quite effective techniques have been propo… ▽ More

    Submitted 3 July, 2017; originally announced July 2017.

    Comments: 11 pages, appeared in proceedings of the Prague Stringology Conference 2017

  13. arXiv:1605.05067  [pdf, other

    cs.DS

    Exact Online String Matching Bibliography

    Authors: Simone Faro

    Abstract: In this short note we present a comprehensive bibliography for the online exact string matching problem. The problem consists in finding all occurrences of a given pattern in a text. It is an extensively studied problem in computer science, mainly due to its direct applications to such diverse areas as text, image and signal processing, speech analysis and recognition, data compression, informatio… ▽ More

    Submitted 17 May, 2016; originally announced May 2016.

    Comments: 23 pages

  14. arXiv:1507.00133  [pdf, other

    cs.CL

    Prior Polarity Lexical Resources for the Italian Language

    Authors: Valeria Borzì, Simone Faro, Arianna Pavone, Sabrina Sansone

    Abstract: In this paper we present SABRINA (Sentiment Analysis: a Broad Resource for Italian Natural language Applications) a manually annotated prior polarity lexical resource for Italian natural language applications in the field of opinion mining and sentiment induction. The resource consists in two different sets, an Italian dictionary of more than 277.000 words tagged with their prior polarity value, a… ▽ More

    Submitted 1 July, 2015; originally announced July 2015.

    Comments: 10 pages, Accepted to NLPCS 2015, the 12th International Workshop on Natural Language Processing and Cognitive Science

  15. arXiv:1501.04001  [pdf, other

    cs.DS

    Efficient Algorithms for the Order Preserving Pattern Matching Problem

    Authors: Simone Faro, Oğuzhan Külekci

    Abstract: Given a pattern x of length m and a text y of length n, both over an ordered alphabet, the order-preserving pattern matching problem consists in finding all substrings of the text with the same relative order as the pattern. It is an approximate variant of the well known exact pattern matching problem which has gained attention in recent years. This interesting problem finds applications in a lot… ▽ More

    Submitted 16 January, 2015; originally announced January 2015.

    Comments: 16 pages, 3 figures, submitted to SEA 2015 conference

  16. arXiv:1209.6449  [pdf, ps, other

    cs.IR cs.DS cs.PF

    Fast Packed String Matching for Short Patterns

    Authors: Simone Faro, M. Oguzhan Külekci

    Abstract: Searching for all occurrences of a pattern in a text is a fundamental problem in computer science with applications in many other fields, like natural language processing, information retrieval and computational biology. In the last two decades a general trend has appeared trying to exploit the power of the word RAM model to speed-up the performances of classical string matching algorithms. In thi… ▽ More

    Submitted 28 September, 2012; originally announced September 2012.

    Comments: 15 pages

  17. arXiv:1012.2547  [pdf, ps, other

    cs.DS

    The Exact String Matching Problem: a Comprehensive Experimental Evaluation

    Authors: Simone Faro, Thierry Lecroq

    Abstract: This paper addresses the online exact string matching problem which consists in finding all occurrences of a given pattern p in a text t. It is an extensively studied problem in computer science, mainly due to its direct applications to such diverse areas as text, image and signal processing, speech analysis and recognition, data compression, information retrieval, computational biology and chemis… ▽ More

    Submitted 12 December, 2010; originally announced December 2010.

    Comments: 22 pages

  18. arXiv:1012.1338  [pdf, ps, other

    cs.DS

    On Tuning the Bad-Character Rule: the Worst-Character Rule

    Authors: Domenico Cantone, Simone Faro

    Abstract: In this note we present the worst-character rule, an efficient variation of the bad-character heuristic for the exact string matching problem, firstly introduced in the well-known Boyer-Moore algorithm. Our proposed rule selects a position relative to the current shift which yields the largest average advancement, according to the characters distribution in the text. Experimental results show that… ▽ More

    Submitted 6 December, 2010; originally announced December 2010.

    Comments: 10 pages

  19. String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)

    Authors: Szymon Grabowski, Simone Faro, Emanuele Giaquinta

    Abstract: We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(α, β))$-time complexity in the worst case and $\bigO(\max(α, β))$-space complexity, where… ▽ More

    Submitted 1 December, 2010; originally announced December 2010.

    Comments: 9 pages. A slightly shorter version of this manuscript was submitted to Information Processing Letters

  20. arXiv:0810.2390  [pdf, ps, other

    cs.DS cs.IR

    Efficient Pattern Matching on Binary Strings

    Authors: Simone Faro, Thierry Lecroq

    Abstract: The binary string matching problem consists in finding all the occurrences of a pattern in a text where both strings are built on a binary alphabet. This is an interesting problem in computer science, since binary data are omnipresent in telecom and computer network applications. Moreover the problem finds applications also in the field of image processing and in pattern matching on compressed t… ▽ More

    Submitted 15 October, 2008; v1 submitted 14 October, 2008; originally announced October 2008.

    Comments: 12 pages

    ACM Class: F.2.2; H.3.3; E.4