-
Composition and Weight Pushing of Monotonic Subsequential Failure Transducers Representing Probabilistic Models
Authors:
Diana Geneva,
Georgi Shopov,
Stoyan Mihov
Abstract:
We present a construction for the composition of subsequential transducers (representing conditional probabilistic models) with subsequential failure transducers (representing probabilistic models). Under certain conditions, satisfied by the corresponding transduction devices, a more efficient construction is applicable that avoids the creation of unnecessary states. Furthermore, the weights of th…
▽ More
We present a construction for the composition of subsequential transducers (representing conditional probabilistic models) with subsequential failure transducers (representing probabilistic models). Under certain conditions, satisfied by the corresponding transduction devices, a more efficient construction is applicable that avoids the creation of unnecessary states. Furthermore, the weights of the resulting failure transducers can be efficiently redistributed via weight pushing in the $\langle \mathbb{R}_+, +, \times, 0, 1 \rangle$ and $\langle \mathbb{R}_+, \max, \times, 0, 1 \rangle$ semirings.
△ Less
Submitted 21 May, 2020; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Space-Efficient Bimachine Construction Based on the Equalizer Accumulation Principle
Authors:
Stefan Gerdjikov,
Stoyan Mihov,
Klaus U. Schulz
Abstract:
Algorithms for building bimachines from functional transducers found in the literature in a run of the bimachine imitate one successful path of the input transducer. Each single bimachine output exactly corresponds to the output of a single transducer transition. Here we introduce an alternative construction principle where bimachine steps take alternative parallel transducer paths into account, m…
▽ More
Algorithms for building bimachines from functional transducers found in the literature in a run of the bimachine imitate one successful path of the input transducer. Each single bimachine output exactly corresponds to the output of a single transducer transition. Here we introduce an alternative construction principle where bimachine steps take alternative parallel transducer paths into account, maximizing the possible output at each step using a joint view. The size of both the deterministic left and right automaton of the bimachine is restricted by $2^{\vert Q\vert}$ where $\vert Q\vert$ is the number of transducer states. Other bimachine constructions lead to larger subautomata. As a concrete example we present a class of real-time functional transducers with $n+2$ states for which the standard bimachine construction generates a bimachine with at least $Θ(n!)$ states whereas the construction based on the equalizer accumulation principle leads to $2^n + n +3$ states. Our construction can be applied to rational functions from free monoids to "mge monoids", a large class of monoids including free monoids, groups, and others that is closed under Cartesian products.
△ Less
Submitted 27 February, 2018;
originally announced March 2018.
-
Myhill-Nerode Relation for Sequentiable Structures
Authors:
Stefan Gerdjikov,
Stoyan Mihov
Abstract:
Sequentiable structures are a subclass of monoids that generalise the free monoids and the monoid of non-negative real numbers with addition. In this paper we consider functions $f:Σ^*\rightarrow {\cal M}$ and define the Myhill-Nerode relation for these functions. We prove that a function of finite index, $n$, can be represented with a subsequential transducer with $n$ states.
Sequentiable structures are a subclass of monoids that generalise the free monoids and the monoid of non-negative real numbers with addition. In this paper we consider functions $f:Σ^*\rightarrow {\cal M}$ and define the Myhill-Nerode relation for these functions. We prove that a function of finite index, $n$, can be represented with a subsequential transducer with $n$ states.
△ Less
Submitted 8 June, 2017;
originally announced June 2017.
-
Good parts first - a new algorithm for approximate search in lexica and string databases
Authors:
Stefan Gerdjikov,
Stoyan Mihov,
Petar Mitankin,
Klaus U. Schulz
Abstract:
We present a new efficient method for approximate search in electronic lexica. Given an input string (the pattern) and a similarity threshold, the algorithm retrieves all entries of the lexicon that are sufficiently similar to the pattern. Search is organized in subsearches that always start with an exact partial match where a substring of the input pattern is aligned with a substring of a lexicon…
▽ More
We present a new efficient method for approximate search in electronic lexica. Given an input string (the pattern) and a similarity threshold, the algorithm retrieves all entries of the lexicon that are sufficiently similar to the pattern. Search is organized in subsearches that always start with an exact partial match where a substring of the input pattern is aligned with a substring of a lexicon word. Afterwards this partial match is extended stepwise to larger substrings. For aligning further parts of the pattern with corresponding parts of lexicon entries, more errors are tolerated at each subsequent step. For supporting this alignment order, which may start at any part of the pattern, the lexicon is represented as a structure that enables immediate access to any substring of a lexicon word and permits the extension of such substrings in both directions. Experimental evaluations of the approximate search procedure are given that show significant efficiency improvements compared to existing techniques. Since the technique can be used for large error bounds it offers interesting possibilities for approximate search in special collections of "long" strings, such as phrases, sentences, or book ti
△ Less
Submitted 3 December, 2015; v1 submitted 4 January, 2013;
originally announced January 2013.
-
Incremental construction of minimal acyclic finite-state automata
Authors:
Jan Daciuk,
Stoyan Mihov,
Bruce Watson,
Richard Watson
Abstract:
In this paper, we describe a new method for constructing minimal, deterministic, acyclic finite-state automata from a set of strings. Traditional methods consist of two phases: the first to construct a trie, the second one to minimize it. Our approach is to construct a minimal automaton in a single phase by adding new strings one by one and minimizing the resulting automaton on-the-fly. We prese…
▽ More
In this paper, we describe a new method for constructing minimal, deterministic, acyclic finite-state automata from a set of strings. Traditional methods consist of two phases: the first to construct a trie, the second one to minimize it. Our approach is to construct a minimal automaton in a single phase by adding new strings one by one and minimizing the resulting automaton on-the-fly. We present a general algorithm as well as a specialization that relies upon the lexicographical ordering of the input strings.
△ Less
Submitted 6 July, 2000;
originally announced July 2000.