-
Equal Requests are Asymptotically Hardest for Data Recovery
Authors:
Jüri Lember,
Ago-Erik Riet
Abstract:
In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate. We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents.
We prove that a constant request sequence is locally hardest to serve: If enough copi…
▽ More
In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate. We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents.
We prove that a constant request sequence is locally hardest to serve: If enough copies of each vector are stored in servers, then if a request sequence with all requests equal can be served then we can still serve it if a few requests are changed.
For random iid server contents, with number of data symbols constant (for simplicity) and the number of servers growing, we show that the maximum number of user requests we can serve divided by the number of servers we need approaches a limit almost surely. For uniform server contents, we show this limit is 1/2, both for sequences of copies of a fixed request and of any requests, so it is at least as hard to serve equal requests as any requests. For iid requests independent from the uniform server contents the limit is at least 1/2 and equal to 1/2 if requests are all equal to a fixed request almost surely, confirming the same.
As a building block, we deduce from a 1952 result of Marshall Hall, Jr. on abelian groups, that any collection of half as many requests as coded symbols in the doubled binary simplex code can be served by this code. This implies the fractional version of the Functional Batch Code Conjecture that allows half-servers.
△ Less
Submitted 3 May, 2024;
originally announced May 2024.
-
Black hole solutions in scalar-tensor symmetric teleparallel gravity
Authors:
Sebastian Bahamonde,
Jorge Gigante Valcarcel,
Laur Järv,
Joosep Lember
Abstract:
Symmetric teleparallel gravity is constructed with a nonzero nonmetricity tensor while both torsion and curvature are vanishing. In this framework, we find exact scalarised spherically symmetric static solutions in scalar-tensor theories built with a nonminimal coupling between the nonmetricity scalar and a scalar field. It turns out that the Bocharova-Bronnikov-Melnikov-Bekenstein solution has a…
▽ More
Symmetric teleparallel gravity is constructed with a nonzero nonmetricity tensor while both torsion and curvature are vanishing. In this framework, we find exact scalarised spherically symmetric static solutions in scalar-tensor theories built with a nonminimal coupling between the nonmetricity scalar and a scalar field. It turns out that the Bocharova-Bronnikov-Melnikov-Bekenstein solution has a symmetric teleparallel analogue (in addition to the recently found metric teleparallel analogue), while some other of these solutions describe scalarised black hole configurations that are not known in the Riemannian or metric teleparallel scalar-tensor case. To aid the analysis we also derive no-hair theorems for the theory. Since the symmetric teleparallel scalar-tensor models also include $f(Q)$ gravity, we shortly discuss this case and further prove a theorem which says that by imposing that the metric functions are the reciprocal of each other ($g_{rr}=1/g_{tt}$), the $f(Q)$ gravity theory reduces to the symmetric teleparallel equivalent of general relativity (plus a cosmological constant), and the metric takes the (Anti)de-Sitter-Schwarzschild form.
△ Less
Submitted 1 September, 2022; v1 submitted 6 June, 2022;
originally announced June 2022.
-
An evolution model with uncountably many alleles
Authors:
Daniela Bertacchi,
Juri Lember,
Fabio Zucca
Abstract:
We study a class of evolution models, where the breeding process involves an arbitrary exchangeable process, allowing for mutations to appear. The population size $n$ is fixed, hence after breeding, selection is applied. Individuals are characterized by their genome, picked inside a set $X$ (which may be uncountable), and there is a fitness associated to each genome. Being less fit implies a highe…
▽ More
We study a class of evolution models, where the breeding process involves an arbitrary exchangeable process, allowing for mutations to appear. The population size $n$ is fixed, hence after breeding, selection is applied. Individuals are characterized by their genome, picked inside a set $X$ (which may be uncountable), and there is a fitness associated to each genome. Being less fit implies a higher chance of being discarded in the selection process. The stationary distribution of the process can be described and studied. We are interested in the asymptotic behavior of this stationary distribution as $n$ goes to infinity. Choosing a parameter $λ>0$ to tune the scaling of the fitness when $n$ grows, we prove limiting theorems both for the case when the breeding process does not depend on $n$, and for the case when it is given by a Dirichlet process prior. In both cases, the limit exhibits phase transitions depending on the parameter $λ
△ Less
Submitted 30 April, 2022;
originally announced May 2022.
-
Hybrid classifiers of pairwise Markov models
Authors:
Kristi Kuljus,
Jüri Lember
Abstract:
The article studies segmentation problem (also known as classification problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and underlying state sequence form a two-dimensional Markov chain, it is a natural generalization of a hidden Markov model. To demonstrate the richness of the class of PMMs, we examine closer a few examples of rather different types of…
▽ More
The article studies segmentation problem (also known as classification problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and underlying state sequence form a two-dimensional Markov chain, it is a natural generalization of a hidden Markov model. To demonstrate the richness of the class of PMMs, we examine closer a few examples of rather different types of PMMs: a model for two related Markov chains, a model that allows to model an inhomogeneous Markov chain as a homogeneous one and a semi-Markov model. The segmentation problem assumes that one of the marginal processes is observed and the other one is not, the problem is to estimate the unobserved state path given the observations. The standard state path estimators often used are the so-called Viterbi path (a sequence with maximum state path probability given the observations) or the pointwise maximum a posteriori (PMAP) path (a sequence that maximizes the conditional state probability for given observations pointwise). Both these estimators have their limitations, therefore we derive formulas for calculating the so-called hybrid path estimators which interpolate between the PMAP and Viterbi path. We apply the introduced algorithms to the studied models in order to demonstrate the properties of different segmentation methods, and to illustrate large variation in behaviour of different segmentation methods in different PMMs. The studied examples show that a segmentation method should always be chosen with care by taking into account the particular model of interest.
△ Less
Submitted 20 March, 2022;
originally announced March 2022.
-
Global portraits of nonminimal teleparallel inflation
Authors:
Laur Järv,
Joosep Lember
Abstract:
We construct the global phase portraits of inflationary dynamics in teleparallel gravity models with a scalar field nonminimally coupled to torsion scalar. The adopted set of variables can clearly distinguish between different asymptotic states as fixed points, including the kinetic and inflationary regimes. The key role in the description of inflation is played by the heteroclinic orbits which ru…
▽ More
We construct the global phase portraits of inflationary dynamics in teleparallel gravity models with a scalar field nonminimally coupled to torsion scalar. The adopted set of variables can clearly distinguish between different asymptotic states as fixed points, including the kinetic and inflationary regimes. The key role in the description of inflation is played by the heteroclinic orbits which run from the asymptotic saddle points to the late time attractor point and are approximated by nonminimal slow roll conditions. To seek the asymptotic fixed points we outline a heuristic method in terms of the "effective potential" and "effective mass", which can be applied for any nonminimally coupled theories. As particular examples we study positive quadratic nonminimal couplings with quadratic and quartic potentials, and note how the portraits differ qualitatively from the known scalar-curvature counterparts. For quadratic models inflation can only occur at small nonminimal coupling to torsion, as for larger coupling the asymptotic de Sitter saddle point disappears from the physical phase space. Teleparallel models with quartic potentials are not viable for inflation at all, since for small nonminimal coupling the asymptotic saddle point exhibits weaker than exponential expansion, and for larger coupling disappears too.
△ Less
Submitted 2 March, 2022; v1 submitted 29 April, 2021;
originally announced April 2021.
-
Regenerativity of Viterbi process for pairwise Markov models
Authors:
Jüri Lember,
Joonas Sova
Abstract:
For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model (PMM), where the joint process consisting of finite-state hidden process and observation process is assumed to be a Markov chain. It has been recently proven that under some conditio…
▽ More
For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model (PMM), where the joint process consisting of finite-state hidden process and observation process is assumed to be a Markov chain. It has been recently proven that under some conditions the Viterbi path of the PMM can almost surely be extended to infinity, thereby defining the infinite Viterbi decoding of the observation sequence, called the Viterbi process. This was done by constructing a block of observations, called a barrier, which ensures that the Viterbi path goes trough a given state whenever this block occurs in the observation sequence. In this paper we prove that the joint process consisting of Viterbi process and PMM is regenerative. The proof involves a delicate construction of regeneration times which coincide with the occurrences of barriers. As one possible application of our theory, some results on the asymptotics of the Viterbi training algorithm are derived.
△ Less
Submitted 15 March, 2021;
originally announced March 2021.
-
Exponential forgetting of smoothing distributions for pairwise Markov models
Authors:
Jüri Lember,
Joonas Sova
Abstract:
We consider a bivariate Markov chain $Z=\{Z_k\}_{k \geq 1}=\{(X_k,Y_k)\}_{k \geq 1}$ taking values on product space ${\cal Z}={\cal X} \times{ \cal Y}$, where ${\cal X}$ is possibly uncountable space and ${\cal Y}=\{1,\ldots, |{\cal Y}|\}$ is a finite state-space. The purpose of the paper is to find sufficient conditions that guarantee the exponential convergence of smoothing, filtering and predic…
▽ More
We consider a bivariate Markov chain $Z=\{Z_k\}_{k \geq 1}=\{(X_k,Y_k)\}_{k \geq 1}$ taking values on product space ${\cal Z}={\cal X} \times{ \cal Y}$, where ${\cal X}$ is possibly uncountable space and ${\cal Y}=\{1,\ldots, |{\cal Y}|\}$ is a finite state-space. The purpose of the paper is to find sufficient conditions that guarantee the exponential convergence of smoothing, filtering and predictive probabilities: $$\sup_{n\geq t}\|P(Y_{t:\infty}\in \cdot|X_{l:n})-P(Y_{t:\infty}\in \cdot|X_{s:n}) \|_{\rm TV} \leq K_s α^{t}, \quad \mbox{a.s.}$$ Here $t\geq s\geq l\geq 1$, $K_s$ is $σ(X_{s:\infty})$-measurable finite random variable and $α\in (0,1)$ is fixed. In the second part of the paper, we establish two-sided versions of the above-mentioned convergence. We show that the desired convergences hold under fairly general conditions. A special case of above-mentioned very general model is popular hidden Markov model (HMM). We prove that in HMM-case, our assumptions are more general than all similar mixing-type of conditions encountered in practice, yet relatively easy to verify.
△ Less
Submitted 9 March, 2021;
originally announced March 2021.
-
Estimating the logarithm of characteristic function and stability parameter for symmetric stable laws
Authors:
Annika Krutto,
Jüri Lember
Abstract:
Let $X_1,\ldots,X_n$ be an i.i.d. sample from symmetric stable distribution with stability parameter $α$ and scale parameter $γ$. Let $\varphi_n$ be the empirical characteristic function. We prove an uniform large deviation inequality: given preciseness $ε>0$ and probability $p\in (0,1)$, there exists universal (depending on $ε$ and $p$ but not depending on $α$ and $γ$) constant $\bar{r}>0$ so tha…
▽ More
Let $X_1,\ldots,X_n$ be an i.i.d. sample from symmetric stable distribution with stability parameter $α$ and scale parameter $γ$. Let $\varphi_n$ be the empirical characteristic function. We prove an uniform large deviation inequality: given preciseness $ε>0$ and probability $p\in (0,1)$, there exists universal (depending on $ε$ and $p$ but not depending on $α$ and $γ$) constant $\bar{r}>0$ so that $$P\big(\sup_{u>0:r(u)\leq \bar{r}}|r(u)-\hat{r}(u)|\geq ε\big)\leq p,$$ where $r(u)=(uγ)^α$ and $\hat{r}(u)=-\ln|\varphi_n(u)|$. As an applications of the result, we show how it can be used in estimation unknown stability parameter $α$.
△ Less
Submitted 10 August, 2020;
originally announced August 2020.
-
MAP segmentation in Bayesian hidden Markov models: a case study
Authors:
Alexey Koloydenko,
Kristi Kuljus,
Jüri Lember
Abstract:
We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Baye…
▽ More
We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data.
△ Less
Submitted 17 April, 2020;
originally announced April 2020.
-
An evolutionary model that satisfies detailed balance
Authors:
Jüri Lember,
Chris Watkins
Abstract:
We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, an…
▽ More
We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, and, unlike in many other existing models, the stationary distribution -- so called mutation-selection equilibrium -- can be easily found and studied. The behaviour of the stationary distribution when the population size increases is our main object of interest. Several phase-transition theorems are proved.
△ Less
Submitted 24 August, 2020; v1 submitted 27 February, 2019;
originally announced February 2019.
-
A stochastic model for the evolution of species with random fitness
Authors:
Daniela Bertacchi,
Juri Lember,
Fabio Zucca
Abstract:
We generalize the evolution model introduced by Guiol, Machado and Schinazi (2010). In our model at odd times a random number X of species is created. Each species is endowed with a random fitness with arbitrary distribution on $[0, 1]$. At even times a random number Y of species is removed, killing the species with lower fitness. We show that there is a critical fitness $f_c$ below which the numb…
▽ More
We generalize the evolution model introduced by Guiol, Machado and Schinazi (2010). In our model at odd times a random number X of species is created. Each species is endowed with a random fitness with arbitrary distribution on $[0, 1]$. At even times a random number Y of species is removed, killing the species with lower fitness. We show that there is a critical fitness $f_c$ below which the number of species hits zero i.o. and above of which this number goes to infinity. We prove uniform convergence for the distribution of surviving species and describe the phenomena which could not be observed in previous works with uniformly distributed fitness.
△ Less
Submitted 1 May, 2018; v1 submitted 19 April, 2018;
originally announced April 2018.
-
Estimation of Viterbi path in Bayesian hidden Markov models
Authors:
Jüri Lember,
Dario Gasbarra,
Alexey Koloydenko,
Kristi Kuljus
Abstract:
The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum posterior probability (MAP). For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applica…
▽ More
The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum posterior probability (MAP). For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applicable and several iterative methods can be used instead. We introduce a new EM-type algorithm for finding the MAP path and compare it with various other methods for finding the MAP path, including the variational Bayes approach and MCMC methods. Examples with simulated data are used to compare the performance of the methods. The main focus is on non-stochastic iterative methods and our results show that the best of those methods work as well or better than the best MCMC methods. Our results demonstrate that when the primary goal is segmentation, then it is more reasonable to perform segmentation directly by considering the transition and emission parameters as nuisance parameters.
△ Less
Submitted 11 May, 2019; v1 submitted 5 February, 2018;
originally announced February 2018.
-
Quantifying the Estimation Error of Principal Components
Authors:
Raphael Hauser,
Raul Kangro,
Jüri Lember,
Heinrich Matzinger
Abstract:
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population.…
▽ More
Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigendecomposition of the proxy covariance $\widehatΣ$ rather than the ground-truth $Σ$, it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The recent results of Kolchinskii and Lounici yield such bounds. In the present paper we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size order.
△ Less
Submitted 27 October, 2017;
originally announced October 2017.
-
Existence of infinite Viterbi path for pairwise Markov models
Authors:
Jüri Lember,
Joonas Sova
Abstract:
For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model, where the joint process consisting of finite-state hidden regime and observation process is assumed to be a Markov chain. We prove that under some conditions it is possible to exten…
▽ More
For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model, where the joint process consisting of finite-state hidden regime and observation process is assumed to be a Markov chain. We prove that under some conditions it is possible to extend the Viterbi path to infinity for almost every observation sequence which in turn enables to define an infinite Viterbi decoding of the observation process, called the Viterbi process. This is done by constructing a block of observations, called a barrier, which ensures that the Viterbi path goes trough a given state whenever this block occurs in the observation sequence.
△ Less
Submitted 12 August, 2017;
originally announced August 2017.
-
Comparison of hidden Markov chain models and hidden Markov random field models in estimation of computed tomography images
Authors:
Kristi Kuljus,
Fekadu L. Bayisa,
David Bolin,
Jüri Lember,
Jun Yu
Abstract:
There is an interest to replace computed tomography (CT) images with magnetic resonance (MR) images for a number of diagnostic and therapeutic workflows. In this article, predicting CT images from a number of magnetic resonance imaging (MRI) sequences using regression approach is explored. Two principal areas of application for estimated CT images are dose calculations in MRI-based radiotherapy tr…
▽ More
There is an interest to replace computed tomography (CT) images with magnetic resonance (MR) images for a number of diagnostic and therapeutic workflows. In this article, predicting CT images from a number of magnetic resonance imaging (MRI) sequences using regression approach is explored. Two principal areas of application for estimated CT images are dose calculations in MRI-based radiotherapy treatment planning and attenuation correction for positron emission tomography (PET)/MRI. The main purpose of this work is to investigate the performance of hidden Markov (chain) models (HMMs) in comparison to hidden Markov random field (HMRF) models when predicting CT images of head. Our study shows that HMMs have clear advantages over HMRF models in this particular application. Obtained results suggest that HMMs deserve a further study for investigating their potential in modelling applications where the most natural theoretical choice would be the class of HMRF models.
△ Less
Submitted 4 May, 2017;
originally announced May 2017.
-
Lower bounds for moments of global scores of pairwise Markov chains
Authors:
Jüri Lember,
Heinrich Matzinger,
Joonas Sova,
Fabio Zucca
Abstract:
Let $X_1,X_2,\ldots$ and $Y_1,Y_2,\ldots$ be two random sequences so that every random variable takes values in a finite set $\mathbb{A}$. We consider a global similarity score $L_n:=L(X_1,\ldots,X_n;Y_1,\ldots,Y_n)$ that measures the homology (relatedness) of words $(X_1,\ldots,X_n)$ and $(Y_1,\ldots,Y_n)$. A typical example of such score is the length of the longest common subsequence. We study…
▽ More
Let $X_1,X_2,\ldots$ and $Y_1,Y_2,\ldots$ be two random sequences so that every random variable takes values in a finite set $\mathbb{A}$. We consider a global similarity score $L_n:=L(X_1,\ldots,X_n;Y_1,\ldots,Y_n)$ that measures the homology (relatedness) of words $(X_1,\ldots,X_n)$ and $(Y_1,\ldots,Y_n)$. A typical example of such score is the length of the longest common subsequence. We study the order of central absolute moment $E|L_n-EL_n|^r$ in the case where two-dimensional process $(X_1,Y_1),(X_2,Y_2),\ldots$ is a Markov chain on $\mathbb{A}\times \mathbb{A}$. This is a very general model involving independent Markov chains, hidden Markov models, Markov switching models and many more. Our main result establishes a general condition that guarantees that $E|L_n-EL_n|^r\asymp n^{r\over 2}$. We also perform simulations indicating the validity of the condition.
△ Less
Submitted 18 February, 2016; v1 submitted 17 February, 2016;
originally announced February 2016.
-
Lower Bounds on the Generalized Central Moments of the Optimal Alignments Score of Random Sequences
Authors:
Ruoting Gong,
Christian Houdré,
Jüri Lember
Abstract:
We present a general approach to the problem of determining tight asymptotic lower bounds for generalized central moments of the optimal alignment score of two independent sequences of i.i.d. random variables. At first, these are obtained under a main assumption for which sufficient conditions are provided. When the main assumption fails, we nevertheless develop a "uniform approximation" method le…
▽ More
We present a general approach to the problem of determining tight asymptotic lower bounds for generalized central moments of the optimal alignment score of two independent sequences of i.i.d. random variables. At first, these are obtained under a main assumption for which sufficient conditions are provided. When the main assumption fails, we nevertheless develop a "uniform approximation" method leading to asymptotic lower bounds. Our general results are then applied to the length of the longest common subsequence of binary strings, in which case asymptotic lower bounds are obtained for the moments and the exponential moments of the optimal score. As a byproduct, a local upper bound on the rate function associated with the length of the longest common subsequences of two binary strings is also obtained.
△ Less
Submitted 24 November, 2016; v1 submitted 19 June, 2015;
originally announced June 2015.
-
New Bounds for Permutation Codes in Ulam Metric
Authors:
Faruk Göloğlu,
Jüri Lember,
Ago-Erik Riet,
Vitaly Skachek
Abstract:
New bounds on the cardinality of permutation codes equipped with the Ulam distance are presented. First, an integer-programming upper bound is derived, which improves on the Singleton-type upper bound in the literature for some lengths. Second, several probabilistic lower bounds are developed, which improve on the known lower bounds for large minimum distances. The results of a computer search for…
▽ More
New bounds on the cardinality of permutation codes equipped with the Ulam distance are presented. First, an integer-programming upper bound is derived, which improves on the Singleton-type upper bound in the literature for some lengths. Second, several probabilistic lower bounds are developed, which improve on the known lower bounds for large minimum distances. The results of a computer search for permutation codes are also presented.
△ Less
Submitted 20 April, 2015;
originally announced April 2015.
-
Optimal alignments of longest common subsequences and their path properties
Authors:
Jüri Lember,
Heinrich Matzinger,
Anna Vollmer
Abstract:
We investigate the behavior of optimal alignment paths for homologous (related) and independent random sequences. An alignment between two finite sequences is optimal if it corresponds to the longest common subsequence (LCS). We prove the existence of lowest and highest optimal alignments and study their differences. High differences between the extremal alignments imply the high variety of all op…
▽ More
We investigate the behavior of optimal alignment paths for homologous (related) and independent random sequences. An alignment between two finite sequences is optimal if it corresponds to the longest common subsequence (LCS). We prove the existence of lowest and highest optimal alignments and study their differences. High differences between the extremal alignments imply the high variety of all optimal alignments. We present several simulations indicating that the homologous (having the same common ancestor) sequences have typically the distance between the extremal alignments of much smaller size than independent sequences. In particular, the simulations suggest that for the homologous sequences, the growth of the distance between the extremal alignments is logarithmical. The main theoretical results of the paper prove that (under some assumptions) this is the case, indeed. The paper suggests that the properties of the optimal alignment paths characterize the relatedness of the sequences.
△ Less
Submitted 4 July, 2014;
originally announced July 2014.
-
On the accuracy of the Viterbi alignment
Authors:
Kristi Kuljus,
Jüri Lember
Abstract:
In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. I…
▽ More
In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. In this article, an iterative procedure for improving the Viterbi alignment is proposed and studied. The iterative approach is compared with a simple bunch approach where a number of states with low probability are all replaced at the same time. It can be seen that the iterative way of adjusting the Viterbi alignment is more efficient and it has several advantages over the bunch approach. The same iterative algorithm for improving the Viterbi alignment can be used in the case of pee**, that is when it is possible to reveal hidden states. In addition, lower bounds for classification probabilities of the Viterbi alignment under different conditions on the model parameters are studied.
△ Less
Submitted 30 July, 2013;
originally announced July 2013.
-
General approach to the fluctuations problem in random sequence comparison
Authors:
Jüri Lember,
Heinrich Matzinger,
Felipe Torres
Abstract:
We present a general approach to the problem of determining the asymptotic order of the variance of the optimal score between two independent random sequences defined over an arbitrary finite alphabet. Our general approach is based on identifying random variables driving the fluctuations of the optimal score and conveniently choosing functions of them which exhibit certain monotonicity properties.…
▽ More
We present a general approach to the problem of determining the asymptotic order of the variance of the optimal score between two independent random sequences defined over an arbitrary finite alphabet. Our general approach is based on identifying random variables driving the fluctuations of the optimal score and conveniently choosing functions of them which exhibit certain monotonicity properties. We show how our general approach establishes a common theoretical background for the techniques used by Matzinger et al. in a series of previous articles [6, 8, 20, 24, 26, 37] studying the same problem in especial cases. Additionally, we explicitely apply our general approach to study the fluctuations of the optimal score between two random sequences over a finite alphabet (closing the study as initiated in [26]) and of the length of the longest common subsequences between two random sequences with a certain block structure (generalizing part of [37]).
△ Less
Submitted 21 November, 2012;
originally announced November 2012.
-
Detecting the homology of DNA-sequences based on the variety of optimal alignments: a case study
Authors:
Erik Hirmo,
Jüri Lember,
Heinrich Matzinger
Abstract:
We consider a novel approach of measuring the homology of DNA sequences based of the variety of optimal alignments in the longest common subsequence sense. The proposed approach is compared with BLAST in measuring the homology of four genes.
We consider a novel approach of measuring the homology of DNA sequences based of the variety of optimal alignments in the longest common subsequence sense. The proposed approach is compared with BLAST in measuring the homology of four genes.
△ Less
Submitted 14 October, 2012;
originally announced October 2012.
-
The rate of the convergence of the mean score in random sequence comparison
Authors:
Juri Lember,
Heinrich Matzinger,
Felipe Torres
Abstract:
We consider a general class of super-additive scores measuring the similarity of two independent sequences of $n$ i.i.d. letters from a finite alphabet. Our object of interest is the mean score by letter $l_n$. By the subadditivity $l_n$ is nondecreasing and converges to a limit $l$. We give a simple method of bounding the difference $l-l_n$ and obtaining the rate of convergence. Our result genera…
▽ More
We consider a general class of super-additive scores measuring the similarity of two independent sequences of $n$ i.i.d. letters from a finite alphabet. Our object of interest is the mean score by letter $l_n$. By the subadditivity $l_n$ is nondecreasing and converges to a limit $l$. We give a simple method of bounding the difference $l-l_n$ and obtaining the rate of convergence. Our result generalizes a previous result of Alexander, where only the special case of the longest common subsequence is considered.
△ Less
Submitted 17 November, 2010; v1 submitted 11 November, 2010;
originally announced November 2010.
-
A generalized risk approach to path inference based on hidden Markov models
Authors:
Jüri Lember,
Alexey A. Koloydenko
Abstract:
Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied m…
▽ More
Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied more recently and in dedicated applications generally unfamiliar to the statistical learning community. Over a decade ago, however, a family of algorithmically defined decoders aiming to hybridize the two standard ones was proposed (Brushe et al., 1998). The present paper gives a careful analysis of this hybridization approach, identifies several problems and issues with it and other previously proposed approaches, and proposes practical resolutions of those. Furthermore, simple modifications of the classical criteria for hidden path recognition are shown to lead to a new class of decoders. Dynamic programming algorithms to compute these decoders in the usual forward-backward manner are presented. A particularly interesting subclass of such estimators can be also viewed as hybrids of the MAP and PD estimators. Similar to previously proposed MAP-PD hybrids, the new class is parameterized by a small number of tunable parameters. Unlike their algorithmic predecessors, the new risk-based decoders are more clearly interpretable, and, most importantly, work "out of the box" in practice, which is demonstrated on some real bioinformatics tasks and data. Some further generalizations and applications are discussed in conclusion.
△ Less
Submitted 16 April, 2013; v1 submitted 21 July, 2010;
originally announced July 2010.
-
Asymptotic risks of Viterbi segmentation
Authors:
Kristi Kuljus,
Jüri Lember
Abstract:
We consider the maximum likelihood (Viterbi) alignment of a hidden Markov model (HMM). In an HMM, the underlying Markov chain is usually hidden and the Viterbi alignment is often used as the estimate of it. This approach will be referred to as the Viterbi segmentation. The goodness of the Viterbi segmentation can be measured by several risks. In this paper, we prove the existence of asymptotic ris…
▽ More
We consider the maximum likelihood (Viterbi) alignment of a hidden Markov model (HMM). In an HMM, the underlying Markov chain is usually hidden and the Viterbi alignment is often used as the estimate of it. This approach will be referred to as the Viterbi segmentation. The goodness of the Viterbi segmentation can be measured by several risks. In this paper, we prove the existence of asymptotic risks. Being independent of data, the asymptotic risks can be considered as the characteristics of the model that illustrate the long-run behavior of the Viterbi segmentation.
△ Less
Submitted 13 December, 2010; v1 submitted 18 February, 2010;
originally announced February 2010.
-
On approximation of smoothing probabilities for hidden Markov models
Authors:
J. Lember
Abstract:
We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that th…
▽ More
We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that the corresponding risks converge.
△ Less
Submitted 10 May, 2011; v1 submitted 24 October, 2009;
originally announced October 2009.
-
Standard deviation of the longest common subsequence
Authors:
Jüri Lember,
Heinrich Matzinger
Abstract:
Let $L_n$ be the length of the longest common subsequence of two independent i.i.d. sequences of Bernoulli variables of length $n$. We prove that the order of the standard deviation of $L_n$ is $\sqrt{n}$, provided the parameter of the Bernoulli variables is small enough. This validates Waterman's conjecture in this situation [Philos. Trans. R. Soc. Lond. Ser. B 344 (1994) 383--390]. The order c…
▽ More
Let $L_n$ be the length of the longest common subsequence of two independent i.i.d. sequences of Bernoulli variables of length $n$. We prove that the order of the standard deviation of $L_n$ is $\sqrt{n}$, provided the parameter of the Bernoulli variables is small enough. This validates Waterman's conjecture in this situation [Philos. Trans. R. Soc. Lond. Ser. B 344 (1994) 383--390]. The order conjectured by Chvatal and Sankoff [J. Appl. Probab. 12 (1975) 306--315], however, is different.
△ Less
Submitted 29 July, 2009;
originally announced July 2009.
-
A constructive proof of the existence of Viterbi processes
Authors:
J. Lember,
A. Koloydenko
Abstract:
Since the early days of digital communication, hidden Markov models (HMMs) have now been also routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. In an HMM $(X_i,Y_i)_{i\ge 1}$, observations $X_1,X_2,...$ are assumed to be conditionally independent given an ``explanatory'' Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the…
▽ More
Since the early days of digital communication, hidden Markov models (HMMs) have now been also routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. In an HMM $(X_i,Y_i)_{i\ge 1}$, observations $X_1,X_2,...$ are assumed to be conditionally independent given an ``explanatory'' Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional distribution of $X_i$ depends solely on $Y_i$. Central to the theory and applications of HMM is the Viterbi algorithm to find {\em a maximum a posteriori} (MAP) estimate $q_{1:n}=(q_1,q_2,...,q_n)$ of $Y_{1:n}$ given observed data $x_{1:n}$. Maximum {\em a posteriori} paths are also known as Viterbi paths or alignments. Recently, attempts have been made to study the behavior of Viterbi alignments when $n\to \infty$. Thus, it has been shown that in some special cases a well-defined limiting Viterbi alignment exists. While innovative, these attempts have relied on rather strong assumptions and involved proofs which are existential. This work proves the existence of infinite Viterbi alignments in a more constructive manner and for a very general class of HMMs.
△ Less
Submitted 14 April, 2008;
originally announced April 2008.
-
The adjusted Viterbi training for hidden Markov models
Authors:
Jüri Lember,
Alexey Koloydenko
Abstract:
The EM procedure is a principal tool for parameter estimation in the hidden Markov models. However, applications replace EM by Viterbi extraction, or training (VT). VT is computationally less intensive, more stable and has more of an intuitive appeal, but VT estimation is biased and does not satisfy the following fixed point property. Hypothetically, given an infinitely large sample and initiali…
▽ More
The EM procedure is a principal tool for parameter estimation in the hidden Markov models. However, applications replace EM by Viterbi extraction, or training (VT). VT is computationally less intensive, more stable and has more of an intuitive appeal, but VT estimation is biased and does not satisfy the following fixed point property. Hypothetically, given an infinitely large sample and initialized to the true parameters, VT will generally move away from the initial values. We propose adjusted Viterbi training (VA), a new method to restore the fixed point property and thus alleviate the overall imprecision of the VT estimators, while preserving the computational advantages of the baseline VT algorithm. Simulations elsewhere have shown that VA appreciably improves the precision of estimation in both the special case of mixture models and more general HMMs. However, being entirely analytic, the VA correction relies on infinite Viterbi alignments and associated limiting probability distributions. While explicit in the mixture case, the existence of these limiting measures is not obvious for more general HMMs. This paper proves that under certain mild conditions, the required limiting distributions for general HMMs do exist.
△ Less
Submitted 17 March, 2008;
originally announced March 2008.
-
Nonparametric Bayesian model selection and averaging
Authors:
Subhashis Ghosal,
Jüri Lember,
Aad van der Vaart
Abstract:
We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an…
▽ More
We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an abstract model index and a prior on a density model for each model index. We present a general theorem on the rate of contraction of the resulting posterior distribution as $n\to \infty$, which gives conditions under which the rate of contraction is the one attached to the model that best approximates the true density of the observations. This shows that, for instance, the posterior distribution can adapt to the smoothness of the underlying density. We also study the posterior distribution of the model index, and find that under the same conditions the posterior distribution gives negligible weight to models that are bigger than the optimal one, and thus selects the optimal model or smaller models that also approximate the true density well. We apply these result to log spline density models, where we show that the prior weights on the regularity index interact with the priors on the models, making the exact rates depend in a complicated way on the priors, but also that the rate is fairly robust to specification of the prior weights.
△ Less
Submitted 1 February, 2008;
originally announced February 2008.
-
Infinite Viterbi alignments in the two state hidden Markov models
Authors:
J. Lember,
A. Koloydenko
Abstract:
Since the early days of digital communication, Hidden Markov Models (HMMs) have now been routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. An HMM $(X_i,Y_i)_{i\ge 1}$ assumes observations $X_1,X_2,...$ to be conditionally independent given an "explanotary" Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional dis…
▽ More
Since the early days of digital communication, Hidden Markov Models (HMMs) have now been routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. An HMM $(X_i,Y_i)_{i\ge 1}$ assumes observations $X_1,X_2,...$ to be conditionally independent given an "explanotary" Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional distribution of $X_i$ depends solely on $Y_i$. Central to the theory and applications of HMM is the Viterbi algorithm to find {\em a maximum a posteriori} estimate $q_{1:n}=(q_1,q_2,...,q_n)$ of $Y_{1:n}$ given the observed data $x_{1:n}$. Maximum {\em a posteriori} paths are also called Viterbi paths or alignments. Recently, attempts have been made to study the behavior of Viterbi alignments of HMMs with two hidden states when $n$ tends to infinity. It has indeed been shown that in some special cases a well-defined limiting Viterbi alignment exists. While innovative, these attempts have relied on rather strong assumptions. This work proves the existence of infinite Viterbi alignments for virtually any HMM with two hidden states.
△ Less
Submitted 5 February, 2009; v1 submitted 6 November, 2007;
originally announced November 2007.
-
Adjusted Viterbi training for hidden Markov models
Authors:
J. Lember,
A. Koloydenko
Abstract:
To estimate the emission parameters in hidden Markov models one commonly uses the EM algorithm or its variation. Our primary motivation, however, is the Philips speech recognition system wherein the EM algorithm is replaced by the Viterbi training algorithm. Viterbi training is faster and computationally less involved than EM, but it is also biased and need not even be consistent. We propose an…
▽ More
To estimate the emission parameters in hidden Markov models one commonly uses the EM algorithm or its variation. Our primary motivation, however, is the Philips speech recognition system wherein the EM algorithm is replaced by the Viterbi training algorithm. Viterbi training is faster and computationally less involved than EM, but it is also biased and need not even be consistent. We propose an alternative to the Viterbi training -- adjusted Viterbi training -- that has the same order of computational complexity as Viterbi training but gives more accurate estimators. Elsewhere, we studied the adjusted Viterbi training for a special case of mixtures, supporting the theory by simulations. This paper proves the adjusted Viterbi training to be also possible for more general hidden Markov models.
△ Less
Submitted 14 September, 2007;
originally announced September 2007.
-
Adjusted Viterbi training
Authors:
J. Lember,
A. Koloydenko
Abstract:
We study modifications of the Viterbi Training (VT) algorithm to estimate emission parameters in Hidden Markov Models (HMM) in general, and in mixure models in particular. Motivated by applications of VT to HMM that are used in speech recognition, natural language modeling, image analysis, and bioinformatics, we investigate a possibility of alleviating the inconsistency of VT while controlling t…
▽ More
We study modifications of the Viterbi Training (VT) algorithm to estimate emission parameters in Hidden Markov Models (HMM) in general, and in mixure models in particular. Motivated by applications of VT to HMM that are used in speech recognition, natural language modeling, image analysis, and bioinformatics, we investigate a possibility of alleviating the inconsistency of VT while controlling the amount of extra computations. Specifically, we propose to enable VT to asymptotically fix the true values of the parameters as does the EM algorithm. This relies on infinite Viterbi alignment and an associated with it limiting probability distribution. This paper, however, focuses on mixture models, an important case of HMM, wherein the limiting distribution can always be computed exactly; finding such limiting distribution for general HMM presents a more challenging task under our ongoing investigation.
A simulation of a univariate Gaussian mixture shows that our central algorithm (VA1) can dramatically improve accuracy without much cost in computation time.
We also present VA2, a more mathematically advanced correction to VT, verify by simulation its fast convergence and high accuracy; its computational feasibility remains to be investigated in future work.
△ Less
Submitted 9 October, 2004; v1 submitted 11 June, 2004;
originally announced June 2004.