Search | arXiv e-print repository

Equal Requests are Asymptotically Hardest for Data Recovery

Abstract: In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate. We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents. We prove that a constant request sequence is locally hardest to serve: If enough copi… ▽ More In a distributed storage system serving hot data, the data recovery performance becomes important, captured e.g. by the service rate. We give partial evidence for it being hardest to serve a sequence of equal user requests (as in PIR coding regime) both for concrete and random user requests and server contents. We prove that a constant request sequence is locally hardest to serve: If enough copies of each vector are stored in servers, then if a request sequence with all requests equal can be served then we can still serve it if a few requests are changed. For random iid server contents, with number of data symbols constant (for simplicity) and the number of servers growing, we show that the maximum number of user requests we can serve divided by the number of servers we need approaches a limit almost surely. For uniform server contents, we show this limit is 1/2, both for sequences of copies of a fixed request and of any requests, so it is at least as hard to serve equal requests as any requests. For iid requests independent from the uniform server contents the limit is at least 1/2 and equal to 1/2 if requests are all equal to a fixed request almost surely, confirming the same. As a building block, we deduce from a 1952 result of Marshall Hall, Jr. on abelian groups, that any collection of half as many requests as coded symbols in the doubled binary simplex code can be served by this code. This implies the fractional version of the Functional Batch Code Conjecture that allows half-servers. △ Less

Submitted 3 May, 2024; originally announced May 2024.

Comments: 13 pages

arXiv:2206.02725 [pdf, other]

doi 10.1088/1475-7516/2022/08/082

Black hole solutions in scalar-tensor symmetric teleparallel gravity

Authors: Sebastian Bahamonde, Jorge Gigante Valcarcel, Laur Järv, Joosep Lember

Abstract: Symmetric teleparallel gravity is constructed with a nonzero nonmetricity tensor while both torsion and curvature are vanishing. In this framework, we find exact scalarised spherically symmetric static solutions in scalar-tensor theories built with a nonminimal coupling between the nonmetricity scalar and a scalar field. It turns out that the Bocharova-Bronnikov-Melnikov-Bekenstein solution has a… ▽ More Symmetric teleparallel gravity is constructed with a nonzero nonmetricity tensor while both torsion and curvature are vanishing. In this framework, we find exact scalarised spherically symmetric static solutions in scalar-tensor theories built with a nonminimal coupling between the nonmetricity scalar and a scalar field. It turns out that the Bocharova-Bronnikov-Melnikov-Bekenstein solution has a symmetric teleparallel analogue (in addition to the recently found metric teleparallel analogue), while some other of these solutions describe scalarised black hole configurations that are not known in the Riemannian or metric teleparallel scalar-tensor case. To aid the analysis we also derive no-hair theorems for the theory. Since the symmetric teleparallel scalar-tensor models also include $f(Q)$ gravity, we shortly discuss this case and further prove a theorem which says that by imposing that the metric functions are the reciprocal of each other ($g_{rr}=1/g_{tt}$), the $f(Q)$ gravity theory reduces to the symmetric teleparallel equivalent of general relativity (plus a cosmological constant), and the metric takes the (Anti)de-Sitter-Schwarzschild form. △ Less

Submitted 1 September, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: Matches published version in JCAP. 24 pages, 1 figure

Journal ref: JCAP08(2022)082

arXiv:2205.00330 [pdf, ps, other]

An evolution model with uncountably many alleles

Authors: Daniela Bertacchi, Juri Lember, Fabio Zucca

Abstract: We study a class of evolution models, where the breeding process involves an arbitrary exchangeable process, allowing for mutations to appear. The population size $n$ is fixed, hence after breeding, selection is applied. Individuals are characterized by their genome, picked inside a set $X$ (which may be uncountable), and there is a fitness associated to each genome. Being less fit implies a highe… ▽ More We study a class of evolution models, where the breeding process involves an arbitrary exchangeable process, allowing for mutations to appear. The population size $n$ is fixed, hence after breeding, selection is applied. Individuals are characterized by their genome, picked inside a set $X$ (which may be uncountable), and there is a fitness associated to each genome. Being less fit implies a higher chance of being discarded in the selection process. The stationary distribution of the process can be described and studied. We are interested in the asymptotic behavior of this stationary distribution as $n$ goes to infinity. Choosing a parameter $λ>0$ to tune the scaling of the fitness when $n$ grows, we prove limiting theorems both for the case when the breeding process does not depend on $n$, and for the case when it is given by a Dirichlet process prior. In both cases, the limit exhibits phase transitions depending on the parameter $λ △ Less

Submitted 30 April, 2022; originally announced May 2022.

Comments: 45 pages

MSC Class: 60J05; 60B10; 92D15

arXiv:2203.10574 [pdf, other]

Hybrid classifiers of pairwise Markov models

Authors: Kristi Kuljus, Jüri Lember

Abstract: The article studies segmentation problem (also known as classification problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and underlying state sequence form a two-dimensional Markov chain, it is a natural generalization of a hidden Markov model. To demonstrate the richness of the class of PMMs, we examine closer a few examples of rather different types of… ▽ More The article studies segmentation problem (also known as classification problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and underlying state sequence form a two-dimensional Markov chain, it is a natural generalization of a hidden Markov model. To demonstrate the richness of the class of PMMs, we examine closer a few examples of rather different types of PMMs: a model for two related Markov chains, a model that allows to model an inhomogeneous Markov chain as a homogeneous one and a semi-Markov model. The segmentation problem assumes that one of the marginal processes is observed and the other one is not, the problem is to estimate the unobserved state path given the observations. The standard state path estimators often used are the so-called Viterbi path (a sequence with maximum state path probability given the observations) or the pointwise maximum a posteriori (PMAP) path (a sequence that maximizes the conditional state probability for given observations pointwise). Both these estimators have their limitations, therefore we derive formulas for calculating the so-called hybrid path estimators which interpolate between the PMAP and Viterbi path. We apply the introduced algorithms to the studied models in order to demonstrate the properties of different segmentation methods, and to illustrate large variation in behaviour of different segmentation methods in different PMMs. The studied examples show that a segmentation method should always be chosen with care by taking into account the particular model of interest. △ Less

Submitted 20 March, 2022; originally announced March 2022.

arXiv:2104.14258 [pdf, other]

doi 10.3390/universe7060179

Global portraits of nonminimal teleparallel inflation

Authors: Laur Järv, Joosep Lember

Abstract: We construct the global phase portraits of inflationary dynamics in teleparallel gravity models with a scalar field nonminimally coupled to torsion scalar. The adopted set of variables can clearly distinguish between different asymptotic states as fixed points, including the kinetic and inflationary regimes. The key role in the description of inflation is played by the heteroclinic orbits which ru… ▽ More We construct the global phase portraits of inflationary dynamics in teleparallel gravity models with a scalar field nonminimally coupled to torsion scalar. The adopted set of variables can clearly distinguish between different asymptotic states as fixed points, including the kinetic and inflationary regimes. The key role in the description of inflation is played by the heteroclinic orbits which run from the asymptotic saddle points to the late time attractor point and are approximated by nonminimal slow roll conditions. To seek the asymptotic fixed points we outline a heuristic method in terms of the "effective potential" and "effective mass", which can be applied for any nonminimally coupled theories. As particular examples we study positive quadratic nonminimal couplings with quadratic and quartic potentials, and note how the portraits differ qualitatively from the known scalar-curvature counterparts. For quadratic models inflation can only occur at small nonminimal coupling to torsion, as for larger coupling the asymptotic de Sitter saddle point disappears from the physical phase space. Teleparallel models with quartic potentials are not viable for inflation at all, since for small nonminimal coupling the asymptotic saddle point exhibits weaker than exponential expansion, and for larger coupling disappears too. △ Less

Submitted 2 March, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

Comments: 22 pages, 14 plots; v2: minor typos corrected, references added, version accepted by "Universe"

Journal ref: Universe 2021, 7(6), 179

arXiv:2103.11821 [pdf, other]

doi 10.1007/s10959-020-01022-z

Regenerativity of Viterbi process for pairwise Markov models

Authors: Jüri Lember, Joonas Sova

Abstract: For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model (PMM), where the joint process consisting of finite-state hidden process and observation process is assumed to be a Markov chain. It has been recently proven that under some conditio… ▽ More For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model (PMM), where the joint process consisting of finite-state hidden process and observation process is assumed to be a Markov chain. It has been recently proven that under some conditions the Viterbi path of the PMM can almost surely be extended to infinity, thereby defining the infinite Viterbi decoding of the observation sequence, called the Viterbi process. This was done by constructing a block of observations, called a barrier, which ensures that the Viterbi path goes trough a given state whenever this block occurs in the observation sequence. In this paper we prove that the joint process consisting of Viterbi process and PMM is regenerative. The proof involves a delicate construction of regeneration times which coincide with the occurrences of barriers. As one possible application of our theory, some results on the asymptotics of the Viterbi training algorithm are derived. △ Less

Submitted 15 March, 2021; originally announced March 2021.

Comments: arXiv admin note: substantial text overlap with arXiv:1708.03799

Journal ref: Journal of Theoretical Probability volume 34 (2021)

arXiv:2103.05474 [pdf, ps, other]

Exponential forgetting of smoothing distributions for pairwise Markov models

Authors: Jüri Lember, Joonas Sova

Abstract: We consider a bivariate Markov chain $Z=\{Z_k\}_{k \geq 1}=\{(X_k,Y_k)\}_{k \geq 1}$ taking values on product space ${\cal Z}={\cal X} \times{ \cal Y}$, where ${\cal X}$ is possibly uncountable space and ${\cal Y}=\{1,\ldots, |{\cal Y}|\}$ is a finite state-space. The purpose of the paper is to find sufficient conditions that guarantee the exponential convergence of smoothing, filtering and predic… ▽ More We consider a bivariate Markov chain $Z=\{Z_k\}_{k \geq 1}=\{(X_k,Y_k)\}_{k \geq 1}$ taking values on product space ${\cal Z}={\cal X} \times{ \cal Y}$, where ${\cal X}$ is possibly uncountable space and ${\cal Y}=\{1,\ldots, |{\cal Y}|\}$ is a finite state-space. The purpose of the paper is to find sufficient conditions that guarantee the exponential convergence of smoothing, filtering and predictive probabilities: $$\sup_{n\geq t}\|P(Y_{t:\infty}\in \cdot|X_{l:n})-P(Y_{t:\infty}\in \cdot|X_{s:n}) \|_{\rm TV} \leq K_s α^{t}, \quad \mbox{a.s.}$$ Here $t\geq s\geq l\geq 1$, $K_s$ is $σ(X_{s:\infty})$-measurable finite random variable and $α\in (0,1)$ is fixed. In the second part of the paper, we establish two-sided versions of the above-mentioned convergence. We show that the desired convergences hold under fairly general conditions. A special case of above-mentioned very general model is popular hidden Markov model (HMM). We prove that in HMM-case, our assumptions are more general than all similar mixing-type of conditions encountered in practice, yet relatively easy to verify. △ Less

Submitted 9 March, 2021; originally announced March 2021.

arXiv:2008.04423 [pdf, other]

Estimating the logarithm of characteristic function and stability parameter for symmetric stable laws

Authors: Annika Krutto, Jüri Lember

Abstract: Let $X_1,\ldots,X_n$ be an i.i.d. sample from symmetric stable distribution with stability parameter $α$ and scale parameter $γ$. Let $\varphi_n$ be the empirical characteristic function. We prove an uniform large deviation inequality: given preciseness $ε>0$ and probability $p\in (0,1)$, there exists universal (depending on $ε$ and $p$ but not depending on $α$ and $γ$) constant $\bar{r}>0$ so tha… ▽ More Let $X_1,\ldots,X_n$ be an i.i.d. sample from symmetric stable distribution with stability parameter $α$ and scale parameter $γ$. Let $\varphi_n$ be the empirical characteristic function. We prove an uniform large deviation inequality: given preciseness $ε>0$ and probability $p\in (0,1)$, there exists universal (depending on $ε$ and $p$ but not depending on $α$ and $γ$) constant $\bar{r}>0$ so that $$P\big(\sup_{u>0:r(u)\leq \bar{r}}|r(u)-\hat{r}(u)|\geq ε\big)\leq p,$$ where $r(u)=(uγ)^α$ and $\hat{r}(u)=-\ln|\varphi_n(u)|$. As an applications of the result, we show how it can be used in estimation unknown stability parameter $α$. △ Less

Submitted 10 August, 2020; originally announced August 2020.

arXiv:2004.08336 [pdf, ps, other]

MAP segmentation in Bayesian hidden Markov models: a case study

Authors: Alexey Koloydenko, Kristi Kuljus, Jüri Lember

Abstract: We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Baye… ▽ More We consider the problem of estimating the maximum posterior probability (MAP) state sequence for a finite state and finite emission alphabet hidden Markov model (HMM) in the Bayesian setup, where both emission and transition matrices have Dirichlet priors. We study a training set consisting of thousands of protein alignment pairs. The training data is used to set the prior hyperparameters for Bayesian MAP segmentation. Since the Viterbi algorithm is not applicable any more, there is no simple procedure to find the MAP path, and several iterative algorithms are considered and compared. The main goal of the paper is to test the Bayesian setup against the frequentist one, where the parameters of HMM are estimated using the training data. △ Less

Submitted 17 April, 2020; originally announced April 2020.

arXiv:1902.10834 [pdf, other]

An evolutionary model that satisfies detailed balance

Authors: Jüri Lember, Chris Watkins

Abstract: We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, an… ▽ More We propose a class of evolutionary models that involves an arbitrary exchangeable process as the breeding process and different selection schemes. In those models, a new genome is born according to the breeding process, and then a genome is removed according to the selection scheme that involves fitness. Thus the population size remains constant. The process evolves according to a Markov chain, and, unlike in many other existing models, the stationary distribution -- so called mutation-selection equilibrium -- can be easily found and studied. The behaviour of the stationary distribution when the population size increases is our main object of interest. Several phase-transition theorems are proved. △ Less

Submitted 24 August, 2020; v1 submitted 27 February, 2019; originally announced February 2019.

Comments: 38 pages, 5 figures

arXiv:1804.07208 [pdf, ps, other]

doi 10.1214/18-ECP190

A stochastic model for the evolution of species with random fitness

Authors: Daniela Bertacchi, Juri Lember, Fabio Zucca

Abstract: We generalize the evolution model introduced by Guiol, Machado and Schinazi (2010). In our model at odd times a random number X of species is created. Each species is endowed with a random fitness with arbitrary distribution on $[0, 1]$. At even times a random number Y of species is removed, killing the species with lower fitness. We show that there is a critical fitness $f_c$ below which the numb… ▽ More We generalize the evolution model introduced by Guiol, Machado and Schinazi (2010). In our model at odd times a random number X of species is created. Each species is endowed with a random fitness with arbitrary distribution on $[0, 1]$. At even times a random number Y of species is removed, killing the species with lower fitness. We show that there is a critical fitness $f_c$ below which the number of species hits zero i.o. and above of which this number goes to infinity. We prove uniform convergence for the distribution of surviving species and describe the phenomena which could not be observed in previous works with uniformly distributed fitness. △ Less

Submitted 1 May, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

Comments: 14 pages, some minor mistakes have been fixed

MSC Class: 60J20; 60J80; 60J15

Journal ref: Electron. Commun. Probab. 23 (2018), no. 88, 1-13

arXiv:1802.01630 [pdf, ps, other]

Estimation of Viterbi path in Bayesian hidden Markov models

Authors: Jüri Lember, Dario Gasbarra, Alexey Koloydenko, Kristi Kuljus

Abstract: The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum posterior probability (MAP). For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applica… ▽ More The article studies different methods for estimating the Viterbi path in the Bayesian framework. The Viterbi path is an estimate of the underlying state path in hidden Markov models (HMMs), which has a maximum posterior probability (MAP). For an HMM with given parameters, the Viterbi path can be easily found with the Viterbi algorithm. In the Bayesian framework the Viterbi algorithm is not applicable and several iterative methods can be used instead. We introduce a new EM-type algorithm for finding the MAP path and compare it with various other methods for finding the MAP path, including the variational Bayes approach and MCMC methods. Examples with simulated data are used to compare the performance of the methods. The main focus is on non-stochastic iterative methods and our results show that the best of those methods work as well or better than the best MCMC methods. Our results demonstrate that when the primary goal is segmentation, then it is more reasonable to perform segmentation directly by considering the transition and emission parameters as nuisance parameters. △ Less

Submitted 11 May, 2019; v1 submitted 5 February, 2018; originally announced February 2018.

arXiv:1710.10124 [pdf, ps, other]

Quantifying the Estimation Error of Principal Components

Authors: Raphael Hauser, Raul Kangro, Jüri Lember, Heinrich Matzinger

Abstract: Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population.… ▽ More Principal component analysis is an important pattern recognition and dimensionality reduction tool in many applications. Principal components are computed as eigenvectors of a maximum likelihood covariance $\widehatΣ$ that approximates a population covariance $Σ$, and these eigenvectors are often used to extract structural information about the variables (or attributes) of the studied population. Since PCA is based on the eigendecomposition of the proxy covariance $\widehatΣ$ rather than the ground-truth $Σ$, it is important to understand the approximation error in each individual eigenvector as a function of the number of available samples. The recent results of Kolchinskii and Lounici yield such bounds. In the present paper we sharpen these bounds and show that eigenvectors can often be reconstructed to a required accuracy from a sample of strictly smaller size order. △ Less

Submitted 27 October, 2017; originally announced October 2017.

arXiv:1708.03799 [pdf, other]

Existence of infinite Viterbi path for pairwise Markov models

Authors: Jüri Lember, Joonas Sova

Abstract: For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model, where the joint process consisting of finite-state hidden regime and observation process is assumed to be a Markov chain. We prove that under some conditions it is possible to exten… ▽ More For hidden Markov models one of the most popular estimates of the hidden chain is the Viterbi path -- the path maximising the posterior probability. We consider a more general setting, called the pairwise Markov model, where the joint process consisting of finite-state hidden regime and observation process is assumed to be a Markov chain. We prove that under some conditions it is possible to extend the Viterbi path to infinity for almost every observation sequence which in turn enables to define an infinite Viterbi decoding of the observation process, called the Viterbi process. This is done by constructing a block of observations, called a barrier, which ensures that the Viterbi path goes trough a given state whenever this block occurs in the observation sequence. △ Less

Submitted 12 August, 2017; originally announced August 2017.

arXiv:1705.01727 [pdf, other]

Comparison of hidden Markov chain models and hidden Markov random field models in estimation of computed tomography images

Authors: Kristi Kuljus, Fekadu L. Bayisa, David Bolin, Jüri Lember, Jun Yu

Abstract: There is an interest to replace computed tomography (CT) images with magnetic resonance (MR) images for a number of diagnostic and therapeutic workflows. In this article, predicting CT images from a number of magnetic resonance imaging (MRI) sequences using regression approach is explored. Two principal areas of application for estimated CT images are dose calculations in MRI-based radiotherapy tr… ▽ More There is an interest to replace computed tomography (CT) images with magnetic resonance (MR) images for a number of diagnostic and therapeutic workflows. In this article, predicting CT images from a number of magnetic resonance imaging (MRI) sequences using regression approach is explored. Two principal areas of application for estimated CT images are dose calculations in MRI-based radiotherapy treatment planning and attenuation correction for positron emission tomography (PET)/MRI. The main purpose of this work is to investigate the performance of hidden Markov (chain) models (HMMs) in comparison to hidden Markov random field (HMRF) models when predicting CT images of head. Our study shows that HMMs have clear advantages over HMRF models in this particular application. Obtained results suggest that HMMs deserve a further study for investigating their potential in modelling applications where the most natural theoretical choice would be the class of HMRF models. △ Less

Submitted 4 May, 2017; originally announced May 2017.

Comments: 17 pages, 5 figures, corresponding author, e-mail: [email protected]

arXiv:1602.05560 [pdf, other]

Lower bounds for moments of global scores of pairwise Markov chains

Authors: Jüri Lember, Heinrich Matzinger, Joonas Sova, Fabio Zucca

Abstract: Let $X_1,X_2,\ldots$ and $Y_1,Y_2,\ldots$ be two random sequences so that every random variable takes values in a finite set $\mathbb{A}$. We consider a global similarity score $L_n:=L(X_1,\ldots,X_n;Y_1,\ldots,Y_n)$ that measures the homology (relatedness) of words $(X_1,\ldots,X_n)$ and $(Y_1,\ldots,Y_n)$. A typical example of such score is the length of the longest common subsequence. We study… ▽ More Let $X_1,X_2,\ldots$ and $Y_1,Y_2,\ldots$ be two random sequences so that every random variable takes values in a finite set $\mathbb{A}$. We consider a global similarity score $L_n:=L(X_1,\ldots,X_n;Y_1,\ldots,Y_n)$ that measures the homology (relatedness) of words $(X_1,\ldots,X_n)$ and $(Y_1,\ldots,Y_n)$. A typical example of such score is the length of the longest common subsequence. We study the order of central absolute moment $E|L_n-EL_n|^r$ in the case where two-dimensional process $(X_1,Y_1),(X_2,Y_2),\ldots$ is a Markov chain on $\mathbb{A}\times \mathbb{A}$. This is a very general model involving independent Markov chains, hidden Markov models, Markov switching models and many more. Our main result establishes a general condition that guarantees that $E|L_n-EL_n|^r\asymp n^{r\over 2}$. We also perform simulations indicating the validity of the condition. △ Less

Submitted 18 February, 2016; v1 submitted 17 February, 2016; originally announced February 2016.

MSC Class: 60K35; 41A25; 60C05

arXiv:1506.06067 [pdf, ps, other]

Lower Bounds on the Generalized Central Moments of the Optimal Alignments Score of Random Sequences

Authors: Ruoting Gong, Christian Houdré, Jüri Lember

Abstract: We present a general approach to the problem of determining tight asymptotic lower bounds for generalized central moments of the optimal alignment score of two independent sequences of i.i.d. random variables. At first, these are obtained under a main assumption for which sufficient conditions are provided. When the main assumption fails, we nevertheless develop a "uniform approximation" method le… ▽ More We present a general approach to the problem of determining tight asymptotic lower bounds for generalized central moments of the optimal alignment score of two independent sequences of i.i.d. random variables. At first, these are obtained under a main assumption for which sufficient conditions are provided. When the main assumption fails, we nevertheless develop a "uniform approximation" method leading to asymptotic lower bounds. Our general results are then applied to the length of the longest common subsequence of binary strings, in which case asymptotic lower bounds are obtained for the moments and the exponential moments of the optimal score. As a byproduct, a local upper bound on the rate function associated with the length of the longest common subsequences of two binary strings is also obtained. △ Less

Submitted 24 November, 2016; v1 submitted 19 June, 2015; originally announced June 2015.

Comments: Final version to appear in Journal of Theoretical Probability, 33 pages

MSC Class: 05A05; 60C05; 60F10

arXiv:1504.05100 [pdf, ps, other]

New Bounds for Permutation Codes in Ulam Metric

Authors: Faruk Göloğlu, Jüri Lember, Ago-Erik Riet, Vitaly Skachek

Abstract: New bounds on the cardinality of permutation codes equipped with the Ulam distance are presented. First, an integer-programming upper bound is derived, which improves on the Singleton-type upper bound in the literature for some lengths. Second, several probabilistic lower bounds are developed, which improve on the known lower bounds for large minimum distances. The results of a computer search for… ▽ More New bounds on the cardinality of permutation codes equipped with the Ulam distance are presented. First, an integer-programming upper bound is derived, which improves on the Singleton-type upper bound in the literature for some lengths. Second, several probabilistic lower bounds are developed, which improve on the known lower bounds for large minimum distances. The results of a computer search for permutation codes are also presented. △ Less

Submitted 20 April, 2015; originally announced April 2015.

Comments: To be presented at ISIT 2015, 5 pages

arXiv:1407.1233 [pdf, ps, other]

doi 10.3150/13-BEJ522

Optimal alignments of longest common subsequences and their path properties

Authors: Jüri Lember, Heinrich Matzinger, Anna Vollmer

Abstract: We investigate the behavior of optimal alignment paths for homologous (related) and independent random sequences. An alignment between two finite sequences is optimal if it corresponds to the longest common subsequence (LCS). We prove the existence of lowest and highest optimal alignments and study their differences. High differences between the extremal alignments imply the high variety of all op… ▽ More We investigate the behavior of optimal alignment paths for homologous (related) and independent random sequences. An alignment between two finite sequences is optimal if it corresponds to the longest common subsequence (LCS). We prove the existence of lowest and highest optimal alignments and study their differences. High differences between the extremal alignments imply the high variety of all optimal alignments. We present several simulations indicating that the homologous (having the same common ancestor) sequences have typically the distance between the extremal alignments of much smaller size than independent sequences. In particular, the simulations suggest that for the homologous sequences, the growth of the distance between the extremal alignments is logarithmical. The main theoretical results of the paper prove that (under some assumptions) this is the case, indeed. The paper suggests that the properties of the optimal alignment paths characterize the relatedness of the sequences. △ Less

Submitted 4 July, 2014; originally announced July 2014.

Comments: Published in at http://dx.doi.org/10.3150/13-BEJ522 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ522

Journal ref: Bernoulli 2014, Vol. 20, No. 3, 1292-1343

arXiv:1307.7948 [pdf, ps, other]

On the accuracy of the Viterbi alignment

Authors: Kristi Kuljus, Jüri Lember

Abstract: In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. I… ▽ More In a hidden Markov model, the underlying Markov chain is usually hidden. Often, the maximum likelihood alignment (Viterbi alignment) is used as its estimate. Although having the biggest likelihood, the Viterbi alignment can behave very untypically by passing states that are at most unexpected. To avoid such situations, the Viterbi alignment can be modified by forcing it not to pass these states. In this article, an iterative procedure for improving the Viterbi alignment is proposed and studied. The iterative approach is compared with a simple bunch approach where a number of states with low probability are all replaced at the same time. It can be seen that the iterative way of adjusting the Viterbi alignment is more efficient and it has several advantages over the bunch approach. The same iterative algorithm for improving the Viterbi alignment can be used in the case of pee**, that is when it is possible to reveal hidden states. In addition, lower bounds for classification probabilities of the Viterbi alignment under different conditions on the model parameters are studied. △ Less

Submitted 30 July, 2013; originally announced July 2013.

arXiv:1211.5072 [pdf, ps, other]

General approach to the fluctuations problem in random sequence comparison

Authors: Jüri Lember, Heinrich Matzinger, Felipe Torres

Abstract: We present a general approach to the problem of determining the asymptotic order of the variance of the optimal score between two independent random sequences defined over an arbitrary finite alphabet. Our general approach is based on identifying random variables driving the fluctuations of the optimal score and conveniently choosing functions of them which exhibit certain monotonicity properties.… ▽ More We present a general approach to the problem of determining the asymptotic order of the variance of the optimal score between two independent random sequences defined over an arbitrary finite alphabet. Our general approach is based on identifying random variables driving the fluctuations of the optimal score and conveniently choosing functions of them which exhibit certain monotonicity properties. We show how our general approach establishes a common theoretical background for the techniques used by Matzinger et al. in a series of previous articles [6, 8, 20, 24, 26, 37] studying the same problem in especial cases. Additionally, we explicitely apply our general approach to study the fluctuations of the optimal score between two random sequences over a finite alphabet (closing the study as initiated in [26]) and of the length of the longest common subsequences between two random sequences with a certain block structure (generalizing part of [37]). △ Less

Submitted 21 November, 2012; originally announced November 2012.

Comments: 39 pages

MSC Class: 60K35; 41A25; 60C05

arXiv:1210.3771 [pdf, ps, other]

Detecting the homology of DNA-sequences based on the variety of optimal alignments: a case study

Authors: Erik Hirmo, Jüri Lember, Heinrich Matzinger

Abstract: We consider a novel approach of measuring the homology of DNA sequences based of the variety of optimal alignments in the longest common subsequence sense. The proposed approach is compared with BLAST in measuring the homology of four genes. We consider a novel approach of measuring the homology of DNA sequences based of the variety of optimal alignments in the longest common subsequence sense. The proposed approach is compared with BLAST in measuring the homology of four genes. △ Less

Submitted 14 October, 2012; originally announced October 2012.

arXiv:1011.2688 [pdf, other]

The rate of the convergence of the mean score in random sequence comparison

Authors: Juri Lember, Heinrich Matzinger, Felipe Torres

Abstract: We consider a general class of super-additive scores measuring the similarity of two independent sequences of $n$ i.i.d. letters from a finite alphabet. Our object of interest is the mean score by letter $l_n$. By the subadditivity $l_n$ is nondecreasing and converges to a limit $l$. We give a simple method of bounding the difference $l-l_n$ and obtaining the rate of convergence. Our result genera… ▽ More We consider a general class of super-additive scores measuring the similarity of two independent sequences of $n$ i.i.d. letters from a finite alphabet. Our object of interest is the mean score by letter $l_n$. By the subadditivity $l_n$ is nondecreasing and converges to a limit $l$. We give a simple method of bounding the difference $l-l_n$ and obtaining the rate of convergence. Our result generalizes a previous result of Alexander, where only the special case of the longest common subsequence is considered. △ Less

Submitted 17 November, 2010; v1 submitted 11 November, 2010; originally announced November 2010.

Comments: 13 pages, 1 figure

MSC Class: 60K35; 41A25; 60C05

arXiv:1007.3622 [pdf, other]

A generalized risk approach to path inference based on hidden Markov models

Authors: Jüri Lember, Alexey A. Koloydenko

Abstract: Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied m… ▽ More Motivated by the unceasing interest in hidden Markov models (HMMs), this paper re-examines hidden path inference in these models, using primarily a risk-based framework. While the most common maximum a posteriori (MAP), or Viterbi, path estimator and the minimum error, or Posterior Decoder (PD), have long been around, other path estimators, or decoders, have been either only hinted at or applied more recently and in dedicated applications generally unfamiliar to the statistical learning community. Over a decade ago, however, a family of algorithmically defined decoders aiming to hybridize the two standard ones was proposed (Brushe et al., 1998). The present paper gives a careful analysis of this hybridization approach, identifies several problems and issues with it and other previously proposed approaches, and proposes practical resolutions of those. Furthermore, simple modifications of the classical criteria for hidden path recognition are shown to lead to a new class of decoders. Dynamic programming algorithms to compute these decoders in the usual forward-backward manner are presented. A particularly interesting subclass of such estimators can be also viewed as hybrids of the MAP and PD estimators. Similar to previously proposed MAP-PD hybrids, the new class is parameterized by a small number of tunable parameters. Unlike their algorithmic predecessors, the new risk-based decoders are more clearly interpretable, and, most importantly, work "out of the box" in practice, which is demonstrated on some real bioinformatics tasks and data. Some further generalizations and applications are discussed in conclusion. △ Less

Submitted 16 April, 2013; v1 submitted 21 July, 2010; originally announced July 2010.

Comments: Section 5: corrected denominators of the scaled beta variables (pp. 27-30), => corrections in claims 1, 3, Prop. 12, bottom of Table 1. Decoder (49), Corol. 14 are generalized to handle 0 probabilities. Notation is more closely aligned with (Bishop, 2006). Details are inserted in eqn-s (43); the positivity assumption in Prop. 11 is explicit. Fixed ty** errors in equation (41), Example 2

MSC Class: 60J20; 65C60; 62M05; 60G35; 94A12; 94A05; 90C39

arXiv:1002.3509 [pdf, ps, other]

Asymptotic risks of Viterbi segmentation

Authors: Kristi Kuljus, Jüri Lember

Abstract: We consider the maximum likelihood (Viterbi) alignment of a hidden Markov model (HMM). In an HMM, the underlying Markov chain is usually hidden and the Viterbi alignment is often used as the estimate of it. This approach will be referred to as the Viterbi segmentation. The goodness of the Viterbi segmentation can be measured by several risks. In this paper, we prove the existence of asymptotic ris… ▽ More We consider the maximum likelihood (Viterbi) alignment of a hidden Markov model (HMM). In an HMM, the underlying Markov chain is usually hidden and the Viterbi alignment is often used as the estimate of it. This approach will be referred to as the Viterbi segmentation. The goodness of the Viterbi segmentation can be measured by several risks. In this paper, we prove the existence of asymptotic risks. Being independent of data, the asymptotic risks can be considered as the characteristics of the model that illustrate the long-run behavior of the Viterbi segmentation. △ Less

Submitted 13 December, 2010; v1 submitted 18 February, 2010; originally announced February 2010.

Comments: 23 pages

MSC Class: 60F15; 62M5

arXiv:0910.4636 [pdf, ps, other]

On approximation of smoothing probabilities for hidden Markov models

Authors: J. Lember

Abstract: We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that th… ▽ More We consider the smoothing probabilities of hidden Markov model (HMM). We show that under fairly general conditions for HMM, the exponential forgetting still holds, and the smoothing probabilities can be well approximated with the ones of double sided HMM. This makes it possible to use ergodic theorems. As an applications we consider the pointwise maximum a posteriori segmentation, and show that the corresponding risks converge. △ Less

Submitted 10 May, 2011; v1 submitted 24 October, 2009; originally announced October 2009.

Comments: submitted to Statistics and Probability Letters

arXiv:0907.5137 [pdf, ps, other]

doi 10.1214/08-AOP436

Standard deviation of the longest common subsequence

Authors: Jüri Lember, Heinrich Matzinger

Abstract: Let $L_n$ be the length of the longest common subsequence of two independent i.i.d. sequences of Bernoulli variables of length $n$. We prove that the order of the standard deviation of $L_n$ is $\sqrt{n}$, provided the parameter of the Bernoulli variables is small enough. This validates Waterman's conjecture in this situation [Philos. Trans. R. Soc. Lond. Ser. B 344 (1994) 383--390]. The order c… ▽ More Let $L_n$ be the length of the longest common subsequence of two independent i.i.d. sequences of Bernoulli variables of length $n$. We prove that the order of the standard deviation of $L_n$ is $\sqrt{n}$, provided the parameter of the Bernoulli variables is small enough. This validates Waterman's conjecture in this situation [Philos. Trans. R. Soc. Lond. Ser. B 344 (1994) 383--390]. The order conjectured by Chvatal and Sankoff [J. Appl. Probab. 12 (1975) 306--315], however, is different. △ Less

Submitted 29 July, 2009; originally announced July 2009.

Comments: Published in at http://dx.doi.org/10.1214/08-AOP436 the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-AOP-AOP436 MSC Class: 60K35; 41A25 (Primary); 60C05C (Secondary)

Journal ref: Annals of Probability 2009, Vol. 37, No. 3, 1192-1235

arXiv:0804.2138 [pdf, ps, other]

doi 10.1109/TIT.2010.2040897

A constructive proof of the existence of Viterbi processes

Authors: J. Lember, A. Koloydenko

Abstract: Since the early days of digital communication, hidden Markov models (HMMs) have now been also routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. In an HMM $(X_i,Y_i)_{i\ge 1}$, observations $X_1,X_2,...$ are assumed to be conditionally independent given an ``explanatory'' Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the… ▽ More Since the early days of digital communication, hidden Markov models (HMMs) have now been also routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. In an HMM $(X_i,Y_i)_{i\ge 1}$, observations $X_1,X_2,...$ are assumed to be conditionally independent given an ``explanatory'' Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional distribution of $X_i$ depends solely on $Y_i$. Central to the theory and applications of HMM is the Viterbi algorithm to find {\em a maximum a posteriori} (MAP) estimate $q_{1:n}=(q_1,q_2,...,q_n)$ of $Y_{1:n}$ given observed data $x_{1:n}$. Maximum {\em a posteriori} paths are also known as Viterbi paths or alignments. Recently, attempts have been made to study the behavior of Viterbi alignments when $n\to \infty$. Thus, it has been shown that in some special cases a well-defined limiting Viterbi alignment exists. While innovative, these attempts have relied on rather strong assumptions and involved proofs which are existential. This work proves the existence of infinite Viterbi alignments in a more constructive manner and for a very general class of HMMs. △ Less

Submitted 14 April, 2008; originally announced April 2008.

Comments: Submitted to the IEEE Transactions on Information Theory, focuses on the proofs of the results presented in arXiv:0709.2317, and arXiv:0803.2394

Journal ref: IEEE Transactions on Information Theory, volume 56, issue 4, 2010, pages 2017 - 2033

arXiv:0803.2394 [pdf, ps, other]

doi 10.3150/07-BEJ105

The adjusted Viterbi training for hidden Markov models

Authors: Jüri Lember, Alexey Koloydenko

Abstract: The EM procedure is a principal tool for parameter estimation in the hidden Markov models. However, applications replace EM by Viterbi extraction, or training (VT). VT is computationally less intensive, more stable and has more of an intuitive appeal, but VT estimation is biased and does not satisfy the following fixed point property. Hypothetically, given an infinitely large sample and initiali… ▽ More The EM procedure is a principal tool for parameter estimation in the hidden Markov models. However, applications replace EM by Viterbi extraction, or training (VT). VT is computationally less intensive, more stable and has more of an intuitive appeal, but VT estimation is biased and does not satisfy the following fixed point property. Hypothetically, given an infinitely large sample and initialized to the true parameters, VT will generally move away from the initial values. We propose adjusted Viterbi training (VA), a new method to restore the fixed point property and thus alleviate the overall imprecision of the VT estimators, while preserving the computational advantages of the baseline VT algorithm. Simulations elsewhere have shown that VA appreciably improves the precision of estimation in both the special case of mixture models and more general HMMs. However, being entirely analytic, the VA correction relies on infinite Viterbi alignments and associated limiting probability distributions. While explicit in the mixture case, the existence of these limiting measures is not obvious for more general HMMs. This paper proves that under certain mild conditions, the required limiting distributions for general HMMs do exist. △ Less

Submitted 17 March, 2008; originally announced March 2008.

Comments: Published in at http://dx.doi.org/10.3150/07-BEJ105 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Report number: IMS-BEJ-BEJ105

Journal ref: Bernoulli 2008, Vol. 14, No. 1, 180-206

arXiv:0802.0069 [pdf, ps, other]

doi 10.1214/07-EJS090

Nonparametric Bayesian model selection and averaging

Authors: Subhashis Ghosal, Jüri Lember, Aad van der Vaart

Abstract: We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an… ▽ More We consider nonparametric Bayesian estimation of a probability density $p$ based on a random sample of size $n$ from this density using a hierarchical prior. The prior consists, for instance, of prior weights on the regularity of the unknown density combined with priors that are appropriate given that the density has this regularity. More generally, the hierarchy consists of prior weights on an abstract model index and a prior on a density model for each model index. We present a general theorem on the rate of contraction of the resulting posterior distribution as $n\to \infty$, which gives conditions under which the rate of contraction is the one attached to the model that best approximates the true density of the observations. This shows that, for instance, the posterior distribution can adapt to the smoothness of the underlying density. We also study the posterior distribution of the model index, and find that under the same conditions the posterior distribution gives negligible weight to models that are bigger than the optimal one, and thus selects the optimal model or smaller models that also approximate the true density well. We apply these result to log spline density models, where we show that the prior weights on the regularity index interact with the priors on the models, making the exact rates depend in a complicated way on the priors, but also that the rate is fairly robust to specification of the prior weights. △ Less

Submitted 1 February, 2008; originally announced February 2008.

Comments: Published in at http://dx.doi.org/10.1214/07-EJS090 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org)

Report number: IMS-EJS-EJS_2007_90 MSC Class: 62G07; 62G20; 62C10; 65U05; 68T05 (Primary)

Journal ref: Electronic Journal of Statistics 2008, Vol. 2, 63-89

arXiv:0711.0928 [pdf, ps, other]

Infinite Viterbi alignments in the two state hidden Markov models

Authors: J. Lember, A. Koloydenko

Abstract: Since the early days of digital communication, Hidden Markov Models (HMMs) have now been routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. An HMM $(X_i,Y_i)_{i\ge 1}$ assumes observations $X_1,X_2,...$ to be conditionally independent given an "explanotary" Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional dis… ▽ More Since the early days of digital communication, Hidden Markov Models (HMMs) have now been routinely used in speech recognition, processing of natural languages, images, and in bioinformatics. An HMM $(X_i,Y_i)_{i\ge 1}$ assumes observations $X_1,X_2,...$ to be conditionally independent given an "explanotary" Markov process $Y_1,Y_2,...$, which itself is not observed; moreover, the conditional distribution of $X_i$ depends solely on $Y_i$. Central to the theory and applications of HMM is the Viterbi algorithm to find {\em a maximum a posteriori} estimate $q_{1:n}=(q_1,q_2,...,q_n)$ of $Y_{1:n}$ given the observed data $x_{1:n}$. Maximum {\em a posteriori} paths are also called Viterbi paths or alignments. Recently, attempts have been made to study the behavior of Viterbi alignments of HMMs with two hidden states when $n$ tends to infinity. It has indeed been shown that in some special cases a well-defined limiting Viterbi alignment exists. While innovative, these attempts have relied on rather strong assumptions. This work proves the existence of infinite Viterbi alignments for virtually any HMM with two hidden states. △ Less

Submitted 5 February, 2009; v1 submitted 6 November, 2007; originally announced November 2007.

Comments: Several minor changes and corrections have been made in the arguments as suggested by anonymous reviewers, which should hopefully improve readability. Abstract has been added

Journal ref: Acta et Commentationes Universitatis Tartuensis de Mathematica, Volume 12, 2008, pp. 109-124

arXiv:0709.2317 [pdf, ps, other]

Adjusted Viterbi training for hidden Markov models

Authors: J. Lember, A. Koloydenko

Abstract: To estimate the emission parameters in hidden Markov models one commonly uses the EM algorithm or its variation. Our primary motivation, however, is the Philips speech recognition system wherein the EM algorithm is replaced by the Viterbi training algorithm. Viterbi training is faster and computationally less involved than EM, but it is also biased and need not even be consistent. We propose an… ▽ More To estimate the emission parameters in hidden Markov models one commonly uses the EM algorithm or its variation. Our primary motivation, however, is the Philips speech recognition system wherein the EM algorithm is replaced by the Viterbi training algorithm. Viterbi training is faster and computationally less involved than EM, but it is also biased and need not even be consistent. We propose an alternative to the Viterbi training -- adjusted Viterbi training -- that has the same order of computational complexity as Viterbi training but gives more accurate estimators. Elsewhere, we studied the adjusted Viterbi training for a special case of mixtures, supporting the theory by simulations. This paper proves the adjusted Viterbi training to be also possible for more general hidden Markov models. △ Less

Submitted 14 September, 2007; originally announced September 2007.

Comments: 45 pages, 2 figures

Report number: 07-01

arXiv:math/0406237 [pdf, ps, other]

Adjusted Viterbi training

Authors: J. Lember, A. Koloydenko

Abstract: We study modifications of the Viterbi Training (VT) algorithm to estimate emission parameters in Hidden Markov Models (HMM) in general, and in mixure models in particular. Motivated by applications of VT to HMM that are used in speech recognition, natural language modeling, image analysis, and bioinformatics, we investigate a possibility of alleviating the inconsistency of VT while controlling t… ▽ More We study modifications of the Viterbi Training (VT) algorithm to estimate emission parameters in Hidden Markov Models (HMM) in general, and in mixure models in particular. Motivated by applications of VT to HMM that are used in speech recognition, natural language modeling, image analysis, and bioinformatics, we investigate a possibility of alleviating the inconsistency of VT while controlling the amount of extra computations. Specifically, we propose to enable VT to asymptotically fix the true values of the parameters as does the EM algorithm. This relies on infinite Viterbi alignment and an associated with it limiting probability distribution. This paper, however, focuses on mixture models, an important case of HMM, wherein the limiting distribution can always be computed exactly; finding such limiting distribution for general HMM presents a more challenging task under our ongoing investigation. A simulation of a univariate Gaussian mixture shows that our central algorithm (VA1) can dramatically improve accuracy without much cost in computation time. We also present VA2, a more mathematically advanced correction to VT, verify by simulation its fast convergence and high accuracy; its computational feasibility remains to be investigated in future work. △ Less

Submitted 9 October, 2004; v1 submitted 11 June, 2004; originally announced June 2004.

Comments: 15 pages, 1 PostScript figure; in review by "Computational Statistics and Data Analysis"; citation 15 activated 20 pages, 1.5-spaced, citation styled changed to author-year, minor changes in the wording of abstract and introduction, three new references added, one old one removed, table references corrected, submitted to "Statistics and Computing"

MSC Class: 62F12; 68T10; 92D20; 62H12

Showing 1–33 of 33 results for author: Lember, J