-
arXiv:2403.07685 [pdf, ps, other]
On fluctuations of complexity measures for the FIND algorithm
Abstract: The FIND algorithm (also called Quickselect) is a fundamental algorithm to select ranks or quantiles within a set of data. It was shown by Grübel and Rösler that the number of key comparisons required by Find as a process of the quantiles $α\in[0,1]$ in a natural probabilistic model converges after normalization in distribution within the càdlàg space $D[0,1]$ endowed with the Skorokhod metric. We… ▽ More
Submitted 12 March, 2024; originally announced March 2024.
Comments: This is an extended abstract later to be replaced by its full paper version
MSC Class: 60F17; 68Q25; 68P10; 60C05
-
Patricia's Bad Distributions
Abstract: The height of a random PATRICIA tree built from independent, identically distributed infinite binary strings with arbitrary diffuse probability distribution $μ$ on $\{0,1\}^\mathbb{N}$ is studied. We show that the expected height grows asymptotically sublinearly in the number of leaves for any such $μ$, but can be made to exceed any specific sublinear growth rate by choosing $μ$ appropriately.
Submitted 18 May, 2024; v1 submitted 8 March, 2024; originally announced March 2024.
Comments: Revised version. Accepted for publication in the proceedings of the 35th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA2024)
MSC Class: 68P05; 60C05; 68R15; 68P10
-
Bivariate change point detection in movement direction and speed
Abstract: Biological movement patterns can sometimes be quasi linear with abrupt changes in direction and speed, as in plastids in root cells investigated here. For the analysis of such changes we propose a new stochastic model for movement along linear structures. Maximum likelihood estimators are provided, and due to serial dependencies of increments, the classical MOSUM statistic is replaced by a moving… ▽ More
Submitted 4 February, 2024; originally announced February 2024.
MSC Class: Primary 62P10; 60F17; Secondary: 60G15; 60G50
-
arXiv:2202.00081 [pdf, ps, other]
On solutions of the distributional Bellman equation
Abstract: In distributional reinforcement learning not only expected returns but the complete return distributions of a policy are taken into account. The return distribution for a fixed policy is given as the solution of an associated distributional Bellman equation. In this note we consider general distributional Bellman equations and study existence and uniqueness of their solutions as well as tail prope… ▽ More
Submitted 26 May, 2023; v1 submitted 31 January, 2022; originally announced February 2022.
Comments: Largely revised version to appear in Electron. Res. Arch. (Special Issue: Mathematics of Machine Learning and Related Topics)
MSC Class: 60E05; 60H25 (Primary) 68T05; 90C40 (Secondary)
-
arXiv:1909.12767 [pdf, ps, other]
A note on the independence number, domination number and related parameters of random binary search trees and random recursive trees
Abstract: We identify the mean growth of the independence number of random binary search trees and random recursive trees and show normal fluctuations around their means. Similarly we also show normal limit laws for the domination number and variations of it for these two cases of random tree models. Our results are an application of a recent general theorem of Holmgren and Janson on fringe trees in these t… ▽ More
Submitted 10 February, 2020; v1 submitted 27 September, 2019; originally announced September 2019.
MSC Class: 60C05 (Primary) 05C69; 05C05; 05C80; 60F05; 05C15 (Secondary)
-
Node Profiles of Symmetric Digital Search Trees: Concentration Properties
Abstract: We give a detailed asymptotic analysis of the profiles of random symmetric digital search trees, which are in close connection with the performance of the search complexity of random queries in such trees. While the expected profiles have been analyzed for several decades, the analysis of the variance turns out to be very difficult and challenging, and requires the combination of several different… ▽ More
Submitted 28 September, 2020; v1 submitted 18 November, 2017; originally announced November 2017.
Comments: The central limit theorem was removed from this version (and moved to a follow-up paper) since the proof in the previous versions was incomplete. Also, the word "Concentration Properties" was added to the title since this part now entirely focuses on such results
MSC Class: 05A16; 60C05; 68Q25; 68P05; 60F05
-
Probabilistic Analysis of the Dual-Pivot Quicksort "Count"
Abstract: Recently, Aumüller and Dietzfelbinger proposed a version of a dual-pivot quicksort, called "Count", which is optimal among dual-pivot versions with respect to the average number of key comparisons required. In this note we provide further probabilistic analysis of "Count". We derive an exact formula for the average number of swaps needed by "Count" as well as an asymptotic formula for the variance… ▽ More
Submitted 20 October, 2017; originally announced October 2017.
Comments: To appear in the proceedings of Analytic Algorithmics and Combinatorics (ANALCO18)
-
Refined Asymptotics for the Composition of Cyclic Urns
Abstract: A cyclic urn is an urn model for balls of types $0,\ldots,m-1$. The urn starts at time zero with an initial configuration. Then, in each time step, first a ball is drawn from the urn uniformly and independently from the past. If its type is $j$, it is then returned to the urn together with a new ball of type $j+1 \mod m$. The case $m=2$ is the well-known Friedman urn. The composition vector, i.e.,… ▽ More
Submitted 12 March, 2019; v1 submitted 28 December, 2016; originally announced December 2016.
Comments: arXiv admin note: text overlap with arXiv:1507.08119
-
Process convergence for the complexity of Radix Selection on Markov sources
Abstract: A fundamental algorithm for selecting ranks from a finite subset of an ordered set is Radix Selection. This algorithm requires the data to be given as strings of symbols over an ordered alphabet, e.g., binary expansions of real numbers. Its complexity is measured by the number of symbols that have to be read. In this paper the model of independent data identically generated from a Markov chain is… ▽ More
Submitted 2 October, 2017; v1 submitted 8 May, 2016; originally announced May 2016.
Comments: main results significantly improved, 4 figures
MSC Class: 60F17; 60G15; 68P10; 60C05; 68Q25
-
arXiv:1507.08119 [pdf, ps, other]
The CLT Analogue for Cyclic Urns
Abstract: A cyclic urn is an urn model for balls of types $0,\ldots,m-1$ where in each draw the ball drawn, say of type $j$, is returned to the urn together with a new ball of type $j+1 \mod m$. The case $m=2$ is the well-known Friedman urn. The composition vector, i.e., the vector of the numbers of balls of each type after $n$ steps is, after normalization, known to be asymptotically normal for… ▽ More
Submitted 29 July, 2015; originally announced July 2015.
Comments: Extended abstract to be replaced later by a full version
-
arXiv:1505.07321 [pdf, ps, other]
A Limit Theorem for Radix Sort and Tries with Markovian Input
Abstract: Tries are among the most versatile and widely used data structures on words. In particular, they are used in fundamental sorting algorithms such as radix sort which we study in this paper. While the performance of radix sort and tries under a realistic probabilistic model for the generation of words is of significant importance, its analysis, even for simplest memoryless sources, has proved diffic… ▽ More
Submitted 27 May, 2015; originally announced May 2015.
MSC Class: 60F05; 60C05; 68P10; 68Q25
-
Dependence and phase changes in random $m$-ary search trees
Abstract: We study the joint asymptotic behavior of the space requirement and the total path length (either summing over all root-key distances or over all root-node distances) in random $m$-ary search trees. The covariance turns out to exhibit a change of asymptotic behavior: it is essentially linear when $3\le m\le 13$ but becomes of higher order when $m\ge14$. Surprisingly, the corresponding asymptotic c… ▽ More
Submitted 25 February, 2016; v1 submitted 21 January, 2015; originally announced January 2015.
Comments: Revised unabridged version of our paper accepted for publication in Random Structures & Algorithms
MSC Class: 60F05; 68Q25; 68P05; 60C05; 05A16
-
arXiv:1404.3672 [pdf, ps, other]
Analysis of radix selection on Markov sources
Abstract: The complexity of the algorithm Radix Selection is considered for independent data generated from a Markov source. The complexity is measured by the number of bucket operations required and studied as a stochastic process indexed by the ranks; also the case of a uniformly chosen rank is considered. The orders of mean and variance of the complexity and limit theorems are derived. We find weak conve… ▽ More
Submitted 14 April, 2014; originally announced April 2014.
Comments: To appear in the proceedings of the 25th International Conference on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA14), 2014
MSC Class: 68P10; 60F17; 60G15; 60C05; 68Q25
-
A statistical view on exchanges in Quickselect
Abstract: In this paper we study the number of key exchanges required by Hoare's FIND algorithm (also called Quickselect) when operating on a uniformly distributed random permutation and selecting an independent uniformly distributed rank. After normalization we give a limit theorem where the limit law is a perpetuity characterized by a recursive distributional equation. To make the limit theorem usable for… ▽ More
Submitted 19 November, 2013; v1 submitted 31 July, 2013; originally announced July 2013.
Comments: Theorem 4.4 revised; accepted for publication in Analytic Algorithmics and Combinatorics (ANALCO14)
MSC Class: Primary 60F05; 68P10; secondary 60C05; 68Q25
-
A Gaussian limit process for optimal FIND algorithms
Abstract: We consider versions of the FIND algorithm where the pivot element used is the median of a subset chosen uniformly at random from the data. For the median selection we assume that subsamples of size asymptotic to $c \cdot n^α$ are chosen, where $0<α\le \frac{1}{2}$, $c>0$ and $n$ is the size of the data set to be split. We consider the complexity of FIND as a process in the rank to be selected and… ▽ More
Submitted 19 November, 2013; v1 submitted 19 July, 2013; originally announced July 2013.
Comments: revised version
MSC Class: Primary 60F17; 68P10; secondary 60G15; 60C05; 68Q25
-
Average Case and Distributional Analysis of Dual-Pivot Quicksort
Abstract: In 2009, Oracle replaced the long-serving sorting algorithm in its Java 7 runtime library by a new dual-pivot Quicksort variant due to Vladimir Yaroslavskiy. The decision was based on the strikingly good performance of Yaroslavskiy's implementation in running time experiments. At that time, no precise investigations of the algorithm were available to explain its superior performance - on the contr… ▽ More
Submitted 13 February, 2015; v1 submitted 3 April, 2013; originally announced April 2013.
Comments: v3 is content-wise identical to TALG version
ACM Class: F.2.2; G.2.1; G.3; F.2.3; D.3.2
Journal ref: ACM Transactions on Algorithms 11, 3, Article 22 (Jan 2015)
-
arXiv:1303.3594 [pdf, ps, other]
A multiple filter test for the detection of rate changes in renewal processes with varying variance
Abstract: Nonstationarity of the event rate is a persistent problem in modeling time series of events, such as neuronal spike trains. Motivated by a variety of patterns in neurophysiological spike train recordings, we define a general class of renewal processes. This class is used to test the null hypothesis of stationary rate versus a wide alternative of renewal processes with finitely many rate changes (c… ▽ More
Submitted 16 January, 2015; v1 submitted 14 March, 2013; originally announced March 2013.
Comments: Published in at http://dx.doi.org/10.1214/14-AOAS782 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOAS-AOAS782
Journal ref: Annals of Applied Statistics 2014, Vol. 8, No. 4, 2027-2067
-
Polya urns via the contraction method
Abstract: We propose an approach to analyze the asymptotic behavior of Pólya urns based on the contraction method. For this, a new combinatorial discrete time embedding of the evolution of the urn into random rooted trees is developed. A decomposition of these trees leads to a system of recursive distributional equations which capture the distributions of the numbers of balls of each color. Ideas from the c… ▽ More
Submitted 19 November, 2013; v1 submitted 15 January, 2013; originally announced January 2013.
Comments: minor revision; accepted for publication in Combinatorics, Probability & Computing (Special issue dedicated to the memory of Philippe Flajolet)
MSC Class: 60C05; 60F05; 60J05; 68Q25
Journal ref: Combinator. Probab. Comp. 23 (2014) 1148-1186
-
arXiv:1207.4556 [pdf, ps, other]
Refined Quicksort asymptotics
Abstract: The complexity of the Quicksort algorithm is usually measured by the number of key comparisons used during its execution. When operating on a list of $n$ data, permuted uniformly at random, the appropriately normalized complexity $Y_n$ is known to converge almost surely to a non-degenerate random limit $Y$. This assumes a natural embedding of all $Y_n$ on one probability space, e.g., via random bi… ▽ More
Submitted 24 January, 2013; v1 submitted 19 July, 2012; originally announced July 2012.
Comments: revised version; title slightly changed; accepted for publication in Random Structures and Algorithms
MSC Class: 60F05; 60F15; 68P10; 68Q25
-
Towards More Realistic Probabilistic Models for Data Structures: The External Path Length in Tries under the Markov Model
Abstract: Tries are among the most versatile and widely used data structures on words. They are pertinent to the (internal) structure of (stored) words and several splitting procedures used in diverse contexts ranging from document taxonomy to IP addresses lookup, from data compression (i.e., Lempel-Ziv'77 scheme) to dynamic hashing, from partial-match queries to speech recognition, from leader election alg… ▽ More
Submitted 18 September, 2012; v1 submitted 2 July, 2012; originally announced July 2012.
Comments: minor revision; to appear in Proceedings of ACM-SIAM Symposium on Discrete Algorithms (SODA) (2013)
MSC Class: 60F05; 68P05; 68Q25
-
Appendix to "Approximating perpetuities"
Abstract: An algorithm for perfect simulation from the unique solution of the distributional fixed point equation $Y=_d UY + U(1-U)$ is constructed, where $Y$ and $U$ are independent and $U$ is uniformly distributed on $[0,1]$. This distribution comes up as a limit distribution in the probabilistic analysis of the Quickselect algorithm. Our simulation algorithm is based on coupling from the past with a mult… ▽ More
Submitted 29 July, 2012; v1 submitted 3 March, 2012; originally announced March 2012.
MSC Class: 60J05; 65C05; 68U20; 60E05
-
Asymptotic analysis of Hoppe trees
Abstract: We introduce and analyze a random tree model associated to Hoppe's urn. The tree is built successively by adding nodes to the existing tree when starting with the single root node. In each step a node is added to the tree as a child of an existing node where these parent nodes are chosen randomly with probabilities proportional to their weights. The root node has weight $\vartheta>0$, a given fixe… ▽ More
Submitted 5 July, 2012; v1 submitted 11 February, 2012; originally announced February 2012.
MSC Class: 60F05; 60C05 (Primary) 60G42; 68R05 (Secondary)
-
arXiv:1202.1370 [pdf, ps, other]
On a functional contraction method
Abstract: Methods for proving functional limit laws are developed for sequences of stochastic processes which allow a recursive distributional decomposition either in time or space. Our approach is an extension of the so-called contraction method to the space $\mathcal{C}[0,1]$ of continuous functions endowed with uniform topology and the space $\mathcal {D}[0,1]$ of càdlàg functions with the Skorokhod topo… ▽ More
Submitted 9 September, 2015; v1 submitted 7 February, 2012; originally announced February 2012.
Comments: Published at http://dx.doi.org/10.1214/14-AOP919 in the Annals of Probability (http://www.imstat.org/aop/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AOP-AOP919
Journal ref: Annals of Probability 2015, Vol. 43, No. 4, 1777-1822
-
arXiv:1202.1342 [pdf, ps, other]
A limit process for partial match queries in random quadtrees and $2$-d trees
Abstract: We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quadtrees and $k$-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on $n$ points, it is known that the number of nodes $C_n(ξ)$ to visit in order to report the items matching a random query… ▽ More
Submitted 5 December, 2013; v1 submitted 6 February, 2012; originally announced February 2012.
Comments: Published in at http://dx.doi.org/10.1214/12-AAP912 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org). arXiv admin note: text overlap with arXiv:1107.2231
Report number: IMS-AAP-AAP912
Journal ref: Annals of Applied Probability 2013, Vol. 23, No. 6, 2560-2603
-
Partial match queries in random quadtrees
Abstract: We consider the problem of recovering items matching a partially specified pattern in multidimensional trees (quad trees and k-d trees). We assume the traditional model where the data consist of independent and uniform points in the unit square. For this model, in a structure on $n$ points, it is known that the number of nodes $C_n(ξ)$ to visit in order to report the items matching an independent… ▽ More
Submitted 12 July, 2011; originally announced July 2011.
Comments: 12 pages, 2 figures
MSC Class: 05A16; 05A15; 05C05; 60C05
-
Approximating Perpetuities
Abstract: We propose and analyze an algorithm to approximate distribution functions and densities of perpetuities. Our algorithm refines an earlier approach based on iterating discretized versions of the fixed point equation that defines the perpetuity. We significantly reduce the complexity of the earlier algorithm. Also one particular perpetuity arising in the analysis of the selection algorithm Quickse… ▽ More
Submitted 7 November, 2007; originally announced November 2007.
-
arXiv:math/0609385 [pdf, ps, other]
A functional limit theorem for the profile of search trees
Abstract: We study the profile $X_{n,k}$ of random search trees including binary search trees and $m$-ary search trees. Our main result is a functional limit theorem of the normalized profile $X_{n,k}/\mathbb{E}X_{n,k}$ for $k=\lfloorα\log n\rfloor$ in a certain range of $α$. A central feature of the proof is the use of the contraction method to prove convergence in distribution of certain random analytic… ▽ More
Submitted 22 January, 2008; v1 submitted 14 September, 2006; originally announced September 2006.
Comments: Published in at http://dx.doi.org/10.1214/07-AAP457 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org)
Report number: IMS-AAP-AAP457 MSC Class: 60F17 (Primary); 68Q25; 68P10; 60C05 (Secondary)
Journal ref: Annals of Applied Probability 2008, Vol. 18, No. 1, 288-333
-
arXiv:math/0609350 [pdf, ps, other]
The size of random fragmentation trees
Abstract: We study a random fragmentation process and its associated random tree. The process has earlier been studied by Dean and Majumdar (J. Phys. A: Math. Gen., vol. 35, L501--L507), who found a phase transition: the number of fragmentations is asymptotically normal in some cases but not in others, depending on the position of roots of a certain characteristic equation. This parallels the behaviour of… ▽ More
Submitted 13 September, 2006; originally announced September 2006.
-
arXiv:math/0410177 [pdf, ps, other]
On the contraction method with degenerate limit equation
Abstract: A class of random recursive sequences (Y_n) with slowly varying variances as arising for parameters of random trees or recursive algorithms leads after normalizations to degenerate limit equations of the form X\stackrel{L}{=}X. For nondegenerate limit equations the contraction method is a main tool to establish convergence of the scaled sequence to the ``unique'' solution of the limit equation.… ▽ More
Submitted 6 October, 2004; originally announced October 2004.
Comments: Published by the Institute of Mathematical Statistics (http://www.imstat.org) in the Annals of Probability (http://www.imstat.org/aop/) at http://dx.doi.org/10.1214/009117904000000171
Report number: IMS-AOP-AOP294 MSC Class: 60F05; 68Q25 (Primary) 68P10. (Secondary)
Journal ref: Annals of Probability 2004, Vol. 32, No. 3B, 2838-2856
-
arXiv:math/0405322 [pdf, ps, other]
Probabilistic Analysis for Randomized Game Tree Evaluation
Abstract: We give a probabilistic analysis for the randomized game tree evaluation algorithm of Snir. We first show that there exists an input such that the running time, measured as the number of external nodes read by the algorithm, on that input is maximal in stochastic order among all possible inputs. For this worst case input we identify the exact expectation of the number of external nodes read by t… ▽ More
Submitted 17 May, 2004; originally announced May 2004.
Comments: 10 pages, conference: Third Colloquium on Mathematics and Computer Science
MSC Class: 60F10 (Primary); 60J80; 68W40; 60J85 (Secondary)
-
arXiv:math/0005237 [pdf, ps, other]
Perfect simulation from the Quicksort limit distribution
Abstract: The weak limit of the normalized number of comparisons needed by the Quicksort algorithm to sort n randomly permuted items is known to be determined implicitly by a distributional fixed-point equation. We give an algorithm for perfect random variate generation from this distribution.
Submitted 23 May, 2000; v1 submitted 23 May, 2000; originally announced May 2000.
Comments: 7 pages. See also http://www.mts.jhu.edu/~fill/, http://www-cgrl.cs.mcgill.ca/~luc/, and http://www.stochastik.uni-freiburg.de/homepages/neininger/ . Submitted for publication in May, 2000
Report number: 603, Department of Mathematical Sciences, The Johns Hopkins University MSC Class: 65C10 (primary); 65C05; 68U20; 11K45 (secondary)