-
The Harmonic Descent Chain
Authors:
David J. Aldous,
Svante Janson,
Xiaodan Li
Abstract:
The decreasing Markov chain on \{1,2,3, \ldots\} with transition probabilities $p(j,j-i) \propto 1/i$ arises as a key component of the analysis of the beta-splitting random tree model. We give a direct and almost self-contained "probability" treatment of its occupation probabilities, as a counterpart to a more sophisticated but perhaps opaque derivation using a limit continuum tree structure and M…
▽ More
The decreasing Markov chain on \{1,2,3, \ldots\} with transition probabilities $p(j,j-i) \propto 1/i$ arises as a key component of the analysis of the beta-splitting random tree model. We give a direct and almost self-contained "probability" treatment of its occupation probabilities, as a counterpart to a more sophisticated but perhaps opaque derivation using a limit continuum tree structure and Mellin transforms.
△ Less
Submitted 8 May, 2024;
originally announced May 2024.
-
Fringe trees of Patricia tries and compressed binary search trees
Authors:
Svante Janson
Abstract:
Most of the results in the paper for Patricia tries have earlier been proved by Jaspar Ischebeck.
We study the distribution of fringe trees in Patricia tries and compressed binary search trees; both cases are random binary trees that have been compressed by deleting vertices of outdegree 1 so that they are random full binary trees. The main results are central limit theorems for the number of fr…
▽ More
Most of the results in the paper for Patricia tries have earlier been proved by Jaspar Ischebeck.
We study the distribution of fringe trees in Patricia tries and compressed binary search trees; both cases are random binary trees that have been compressed by deleting vertices of outdegree 1 so that they are random full binary trees. The main results are central limit theorems for the number of fringe trees of a given type, which imply quenched and annealed limit results for the fringe tree distribution; for Patricia tries, this is complicated by periodic oscillations in the usual manner. We also consider extended fringe trees. The results are derived from earlier results for uncompressed tries and binary search trees. In the case of compressed binary search trees, it seems difficult to give a closed formula for the asymptotic fringe tree distribution, but we provide a recursion and give examples.
△ Less
Submitted 11 June, 2024; v1 submitted 2 May, 2024;
originally announced May 2024.
-
Better-than-average uniform random variables and Eulerian numbers, or: How many candidates should a voter approve?
Authors:
Svante Janson,
Warren D. Smith
Abstract:
Consider $n$ independent random numbers with a uniform distribution on $[0,1]$. The number of them that exceed their mean is shown to have an Eulerian distribution, i.e., it is described by the Eulerian numbers. This is related to, but distinct from, the well known fact that the integer part of the sum of independent random numbers uniform on $[0,1]$ has an Eulerian distribution. One motivation fo…
▽ More
Consider $n$ independent random numbers with a uniform distribution on $[0,1]$. The number of them that exceed their mean is shown to have an Eulerian distribution, i.e., it is described by the Eulerian numbers. This is related to, but distinct from, the well known fact that the integer part of the sum of independent random numbers uniform on $[0,1]$ has an Eulerian distribution. One motivation for this problem comes from voting theory.
△ Less
Submitted 5 March, 2024;
originally announced March 2024.
-
On semi-restricted Rock, Paper, Scissors
Authors:
Svante Janson
Abstract:
Spiro, Surya and Zeng (Electron. J. Combin. 2023; arXiv:2207.11272) recently studied a semi-restricted variant of the well-known game Rock, Paper, Scissors; in this variant the game is played for $3n$ rounds, but one of the two players is restricted and has to use each of the three moves exactly $n$ times. They find the optimal strategy, and they show that it results in an expected score for the u…
▽ More
Spiro, Surya and Zeng (Electron. J. Combin. 2023; arXiv:2207.11272) recently studied a semi-restricted variant of the well-known game Rock, Paper, Scissors; in this variant the game is played for $3n$ rounds, but one of the two players is restricted and has to use each of the three moves exactly $n$ times. They find the optimal strategy, and they show that it results in an expected score for the unrestricted player $Θ(\sqrt{n})$; they conjecture, based on numerical evidence, that the expectation is $\approx 1.46\sqrt{n}$.
We analyse the result of the strategy further and show that the average is $\sim c \sqrt{n}$ with $c=3\sqrt{3}/2\sqrtπ=1.466$, verifying the conjecture. We also find the asymptotic distribution of the score, and compute its variance.
△ Less
Submitted 2 May, 2024; v1 submitted 22 February, 2024;
originally announced February 2024.
-
Almost sure and moment convergence for triangular Pólya urns
Authors:
Svante Janson
Abstract:
We consider triangular Pólya urns and show under very weak conditions a general strong limit theorem of the form $X_{ni}/a_{ni}\to \mathcal{X}_i$ a.s., where $X_{ni}$ is the number of balls of colour $i$ after $n$ draws; the constants $a_{ni}$ are explicit and of the form $n^α\log^γn$; the limit is a.s. positive, and may be either deterministic or random, but is in general unknown.
The result ex…
▽ More
We consider triangular Pólya urns and show under very weak conditions a general strong limit theorem of the form $X_{ni}/a_{ni}\to \mathcal{X}_i$ a.s., where $X_{ni}$ is the number of balls of colour $i$ after $n$ draws; the constants $a_{ni}$ are explicit and of the form $n^α\log^γn$; the limit is a.s. positive, and may be either deterministic or random, but is in general unknown.
The result extends to urns with subtractions under weak conditions, but a counterexample shows that some conditions are needed.
For balanced urns we also prove moment convergence in the main results if the replacements have the corresponding moments.
The proofs are based on studying the corresponding continuous-time urn using martingale methods, and showing corresponding results there. We assume for convenience that all replacements have finite second moments.
△ Less
Submitted 21 March, 2024; v1 submitted 2 February, 2024;
originally announced February 2024.
-
Uncovering a graph
Authors:
Svante Janson
Abstract:
Uncover the vertices of a given graph, deterministic or random, in random order; we consider both a discrete-time and a continuous-time version. We study the evolution of the number of visible edges, and show convergence after normalization to a Gaussian process. This problem was studied by Hackl, Panholzer, and Wagner for the case when the graph is a random labelled tree; we generalize their resu…
▽ More
Uncover the vertices of a given graph, deterministic or random, in random order; we consider both a discrete-time and a continuous-time version. We study the evolution of the number of visible edges, and show convergence after normalization to a Gaussian process. This problem was studied by Hackl, Panholzer, and Wagner for the case when the graph is a random labelled tree; we generalize their result to more general graphs, including both other classes of random and non-random trees, and denser graphs. The results are similar in all cases, but some differences can be seen depending on the size of the average degree and of the variance of the vertex degrees.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Fringe trees for random trees with given vertex degrees
Authors:
Gabriel Berzunza Ojeda,
Cecilia Holmgren,
Svante Janson
Abstract:
We prove asymptotic normality for the number of fringe subtrees isomorphic to any given tree in uniformly random trees with given vertex degrees. As applications, we also prove corresponding results for random labelled trees with given vertex degrees, for random simply generated trees (or conditioned Galton--Watson trees), and for additive functionals.
The key tool for our work is an extension t…
▽ More
We prove asymptotic normality for the number of fringe subtrees isomorphic to any given tree in uniformly random trees with given vertex degrees. As applications, we also prove corresponding results for random labelled trees with given vertex degrees, for random simply generated trees (or conditioned Galton--Watson trees), and for additive functionals.
The key tool for our work is an extension to the multivariate setting of a theorem by Gao and Wormald (2004), which provides a way to show asymptotic normality by analysing the behaviour of sufficiently high factorial moments.
△ Less
Submitted 7 December, 2023;
originally announced December 2023.
-
Approximation of Subgraph Counts in the Uniform Attachment Model
Authors:
Johan Björklund,
Cecilia Holmgren,
Svante Janson,
Tiffany Y. Y. Lo
Abstract:
We use Stein's method to obtain distributional approximations of subgraph counts in the uniform attachment model or random directed acyclic graph; we provide also estimates of rates of convergence. In particular, we give uni- and multi-variate Poisson approximations to the counts of cycles, and normal approximations to the counts of unicyclic subgraphs; we also give a partial result for the counts…
▽ More
We use Stein's method to obtain distributional approximations of subgraph counts in the uniform attachment model or random directed acyclic graph; we provide also estimates of rates of convergence. In particular, we give uni- and multi-variate Poisson approximations to the counts of cycles, and normal approximations to the counts of unicyclic subgraphs; we also give a partial result for the counts of trees. We further find a class of multicyclic graphs whose subgraph counts are a.s. bounded as $n\to\infty$.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
On a central limit theorem in renewal theory
Authors:
Svante Janson
Abstract:
Serfozo (2009, Theorem 2.65) gives a useful central limit theorem for processes with regenerative increments. Unfortunately, there is a gap in the proof. We fill this gap, and at the same time we weaken the assumptions. Furthermore, we give conditions for moment convergence in this setting. We give also further results complementing results in Serfozo (2009) on the law of large numbers and estimat…
▽ More
Serfozo (2009, Theorem 2.65) gives a useful central limit theorem for processes with regenerative increments. Unfortunately, there is a gap in the proof. We fill this gap, and at the same time we weaken the assumptions. Furthermore, we give conditions for moment convergence in this setting. We give also further results complementing results in Serfozo (2009) on the law of large numbers and estimates for the mean; in particular, we show that there is a gap between conditions for the weak and strong laws of large numbers.
△ Less
Submitted 22 May, 2023;
originally announced May 2023.
-
Real trees
Authors:
Svante Janson
Abstract:
We survey the definition and some elementary properties of real trees. There are no new results, as far as we know. One purpose is to give a number of different definitions and show the equivalence between them. We discuss also, for example, the four-point inequality, the length measure and the connection to the theory of Gromov hyperbolic spaces. Several examples are given.
We survey the definition and some elementary properties of real trees. There are no new results, as far as we know. One purpose is to give a number of different definitions and show the equivalence between them. We discuss also, for example, the four-point inequality, the length measure and the connection to the theory of Gromov hyperbolic spaces. Several examples are given.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
The Critical Beta-splitting Random Tree II: Overview and Open Problems
Authors:
David J. Aldous,
Svante Janson
Abstract:
In the critical beta-splitting model of a random $n$-leaf rooted tree, clades are recursively (from the root) split into sub-clades, and a clade of $m$ leaves is split into sub-clades containing $i$ and $m-i$ leaves with probabilities $\propto 1/(i(m-i))$. Study of structure theory and explicit quantitative aspects of the model is an active research topic, and this article provides an extensive ov…
▽ More
In the critical beta-splitting model of a random $n$-leaf rooted tree, clades are recursively (from the root) split into sub-clades, and a clade of $m$ leaves is split into sub-clades containing $i$ and $m-i$ leaves with probabilities $\propto 1/(i(m-i))$. Study of structure theory and explicit quantitative aspects of the model is an active research topic, and this article provides an extensive overview what is currently known. For many results there are different proofs, probabilistic or analytic, so the model provides a testbed for a ``compare and contrast" discussion of techniques. We give some proofs that are not currently available elsewhere, and also give heuristics for some proven results and for some open problems. Our discussion is centered around three ``foundational" results.
(i) There is a canonical embedding into a continuous-time model, that is a random tree $\mbox{CTCS}(n)$ on $n$ leaves with real-valued edge lengths, and this model turns out more convenient to study. The family $(\mbox{CTCS}(n), n \ge 2)$ is consistent under a ``delete random leaf and prune" operation. That leads to an explicit inductive construction of $(\mbox{CTCS}(n), n \ge 2)$ as $n$ increases, and then to a limit structure $\mbox{CTCS}(\infty)$ formalized via exchangeable partitions, in some ways analogous to the Brownian continuum random tree.
(ii) There is a CLT for leaf heights, and the analytic proof can be extended to provide surprisingly precise analysis of other height-related aspects.
(iii) There is an explicit description of the limit {\em fringe distribution} relative to a random leaf, whose graphical representation is essentially the format of the cladogram representation of biological phylogenies.
Many open problems remain.
△ Less
Submitted 5 July, 2024; v1 submitted 4 March, 2023;
originally announced March 2023.
-
Central limit theorem for components in meandric systems through high moments
Authors:
Svante Janson,
Paul Thévenin
Abstract:
We investigate here the behaviour of a large typical meandric system, proving a central limit theorem for the number of components of given shape. Our main tool is a theorem of Gao and Wormald, that allows us to deduce a central limit theorem from the asymptotics of large moments of our quantities of interest.
We investigate here the behaviour of a large typical meandric system, proving a central limit theorem for the number of components of given shape. Our main tool is a theorem of Gao and Wormald, that allows us to deduce a central limit theorem from the asymptotics of large moments of our quantities of interest.
△ Less
Submitted 3 March, 2023;
originally announced March 2023.
-
The number of descendants in a random directed acyclic graph
Authors:
Svante Janson
Abstract:
We consider a well known model of random directed acyclic graphs of order $n$, obtained by recursively adding vertices, where each new vertex has a fixed outdegree $d\ge2$ and the endpoints of the $d$ edges from it are chosen uniformly at random among previously existing vertices.
Our main results concern the number $X$ of vertices that are descendants of $n$. We show that $X/\sqrt n$ converges…
▽ More
We consider a well known model of random directed acyclic graphs of order $n$, obtained by recursively adding vertices, where each new vertex has a fixed outdegree $d\ge2$ and the endpoints of the $d$ edges from it are chosen uniformly at random among previously existing vertices.
Our main results concern the number $X$ of vertices that are descendants of $n$. We show that $X/\sqrt n$ converges in distribution; the limit distribution is, up to a constant factor, given by the $d$th root of a Gamma distributed variable. $Γ(d/(d-1))$. When $d=2$, the limit distribution can also be described as a chi distribution $χ(4)$. We also show convergence of moments, and find thus the asymptotics of the mean and higher moments.
△ Less
Submitted 27 February, 2023; v1 submitted 24 February, 2023;
originally announced February 2023.
-
On the Statistics of the Number of Fixed-Dimensional Subcubes in a Random Subset of the n-Dimensional Discrete Unit Cube
Authors:
Svante Janson,
Blair Seidler,
Doron Zeilberger
Abstract:
This paper consists of two independent, but related parts. In the first part we show how to use symbolic computation to derive explicit expressions for the first few moments of the number of implicants that a random Boolean function has, or equivalently the number of fixed-dimensional subcubes contained in a random subset of the $n$-dimensional cube. These explicit expressions suggest, but do not…
▽ More
This paper consists of two independent, but related parts. In the first part we show how to use symbolic computation to derive explicit expressions for the first few moments of the number of implicants that a random Boolean function has, or equivalently the number of fixed-dimensional subcubes contained in a random subset of the $n$-dimensional cube. These explicit expressions suggest, but do not prove, that these random variables are always asymptotically normal.
The second part presents a full, human-generated proof, of this asymptotic normality, first proved by Urszula Konieczna.
△ Less
Submitted 17 February, 2023;
originally announced February 2023.
-
On Knuth's conjecture for back and forward arcs in Depth First Search in a random digraph with geometric outdegree distribution
Authors:
Svante Janson
Abstract:
Donald Knuth, in a draft of a coming volume of The Art of Computer Programming, has recently conjectured that in Depth-First Search of a random digraph with geometric outdegree distribution, the numbers of back and forward arcs have the same distribution.
We show that this conjecture is equivalent to an equality between two generating functions defined by different recursions.
Unfortunately, w…
▽ More
Donald Knuth, in a draft of a coming volume of The Art of Computer Programming, has recently conjectured that in Depth-First Search of a random digraph with geometric outdegree distribution, the numbers of back and forward arcs have the same distribution.
We show that this conjecture is equivalent to an equality between two generating functions defined by different recursions.
Unfortunately, we have not been able so use this to prove the conjecture, which still is open, but we hope that this note will inspire others to succeed with the conjecture.
△ Less
Submitted 10 January, 2023;
originally announced January 2023.
-
Depth-First Search performance in a random digraph with geometric outdegree distribution
Authors:
Philippe Jacquet,
Svante Janson
Abstract:
We present an analysis of the depth-first search algorithm in a random digraph model with independent outdegrees having a geometric distribution.
The results include asymptotic results for the depth profile of vertices, the height (maximum depth) and average depth, the number of trees in the forest, the size of the largest and second-largest trees, and the numbers of arcs of different types in t…
▽ More
We present an analysis of the depth-first search algorithm in a random digraph model with independent outdegrees having a geometric distribution.
The results include asymptotic results for the depth profile of vertices, the height (maximum depth) and average depth, the number of trees in the forest, the size of the largest and second-largest trees, and the numbers of arcs of different types in the depth-first jungle. Most results are first order. For the height we show an asymptotic normal distribution.
This analysis proposed by Donald Knuth in his next to appear volume of The Art of Computer Programming gives interesting insight in one of the most elegant and efficient algorithm for graph analysis due to Tarjan.
△ Less
Submitted 30 December, 2022;
originally announced December 2022.
-
Conditioned Galton-Watson trees: The shape functional, and more on the sum of powers of subtree sizes and its mean
Authors:
James Allen Fill,
Svante Janson,
Stephan Wagner
Abstract:
For a complex number $α$, we consider the sum of the $α$th powers of subtree sizes in Galton--Watson trees conditioned to be of size $n$. Limiting distributions of this functional $X_n(α)$ have been determined for $\Reα\neq 0$, revealing a transition between a complex normal limiting distribution for $\Reα< 0$ and a non-normal limiting distribution for $\Reα> 0$. In this paper, we complete the pic…
▽ More
For a complex number $α$, we consider the sum of the $α$th powers of subtree sizes in Galton--Watson trees conditioned to be of size $n$. Limiting distributions of this functional $X_n(α)$ have been determined for $\Reα\neq 0$, revealing a transition between a complex normal limiting distribution for $\Reα< 0$ and a non-normal limiting distribution for $\Reα> 0$. In this paper, we complete the picture by proving a normal limiting distribution, along with moment convergence, in the missing case $\Reα= 0$. The same results are also established in the case of the so-called shape functional $X_n'(0)$, which is the sum of the logarithms of all subtree sizes; these results were obtained earlier in special cases. Additionally, we prove convergence of all moments in the case $\Reα< 0$, where this result was previously missing, and establish new results about the asymptotic mean for real $α< 1/2$.
A novel feature for $\Reα=0$ is that we find joint convergence for several $α$ to independent limits, in contrast to the cases $\Reα\neq0$, where the limit is known to be a continuous function of $α$. Another difference from the case $\Reα\neq0$ is that there is a logarithmic factor in the asymptotic variance when $\Reα=0$; this holds also for the shape functional.
The proofs are largely based on singularity analysis of generating functions.
△ Less
Submitted 23 January, 2023; v1 submitted 21 December, 2022;
originally announced December 2022.
-
A note on estimating global subgraph counts by sampling
Authors:
Svante Janson,
Valentas Kurauskas
Abstract:
We give a simple proof of a generalization of an inequality for homomorphism counts by Sidorenko (1994). A special case of our inequality says that if $d_v$ denotes the degree of a vertex $v$ in a graph $G$ and $\textrm{Hom}_Δ(H, G)$ denotes the number of homomorphisms from a connected graph $H$ on $h$ vertices to $G$ which map a particular vertex of $H$ to a vertex $v$ in $G$ with $d_v \ge Δ$, th…
▽ More
We give a simple proof of a generalization of an inequality for homomorphism counts by Sidorenko (1994). A special case of our inequality says that if $d_v$ denotes the degree of a vertex $v$ in a graph $G$ and $\textrm{Hom}_Δ(H, G)$ denotes the number of homomorphisms from a connected graph $H$ on $h$ vertices to $G$ which map a particular vertex of $H$ to a vertex $v$ in $G$ with $d_v \ge Δ$, then $ \textrm{Hom}_Δ(H,G) \le \sum_{v\in G} d_v^{h-1}\mathbf{1}_{d_v\ge Δ} $
We use this inequality to study the minimum sample size needed to estimate the number of copies of $H$ in $G$ by sampling vertices of $G$ at random.
△ Less
Submitted 20 October, 2022;
originally announced October 2022.
-
Identities and periodic oscillations of divide-and-conquer recurrences splitting at half
Authors:
Hsien-Kuei Hwang,
Svante Janson,
Tsung-Hsi Tsai
Abstract:
We study divide-and-conquer recurrences of the form \begin{equation*}
f(n)
= αf(\lfloor \tfrac n2\rfloor)
+ βf(\lceil \tfrac n2\rceil)
+ g(n) \qquad(n\ge2), \end{equation*} with $g(n)$ and $f(1)$ given, where $α,β\ge0$ with $α+β>0$; such recurrences appear often in analysis of computer algorithms, numeration systems, combinatorial sequences, and related areas. We show that the solution sat…
▽ More
We study divide-and-conquer recurrences of the form \begin{equation*}
f(n)
= αf(\lfloor \tfrac n2\rfloor)
+ βf(\lceil \tfrac n2\rceil)
+ g(n) \qquad(n\ge2), \end{equation*} with $g(n)$ and $f(1)$ given, where $α,β\ge0$ with $α+β>0$; such recurrences appear often in analysis of computer algorithms, numeration systems, combinatorial sequences, and related areas. We show that the solution satisfies always the simple \emph{identity} \begin{equation*}
f(n)
= n^{\log_2(α+β)} P(\log_2n) - Q(n) \end{equation*} under an optimum (iff) condition on $g(n)$. This form is not only an identity but also an asymptotic expansion because $Q(n)$ is of a smaller order. Explicit forms for the \emph{continuity} of the periodic function $P$ are provided, together with a few other smoothness properties. We show how our results can be easily applied to many dozens of concrete examples collected from the literature, and how they can be extended in various directions. Our method of proof is surprisingly simple and elementary, but leads to the strongest types of results for all examples to which our theory applies.
△ Less
Submitted 19 October, 2022;
originally announced October 2022.
-
Quantitative bounds in the central limit theorem for $m$-dependent random variables
Authors:
Svante Janson,
Luca Pratelli,
Pietro Rigo
Abstract:
For each $n\ge 1$, let $X_{n,1},\ldots,X_{n,N_n}$ be real random variables and $S_n=\sum_{i=1}^{N_n}X_{n,i}$. Let $m_n\ge 1$ be an integer. Suppose $(X_{n,1},\ldots,X_{n,N_n})$ is $m_n$-dependent, $E(X_{ni})=0$, $E(X_{ni}^2)<\infty$ and $σ_n^2:=E(S_n^2)>0$ for all $n$ and $i$. Then, \begin{gather*} d_W\Bigl(\frac{S_n}{σ_n},\,Z\Bigr)\le 30\,\bigl\{c^{1/3}+12\,U_n(c/2)^{1/2}\bigr\}\quad\quad\text{fo…
▽ More
For each $n\ge 1$, let $X_{n,1},\ldots,X_{n,N_n}$ be real random variables and $S_n=\sum_{i=1}^{N_n}X_{n,i}$. Let $m_n\ge 1$ be an integer. Suppose $(X_{n,1},\ldots,X_{n,N_n})$ is $m_n$-dependent, $E(X_{ni})=0$, $E(X_{ni}^2)<\infty$ and $σ_n^2:=E(S_n^2)>0$ for all $n$ and $i$. Then, \begin{gather*} d_W\Bigl(\frac{S_n}{σ_n},\,Z\Bigr)\le 30\,\bigl\{c^{1/3}+12\,U_n(c/2)^{1/2}\bigr\}\quad\quad\text{for all }n\ge 1\text{ and }c>0, \end{gather*} where $d_W$ is Wasserstein distance, $Z$ a standard normal random variable and $$U_n(c)=\frac{m_n}{σ_n^2}\,\sum_{i=1}^{N_n}E\Bigl[X_{n,i}^2\,1\bigl\{\abs{X_{n,i}}>c\,σ_n/m_n\bigr\}\Bigr].$$ Among other things, this estimate of $d_W\bigl(S_n/σ_n,\,Z\bigr)$ yields a similar estimate of $d_{TV}\bigl(S_n/σ_n,\,Z\bigr)$ where $d_{TV}$ is total variation distance.
△ Less
Submitted 12 August, 2022;
originally announced August 2022.
-
The number of occurrences of patterns in a random tree or forest permutation
Authors:
Svante Janson
Abstract:
The classes of tree permutations and forest permutations were defined by Acan and Hitczenko (2016). We study random permutations of a given length from these classes, and in particular the number of occurrences of a fixed pattern in one of these random permutations. The main results show that the distributions of these numbers are asymptotically normal.
The proof uses representations of random t…
▽ More
The classes of tree permutations and forest permutations were defined by Acan and Hitczenko (2016). We study random permutations of a given length from these classes, and in particular the number of occurrences of a fixed pattern in one of these random permutations. The main results show that the distributions of these numbers are asymptotically normal.
The proof uses representations of random tree and forest permutations that enable us to express the number of occurrences of a pattern by a type of $U$-statistics; we then use general limit theorems for the latter.
△ Less
Submitted 9 March, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
Edge coherence in multiplex networks
Authors:
Swati Chandna,
Svante Janson,
Sofia C. Olhede
Abstract:
This paper introduces a nonparametric framework for the setting where multiple networks are observed on the same set of nodes, also known as multiplex networks. Our objective is to provide a simple parameterization which explicitly captures linear dependence between the different layers of networks. For non-Euclidean observations, such as shapes and graphs, the notion of "linear" must be defined a…
▽ More
This paper introduces a nonparametric framework for the setting where multiple networks are observed on the same set of nodes, also known as multiplex networks. Our objective is to provide a simple parameterization which explicitly captures linear dependence between the different layers of networks. For non-Euclidean observations, such as shapes and graphs, the notion of "linear" must be defined appropriately. Taking inspiration from the representation of stochastic processes and the analogy of the multivariate spectral representation of a stochastic process with joint exchangeability of Bernoulli arrays, we introduce the notion of edge coherence as a measure of linear dependence in the graph limit space. Edge coherence is defined for pairs of edges from any two network layers and is the key novel parameter. We illustrate the utility of our approach by eliciting simple models such as a correlated stochastic blockmodel and a correlated inhomogeneous graph limit model.
△ Less
Submitted 18 February, 2022;
originally announced February 2022.
-
Fluctuations of balanced urns with infinitely many colours
Authors:
Svante Janson,
Cécile Mailler,
Denis Villemonais
Abstract:
In this paper, we prove convergence and fluctuation results for measure-valued Pólya processes (MVPPs, also known as Pólya urns with infinitely-many colours). Our convergence results hold almost surely and in $L^2$, under assumptions that are different from that of other convergence results in the literature. Our fluctuation results are the first second-order results in the literature on MVPPs; th…
▽ More
In this paper, we prove convergence and fluctuation results for measure-valued Pólya processes (MVPPs, also known as Pólya urns with infinitely-many colours). Our convergence results hold almost surely and in $L^2$, under assumptions that are different from that of other convergence results in the literature. Our fluctuation results are the first second-order results in the literature on MVPPs; they generalise classical fluctuation results from the literature on finitely-many-colour Pólya urns. As in the finitely-many-colour case, the order and shape of the fluctuations depend on whether the "spectral gap is small or large".
To prove these results, we show that MVPPs are stochastic approximations taking values in the set of measures on a measurable space $E$ (the colour space). We then use martingale methods and standard operator theory to prove convergence and fluctuation results for these stochastic approximations.
△ Less
Submitted 26 November, 2021;
originally announced November 2021.
-
Unicellular maps vs hyperbolic surfaces in large genus: simple closed curves
Authors:
Svante Janson,
Baptiste Louf
Abstract:
We study uniformly random maps with a single face, genus $g$, and size $n$, as $n,g\rightarrow \infty$ with $g = o(n)$, in continuation of several previous works on the geometric properties of "high genus maps". We calculate the number of short simple cycles, and we show convergence of their lengths (after a well-chosen rescaling of the graph distance) to a Poisson process, which happens to be exa…
▽ More
We study uniformly random maps with a single face, genus $g$, and size $n$, as $n,g\rightarrow \infty$ with $g = o(n)$, in continuation of several previous works on the geometric properties of "high genus maps". We calculate the number of short simple cycles, and we show convergence of their lengths (after a well-chosen rescaling of the graph distance) to a Poisson process, which happens to be exactly the same as the limit law obtained by Mirzakhani and Petri (2019) when they studied simple closed geodesics on random hyperbolic surfaces under the Weil-Petersson measure as $g\rightarrow \infty$. This leads us to conjecture that these two models are somehow "the same" in the limit, which would allow to translate problems on hyperbolic surfaces in terms of random trees, thanks to a powerful bijection of Chapuy, Féray and Fusy (2013).
△ Less
Submitted 10 December, 2021; v1 submitted 23 November, 2021;
originally announced November 2021.
-
A central limit theorem for m-dependent variables
Authors:
Svante Janson
Abstract:
We give a simple and general central limit theorem for a triangular array of m-dependent variables. The result requires only a Lindeberg condition and avoids unnecessary extra conditions that have been used earlier. The result applies also to increasing $m=m(n)$, provided the Lindeberg condition is modified accordingly. This improves earlier results by several authors.
We give a simple and general central limit theorem for a triangular array of m-dependent variables. The result requires only a Lindeberg condition and avoids unnecessary extra conditions that have been used earlier. The result applies also to increasing $m=m(n)$, provided the Lindeberg condition is modified accordingly. This improves earlier results by several authors.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Asymptotic normality for $m$-dependent and constrained $U$-statistics, with applications to pattern matching in random strings and permutations
Authors:
Svante Janson
Abstract:
We study (asymmetric) $U$-statistics based on a stationary sequence of $m$-dependent variables; moreover, we consider constrained $U$-statistics, where the defining multiple sum only includes terms satisfying some restrictions on the gaps between indices. Results include a law of large numbers and a central limit theorem. Special attention is paid to degenerate cases where, after the standard norm…
▽ More
We study (asymmetric) $U$-statistics based on a stationary sequence of $m$-dependent variables; moreover, we consider constrained $U$-statistics, where the defining multiple sum only includes terms satisfying some restrictions on the gaps between indices. Results include a law of large numbers and a central limit theorem. Special attention is paid to degenerate cases where, after the standard normalization, the asymptotic variance vanishes; in these cases non-normal limits occur after a different normalization.
The results are motivated by applications to pattern matching in random strings and permutations. We obtain both new results and new proofs of old results.
△ Less
Submitted 9 March, 2022; v1 submitted 17 June, 2021;
originally announced June 2021.
-
Fluctuations of Subgraph Counts in Graphon Based Random Graphs
Authors:
Bhaswar B. Bhattacharya,
Anirban Chatterjee,
Svante Janson
Abstract:
Given a graphon $W$ and a finite simple graph $H$, with vertex set $V(H)$, denote by $X_n(H, W)$ the number of copies of $H$ in a $W$-random graph on $n$ vertices. The asymptotic distribution of $X_n(H, W)$ was recently obtained by Hladký, Pelekis, and Šileikis (2021) in the case where $H$ is a clique. In this paper, we extend this result to any fixed graph $H$. Towards this we introduce a notion…
▽ More
Given a graphon $W$ and a finite simple graph $H$, with vertex set $V(H)$, denote by $X_n(H, W)$ the number of copies of $H$ in a $W$-random graph on $n$ vertices. The asymptotic distribution of $X_n(H, W)$ was recently obtained by Hladký, Pelekis, and Šileikis (2021) in the case where $H$ is a clique. In this paper, we extend this result to any fixed graph $H$. Towards this we introduce a notion of $H$-regularity of graphons and show that if the graphon $W$ is not $H$-regular, then $X_n(H, W)$ has Gaussian fluctuations with scaling $n^{|V(H)|-\frac{1}{2}}$. On the other hand, if $W$ is $H$-regular, then the fluctuations are of order $n^{|V(H)|-1}$ and the limiting distribution of $X_n(H, W)$ can have both Gaussian and non-Gaussian components, where the non-Gaussian component is a (possibly) infinite weighted sum of centered chi-squared random variables with the weights determined by the spectral properties of a graphon derived from $W$. Our proofs use the asymptotic theory of generalized $U$-statistics developed by Janson and Nowicki (1991). We also investigate the structure of $H$-regular graphons for which either the Gaussian or the non-Gaussian component of the limiting distribution (but not both) is degenerate. Interestingly, there are also $H$-regular graphons $W$ for which both the Gaussian or the non-Gaussian components are degenerate, that is, $X_n(H, W)$ has a degenerate limit even under the scaling $n^{|V(H)|-1}$. We give an example of this degeneracy with $H=K_{1, 3}$ (the 3-star) and also establish non-degeneracy in a few examples. This naturally leads to interesting open questions on higher-order degeneracies.
△ Less
Submitted 17 January, 2022; v1 submitted 15 April, 2021;
originally announced April 2021.
-
The sum of powers of subtree sizes for conditioned Galton-Watson trees
Authors:
James Allen Fill,
Svante Janson
Abstract:
We study the additive functional $X_n(α)$ on conditioned Galton-Watson trees given, for arbitrary complex $α$, by summing the $α$th power of all subtree sizes. Allowing complex $α$ is advantageous, even for the study of real $α$, since it allows us to use powerful results from the theory of analytic functions in the proofs.
For $\Reα< 0$, we prove that $X_n(α)$, suitably normalized, has a comple…
▽ More
We study the additive functional $X_n(α)$ on conditioned Galton-Watson trees given, for arbitrary complex $α$, by summing the $α$th power of all subtree sizes. Allowing complex $α$ is advantageous, even for the study of real $α$, since it allows us to use powerful results from the theory of analytic functions in the proofs.
For $\Reα< 0$, we prove that $X_n(α)$, suitably normalized, has a complex normal limiting distribution; moreover, as processes in $α$, the weak convergence holds in the space of analytic functions in the left half-plane. We establish, and prove similar process-convergence extensions of, limiting distribution results for $α$ in various regions of the complex plane. We focus mainly on the case where $\Reα> 0$, for which $X_n(α)$, suitably normalized, has a limiting distribution that is not normal but does not depend on the offspring distribution $ξ$ of the conditioned Galton-Watson tree, assuming only that $E[ξ] = 1$ and $0 < \mathrm{Var} [ξ] < \infty$. Under a weak extra moment assumption on $ξ$, we prove that the convergence extends to moments, ordinary and absolute and mixed, of all orders.
At least when $\Reα> \frac12$, the limit random variable $Y(α)$ can be expressed as a function of a normalized Brownian excursion.
△ Less
Submitted 6 April, 2021;
originally announced April 2021.
-
Short cycles in high genus unicellular maps
Authors:
Svante Janson,
Baptiste Louf
Abstract:
We study large uniform random maps with one face whose genus grows linearly with the number of edges, which are a model of discrete hyperbolic geometry. In previous works, several hyperbolic geometric features have been investigated. In the present work, we study the number of short cycles in a uniform unicellular map of high genus, and we show that it converges to a Poisson distribution. As a cor…
▽ More
We study large uniform random maps with one face whose genus grows linearly with the number of edges, which are a model of discrete hyperbolic geometry. In previous works, several hyperbolic geometric features have been investigated. In the present work, we study the number of short cycles in a uniform unicellular map of high genus, and we show that it converges to a Poisson distribution. As a corollary, we obtain the law of the systole of uniform unicellular maps in high genus. We also obtain the asymptotic distribution of the vertex degrees in such a map.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Can smooth graphons in several dimensions be represented by smooth graphons on $[0,1]$?
Authors:
Svante Janson,
Sofia Olhede
Abstract:
A graphon that is defined on $[0,1]^d$ and is Hölder$(α)$ continuous for some $d\ge2$ and $α\in(0,1]$ can be represented by a graphon on $[0,1]$ that is Hölder$(α/d)$ continuous. We give examples that show that this reduction in smoothness to $α/d$ is the best possible, for any $d$ and $α$; for $α=1$, the example is a dot product graphon and shows that the reduction is the best possible even for g…
▽ More
A graphon that is defined on $[0,1]^d$ and is Hölder$(α)$ continuous for some $d\ge2$ and $α\in(0,1]$ can be represented by a graphon on $[0,1]$ that is Hölder$(α/d)$ continuous. We give examples that show that this reduction in smoothness to $α/d$ is the best possible, for any $d$ and $α$; for $α=1$, the example is a dot product graphon and shows that the reduction is the best possible even for graphons that are polynomials.
A motivation for studying the smoothness of graphon functions is that this represents a key assumption in non-parametric statistical network analysis. Our examples show that making a smoothness assumption in a particular dimension is not equivalent to making it in any other latent dimension.
△ Less
Submitted 19 January, 2021;
originally announced January 2021.
-
Minimal matchings of point processes
Authors:
Alexander E. Holroyd,
Svante Janson,
Johan Wästlund
Abstract:
Suppose that red and blue points form independent homogeneous Poisson processes of equal intensity in $R^d$. For a positive (respectively, negative) parameter $γ$ we consider red-blue matchings that locally minimize (respectively, maximize) the sum of $γ$th powers of the edge lengths, subject to locally minimizing the number of unmatched points. The parameter can be viewed as a measure of fairness…
▽ More
Suppose that red and blue points form independent homogeneous Poisson processes of equal intensity in $R^d$. For a positive (respectively, negative) parameter $γ$ we consider red-blue matchings that locally minimize (respectively, maximize) the sum of $γ$th powers of the edge lengths, subject to locally minimizing the number of unmatched points. The parameter can be viewed as a measure of fairness. The limit $γ\to-\infty$ is equivalent to Gale-Shapley stable matching. We also consider limits as $γ$ approaches $0$, $1-$, $1+$ and $\infty$. We focus on dimension $d=1$. We prove that almost surely no such matching has unmatched points. (This question is open for higher $d$). For each $γ<1$ we establish that there is almost surely a unique such matching, and that it can be expressed as a finitary factor of the points. Moreover, its typical edge length has finite $r$th moment if and only if $r<1/2$. In contrast, for $γ=1$ there are uncountably many matchings, while for $γ>1$ there are countably many, but it is impossible to choose one in a translation-invariant way. We obtain existence results in higher dimensions (covering many but not all cases). We address analogous questions for one-colour matchings also.
△ Less
Submitted 13 December, 2020;
originally announced December 2020.
-
On general subtrees of a conditioned Galton-Watson tree
Authors:
Svante Janson
Abstract:
We show that the number of copies of a given rooted tree in a conditioned Galton-Watson tree satisfies a law of large numbers under a minimal moment condition on the offspring distribution.
We show that the number of copies of a given rooted tree in a conditioned Galton-Watson tree satisfies a law of large numbers under a minimal moment condition on the offspring distribution.
△ Less
Submitted 9 November, 2020;
originally announced November 2020.
-
On the probability that a binomial variable is at most its expectation
Authors:
Svante Janson
Abstract:
Consider the probability that a binomial random variable Bi$(n,m/n)$ with integer expectation $m$ is at most its expectation. Chvátal conjectured that for any given $n$, this probability is smallest when $m$ is the integer closest to $2n/3$. We show that this holds when $n$ is large.
Consider the probability that a binomial random variable Bi$(n,m/n)$ with integer expectation $m$ is at most its expectation. Chvátal conjectured that for any given $n$, this probability is smallest when $m$ is the integer closest to $2n/3$. We show that this holds when $n$ is large.
△ Less
Submitted 19 October, 2020; v1 submitted 29 September, 2020;
originally announced September 2020.
-
The distance profile of rooted and unrooted simply generated trees
Authors:
Gabriel Berzunza Ojeda,
Svante Janson
Abstract:
It is well-known that the height profile of a critical conditioned Galton-Watson tree with finite offspring variance converges, after a suitable normalization, to the local time of a standard Brownian excursion. In this work, we study the distance profile, defined as the profile of all distances between pairs of vertices. We show that after a proper rescaling the distance profile converges to a co…
▽ More
It is well-known that the height profile of a critical conditioned Galton-Watson tree with finite offspring variance converges, after a suitable normalization, to the local time of a standard Brownian excursion. In this work, we study the distance profile, defined as the profile of all distances between pairs of vertices. We show that after a proper rescaling the distance profile converges to a continuous random function that can be described as the density of distances between random points in the Brownian continuum random tree.
We show that this limiting function a.s. is Hölder continuous of any order $α<1$, and that it is a.e. differentiable. We note that it cannot be differentiable at $0$, but leave as open questions whether it is Lipschitz, and whether is continuously differentiable on the half-line $(0,\infty)$.
The distance profile is naturally defined also for unrooted trees contrary to the height profile that is designed for rooted trees. This is used in our proof, and we prove the corresponding convergence result for the distance profile of random unrooted simply generated trees. As a minor purpose of the present work, we also formalize the notion of unrooted simply generated trees and include some simple results relating them to rooted simply generated trees, which might be of independent interest.
△ Less
Submitted 21 June, 2021; v1 submitted 1 September, 2020;
originally announced September 2020.
-
Tree limits and limits of random trees
Authors:
Svante Janson
Abstract:
We explore the tree limits recently defined by Elek and Tardos. In particular, we find tree limits for many classes of random trees. We give general theorems for three classes of conditional Galton-Watson trees and simply generated trees, for split trees and generalized split trees (as defined here), and for trees defined by a continuous-time branching process. These general results include, for e…
▽ More
We explore the tree limits recently defined by Elek and Tardos. In particular, we find tree limits for many classes of random trees. We give general theorems for three classes of conditional Galton-Watson trees and simply generated trees, for split trees and generalized split trees (as defined here), and for trees defined by a continuous-time branching process. These general results include, for example, random labelled trees, ordered trees, random recursive trees, preferential attachment trees, and binary search trees.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
On the Gromov-Prohorov distance
Authors:
Svante Janson
Abstract:
We survey some basic results on the Gromov-Prohorov distance between metric measure spaces. (We do not claim any new results.)
We give several different definitions and show the equivalence of them. We also show that convergence in the Gromov-Prohorov distance is equivalent to convergence in distribution of the array of distances between finite sets of random points.
We survey some basic results on the Gromov-Prohorov distance between metric measure spaces. (We do not claim any new results.)
We give several different definitions and show the equivalence of them. We also show that convergence in the Gromov-Prohorov distance is equivalent to convergence in distribution of the array of distances between finite sets of random points.
△ Less
Submitted 2 June, 2020; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Continuous time digital search tree and a border aggregation model
Authors:
Svante Janson,
Debleena Thacker
Abstract:
We consider the continuous-time version of the random digital search tree, and construct a coupling with a border aggregation model as studied in Thacker and Volkov (2018), showing a relation between the height of the tree and the time required for aggregation. This relation carries over to the corresponding discrete-time models. As a consequence we find a very precise asymptotic result for the ti…
▽ More
We consider the continuous-time version of the random digital search tree, and construct a coupling with a border aggregation model as studied in Thacker and Volkov (2018), showing a relation between the height of the tree and the time required for aggregation. This relation carries over to the corresponding discrete-time models. As a consequence we find a very precise asymptotic result for the time to aggregation, using recent results by Drmota et al.\ (2020) for the digital search tree.
△ Less
Submitted 29 April, 2020;
originally announced April 2020.
-
The space $D$ in several variables: random variables and higher moments
Authors:
Svante Janson
Abstract:
We study the Banach space $D([0,1]^m)$ of functions of several variables that are (in a certain sense) right-continuous with left limits, and extend several results previously known for the standard case $m=1$. We give, for example, a description of the dual space, and we show that a bounded multilinear form always is measurable with respect to the $σ$-field generated by the point evaluations. The…
▽ More
We study the Banach space $D([0,1]^m)$ of functions of several variables that are (in a certain sense) right-continuous with left limits, and extend several results previously known for the standard case $m=1$. We give, for example, a description of the dual space, and we show that a bounded multilinear form always is measurable with respect to the $σ$-field generated by the point evaluations. These results are used to study random functions in the space. (I.e., random elements of the space.) In particular, we give results on existence of moments (in different senses) of such random functions, and we give an application to the Zolotarev distance between two such random functions.
△ Less
Submitted 1 April, 2020;
originally announced April 2020.
-
Hidden Words Statistics for Large Patterns
Authors:
Svante Janson,
Wojciech Szpankowski
Abstract:
We study here the so called subsequence pattern matching also known as hidden pattern matching in which one searches for a given pattern $w$ of length $m$ as a subsequence in a random text of length $n$. The quantity of interest is the number of occurrences of $w$ as a subsequence (i.e., occurring in not necessarily consecutive text locations). This problem finds many applications from intrusion d…
▽ More
We study here the so called subsequence pattern matching also known as hidden pattern matching in which one searches for a given pattern $w$ of length $m$ as a subsequence in a random text of length $n$. The quantity of interest is the number of occurrences of $w$ as a subsequence (i.e., occurring in not necessarily consecutive text locations). This problem finds many applications from intrusion detection, to trace reconstruction, to deletion channel, and to DNA-based storage systems. In all of these applications, the pattern $w$ is of variable length. To the best of our knowledge this problem was only tackled for a fixed length $m=O(1)$ [Flajolet, Szpankowski and Vallée, 2006]. In our main result we prove that for $m=o(n^{1/3})$ the number of subsequence occurrences is normally distributed. In addition, we show that under some constraints on the structure of $w$ the asymptotic normality can be extended to $m=o(\sqrt{n})$. For a special pattern $w$ consisting of the same symbol, we indicate that for $m=o(n)$ the distribution of number of subsequences is either asymptotically normal or asymptotically log normal. We conjecture that this dichotomy is true for all patterns. We use Hoeffding's projection method for $U$-statistics to prove our findings.
△ Less
Submitted 21 March, 2020;
originally announced March 2020.
-
On the independence number of some random trees
Authors:
Svante Janson
Abstract:
We show that for many models of random trees, the independence number divided by the size converges almost surely to a constant as the size grows to infinity; the trees that we consider include random recursive trees, binary and $m$-ary search trees, preferential attachment trees, and others. The limiting constant is computed, analytically or numerically, for several examples. The method is based…
▽ More
We show that for many models of random trees, the independence number divided by the size converges almost surely to a constant as the size grows to infinity; the trees that we consider include random recursive trees, binary and $m$-ary search trees, preferential attachment trees, and others. The limiting constant is computed, analytically or numerically, for several examples. The method is based on Crump-Mode-Jagers branching processes.
△ Less
Submitted 20 March, 2020; v1 submitted 19 March, 2020;
originally announced March 2020.
-
Central limit theorems for additive functionals and fringe trees in tries
Authors:
Svante Janson
Abstract:
We give general theorems on asymptotic normality for additive functionals of random tries generated by a sequence of independent strings. These theorems are applied to show asymptotic normality of the distribution of random fringe trees in a random trie. Formulas for asymptotic mean and variance are given. In particular, the proportion of fringe trees of size $k$ (defined as number of keys) is asy…
▽ More
We give general theorems on asymptotic normality for additive functionals of random tries generated by a sequence of independent strings. These theorems are applied to show asymptotic normality of the distribution of random fringe trees in a random trie. Formulas for asymptotic mean and variance are given. In particular, the proportion of fringe trees of size $k$ (defined as number of keys) is asymptotically, ignoring oscillations, $c/(k(k-1))$ for $k\ge2$, where $c=1/(1+H)$ with $H$ the entropy of the digits. Another application gives asymptotic normality of the number of $k$-protected nodes in a random trie. For symmetric tries, it is shown that the asymptotic proportion of $k$-protected nodes (ignoring oscillations) decreases geometrically as $k\to\infty$.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
To fixate or not to fixate in two-type annihilating branching random walks
Authors:
Daniel Ahlberg,
Simon Griffiths,
Svante Janson
Abstract:
We study a model of competition between two types evolving as branching random walks on $\mathbb{Z}^d$. The two types are represented by red and blue balls respectively, with the rule that balls of different colour annihilate upon contact. We consider initial configurations in which the sites of $\mathbb{Z}^d$ contain one ball each, which are independently coloured red with probability $p$ and blu…
▽ More
We study a model of competition between two types evolving as branching random walks on $\mathbb{Z}^d$. The two types are represented by red and blue balls respectively, with the rule that balls of different colour annihilate upon contact. We consider initial configurations in which the sites of $\mathbb{Z}^d$ contain one ball each, which are independently coloured red with probability $p$ and blue otherwise. We address the question of \emph{fixation}, referring to the sites eventually settling for a given colour, or not. Under a mild moment condition on the branching rule, we prove that the process will fixate almost surely for $p\neq 1/2$, and that every site will change colour infinitely often almost surely for the balanced initial condition $p=1/2$.
△ Less
Submitted 21 October, 2020; v1 submitted 21 February, 2020;
originally announced February 2020.
-
Rate of convergence for traditional Pólya urns
Authors:
Svante Janson
Abstract:
Consider a Pólya urn with balls of several colours, where balls are drawn sequentially and each drawn ball immediately is replaced together with a fixed number of balls of the same colour. It is well-known that the proportions of balls of the different colours converge in distribution to a Dirichlet distribution. We show that the rate of convergence is $Θ(1/n)$ in the minimal $L_p$ metric for any…
▽ More
Consider a Pólya urn with balls of several colours, where balls are drawn sequentially and each drawn ball immediately is replaced together with a fixed number of balls of the same colour. It is well-known that the proportions of balls of the different colours converge in distribution to a Dirichlet distribution. We show that the rate of convergence is $Θ(1/n)$ in the minimal $L_p$ metric for any $p\in[1,\infty]$, extending a result by Goldstein and Reinert; we further show the same rate for the Lévy distance, while the rate for the Kolmogorov distance depends on the parameters, i.e., on the initial composition of the urn. The method used here differs from the one used by Goldstein and Reinert, and uses direct calculations based on the known exact distributions.
△ Less
Submitted 21 November, 2019;
originally announced November 2019.
-
On distance covariance in metric and Hilbert spaces
Authors:
Svante Janson
Abstract:
Distance covariance is a measure of dependence between two random variables that take values in two, in general different, metric spaces, see Székely, Rizzo and Bakirov (2007) and Lyons (2013). It is known that the distance covariance, and its generalization $α$-distance covariance, can be defined in several different ways that are equivalent under some moment conditions. The present paper conside…
▽ More
Distance covariance is a measure of dependence between two random variables that take values in two, in general different, metric spaces, see Székely, Rizzo and Bakirov (2007) and Lyons (2013). It is known that the distance covariance, and its generalization $α$-distance covariance, can be defined in several different ways that are equivalent under some moment conditions. The present paper considers four such definitions and find minimal moment conditions for each of them, together with some partial results when these conditions are not satisfied.
The paper also studies the special case when the variables are Hilbert space valued, and shows under weak moment conditions that two such variables are independent if and only if their ($α$-)distance covariance is 0; this extends results by Lyons (2013) and Dehling et al. (2018+). The proof uses a new definition of distance covariance in the Hilbert space case, generalizing the definition for Euclidean spaces using characteristic functions by Székely, Rizzo and Bakirov (2007).
△ Less
Submitted 29 October, 2019;
originally announced October 2019.
-
A graphon counter example
Authors:
Svante Janson
Abstract:
We give an example of a graphon such that there is no equivalent graphon with a degree function that is (weakly) increasing.
We give an example of a graphon such that there is no equivalent graphon with a degree function that is (weakly) increasing.
△ Less
Submitted 6 September, 2019;
originally announced September 2019.
-
Successive minimum spanning trees
Authors:
Svante Janson,
Gregory B. Sorkin
Abstract:
In a complete graph $K_n$ with edge weights drawn independently from a uniform distribution $U(0,1)$ (or alternatively an exponential distribution $\operatorname{Exp}(1)$), let $T_1$ be the MST (the spanning tree of minimum weight) and let $T_k$ be the MST after deletion of the edges of all previous trees $T_i$, $i<k$. We show that each tree's weight $w(T_k)$ converges in probability to a constant…
▽ More
In a complete graph $K_n$ with edge weights drawn independently from a uniform distribution $U(0,1)$ (or alternatively an exponential distribution $\operatorname{Exp}(1)$), let $T_1$ be the MST (the spanning tree of minimum weight) and let $T_k$ be the MST after deletion of the edges of all previous trees $T_i$, $i<k$. We show that each tree's weight $w(T_k)$ converges in probability to a constant $γ_k$ with $2k-2\sqrt k <γ_k<2k+2\sqrt k$, and we conjecture that $γ_k = 2k-1+o(1)$. The problem is distinct from that of Frieze and Johansson (2018), finding $k$ MSTs of combined minimum weight, and for $k=2$ ours has strictly larger cost.
Our results also hold (and mostly are derived) in a multigraph model where edge weights for each vertex pair follow a Poisson process; here we additionally have $\mathbb E(w(T_k)) \to γ_k$. Thinking of an edge of weight $w$ as arriving at time $t=n w$, Kruskal's algorithm defines forests $F_k(t)$, each initially empty and eventually equal to $T_k$, with each arriving edge added to the first $F_k(t)$ where it does not create a cycle. Using tools of inhomogeneous random graphs we obtain structural results including that $C_1(F_k(t))/n$, the fraction of vertices in the largest component of $F_k(t)$, converges in probability to a function $ρ_k(t)$, uniformly for all $t$, and that a giant component appears in $F_k(t)$ at a time $t=σ_k$. We conjecture that the functions $ρ_k$ tend to time translations of a single function, $ρ_k(2k+x)\toρ_\infty(x)$ as $k \to \infty$, uniformly in $x\in \mathbb R$.
Simulations and numerical computations give estimated values of $γ_k$ for small $k$, and support the conjectures just stated.
△ Less
Submitted 4 June, 2019;
originally announced June 2019.
-
Preferential attachment without vertex growth: emergence of the giant component
Authors:
Svante Janson,
Lutz Warnke
Abstract:
We study the following preferential attachment variant of the classical Erdos-Renyi random graph process. Starting with an empty graph on n vertices, new edges are added one-by-one, and each time an edge is chosen with probability roughly proportional to the product of the current degrees of its endpoints (note that the vertex set is fixed). We determine the asymptotic size of the giant component…
▽ More
We study the following preferential attachment variant of the classical Erdos-Renyi random graph process. Starting with an empty graph on n vertices, new edges are added one-by-one, and each time an edge is chosen with probability roughly proportional to the product of the current degrees of its endpoints (note that the vertex set is fixed). We determine the asymptotic size of the giant component in the supercritical phase, confirming a conjecture of Pittel from 2010. Our proof uses a simple method: we condition on the vertex degrees (of a multigraph variant), and use known results for the configuration model.
△ Less
Submitted 26 April, 2019;
originally announced April 2019.
-
Strong Convergence of Infinite Color Balanced Urns Under Uniform Ergodicity
Authors:
Antar Bandyopadhyay,
Svante Janson,
Debleena Thacker
Abstract:
We consider the generalization of the Pólya urn scheme with possibly infinite many colors as introduced in \cite{Th-Thesis, BaTH2014, BaTh2016, BaTh2017}. For countable many colors, we prove almost sure convergence of the urn configuration under \emph{uniform ergodicity} assumption on the associated Markov chain. The proof uses a stochastic coupling of the sequence of chosen colors with a \emph{br…
▽ More
We consider the generalization of the Pólya urn scheme with possibly infinite many colors as introduced in \cite{Th-Thesis, BaTH2014, BaTh2016, BaTh2017}. For countable many colors, we prove almost sure convergence of the urn configuration under \emph{uniform ergodicity} assumption on the associated Markov chain. The proof uses a stochastic coupling of the sequence of chosen colors with a \emph{branching Markov chain} on a weighted \emph{random recursive tree} as described in \cite{BaTh2017, Sv_2018}. Using this coupling we estimate the covariance between any two selected colors. In particular, we reprove the limit theorem for the classical urn models with finitely many colors.
△ Less
Submitted 12 April, 2019;
originally announced April 2019.
-
Random graphs with given vertex degrees and switchings
Authors:
Svante Janson
Abstract:
Random graphs with a given degree sequence are often constructed using the configuration model, which yields a random multigraph. We may adjust this multigraph by a sequence of switchings, eventually yielding a simple graph. We show that, assuming essentially a bounded second moment of the degree distribution, this construction with the simplest types of switchings yields a simple random graph wit…
▽ More
Random graphs with a given degree sequence are often constructed using the configuration model, which yields a random multigraph. We may adjust this multigraph by a sequence of switchings, eventually yielding a simple graph. We show that, assuming essentially a bounded second moment of the degree distribution, this construction with the simplest types of switchings yields a simple random graph with an almost uniform distribution, in the sense that the total variation distance is $o(1)$. This construction can be used to transfer results on distributional convergence from the configuration model multigraph to the uniform random simple graph with the given vertex degrees. As examples, we give a few applications to asymptotic normality. We show also a weaker result yielding contiguity when the maximum degree is too large for the main theorem to hold.
△ Less
Submitted 31 January, 2019; v1 submitted 28 January, 2019;
originally announced January 2019.
-
Asymptotic normality in random graphs with given vertex degrees
Authors:
Svante Janson
Abstract:
We consider random graphs with a given degree sequence and show, under weak technical conditions, asymptotic normality of the number of components isomorphic to a given tree, first for the random multigraph given by the configuration model and then, by a conditioning argument, for the simple uniform random graph with the given degree sequence. Such conditioning is standard for convergence in proba…
▽ More
We consider random graphs with a given degree sequence and show, under weak technical conditions, asymptotic normality of the number of components isomorphic to a given tree, first for the random multigraph given by the configuration model and then, by a conditioning argument, for the simple uniform random graph with the given degree sequence. Such conditioning is standard for convergence in probability, but much less straightforward for convergence in distribution as here. The proof uses the method of moments, and is based on a new estimate of mixed cumulants in a case of weakly dependent variables. The result on small components is applied to give a new proof of a recent result by Barbour and Röllin on asymptotic normality of the size of the giant component in the random multigraph; moreover, we extend this to the random simple graph.
△ Less
Submitted 31 January, 2019; v1 submitted 19 December, 2018;
originally announced December 2018.