-
Inferences for Random Graphs Evolved by Clustering Attachment
Authors:
Natalia Markovich,
Maksim Ryzhov,
Marijus Vaičiulis
Abstract:
The evolution of random undirected graphs by the clustering attachment (CA) both without node and edge deletion and with uniform node or edge deletion is investigated. Theoretical results are obtained for the CA without node and edge deletion when a newly appended node is connected to two existing nodes of the graph at each evolution step. Theoretical results concern to (1) the sequence of increme…
▽ More
The evolution of random undirected graphs by the clustering attachment (CA) both without node and edge deletion and with uniform node or edge deletion is investigated. Theoretical results are obtained for the CA without node and edge deletion when a newly appended node is connected to two existing nodes of the graph at each evolution step. Theoretical results concern to (1) the sequence of increments of the consecutive mean clustering coefficients tends to zero; (2) the sequences of node degrees and triangle counts of any fixed node which are proved to be submartingales. These results were obtained for any initial graph. The simulation study is provided for the CA with uniform node or edge deletion and without any deletion. It is shown that (1) the CA leads to light-tailed distributed node degrees and triangle counts; (2) the average clustering coefficient tends to a constant over time; (3) the mean node degree and the mean triangle count increase over time with the rate depending on the parameters of the CA. The exposition is accompanied by a real data study.
△ Less
Submitted 1 March, 2024;
originally announced March 2024.
-
Investigation of triangle counts in graphs evolved by uniform clustering attachment
Authors:
N. M. Markovich,
M. Vaičiulis
Abstract:
The clustering attachment model introduced in the paper Bagrow and Brockmann (2013) may be used as an evolution tool of random networks. We propose a new clustering attachment model which can be considered as the limit of the former clustering attachment model as model parameter $α$ tends to zero. We focus on the study of a total triangle count that is considered in the literature as an important…
▽ More
The clustering attachment model introduced in the paper Bagrow and Brockmann (2013) may be used as an evolution tool of random networks. We propose a new clustering attachment model which can be considered as the limit of the former clustering attachment model as model parameter $α$ tends to zero. We focus on the study of a total triangle count that is considered in the literature as an important characteristic of the network clustering. It is proved that total triangle count tends to infinity a.s. for the proposed model. Our simulation study is used for the modeling of sequences of triangle counts. It is based on the interpretation of the clustering attachment as a generalized Pólya-Eggenberger urn model that is introduced here at first time.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
Extremal properties of evolving networks: local dependence and heavy tails
Authors:
Natalia Markovich
Abstract:
A network evolution with predicted tail and extremal indices of PageRank and the Max-Linear Model used as node influence indices in random graphs is considered. The tail index shows a heaviness of the distribution tail. The extremal index is a measure of clustering (or local dependence) of the stochastic process. The cluster implies a set of consecutive exceedances of the process over a sufficient…
▽ More
A network evolution with predicted tail and extremal indices of PageRank and the Max-Linear Model used as node influence indices in random graphs is considered. The tail index shows a heaviness of the distribution tail. The extremal index is a measure of clustering (or local dependence) of the stochastic process. The cluster implies a set of consecutive exceedances of the process over a sufficiently high threshold. Our recent results concerning sums and maxima of non-stationary random length sequences of regularly varying random variables are extended to random graphs. Starting with a set of connected stationary seed communities as a hot spot and ranking them with regard to their tail indices, the tail and extremal indices of new nodes that are appended to the network may be determined. This procedure allows us to predict a temporal network evolution in terms of tail and extremal indices. The extremal index determines limiting distributions of a maximum of the PageRank and the Max-Linear Model of newly attached nodes. The exposition is provided by algorithms and examples. To validate our theoretical results, our simulation and real data study concerning a linear preferential attachment as a tool for network growth are provided.
△ Less
Submitted 24 November, 2022;
originally announced November 2022.
-
Weighted maxima and sums of non-stationary random length sequences in heavy-tailed models
Authors:
Natalia Markovich
Abstract:
The sums and maxima of weighted non-stationary random length sequences of regularly varying random variables may have the same tail and extremal indices, Markovich and Rodionov (2020).
The main constraints are that there exists a unique series in a scheme of series with the minimum tail index, the tail of the term number is lighter than the tail of the terms and the weights are positive constant…
▽ More
The sums and maxima of weighted non-stationary random length sequences of regularly varying random variables may have the same tail and extremal indices, Markovich and Rodionov (2020).
The main constraints are that there exists a unique series in a scheme of series with the minimum tail index, the tail of the term number is lighter than the tail of the terms and the weights are positive constants. These assumptions are changed here: a bounded random number of series is allowed to have the minimum tail index, the tail of the term number may be heavier than the tail of the terms and the weights may be real-valued. Then we derive the tail and extremal indices of the weighted non-stationary random length sequences under the new assumptions.
△ Less
Submitted 18 September, 2022;
originally announced September 2022.
-
Extremes of Sums and Maxima with Application to Random Networks
Authors:
Natalia Markovich
Abstract:
The sums and maxima of non-stationary random length sequences of regularly varying random variables may have the same tail and extremal indices, Markovich and Rodionov (2020).
The main constraint is that there exists a unique series in a scheme of series with the minimum tail index. The result is now revised allowing a random bounded number of series to have the minimum tail index. This new resu…
▽ More
The sums and maxima of non-stationary random length sequences of regularly varying random variables may have the same tail and extremal indices, Markovich and Rodionov (2020).
The main constraint is that there exists a unique series in a scheme of series with the minimum tail index. The result is now revised allowing a random bounded number of series to have the minimum tail index. This new result is applied to random networks.
△ Less
Submitted 7 October, 2021;
originally announced October 2021.
-
Threshold selection for extremal index estimation
Authors:
Natalia M. Markovich,
Igor V. Rodionov
Abstract:
We propose a new threshold selection method for the nonparametric estimation of the extremal index of stochastic processes. The so-called discrepancy method was proposed as a data-driven smoothing tool for estimation of a probability density function. Now it is modified to select a threshold parameter of an extremal index estimator. To this end, a specific normalization of the discrepancy statisti…
▽ More
We propose a new threshold selection method for the nonparametric estimation of the extremal index of stochastic processes. The so-called discrepancy method was proposed as a data-driven smoothing tool for estimation of a probability density function. Now it is modified to select a threshold parameter of an extremal index estimator. To this end, a specific normalization of the discrepancy statistic based on the Cramér-von Mises-Smirnov statistic $ω^2$ is calculated by the $k$ largest order statistics instead of an entire sample. Its asymptotic distribution as $k\to\infty$ is proved to be the same as the $ω^2$-distribution. The quantiles of the latter distribution are used as discrepancy values. The rate of convergence of an extremal index estimate coupled with the discrepancy method is derived. The discrepancy method is used as an automatic threshold selection for the intervals and $K-$gaps estimators and it may be applied to other estimators of the extremal index.
△ Less
Submitted 4 September, 2020;
originally announced September 2020.
-
Modification of Moment-Based Tail Index Estimator: Sums versus Maxima
Authors:
Natalia Markovich,
Marijus Vaičiulis
Abstract:
In this paper we continue the investigation of the SRCEN estimator of the extreme value index $γ$ (or the tail index $α=1/γ$) proposed in \cite{MCE} for $γ>1/2$. We propose a new estimator based on the local maximum. This, in fact, is a modification of the SRCEN estimator to the case $γ>0$. We establish the consistency and asymptotic normality of the newly proposed estimator for i.i.d. data. Also,…
▽ More
In this paper we continue the investigation of the SRCEN estimator of the extreme value index $γ$ (or the tail index $α=1/γ$) proposed in \cite{MCE} for $γ>1/2$. We propose a new estimator based on the local maximum. This, in fact, is a modification of the SRCEN estimator to the case $γ>0$. We establish the consistency and asymptotic normality of the newly proposed estimator for i.i.d. data. Also, a short discussion on the comparison of the estimators is included.
△ Less
Submitted 9 October, 2017;
originally announced October 2017.
-
Clustering and Hitting Times of Threshold Exceedances and Applications
Authors:
Natalia Markovich
Abstract:
We investigate exceedances of the process over a sufficiently high threshold. The exceedances determine the risk of hazardous events like climate catastrophes, huge insurance claims, the loss and delay in telecommunication networks.
Due to dependence such exceedances tend to occur in clusters. The cluster structure of social networks is caused by dependence (social relationships and interests) b…
▽ More
We investigate exceedances of the process over a sufficiently high threshold. The exceedances determine the risk of hazardous events like climate catastrophes, huge insurance claims, the loss and delay in telecommunication networks.
Due to dependence such exceedances tend to occur in clusters. The cluster structure of social networks is caused by dependence (social relationships and interests) between nodes and possibly heavy-tailed distributions of the node degrees. A minimal time to reach a large node determines the first hitting time. We derive an asymptotically equivalent distribution and a limit expectation of the first hitting time to exceed the threshold $u_n$ as the sample size $n$ tends to infinity. The results can be extended to the second and, generally, to the $k$th ($k> 2$) hitting times. Applications in large-scale networks such as social, telecommunication and recommender systems are discussed.
△ Less
Submitted 30 September, 2017;
originally announced October 2017.
-
Extremes in Random Graphs Models of Complex Networks
Authors:
Natalia Markovich
Abstract:
Regarding the analysis of Web communication, social and complex networks the fast finding of most influential nodes in a network graph constitutes an important research problem. We use two indices of the influence of those nodes, namely, PageRank and a Max-linear model. We consider the PageRank %both as %Galton-Watson branching process and as an autoregressive process with a random number of rando…
▽ More
Regarding the analysis of Web communication, social and complex networks the fast finding of most influential nodes in a network graph constitutes an important research problem. We use two indices of the influence of those nodes, namely, PageRank and a Max-linear model. We consider the PageRank %both as %Galton-Watson branching process and as an autoregressive process with a random number of random coefficients that depend on ranks of incoming nodes and their out-degrees and assume that the coefficients are independent and distributed with regularly varying tail and with the same tail index. Then it is proved that the tail index and the extremal index are the same for both PageRank and the Max-linear model and the values of these indices are found. The achievements are based on the study of random sequences of a random length and the comparison of the distribution of their maxima and linear combinations.
△ Less
Submitted 5 April, 2017;
originally announced April 2017.
-
Extremes Control of Complex Systems With Applications to Social Network
Authors:
Natalia Markovich
Abstract:
The control and risk assessment in complex information systems require to take into account extremes arising from nodes with large node degrees. Various sampling techniques like a Page Rank random walk, a Metropolis-Hastings Markov chain and others serve to collect information about the nodes. The paper contributes to the comparison of sampling techniques in complex networks by means of the first…
▽ More
The control and risk assessment in complex information systems require to take into account extremes arising from nodes with large node degrees. Various sampling techniques like a Page Rank random walk, a Metropolis-Hastings Markov chain and others serve to collect information about the nodes. The paper contributes to the comparison of sampling techniques in complex networks by means of the first hitting time, that is the minimal time required to reach a large node. Both the mean and the distribution of the first hitting time is shown to be determined by the so called extremal index. The latter indicates a dependence measure of extremes and also reflects the cluster structure of the network. The clustering is caused by dependence between nodes and heavy-tailed distributions of their degrees. Based on extreme value theory we estimate the mean and the distribution of the first hitting time and the distribution of node degrees by real data from social networks. We demonstrate the heaviness of the tails of these data using appropriate tools. The same methodology can be applied to other complex networks like peer-to-peer telecommunication systems.
△ Less
Submitted 17 February, 2015;
originally announced February 2015.
-
Hitting times of threshold exceedances and their distributions
Authors:
Natalia Markovich
Abstract:
We investigate exceedances of the process over a sufficiently high threshold. The exceedances determine the risk of hazardous events like climate catastrophes, huge insurance claims, the loss and delay in telecommunication networks.
Due to dependence such exceedances tend to occur in clusters. Cluster structure of social networks is caused by dependence (social relationships and interests) betwe…
▽ More
We investigate exceedances of the process over a sufficiently high threshold. The exceedances determine the risk of hazardous events like climate catastrophes, huge insurance claims, the loss and delay in telecommunication networks.
Due to dependence such exceedances tend to occur in clusters. Cluster structure of social networks is caused by dependence (social relationships and interests) between nodes and possibly heavy-tailed distributions of the node degrees. A minimal time to reach a large node determines the first hitting time. We derive asymptotically equivalent distribution and a limit expectation of the first hitting time to exceed the threshold $u_n$ as sample size $n$ tends to infinity. The results can be extended to the second and, generally, to $k$th ($k>2$) hitting times.
△ Less
Submitted 7 January, 2015;
originally announced January 2015.