-
On the empirical spectral distribution of large wavelet random matrices based on mixed-Gaussian fractional measurements in moderately high dimensions
Authors:
Patrice Abry,
Gustavo Didier,
Oliver Orejola,
Herwig Wendt
Abstract:
In this paper, we characterize the convergence of the (rescaled logarithmic) empirical spectral distribution of wavelet random matrices. We assume a moderately high-dimensional framework where the sample size $n$, the dimension $p(n)$ and, for a fixed integer $j$, the scale $a(n)2^j$ go to infinity in such a way that…
▽ More
In this paper, we characterize the convergence of the (rescaled logarithmic) empirical spectral distribution of wavelet random matrices. We assume a moderately high-dimensional framework where the sample size $n$, the dimension $p(n)$ and, for a fixed integer $j$, the scale $a(n)2^j$ go to infinity in such a way that $\lim_{n \rightarrow \infty}p(n)\cdot a(n)/n = \lim_{n \rightarrow \infty} o(\sqrt{a(n)/n})= 0$. We suppose the underlying measurement process is a random scrambling of a sample of size $n$ of a growing number $p(n)$ of fractional processes. Each of the latter processes is a fractional Brownian motion conditionally on a randomly chosen Hurst exponent. We show that the (rescaled logarithmic) empirical spectral distribution of the wavelet random matrices converges weakly, in probability, to the distribution of Hurst exponents.
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
Multivariate selfsimilarity: Multiscale eigen-structures for selfsimilarity parameter estimation
Authors:
Charles-Gérard Lucas,
Gustavo Didier,
Herwig Wendt,
Patrice Abry
Abstract:
Scale-free dynamics, formalized by selfsimilarity, provides a versatile paradigm massively and ubiquitously used to model temporal dynamics in real-world data. However, its practical use has mostly remained univariate so far. By contrast, modern applications often demand multivariate data analysis. Accordingly, models for multivariate selfsimilarity were recently proposed. Nevertheless, they have…
▽ More
Scale-free dynamics, formalized by selfsimilarity, provides a versatile paradigm massively and ubiquitously used to model temporal dynamics in real-world data. However, its practical use has mostly remained univariate so far. By contrast, modern applications often demand multivariate data analysis. Accordingly, models for multivariate selfsimilarity were recently proposed. Nevertheless, they have remained rarely used in practice because of a lack of available robust estimation procedures for the vector of selfsimilarity parameters. Building upon recent mathematical developments, the present work puts forth an efficient estimation procedure based on the theoretical study of the multiscale eigenstructure of the wavelet spectrum of multivariate selfsimilar processes. The estimation performance is studied theoretically in the asymptotic limits of large scale and sample sizes, and computationally for finite-size samples. As a practical outcome, a fully operational and documented multivariate signal processing estimation toolbox is made freely available and is ready for practical use on real-world data. Its potential benefits are illustrated in epileptic seizure prediction from multi-channel EEG data.
△ Less
Submitted 2 April, 2024; v1 submitted 6 November, 2023;
originally announced November 2023.
-
On the surprising effectiveness of a simple matrix exponential derivative approximation, with application to global SARS-CoV-2
Authors:
Gustavo Didier,
Nathan E. Glatt-Holtz,
Andrew J. Holbrook,
Andrew F. Magee,
Marc A. Suchard
Abstract:
The continuous-time Markov chain (CTMC) is the mathematical workhorse of evolutionary biology. Learning CTMC model parameters using modern, gradient-based methods requires the derivative of the matrix exponential evaluated at the CTMC's infinitesimal generator (rate) matrix. Motivated by the derivative's extreme computational complexity as a function of state space cardinality, recent work demonst…
▽ More
The continuous-time Markov chain (CTMC) is the mathematical workhorse of evolutionary biology. Learning CTMC model parameters using modern, gradient-based methods requires the derivative of the matrix exponential evaluated at the CTMC's infinitesimal generator (rate) matrix. Motivated by the derivative's extreme computational complexity as a function of state space cardinality, recent work demonstrates the surprising effectiveness of a naive, first-order approximation for a host of problems in computational biology. In response to this empirical success, we obtain rigorous deterministic and probabilistic bounds for the error accrued by the naive approximation and establish a "blessing of dimensionality" result that is universal for a large class of rate matrices with random entries. Finally, we apply the first-order approximation within surrogate-trajectory Hamiltonian Monte Carlo for the analysis of the early spread of SARS-CoV-2 across 44 geographic regions that comprise a state space of unprecedented dimensionality for unstructured (flexible) CTMC models within evolutionary biology.
△ Less
Submitted 6 December, 2023; v1 submitted 27 June, 2023;
originally announced June 2023.
-
Wavelet eigenvalue regression in high dimensions
Authors:
Patrice Abry,
B. Cooper Boniece,
Gustavo Didier,
Herwig Wendt
Abstract:
In this paper, we construct the wavelet eigenvalue regression methodology in high dimensions. We assume that possibly non-Gaussian, finite-variance $p$-variate measurements are made of a low-dimensional $r$-variate ($r \ll p$) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise a…
▽ More
In this paper, we construct the wavelet eigenvalue regression methodology in high dimensions. We assume that possibly non-Gaussian, finite-variance $p$-variate measurements are made of a low-dimensional $r$-variate ($r \ll p$) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise and between rows. Building upon the asymptotic and large scale properties of wavelet random matrices in high dimensions, the wavelet eigenvalue regression is shown to be consistent and, under additional assumptions, asymptotically Gaussian in the estimation of the fractal structure of the system. We further construct a consistent estimator of the effective dimension $r$ of the system that significantly increases the robustness of the methodology. The estimation performance over finite samples is studied by means of simulations.
△ Less
Submitted 29 July, 2022; v1 submitted 8 August, 2021;
originally announced August 2021.
-
The generalized Langevin equation in harmonic potentials: Anomalous diffusion and equipartition of energy
Authors:
Gustavo Didier,
Hung D. Nguyen
Abstract:
We consider the generalized Langevin equation (GLE) in a harmonic potential with power law decay memory. We study the anomalous diffusion of the particle's displacement and velocity. By comparison with the free particle situation in which the velocity was previously shown to be either diffusive or subdiffusive, we find that, when trapped in a harmonic potential, the particle's displacement may eit…
▽ More
We consider the generalized Langevin equation (GLE) in a harmonic potential with power law decay memory. We study the anomalous diffusion of the particle's displacement and velocity. By comparison with the free particle situation in which the velocity was previously shown to be either diffusive or subdiffusive, we find that, when trapped in a harmonic potential, the particle's displacement may either be diffusive or superdiffusive. Under slightly stronger assumptions on the memory kernel, namely, for kernels related to the broad class of completely monotonic functions, we show that both the free particle and the harmonically bounded GLE satisfy the equipartition of energy condition. This generalizes previously known results for the GLE under particular kernel instances such as the generalized Rouse kernel or (exactly) a power law function.
△ Less
Submitted 9 March, 2022; v1 submitted 8 March, 2021;
originally announced March 2021.
-
On high-dimensional wavelet eigenanalysis
Authors:
Patrice Abry,
B. Cooper Boniece,
Gustavo Didier,
Herwig Wendt
Abstract:
In this paper, we characterize the asymptotic and large scale behavior of the eigenvalues of wavelet random matrices in high dimensions. We assume that possibly non-Gaussian, finite-variance $p$-variate measurements are made of a low-dimensional $r$-variate ($r \ll p$) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The m…
▽ More
In this paper, we characterize the asymptotic and large scale behavior of the eigenvalues of wavelet random matrices in high dimensions. We assume that possibly non-Gaussian, finite-variance $p$-variate measurements are made of a low-dimensional $r$-variate ($r \ll p$) fractional stochastic process with non-canonical scaling coordinates and in the presence of additive high-dimensional noise. The measurements are correlated both time-wise and between rows. We show that the $r$ largest eigenvalues of the wavelet random matrices, when appropriately rescaled, converge in probability to scale-invariant functions in the high-dimensional limit. By contrast, the remaining $p-r$ eigenvalues remain bounded in probability. Under additional assumptions, we show that the $r$ largest log-eigenvalues of wavelet random matrices exhibit asymptotically Gaussian distributions. The results have direct consequences for statistical inference.
△ Less
Submitted 9 June, 2024; v1 submitted 10 February, 2021;
originally announced February 2021.
-
On operator fractional Lévy motion: integral representations and time reversibility
Authors:
Benjamin Cooper Boniece,
Gustavo Didier
Abstract:
In this paper, we construct operator fractional Lévy motion (ofLm), a broad class of non-Gaussian stochastic processes that are covariance operator self-similar, have wide-sense stationary increments and display infinitely divisible marginal distributions. The ofLm class generalizes the univariate fractional Lévy motion as well as the multivariate operator fractional Brownian motion (ofBm). The of…
▽ More
In this paper, we construct operator fractional Lévy motion (ofLm), a broad class of non-Gaussian stochastic processes that are covariance operator self-similar, have wide-sense stationary increments and display infinitely divisible marginal distributions. The ofLm class generalizes the univariate fractional Lévy motion as well as the multivariate operator fractional Brownian motion (ofBm). The ofLm class can be divided into two types, namely, moving average (maofLm) and real harmonizable (rhofLm), both of which share the covariance structure of ofBm under assumptions. We show that maofLm and rhofLm admit stochastic integral representations in the time and Fourier domains, and establish their distinct small- and large-scale limiting behavior. We characterize time reversibility for ofLm through parametric conditions related to its Lévy measure, starting from a framework for the uniqueness of finite second moment, multivariate stochastic integral representations. In particular, we show that, under non-Gaussianity, the parametric conditions for time reversibility are generally more restrictive than those for the Gaussian case (ofBm).
△ Less
Submitted 15 June, 2021; v1 submitted 10 January, 2021;
originally announced January 2021.
-
Asymptotic theory for the detection of mixing in anomalous diffusion
Authors:
Kui Zhang,
Gustavo Didier
Abstract:
In this paper, we develop asymptotic theory for the mixing detection methodology proposed by M. Magdziarz and A. Weron [Physical Review E, 84:051138 (2011)]. The assumptions cover a broad family of Gaussian stochastic processes including fractional Gaussian noise and the fractional Ornstein-Uhlenbeck process. We show that the asymptotic distribution and convergence rates of the detection statistic…
▽ More
In this paper, we develop asymptotic theory for the mixing detection methodology proposed by M. Magdziarz and A. Weron [Physical Review E, 84:051138 (2011)]. The assumptions cover a broad family of Gaussian stochastic processes including fractional Gaussian noise and the fractional Ornstein-Uhlenbeck process. We show that the asymptotic distribution and convergence rates of the detection statistic may be, respectively, Gaussian or non-Gaussian and standard or nonstandard depending on the diffusion exponent. The results pave the way for mixing detection based on a single observed sample path and by means of robust hypothesis testing.
△ Less
Submitted 16 March, 2021; v1 submitted 28 July, 2020;
originally announced July 2020.
-
On multivariate fractional random fields: tempering and operator-stable laws
Authors:
G. Didier,
S. Kanamori,
F. Sabzikar
Abstract:
In this paper, we define a new and broad family of vector-valued random fields called tempered operator fractional operator-stable random fields (TRF, for short). TRF is typically non-Gaussian and generalizes tempered fractional stable stochastic processes. TRF comprises moving average and harmonizable-type subclasses that are constructed by tempering (matrix-) homogeneous, matrix-valued kernels i…
▽ More
In this paper, we define a new and broad family of vector-valued random fields called tempered operator fractional operator-stable random fields (TRF, for short). TRF is typically non-Gaussian and generalizes tempered fractional stable stochastic processes. TRF comprises moving average and harmonizable-type subclasses that are constructed by tempering (matrix-) homogeneous, matrix-valued kernels in time- and Fourier-domain stochastic integrals with respect to vector-valued, strictly operator-stable random measures. We establish the existence and fundamental properties of TRF. Assuming both Gaussianity and isotropy, we show the equivalence between certain moving average and harmonizable subclasses of TRF. In addition, we establish sample path properties in the scalar-valued case for several Gaussian instances.
△ Less
Submitted 21 February, 2020;
originally announced February 2020.
-
On fractional Lévy processes: tempering, sample path properties and stochastic integration
Authors:
Benjamin Cooper Boniece,
Gustavo Didier,
Farzad Sabzikar
Abstract:
We define two new classes of stochastic processes, called tempered fractional Lévy process of the first and second kinds (TFLP and TFLP $I\!I$, respectively). TFLP and TFLP $I\!I$ make up very broad finite-variance, generally non-Gaussian families of transient anomalous diffusion models that are constructed by exponentially tempering the power law kernel in the moving average representation of a f…
▽ More
We define two new classes of stochastic processes, called tempered fractional Lévy process of the first and second kinds (TFLP and TFLP $I\!I$, respectively). TFLP and TFLP $I\!I$ make up very broad finite-variance, generally non-Gaussian families of transient anomalous diffusion models that are constructed by exponentially tempering the power law kernel in the moving average representation of a fractional Lévy process. Accordingly, the increment processes of TFLP and TFLP $I\!I$ display semi-long range dependence. We establish the sample path properties of TFLP and TFLP $I\!I$. We further use a flexible framework of tempered fractional derivatives and integrals to develop the theory of stochastic integration with respect to TFLP and TFLP $I\!I$, which may not be semimartingales depending on the value of the memory parameter and choice of marginal distribution.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Asymptotic analysis of the mean squared displacement under fractional memory kernels
Authors:
Gustavo Didier,
Hung D. Nguyen
Abstract:
The generalized Langevin equation (GLE) is a universal model for particle velocity in a viscoelastic medium. In this paper, we consider the GLE family with fractional memory kernels. We show that, in the critical regime where the memory kernel decays like $1/t$ for large $t$, the mean squared displacement (MSD) of particle motion grows linearly in time up to a slowly varying (logarithm) term. More…
▽ More
The generalized Langevin equation (GLE) is a universal model for particle velocity in a viscoelastic medium. In this paper, we consider the GLE family with fractional memory kernels. We show that, in the critical regime where the memory kernel decays like $1/t$ for large $t$, the mean squared displacement (MSD) of particle motion grows linearly in time up to a slowly varying (logarithm) term. Moreover, we establish the well-posedness of the GLE in this regime. This solves an open question from [Mckinley 2018 Anomalous] and completes the answer to the conjecture put forward in [Morgado 2002 Relation] on the relationship between memory kernel decay and anomalously diffusive behavior. Under slightly stronger assumptions on the memory kernel, we construct an Abelian-Tauberian framework that leads to robust bounds on the deviation of the MSD around its asymptotic trend. This bridges the gap between the GLE memory kernel and the spectral density of anomalously diffusive particle motion characterized in [Didier 2017 Asymptotic].
△ Less
Submitted 8 March, 2021; v1 submitted 9 January, 2019;
originally announced January 2019.
-
Tempered fractional Brownian motion: wavelet estimation, modeling and testing
Authors:
B. Cooper Boniece,
Gustavo Didier,
Farzad Sabzikar
Abstract:
The Davenport spectrum is a modification of the classical Kolmogorov spectrum for the inertial range of turbulence that accounts for non-scaling low frequency behavior. Like the classical fractional Brownian motion vis-à-vis the Kolmogorov spectrum, tempered fractional Brownian motion (tfBm) is a canonical model that displays the Davenport spectrum. The autocorrelation of the increments of tfBm di…
▽ More
The Davenport spectrum is a modification of the classical Kolmogorov spectrum for the inertial range of turbulence that accounts for non-scaling low frequency behavior. Like the classical fractional Brownian motion vis-à-vis the Kolmogorov spectrum, tempered fractional Brownian motion (tfBm) is a canonical model that displays the Davenport spectrum. The autocorrelation of the increments of tfBm displays semi-long range dependence (hyperbolic and quasi-exponential decays over moderate and large scales, respectively), a phenomenon that has been observed in wide a range of applications from wind speeds to geophysics to finance. In this paper, we use wavelets to construct the first estimation method for tfBm and a simple and computationally efficient test for fBm vs tfBm alternatives. The properties of the wavelet estimator and test are mathematically and computationally established. An application of the methodology to the analysis of geophysical flow data shows that tfBm provides a much closer fit than fBm.
△ Less
Submitted 14 August, 2018;
originally announced August 2018.
-
Fluid heterogeneity detection based on the asymptotic distribution of the time-averaged mean squared displacement in single particle tracking experiments
Authors:
Kui Zhang,
Katelyn P. R. Crizer,
Mark H. Schoenfisch,
David B. Hill,
Gustavo Didier
Abstract:
A tracer particle is called anomalously diffusive if its mean squared displacement grows approximately as $σ^2 t^α$ as a function of time $t$ for some constant $σ^2$, where the diffusion exponent satisfies $α\neq 1$. In this article, we use recent results on the asymptotic distribution of the time-averaged mean squared displacement (Didier and Zhang (2017)) to construct statistical tests for detec…
▽ More
A tracer particle is called anomalously diffusive if its mean squared displacement grows approximately as $σ^2 t^α$ as a function of time $t$ for some constant $σ^2$, where the diffusion exponent satisfies $α\neq 1$. In this article, we use recent results on the asymptotic distribution of the time-averaged mean squared displacement (Didier and Zhang (2017)) to construct statistical tests for detecting physical heterogeneity in viscoelastic fluid samples starting from one or multiple observed anomalously diffusive paths. The methods are asymptotically valid for the range $0 < α< 3/2$ and involve a mathematical characterization of time-averaged mean squared displacement bias and the effect of correlated disturbance errors. The assumptions on particle motion cover a broad family of fractional Gaussian processes, including fractional Brownian motion and many fractional instances of the generalized Langevin equation framework. We apply the proposed methods in experimental data from treated $P.\ aeruginosa$ biofilms generated by the collaboration of the Hill and Schoenfisch Labs at UNC-Chapel Hill.
△ Less
Submitted 5 September, 2018; v1 submitted 12 May, 2018;
originally announced May 2018.
-
Wavelet eigenvalue regression for $n$-variate operator fractional Brownian motion
Authors:
Patrice Abry,
Gustavo Didier
Abstract:
In this contribution, we extend the methodology proposed in Abry and Didier (2017) to obtain the first joint estimator of the real parts of the Hurst eigenvalues of $n$-variate OFBM. The procedure consists of a wavelet regression on the log-eigenvalues of the sample wavelet spectrum. The estimator is shown to be consistent for any time reversible OFBM and, under stronger assumptions, also asymptot…
▽ More
In this contribution, we extend the methodology proposed in Abry and Didier (2017) to obtain the first joint estimator of the real parts of the Hurst eigenvalues of $n$-variate OFBM. The procedure consists of a wavelet regression on the log-eigenvalues of the sample wavelet spectrum. The estimator is shown to be consistent for any time reversible OFBM and, under stronger assumptions, also asymptotically normal starting from either continuous or discrete time measurements. Simulation studies establish the finite sample effectiveness of the methodology and illustrate its benefits compared to univariate-like (entrywise) analysis. As an application, we revisit the well-known self-similar character of Internet traffic by applying the proposed methodology to 4-variate time series of modern, high quality Internet traffic data. The analysis reveals the presence of a rich multivariate self-similarity structure.
△ Less
Submitted 10 August, 2017;
originally announced August 2017.
-
Multivariate Hadamard self-similarity: testing fractal connectivity
Authors:
Herwig Wendt,
Gustavo Didier,
Sébastien Combrexelle,
Patrice Abry
Abstract:
While scale invariance is commonly observed in each component of real world multivariate signals, it is also often the case that the inter-component correlation structure is not fractally connected, i.e., its scaling behavior is not determined by that of the individual components. To model this situation in a versatile manner, we introduce a class of multivariate Gaussian stochastic processes call…
▽ More
While scale invariance is commonly observed in each component of real world multivariate signals, it is also often the case that the inter-component correlation structure is not fractally connected, i.e., its scaling behavior is not determined by that of the individual components. To model this situation in a versatile manner, we introduce a class of multivariate Gaussian stochastic processes called Hadamard fractional Brownian motion (HfBm). Its theoretical study sheds light on the issues raised by the joint requirement of entry-wise scaling and departures from fractal connectivity. An asymptotically normal wavelet-based estimator for its scaling parameter, called the Hurst matrix, is proposed, as well as asymptotically valid confidence intervals. The latter are accompanied by original finite sample procedures for computing confidence intervals and testing fractal connectivity from one single and finite size observation. Monte Carlo simulation studies are used to assess the estimation performance as a function of the (finite) sample size, and to quantify the impact of omitting wavelet cross-correlation terms. The simulation studies are shown to validate the use of approximate confidence intervals, together with the significance level and power of the fractal connectivity test. The test performance and properties are further studied as functions of the HfBm parameters.
△ Less
Submitted 16 January, 2017;
originally announced January 2017.
-
Domain and range symmetries of operator fractional Brownian fields
Authors:
Gustavo Didier,
Mark M. Meerschaert,
Vladas Pipiras
Abstract:
An operator fractional Brownian field (OFBF) is a Gaussian, stationary increment R^n-valued random field on R^m that satisfies the operator self-similarity property {X(c^E t)}_{t in R^m} L= {c^H X(t)}_{t in R^m}, c > 0, for two matrix exponents (E,H). In this paper, we characterize the domain and range symmetries of OFBF, respectively, as maximal groups with respect to equivalence classes generate…
▽ More
An operator fractional Brownian field (OFBF) is a Gaussian, stationary increment R^n-valued random field on R^m that satisfies the operator self-similarity property {X(c^E t)}_{t in R^m} L= {c^H X(t)}_{t in R^m}, c > 0, for two matrix exponents (E,H). In this paper, we characterize the domain and range symmetries of OFBF, respectively, as maximal groups with respect to equivalence classes generated by orbits and, based on a new anisotropic polar-harmonizable representation of OFBF, as intersections of centralizers. We also describe the sets of possible pairs of domain and range symmetry groups in dimensions (m,1) and (2,2).
△ Less
Submitted 4 September, 2016;
originally announced September 2016.
-
Non-Linear Wavelet Regression and Branch & Bound Optimization for the Full Identification of Bivariate Operator Fractional Brownian Motion
Authors:
Jordan Frecon,
Gustavo Didier,
Nelly Pustelnik,
Patrice Abry
Abstract:
Self-similarity is widely considered the reference framework for modeling the scaling properties of real-world data. However, most theoretical studies and their practical use have remained univariate. Operator Fractional Brownian Motion (OfBm) was recently proposed as a multivariate model for self-similarity. Yet it has remained seldom used in applications because of serious issues that appear in…
▽ More
Self-similarity is widely considered the reference framework for modeling the scaling properties of real-world data. However, most theoretical studies and their practical use have remained univariate. Operator Fractional Brownian Motion (OfBm) was recently proposed as a multivariate model for self-similarity. Yet it has remained seldom used in applications because of serious issues that appear in the joint estimation of its numerous parameters. While the univariate fractional Brownian motion requires the estimation of two parameters only, its mere bivariate extension already involves 7 parameters which are very different in nature. The present contribution proposes a method for the full identification of bivariate OfBm (i.e., the joint estimation of all parameters) through an original formulation as a non-linear wavelet regression coupled with a custom-made Branch & Bound numerical scheme. The estimation performance (consistency and asymptotic normality) is mathematically established and numerically assessed by means of Monte Carlo experiments. The impact of the parameters defining OfBm on the estimation performance as well as the associated computational costs are also thoroughly investigated.
△ Less
Submitted 28 August, 2016;
originally announced August 2016.
-
Exponents of operator self-similar random fields
Authors:
Gustavo Didier,
Mark M. Meerschaert,
Vladas Pipiras
Abstract:
If X(c^E t) and c^H X(t) have the same finite-dimensional distributions for some linear operators E and H, we say that the random vector field X(t) is operator self-similar. The exponents E and H are not unique in general, due to symmetry. This paper characterizes the possible set of range exponents H for a given domain exponent, and conversely, the set of domain exponents E for a given range expo…
▽ More
If X(c^E t) and c^H X(t) have the same finite-dimensional distributions for some linear operators E and H, we say that the random vector field X(t) is operator self-similar. The exponents E and H are not unique in general, due to symmetry. This paper characterizes the possible set of range exponents H for a given domain exponent, and conversely, the set of domain exponents E for a given range exponent.
△ Less
Submitted 16 August, 2016;
originally announced August 2016.
-
Two-step wavelet-based estimation for mixed Gaussian fractional processes
Authors:
Patrice Abry,
Gustavo Didier,
Hui Li
Abstract:
A mixed Gaussian fractional process $\{Y(t)\}_{t \in {\Bbb R}} = \{PX(t)\}_{t \in {\Bbb R}}$ is a multivariate stochastic process obtained by pre-multiplying a vector of independent, Gaussian fractional process entries $X$ by a nonsingular matrix $P$. It is interpreted that $Y$ is observable, while $X$ is a hidden process occurring in an (unknown) system of coordinates $P$. Mixed processes natural…
▽ More
A mixed Gaussian fractional process $\{Y(t)\}_{t \in {\Bbb R}} = \{PX(t)\}_{t \in {\Bbb R}}$ is a multivariate stochastic process obtained by pre-multiplying a vector of independent, Gaussian fractional process entries $X$ by a nonsingular matrix $P$. It is interpreted that $Y$ is observable, while $X$ is a hidden process occurring in an (unknown) system of coordinates $P$. Mixed processes naturally arise as approximations to solutions of physically relevant classes of multivariate fractional SDEs under aggregation. We propose a semiparametric two-step wavelet-based method for estimating both the demixing matrix $P^{-1}$ and the memory parameters of $X$. The asymptotic normality of the estimators is established both in continuous and discrete time. Monte Carlo experiments show that the finite sample estimation performance is comparable to that of parametric methods, while being very computationally efficient. As applications, we model a bivariate time series of annual tree ring width measurements, and establish the asymptotic normality of the eigenstructure of sample wavelet matrices.
△ Less
Submitted 10 August, 2017; v1 submitted 18 July, 2016;
originally announced July 2016.
-
Designing optimal- and fast-on-average pattern matching algorithms
Authors:
Gilles Didier,
Laurent Tichit
Abstract:
Given a pattern $w$ and a text $t$, the speed of a pattern matching algorithm over $t$ with regard to $w$, is the ratio of the length of $t$ to the number of text accesses performed to search $w$ into $t$. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to $w$, over iid texts. Next, we show how to determine the greatest sp…
▽ More
Given a pattern $w$ and a text $t$, the speed of a pattern matching algorithm over $t$ with regard to $w$, is the ratio of the length of $t$ to the number of text accesses performed to search $w$ into $t$. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to $w$, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with 9 pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.
△ Less
Submitted 25 November, 2016; v1 submitted 28 April, 2016;
originally announced April 2016.
-
Optimal pattern matching algorithms
Authors:
Gilles Didier
Abstract:
We study a class of finite state machines, called \defi{$w$-matching machines}, which yield to simulate the behavior of pattern matching algorithms while searching for a pattern $w$. They can be used to compute the asymptotic speed, i.e. the limit of the expected ratio of the number of text accesses to the length of the text, of algorithms while parsing an iid text to find the pattern $w$.
Defin…
▽ More
We study a class of finite state machines, called \defi{$w$-matching machines}, which yield to simulate the behavior of pattern matching algorithms while searching for a pattern $w$. They can be used to compute the asymptotic speed, i.e. the limit of the expected ratio of the number of text accesses to the length of the text, of algorithms while parsing an iid text to find the pattern $w$.
Defining the order of a matching machine or of an algorithm as the maximum difference between the current and accessed positions during a search (standard algorithms are generally of order $|w|$), we show that being given a pattern $w$, an order $k$ and an iid model, there exists an optimal $w$-matching machine, i.e. with the greatest asymptotic speed under the model among all the machines of order $k$, of which the set of states belongs to a finite and enumerable set.
It shows that it is possible to determine: 1) the greatest asymptotic speed among a large class of algorithms, with regard to a pattern and an iid model, and 2) a $w$-matching machine, thus an algorithm, achieving this speed.
△ Less
Submitted 2 May, 2016; v1 submitted 28 April, 2016;
originally announced April 2016.
-
The asymptotic distribution of the pathwise mean squared displacement in single particle tracking experiments
Authors:
Gustavo Didier,
Kui Zhang
Abstract:
Recent advances in light microscopy have spawned new research frontiers in microbiology by working around the diffraction barrier and allowing for the observation of nanometric biological structures. Microrheology is the study of the properties of complex fluids, such as those found in biology, through the dynamics of small embedded particles, typically latex beads. Statistics based on the recorde…
▽ More
Recent advances in light microscopy have spawned new research frontiers in microbiology by working around the diffraction barrier and allowing for the observation of nanometric biological structures. Microrheology is the study of the properties of complex fluids, such as those found in biology, through the dynamics of small embedded particles, typically latex beads. Statistics based on the recorded sample paths are then used by biophysicists to infer rheological properties of the fluid. In the biophysical literature, the main statistic for characterizing diffusivity is the so-named mean squared displacement (MSD) of the tracer particles. Notwithstanding the central role played by the MSD, its asymptotic distribution in different cases has not yet been established. In this paper, we tackle this problem. We take a pathwise approach and assume that the particle movement undergoes a Gaussian, stationary-increment stochastic process. We show that as the sample and the increment lag sizes go to infinity, the MSD displays Gaussian or non-Gaussian limiting distributions, as well as distinct convergence rates, depending on the diffusion exponent parameter. We illustrate our results analytically and computationally based on fractional Brownian motion and the (integrated) fractional Ornstein-Uhlenbeck process.
△ Less
Submitted 26 July, 2016; v1 submitted 23 July, 2015;
originally announced July 2015.
-
Wavelet estimation for operator fractional Brownian motion
Authors:
Patrice Abry,
Gustavo Didier
Abstract:
Operator fractional Brownian motion (OFBM) is the natural vector-valued extension of the univariate fractional Brownian motion. Instead of a scalar parameter, the law of an OFBM scales according to a Hurst matrix that affects every component of the process. In this paper, we develop the wavelet analysis of OFBM, as well as a new estimator for the Hurst matrix of bivariate OFBM. For OFBM, the univa…
▽ More
Operator fractional Brownian motion (OFBM) is the natural vector-valued extension of the univariate fractional Brownian motion. Instead of a scalar parameter, the law of an OFBM scales according to a Hurst matrix that affects every component of the process. In this paper, we develop the wavelet analysis of OFBM, as well as a new estimator for the Hurst matrix of bivariate OFBM. For OFBM, the univariate-inspired approach of analyzing the entry-wise behavior of the wavelet spectrum as a function of the (wavelet) scales is fraught with difficulties stemming from mixtures of power laws. The proposed approach consists of considering the evolution along scales of the eigenstructure of the wavelet spectrum. This is shown to yield consistent and asymptotically normal estimators of the Hurst eigenvalues, and also of the coordinate system itself under assumptions. A simulation study is included to demonstrate the good performance of the estimators under finite sample sizes.
△ Less
Submitted 15 September, 2015; v1 submitted 24 January, 2015;
originally announced January 2015.
-
On integral representations of operator fractional Brownian fields
Authors:
Changryong Baek,
Gustavo Didier,
Vladas Pipiras
Abstract:
Operator fractional Brownian fields (OFBFs) are Gaussian, stationary-increment vector random fields that satisfy the operator self-similarity relation {X(c^{E}t)}_{t in R^m} L= {c^{H}X(t)}_{t in R^m}. We establish a general harmonizable representation (Fourier domain stochastic integral) for OFBFs. Under additional assumptions, we also show how the harmonizable representation can be reexpressed as…
▽ More
Operator fractional Brownian fields (OFBFs) are Gaussian, stationary-increment vector random fields that satisfy the operator self-similarity relation {X(c^{E}t)}_{t in R^m} L= {c^{H}X(t)}_{t in R^m}. We establish a general harmonizable representation (Fourier domain stochastic integral) for OFBFs. Under additional assumptions, we also show how the harmonizable representation can be reexpressed as a moving average stochastic integral, thus answering an open problem described in Bierme et al.(2007), "Operator scaling stable random fields", Stochastic Processes and their Applications 117, 312--332.
△ Less
Submitted 23 May, 2014; v1 submitted 24 March, 2014;
originally announced March 2014.
-
On the vaguelet and Riesz properties of L^2-unbounded transformations of orthogonal wavelet bases
Authors:
Gustavo Didier,
Stéphane Jaffard,
Vladas Pipiras
Abstract:
In this work, we prove that certain L^2-unbounded transformations of orthogonal wavelet bases generate vaguelets. The L^2-unbounded functions involved in the transformations are assumed to be quasi-homogeneous at high frequencies. We provide natural examples of functions which are not quasi-homogeneous and for which the resulting transformations are not vaguelets. We also address the related quest…
▽ More
In this work, we prove that certain L^2-unbounded transformations of orthogonal wavelet bases generate vaguelets. The L^2-unbounded functions involved in the transformations are assumed to be quasi-homogeneous at high frequencies. We provide natural examples of functions which are not quasi-homogeneous and for which the resulting transformations are not vaguelets. We also address the related question of whether the considered family of functions is a Riesz basis in L^2(R). The Riesz property could be deduced directly from the results available in the literature or, as we outline, by using the vaguelet property in the context of this work. The considered families of functions arise in wavelet-based decompositions of stochastic processes with uncorrelated coefficients.
△ Less
Submitted 13 March, 2013; v1 submitted 9 October, 2012;
originally announced October 2012.
-
On the wavelet-based simulation of anomalous diffusion
Authors:
Gustavo Didier,
John Fricks
Abstract:
The characterization of particle diffusion is a classical problem in physics and probability theory. The field of microrheology is based on experiments in which microscopic tracer beads are placed into a non-Newtonian fluid and tracked using high speed video capture. The modeling of the behavior of these beads is now an active scientific area which demands multiple stochastic and statistical metho…
▽ More
The characterization of particle diffusion is a classical problem in physics and probability theory. The field of microrheology is based on experiments in which microscopic tracer beads are placed into a non-Newtonian fluid and tracked using high speed video capture. The modeling of the behavior of these beads is now an active scientific area which demands multiple stochastic and statistical methods.
We propose an approximate wavelet-based simulation technique for two classes of continuous time anomalous diffusion models, the fractional Ornstein-Uhlenbeck process and the fractional generalized Langevin equation. The proposed algorithm is an iterative method that provides approximate discretizations that converge quickly and in an appropriate sense to the continuous time target process. As compared to previous works, it covers cases where the natural discretization of the target process does not have closed form in the time domain. Moreover, we propose smoothing procedures as to speed the time domain decay of the filters.
△ Less
Submitted 2 July, 2012; v1 submitted 20 February, 2012;
originally announced February 2012.
-
Statistical Challenges in Microrheology
Authors:
Gustavo Didier,
Scott McKinley,
David B. Hill,
John Fricks
Abstract:
Microrheology is the study of the properties of a complex fluid through the diffusion dynamics of small particles, typically latex beads, moving through that material. Currently, it is the dominant technique in the study of the physical properties of biological fluids, of the material properties of membranes or the cytoplasm of cells, or of the entire cell. The theoretical underpinning of microrhe…
▽ More
Microrheology is the study of the properties of a complex fluid through the diffusion dynamics of small particles, typically latex beads, moving through that material. Currently, it is the dominant technique in the study of the physical properties of biological fluids, of the material properties of membranes or the cytoplasm of cells, or of the entire cell. The theoretical underpinning of microrheology was given in Mason and Weitz (Physical Review Letters; 1995), who introduced a framework for the use of path data of diffusing particles to infer viscoelastic properties of its fluid environment. The multi-particle tracking techniques that were subsequently developed have presented numerous challenges for experimentalists and theoreticians. This paper describes some specific challenges that await the attention of statisticians and applied probabilists. We describe relevant aspects of the physical theory, current inferential efforts and simulation aspects of a central model for the dynamics of nano-scale particles in viscoelastic fluids, the generalized Langevin equation.
△ Less
Submitted 9 February, 2012; v1 submitted 28 January, 2012;
originally announced January 2012.
-
Integral representations and properties of operator fractional Brownian motions
Authors:
Gustavo Didier,
Vladas Pipiras
Abstract:
Operator fractional Brownian motions (OFBMs) are (i) Gaussian, (ii) operator self-similar and (iii) stationary increment processes. They are the natural multivariate generalizations of the well-studied fractional Brownian motions. Because of the possible lack of time-reversibility, the defining properties (i)--(iii) do not, in general, characterize the covariance structure of OFBMs. To circumvent…
▽ More
Operator fractional Brownian motions (OFBMs) are (i) Gaussian, (ii) operator self-similar and (iii) stationary increment processes. They are the natural multivariate generalizations of the well-studied fractional Brownian motions. Because of the possible lack of time-reversibility, the defining properties (i)--(iii) do not, in general, characterize the covariance structure of OFBMs. To circumvent this problem, the class of OFBMs is characterized here by means of their integral representations in the spectral and time domains. For the spectral domain representations, this involves showing how the operator self-similarity shapes the spectral density in the general representation of stationary increment processes. The time domain representations are derived by using primary matrix functions and taking the Fourier transforms of the deterministic spectral domain kernels. Necessary and sufficient conditions for OFBMs to be time-reversible are established in terms of their spectral and time domain representations. It is also shown that the spectral density of the stationary increments of an OFBM has a rigid structure, here called the dichotomy principle. The notion of operator Brownian motions is also explored.
△ Less
Submitted 9 February, 2011;
originally announced February 2011.
-
Exponents, symmetry groups and classification of operator fractional Brownian motions
Authors:
Gustavo Didier,
Vladas Pipiras
Abstract:
Operator fractional Brownian motions (OFBMs) are zero mean, operator self-similar (o.s.s.), Gaussian processes with stationary increments. They generalize univariate fractional Brownian motions to the multivariate context. It is well-known that the so-called symmetry group of an o.s.s. process is conjugate to subgroups of the orthogonal group. Moreover, by a celebrated result of Hudson and Mason,…
▽ More
Operator fractional Brownian motions (OFBMs) are zero mean, operator self-similar (o.s.s.), Gaussian processes with stationary increments. They generalize univariate fractional Brownian motions to the multivariate context. It is well-known that the so-called symmetry group of an o.s.s. process is conjugate to subgroups of the orthogonal group. Moreover, by a celebrated result of Hudson and Mason, the set of all exponents of an operator self-similar process can be related to the tangent space of its symmetry group.
In this paper, we revisit and study both the symmetry groups and exponent sets for the class of OFBMs based on their spectral domain integral representations. A general description of the symmetry groups of OFBMs in terms of subsets of centralizers of the spectral domain parameters is provided. OFBMs with symmetry groups of maximal and minimal types are studied in any dimension. In particular, it is shown that OFBMs have minimal symmetry groups (as thus, unique exponents) in general, in the topological sense. Finer classification results of OFBMs, based on the explicit construction of their symmetry groups, are given in the lower dimensions 2 and 3. It is also shown that the parametrization of spectral domain integral representations are, in a suitable sense, not affected by the multiplicity of exponents, whereas the same is not true for time domain integral representations.
△ Less
Submitted 24 January, 2011;
originally announced January 2011.
-
On the Behrens--Fisher problem: A globally convergent algorithm and a finite-sample study of the Wald, LR and LM Tests
Authors:
Alexandre Belloni,
Gustavo Didier
Abstract:
In this paper we provide a provably convergent algorithm for the multivariate Gaussian Maximum Likelihood version of the Behrens--Fisher Problem. Our work builds upon a formulation of the log-likelihood function proposed by Buot and Richards \citeBR. Instead of focusing on the first order optimality conditions, the algorithm aims directly for the maximization of the log-likelihood function itsel…
▽ More
In this paper we provide a provably convergent algorithm for the multivariate Gaussian Maximum Likelihood version of the Behrens--Fisher Problem. Our work builds upon a formulation of the log-likelihood function proposed by Buot and Richards \citeBR. Instead of focusing on the first order optimality conditions, the algorithm aims directly for the maximization of the log-likelihood function itself to achieve a global solution. Convergence proof and complexity estimates are provided for the algorithm. Computational experiments illustrate the applicability of such methods to high-dimensional data. We also discuss how to extend the proposed methodology to a broader class of problems. We establish a systematic algebraic relation between the Wald, Likelihood Ratio and Lagrangian Multiplier Test ($W\geq \mathit{LR}\geq \mathit{LM}$) in the context of the Behrens--Fisher Problem. Moreover, we use our algorithm to computationally investigate the finite-sample size and power of the Wald, Likelihood Ratio and Lagrange Multiplier Tests, which previously were only available through asymptotic results. The methods developed here are applicable to much higher dimensional settings than the ones available in the literature. This allows us to better capture the role of high dimensionality on the actual size and power of the tests for finite samples.
△ Less
Submitted 5 November, 2008;
originally announced November 2008.