-
Rate of Estimation for the Stationary Distribution of Stochastic Dam** Hamiltonian Systems with Continuous Observations
Authors:
Sylvain Delattre,
Arnaud Gloter,
Nakahiro Yoshida
Abstract:
We study the problem of the non-parametric estimation for the density $π$ of the stationary distribution of a stochastic two-dimensional dam** Hamiltonian system $(Z_t)_{t\in[0,T]}=(X_t,Y_t)_{t \in [0,T]}$. From the continuous observation of the sampling path on $[0,T]$, we study the rate of estimation for $π(x_0,y_0)$ as $T \to \infty$. We show that kernel based estimators can achieve the rate…
▽ More
We study the problem of the non-parametric estimation for the density $π$ of the stationary distribution of a stochastic two-dimensional dam** Hamiltonian system $(Z_t)_{t\in[0,T]}=(X_t,Y_t)_{t \in [0,T]}$. From the continuous observation of the sampling path on $[0,T]$, we study the rate of estimation for $π(x_0,y_0)$ as $T \to \infty$. We show that kernel based estimators can achieve the rate $T^{-v}$ for some explicit exponent $v \in (0,1/2)$. One finding is that the rate of estimation depends on the smoothness of $π$ and is completely different with the rate appearing in the standard i.i.d.\ setting or in the case of two-dimensional non degenerate diffusion processes. Especially, this rate depends also on $y_0$. Moreover, we obtain a minimax lower bound on the $L^2$-risk for pointwise estimation, with the same rate $T^{-v}$, up to $\log(T)$ terms.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.
-
Estimation via length-constrained generalized empirical principal curves under small noise
Authors:
Sylvain Delattre,
Aurélie Fischer
Abstract:
In this paper, we propose a method to build a sequence of generalized empirical principal curves, with selected length, so that, in Hausdor distance, the images of the estimating principal curves converge in probability to the image of g.
In this paper, we propose a method to build a sequence of generalized empirical principal curves, with selected length, so that, in Hausdor distance, the images of the estimating principal curves converge in probability to the image of g.
△ Less
Submitted 15 November, 2019;
originally announced November 2019.
-
Estimating minimum effect with outlier selection
Authors:
Alexandra Carpentier,
Sylvain Delattre,
Etienne Roquain,
Nicolas Verzelen
Abstract:
We introduce one-sided versions of Huber's contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation of the mean of uncorrupted samples (minimum effect) and selection of corrupted samples (outliers). Regarding the minimum effect estimation, we derive the minimax risks and introduce adaptive estimators to the…
▽ More
We introduce one-sided versions of Huber's contamination model, in which corrupted samples tend to take larger values than uncorrupted ones. Two intertwined problems are addressed: estimation of the mean of uncorrupted samples (minimum effect) and selection of corrupted samples (outliers). Regarding the minimum effect estimation, we derive the minimax risks and introduce adaptive estimators to the unknown number of contaminations. Interestingly, the optimal convergence rate highly differs from that in classical Huber's contamination model. Also, our analysis uncovers the effect of particular structural assumptions on the distribution of the contaminated samples. As for the problem of selecting the outliers, we formulate the problem in a multiple testing framework for which the location/scaling of the null hypotheses are unknown. We rigorously prove how estimating the null hypothesis is possible while maintaining a theoretical guarantee on the amount of the falsely selected outliers, both through false discovery rate (FDR) or post hoc bounds. As a by-product, we address a long-standing open issue on FDR control under equi-correlation, which reinforces the interest of removing dependency when making multiple testing.
△ Less
Submitted 21 September, 2018;
originally announced September 2018.
-
On principal curves with a length constraint
Authors:
Sylvain Delattre,
Aurélie Fischer
Abstract:
Principal curves are defined as parametric curves passing through the "middle" of a probability distribution in R^d. In addition to the original definition based on self-consistency, several points of view have been considered among which a least square type constrained minimization problem.In this paper, we are interested in theoretical properties satisfied by a constrained principal curve associ…
▽ More
Principal curves are defined as parametric curves passing through the "middle" of a probability distribution in R^d. In addition to the original definition based on self-consistency, several points of view have been considered among which a least square type constrained minimization problem.In this paper, we are interested in theoretical properties satisfied by a constrained principal curve associated to a probability distribution with second-order moment. We study open and closed principal curves f:[0,1]-->R^d with length at most L and show in particular that they have finite curvature whenever the probability distribution is not supported on the range of a curve with length L.We derive from the order 1 condition, expressing that a curve is a critical point for the criterion, an equation involving the curve, its curvature, as well as a random variable playing the role of the curve parameter. This equation allows to show that a constrained principal curve in dimension 2 has no multiple point.
△ Less
Submitted 14 October, 2019; v1 submitted 5 July, 2017;
originally announced July 2017.
-
On Monte-Carlo tree search for deterministic games with alternate moves and complete information
Authors:
Sylvain Delattre,
Nicolas Fournier
Abstract:
We consider a deterministic game with alternate moves and complete information, of which the issue is always the victory of one of the two opponents. We assume that this game is the realization of a random model enjoying some independence properties. We consider algorithms in the spirit of Monte-Carlo Tree Search, to estimate at best the minimax value of a given position: it consists in simulating…
▽ More
We consider a deterministic game with alternate moves and complete information, of which the issue is always the victory of one of the two opponents. We assume that this game is the realization of a random model enjoying some independence properties. We consider algorithms in the spirit of Monte-Carlo Tree Search, to estimate at best the minimax value of a given position: it consists in simulating, successively, $n$ well-chosen matches, starting from this position. We build an algorithm, which is optimal, step by step, in some sense: once the $n$ first matches are simulated, the algorithm decides from the statistics furnished by the $n$ first matches (and the a priori we have on the game) how to simulate the $(n+1)$-th match in such a way that the increase of information concerning the minimax value of the position under study is maximal. This algorithm is remarkably quick. We prove that our step by step optimal algorithm is not globally optimal and that it always converges in a finite number of steps, even if the a priori we have on the game is completely irrelevant. We finally test our algorithm, against MCTS, on Pearl's game and, with a very simple and universal a priori, on the games Connect Four and some variants. The numerical results are rather disappointing. We however exhibit some situations in which our algorithm seems efficient.
△ Less
Submitted 24 January, 2018; v1 submitted 15 April, 2017;
originally announced April 2017.
-
A note on dynamical models on random graphs and Fokker-Planck equations
Authors:
Sylvain Delattre,
Giambattista Giacomin,
Eric Luçon
Abstract:
We address the issue of the proximity of interacting diffusion models on large graphs with a uniform degree property and a corresponding mean field model, i.e. a model on the complete graph with a suitably renormalized interaction parameter. Examples include Erdős-Rényi graphs with edge probability $p_n$, $n$ is the number of vertices, such that $\lim_{n \to \infty}p_n n= \infty$. The purpose of t…
▽ More
We address the issue of the proximity of interacting diffusion models on large graphs with a uniform degree property and a corresponding mean field model, i.e. a model on the complete graph with a suitably renormalized interaction parameter. Examples include Erdős-Rényi graphs with edge probability $p_n$, $n$ is the number of vertices, such that $\lim_{n \to \infty}p_n n= \infty$. The purpose of this note it twofold: (1) to establish this proximity on finite time horizon, by exploiting the fact that both systems are accurately described by a Fokker-Planck PDE (or, equivalently, by a nonlinear diffusion process) in the $n=\infty$ limit; (2) to remark that in reality this result is unsatisfactory when it comes to applying it to systems with $N$ large but finite, for example the values of $N$ that can be reached in simulations or that correspond to the typical number of interacting units in a biological system.
△ Less
Submitted 23 October, 2016; v1 submitted 18 July, 2016;
originally announced July 2016.
-
On the Kozachenko-Leonenko entropy estimator
Authors:
Nicolas Fournier,
Sylvain Delattre
Abstract:
We study in details the bias and variance of the entropy estimator proposed by Kozachenko and Leonenko for a large class of densities on $\mathbb{R}^d$. We then use the work of Bickel and Breiman to prove a central limit theorem in dimensions $1$ and $2$. In higher dimensions, we provide a development of the bias in terms of powers of $N^{-2/d}$. This allows us to use a Richardson extrapolation to…
▽ More
We study in details the bias and variance of the entropy estimator proposed by Kozachenko and Leonenko for a large class of densities on $\mathbb{R}^d$. We then use the work of Bickel and Breiman to prove a central limit theorem in dimensions $1$ and $2$. In higher dimensions, we provide a development of the bias in terms of powers of $N^{-2/d}$. This allows us to use a Richardson extrapolation to build, in any dimension, an estimator satisfying a central limit theorem and for which we can give some some explicit (asymptotic) confidence intervals.
△ Less
Submitted 24 February, 2016;
originally announced February 2016.
-
Statistical inference versus mean field limit for Hawkes processes
Authors:
Sylvain Delattre,
Nicolas Fournier
Abstract:
We consider a population of $N$ individuals, of which we observe the number of actions as time evolves. For each couple of individuals $(i,j)$, $j$ may or not influence $i$, which we model by i.i.d. Bernoulli$(p)$-random variables, for some unknown parameter $p\in (0,1]$. Each individual acts autonomously at some unknown rate $μ>0$ and acts by mimetism at some rate depending on the number of recen…
▽ More
We consider a population of $N$ individuals, of which we observe the number of actions as time evolves. For each couple of individuals $(i,j)$, $j$ may or not influence $i$, which we model by i.i.d. Bernoulli$(p)$-random variables, for some unknown parameter $p\in (0,1]$. Each individual acts autonomously at some unknown rate $μ>0$ and acts by mimetism at some rate depending on the number of recent actions of the individuals which influence him, the age of these actions being taken into account through an unknown function $\varphi$ (roughly, decreasing and with fast decay). The goal of this paper is to estimate $p$, which is the main charateristic of the graph of interactions, in the asymptotic $N\to\infty$, $t\to\infty$. The main issue is that the mean field limit (as $N \to \infty$) of this model is unidentifiable, in that it only depends on the parameters $μ$ and $p\varphi$. Fortunately, this mean field limit is not valid for large times. We distinguish the subcritical case, where, roughly, the mean number $m_t$ of actions per individual increases linearly and the supercritical case, where $m_t$ increases exponentially. Although the nuisance parameter $\varphi$ is non-parametric, we are able, in both cases, to estimate $p$ without estimating $\varphi$ in a nonparametric way, with a precision of order $N^{-1/2}+N^{1/2}m_t^{-1}$, up to some arbitrarily small loss. We explain, using a Gaussian toy model, the reason why this rate of convergence might be (almost) optimal.
△ Less
Submitted 27 January, 2016; v1 submitted 10 July, 2015;
originally announced July 2015.
-
Asymptotic lower bounds in estimating jumps
Authors:
Emmanuelle Clément,
Sylvain Delattre,
Arnaud Gloter
Abstract:
We study the problem of the efficient estimation of the jumps for stochastic processes. We assume that the stochastic jump process $(X_t)_{t\in[0,1]}$ is observed discretely, with a sampling step of size $1/n$. In the spirit of Hajek's convolution theorem, we show some lower bounds for the estimation error of the sequence of the jumps $(ΔX_{T_k})_k$. As an intermediate result, we prove a LAMN prop…
▽ More
We study the problem of the efficient estimation of the jumps for stochastic processes. We assume that the stochastic jump process $(X_t)_{t\in[0,1]}$ is observed discretely, with a sampling step of size $1/n$. In the spirit of Hajek's convolution theorem, we show some lower bounds for the estimation error of the sequence of the jumps $(ΔX_{T_k})_k$. As an intermediate result, we prove a LAMN property, with rate $\sqrt{n}$, when the marks of the underlying jump component are deterministic. We deduce then a convolution theorem, with an explicit asymptotic minimal variance, in the case where the marks of the jump component are random. To prove that this lower bound is optimal, we show that a threshold estimator of the sequence of jumps $(ΔX_{T_k})_k$ based on the discrete observations, reaches the minimal variance of the previous convolution theorem.
△ Less
Submitted 1 July, 2014;
originally announced July 2014.
-
High dimensional Hawkes processes
Authors:
Sylvain Delattre,
Nicolas Fournier,
Marc Hoffmann
Abstract:
We generalise the construction of multivariate Hawkes processes to a possibly infinite network of counting processes on a directed graph $\mathbb G$. The process is constructed as the solution to a system of Poisson driven stochastic differential equations, for which we prove pathwise existence and uniqueness under some reasonable conditions.
We next investigate how to approximate a standard…
▽ More
We generalise the construction of multivariate Hawkes processes to a possibly infinite network of counting processes on a directed graph $\mathbb G$. The process is constructed as the solution to a system of Poisson driven stochastic differential equations, for which we prove pathwise existence and uniqueness under some reasonable conditions.
We next investigate how to approximate a standard $N$-dimensional Hawkes process by a simple inhomogeneous Poisson process in the mean-field framework where each pair of individuals interact in the same way, in the limit $N \rightarrow \infty$. In the so-called linear case for the interaction, we further investigate the large time behaviour of the process. We study in particular the stability of the central limit theorem when exchanging the limits $N, T\rightarrow \infty$ and exhibit different possible behaviours.
We finally consider the case $\mathbb G = \mathbb Z^d$ with nearest neighbour interactions. In the linear case, we prove some (large time) laws of large numbers and exhibit different behaviours, reminiscent of the infinite setting. Finally we study the propagation of a {\it single impulsion} started at a given point of $\zz^d$ at time $0$. We compute the probability of extinction of such an impulsion and, in some particular cases, we can accurately describe how it propagates to the whole space.
△ Less
Submitted 23 March, 2014;
originally announced March 2014.
-
New procedures controlling the false discovery proportion via Romano-Wolf's heuristic
Authors:
Sylvain Delattre,
Etienne Roquain
Abstract:
The false discovery proportion (FDP) is a convenient way to account for false positives when a large number $m$ of tests are performed simultaneously. Romano and Wolf [Ann. Statist. 35 (2007) 1378-1408] have proposed a general principle that builds FDP controlling procedures from $k$-family-wise error rate controlling procedures while incorporating dependencies in an appropriate manner; see Korn e…
▽ More
The false discovery proportion (FDP) is a convenient way to account for false positives when a large number $m$ of tests are performed simultaneously. Romano and Wolf [Ann. Statist. 35 (2007) 1378-1408] have proposed a general principle that builds FDP controlling procedures from $k$-family-wise error rate controlling procedures while incorporating dependencies in an appropriate manner; see Korn et al. [J. Statist. Plann. Inference 124 (2004) 379-398]; Romano and Wolf (2007). However, the theoretical validity of the latter is still largely unknown. This paper provides a careful study of this heuristic: first, we extend this approach by using a notion of "bounding device" that allows us to cover a wide range of critical values, including those that adapt to $m\_0$, the number of true null hypotheses. Second, the theoretical validity of the latter is investigated both nonasymptotically and asymptotically. Third, we introduce suitable modifications of this heuristic that provide new methods, overcoming the existing procedures with a proven FDP control.
△ Less
Submitted 5 June, 2015; v1 submitted 16 November, 2013;
originally announced November 2013.
-
Estimating the efficient price from the order flow: a Brownian Cox process approach
Authors:
Sylvain Delattre,
Christian Y. Robert,
Mathieu Rosenbaum
Abstract:
At the ultra high frequency level, the notion of price of an asset is very ambiguous. Indeed, many different prices can be defined (last traded price, best bid price, mid price,...). Thus, in practice, market participants face the problem of choosing a price when implementing their strategies. In this work, we propose a notion of efficient price which seems relevant in practice. Furthermore, we pr…
▽ More
At the ultra high frequency level, the notion of price of an asset is very ambiguous. Indeed, many different prices can be defined (last traded price, best bid price, mid price,...). Thus, in practice, market participants face the problem of choosing a price when implementing their strategies. In this work, we propose a notion of efficient price which seems relevant in practice. Furthermore, we provide a statistical methodology enabling to estimate this price form the order flow.
△ Less
Submitted 11 April, 2013; v1 submitted 14 January, 2013;
originally announced January 2013.
-
On empirical distribution function of high-dimensional Gaussian vector components with an application to multiple testing
Authors:
Sylvain Delattre,
Etienne Roquain
Abstract:
This paper introduces a new framework to study the asymptotical behavior of the empirical distribution function (e.d.f.) of Gaussian vector components, whose correlation matrix $Γ^{(m)}$ is dimension-dependent. Hence, by contrast with the existing literature, the vector is not assumed to be stationary. Rather, we make a "vanishing second order" assumption ensuring that the covariance matrix…
▽ More
This paper introduces a new framework to study the asymptotical behavior of the empirical distribution function (e.d.f.) of Gaussian vector components, whose correlation matrix $Γ^{(m)}$ is dimension-dependent. Hence, by contrast with the existing literature, the vector is not assumed to be stationary. Rather, we make a "vanishing second order" assumption ensuring that the covariance matrix $Γ^{(m)}$ is not too far from the identity matrix, while the behavior of the e.d.f. is affected by $Γ^{(m)}$ only through the sequence $γ_m=m^{-2} \sum_{i\neq j} Γ_{i,j}^{(m)}$, as $m$ grows to infinity. This result recovers some of the previous results for stationary long-range dependencies while it also applies to various, high-dimensional, non-stationary frameworks, for which the most correlated variables are not necessarily next to each other. Finally, we present an application of this work to the multiple testing problem, which was the initial statistical motivation for develo** such a methodology.
△ Less
Submitted 4 May, 2013; v1 submitted 9 October, 2012;
originally announced October 2012.
-
Blockwise SVD with error in the operator and application to blind deconvolution
Authors:
S. Delattre,
M. Hoffmann,
D. Picard,
T. Vareschi
Abstract:
We consider linear inverse problems in a nonparametric statistical framework. Both the signal and the operator are unknown and subject to error measurements. We establish minimax rates of convergence under squared error loss when the operator admits a blockwise singular value decomposition (blockwise SVD) and the smoothness of the signal is measured in a Sobolev sense. We construct a nonlinear pro…
▽ More
We consider linear inverse problems in a nonparametric statistical framework. Both the signal and the operator are unknown and subject to error measurements. We establish minimax rates of convergence under squared error loss when the operator admits a blockwise singular value decomposition (blockwise SVD) and the smoothness of the signal is measured in a Sobolev sense. We construct a nonlinear procedure adapting simultaneously to the unknown smoothness of both the signal and the operator and achieving the optimal rate of convergence to within logarithmic terms. When the noise level in the operator is dominant, by taking full advantage of the blockwise SVD property, we demonstrate that the block SVD procedure overperforms classical methods based on Galerkin projection or nonlinear wavelet thresholding. We subsequently apply our abstract framework to the specific case of blind deconvolution on the torus and on the sphere.
△ Less
Submitted 13 April, 2012;
originally announced April 2012.
-
Testing the finiteness of the support of a distribution: a statistical look at Tsirelson's equation
Authors:
Sylvain Delattre,
Mathieu Rosenbaum
Abstract:
We consider the following statistical problem: based on an i.i.d.sample of size n of integer valued random variables with common law m, is it possible to test whether or not the support of m is finite as n goes to infinity? This question is in particular connected to a simple case of Tsirelson's equation, for which it is natural to distinguish between two main configurations, the first one leading…
▽ More
We consider the following statistical problem: based on an i.i.d.sample of size n of integer valued random variables with common law m, is it possible to test whether or not the support of m is finite as n goes to infinity? This question is in particular connected to a simple case of Tsirelson's equation, for which it is natural to distinguish between two main configurations, the first one leading only to laws with finite support, and the second one including laws with infinite support. We show that it is in fact not possible to discriminate between the two situations, even using a very weak notion of statistical test.
△ Less
Submitted 25 February, 2012;
originally announced February 2012.
-
Scaling limits for Hawkes processes and application to financial statistics
Authors:
Emmanuel Bacry,
Sylvain Delattre,
Marc Hoffmann,
Jean François Muzy
Abstract:
We prove a law of large numbers and a functional central limit theorem for multivariate Hawkes processes observed over a time interval $[0,T]$ in the limit $T \rightarrow \infty$. We further exhibit the asymptotic behaviour of the covariation of the increments of the components of a multivariate Hawkes process, when the observations are imposed by a discrete scheme with mesh $Δ$ over $[0,T]$ up to…
▽ More
We prove a law of large numbers and a functional central limit theorem for multivariate Hawkes processes observed over a time interval $[0,T]$ in the limit $T \rightarrow \infty$. We further exhibit the asymptotic behaviour of the covariation of the increments of the components of a multivariate Hawkes process, when the observations are imposed by a discrete scheme with mesh $Δ$ over $[0,T]$ up to some further time shift $τ$. The behaviour of this functional depends on the relative size of $Δ$ and $τ$ with respect to $T$ and enables to give a full account of the second-order structure. As an application, we develop our results in the context of financial statistics. We introduced in a previous work a microscopic stochastic model for the variations of a multivariate financial asset, based on Hawkes processes and that is confined to live on a tick grid. We derive and characterise the exact macroscopic diffusion limit of this model and show in particular its ability to reproduce important empirical stylised fact such as the Epps effect and the lead-lag effect. Moreover, our approach enable to track these effects across scales in rigorous mathematical terms.
△ Less
Submitted 3 February, 2012;
originally announced February 2012.
-
Testing over a continuum of null hypotheses with False Discovery Rate control
Authors:
Gilles Blanchard,
Sylvain Delattre,
Etienne Roquain
Abstract:
We consider statistical hypothesis testing simultaneously over a fairly general, possibly uncountably infinite, set of null hypotheses, under the assumption that a suitable single test (and corresponding $p$-value) is known for each individual hypothesis. We extend to this setting the notion of false discovery rate (FDR) as a measure of type I error. Our main result studies specific procedures bas…
▽ More
We consider statistical hypothesis testing simultaneously over a fairly general, possibly uncountably infinite, set of null hypotheses, under the assumption that a suitable single test (and corresponding $p$-value) is known for each individual hypothesis. We extend to this setting the notion of false discovery rate (FDR) as a measure of type I error. Our main result studies specific procedures based on the observation of the $p$-value process. Control of the FDR at a nominal level is ensured either under arbitrary dependence of $p$-values, or under the assumption that the finite dimensional distributions of the $p$-value process have positive correlations of a specific type (weak PRDS). Both cases generalize existing results established in the finite setting. Its interest is demonstrated in several non-parametric examples: testing the mean/signal in a Gaussian white noise model, testing the intensity of a Poisson process and testing the c.d.f. of i.i.d. random variables.
△ Less
Submitted 6 February, 2014; v1 submitted 17 October, 2011;
originally announced October 2011.
-
Modeling microstructure noise with mutually exciting point processes
Authors:
E. Bacry,
S. Delattre,
M. Hoffmann,
J. F. Muzy
Abstract:
We introduce a new stochastic model for the variations of asset prices at the tick-by-tick level in dimension 1 (for a single asset) and 2 (for a pair of assets). The construction is based on marked point processes and relies on linear self and mutually exciting stochastic intensities as introduced by Hawkes. We associate a counting process with the positive and negative jumps of an asset price. B…
▽ More
We introduce a new stochastic model for the variations of asset prices at the tick-by-tick level in dimension 1 (for a single asset) and 2 (for a pair of assets). The construction is based on marked point processes and relies on linear self and mutually exciting stochastic intensities as introduced by Hawkes. We associate a counting process with the positive and negative jumps of an asset price. By coupling suitably the stochastic intensities of upward and downward changes of prices for several assets simultaneously, we can reproduce microstructure noise (i.e. strong microscopic mean reversion at the level of seconds to a few minutes) and the Epps effect (i.e. the decorrelation of the increments in microscopic scales) while preserving a standard Brownian diffusion behaviour on large scales. More effectively, we obtain analytical closed-form formulae for the mean signature plot and the correlation of two price increments that enable to track across scales the effect of the mean-reversion up to the diffusive limit of the model. We show that the theoretical results are consistent with empirical fits on futures Euro-Bund and Euro-Bobl in several situations.
△ Less
Submitted 18 January, 2011;
originally announced January 2011.
-
Nonparametric regression with martingale increment errors
Authors:
Sylvain Delattre,
Stéphane Gaïffas
Abstract:
We consider the problem of adaptive estimation of the regression function in a framework where we replace ergodicity assumptions (such as independence or mixing) by another structural assumption on the model. Namely, we propose adaptive upper bounds for kernel estimators with data-driven bandwidth (Lepski's selection rule) in a regression model where the noise is an increment of martingale. It inc…
▽ More
We consider the problem of adaptive estimation of the regression function in a framework where we replace ergodicity assumptions (such as independence or mixing) by another structural assumption on the model. Namely, we propose adaptive upper bounds for kernel estimators with data-driven bandwidth (Lepski's selection rule) in a regression model where the noise is an increment of martingale. It includes, as very particular cases, the usual i.i.d. regression and auto-regressive models. The cornerstone tool for this study is a new result for self-normalized martingales, called ``stability'', which is of independent interest. In a first part, we only use the martingale increment structure of the noise. We give an adaptive upper bound using a random rate, that involves the occupation time near the estimation point. Thanks to this approach, the theoretical study of the statistical procedure is disconnected from usual ergodicity properties like mixing. Then, in a second part, we make a link with the usual minimax theory of deterministic rates. Under a beta-mixing assumption on the covariates process, we prove that the random rate considered in the first part is equivalent, with large probability, to a deterministic rate which is the usual minimax adaptive one.
△ Less
Submitted 29 October, 2010;
originally announced October 2010.
-
On the false discovery proportion convergence under Gaussian equi-correlation
Authors:
Sylvain Delattre,
Etienne Roquain
Abstract:
We study the convergence of the false discovery proportion (FDP) of the Benjamini-Hochberg procedure in the Gaussian equi-correlated model, when the correlation $ρ_m$ converges to zero as the hypothesis number $m$ grows to infinity. By contrast with the standard convergence rate $m^{1/2}$ holding under independence, this study shows that the FDP converges to the false discovery rate (FDR) at rate…
▽ More
We study the convergence of the false discovery proportion (FDP) of the Benjamini-Hochberg procedure in the Gaussian equi-correlated model, when the correlation $ρ_m$ converges to zero as the hypothesis number $m$ grows to infinity. By contrast with the standard convergence rate $m^{1/2}$ holding under independence, this study shows that the FDP converges to the false discovery rate (FDR) at rate $\{\min(m,1/ρ_m)\}^{1/2}$ in this equi-correlated model.
△ Less
Submitted 8 July, 2010;
originally announced July 2010.