Search | arXiv e-print repository

Characterizing the fourth-moment phenomenon of monochromatic subgraph counts via influences

Abstract: We investigate the distribution of monochromatic subgraph counts in random vertex $2$-colorings of large graphs. We give sufficient conditions for the asymptotic normality of these counts and demonstrate their essential necessity (particularly for monochromatic triangles). Our approach refines the fourth-moment theorem to establish new, local influence-based conditions for asymptotic normality; th… ▽ More We investigate the distribution of monochromatic subgraph counts in random vertex $2$-colorings of large graphs. We give sufficient conditions for the asymptotic normality of these counts and demonstrate their essential necessity (particularly for monochromatic triangles). Our approach refines the fourth-moment theorem to establish new, local influence-based conditions for asymptotic normality; these findings more generally provide insight into fourth-moment phenomena for a broader class of Rademacher and Gaussian polynomials. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: 29 pages

MSC Class: 60C05

arXiv:2312.13554 [pdf, ps, other]

Time Lower Bounds for the Metropolis Process and Simulated Annealing

Authors: Zongchen Chen, Dan Mikulincer, Daniel Reichman, Alexander S. Wein

Abstract: The Metropolis process (MP) and Simulated Annealing (SA) are stochastic local search heuristics that are often used in solving combinatorial optimization problems. Despite significant interest, there are very few theoretical results regarding the quality of approximation obtained by MP and SA (with polynomially many iterations) for NP-hard optimization problems. We provide rigorous lower bounds… ▽ More The Metropolis process (MP) and Simulated Annealing (SA) are stochastic local search heuristics that are often used in solving combinatorial optimization problems. Despite significant interest, there are very few theoretical results regarding the quality of approximation obtained by MP and SA (with polynomially many iterations) for NP-hard optimization problems. We provide rigorous lower bounds for MP and SA with respect to the classical maximum independent set problem when the algorithms are initialized from the empty set. We establish the existence of a family of graphs for which both MP and SA fail to find approximate solutions in polynomial time. More specifically, we show that for any $\varepsilon \in (0,1)$ there are $n$-vertex graphs for which the probability SA (when limited to polynomially many iterations) will approximate the optimal solution within ratio $Ω\left(\frac{1}{n^{1-\varepsilon}}\right)$ is exponentially small. Our lower bounds extend to graphs of constant average degree $d$, illustrating the failure of MP to achieve an approximation ratio of $Ω\left(\frac{\log (d)}{d}\right)$ in polynomial time. In some cases, our impossibility results also go beyond Simulated Annealing and apply even when the temperature is chosen adaptively. Finally, we prove time lower bounds when the inputs to these algorithms are bipartite graphs, and even trees, which are known to admit polynomial-time algorithms for the independent set problem. △ Less

Submitted 20 December, 2023; originally announced December 2023.

Comments: 44 pages

arXiv:2305.03786 [pdf, ps, other]

Transportation onto log-Lipschitz perturbations

Authors: Max Fathi, Dan Mikulincer, Yair Shenfeld

Abstract: We establish sufficient conditions for the existence of globally Lipschitz transport maps between probability measures and their log-Lipschitz perturbations, with dimension-free bounds. Our results include Gaussian measures on Euclidean spaces and uniform measures on spheres as source measures. More generally, we prove results for source measures on manifolds satisfying strong curvature assumption… ▽ More We establish sufficient conditions for the existence of globally Lipschitz transport maps between probability measures and their log-Lipschitz perturbations, with dimension-free bounds. Our results include Gaussian measures on Euclidean spaces and uniform measures on spheres as source measures. More generally, we prove results for source measures on manifolds satisfying strong curvature assumptions. These seem to be the first examples of dimension-free Lipschitz transport maps in non-Euclidean settings, which are moreover sharp on the sphere. We also present some applications to functional inequalities, including a new dimension-free Gaussian isoperimetric inequality for log-Lipschitz perturbations of the standard Gaussian measure. Our proofs are based on the Langevin flow construction of transport maps of Kim and Milman. △ Less

Submitted 11 December, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

Comments: 20 pages. Minor changes

MSC Class: 39B62 (Primary) 49Q22; 49R05; 35P15 (Secondary)

arXiv:2211.12301 [pdf, ps, other]

Is this correct? Let's check!

Authors: Omri Ben-Eliezer, Dan Mikulincer, Elchanan Mossel, Madhu Sudan

Abstract: Societal accumulation of knowledge is a complex process. The correctness of new units of knowledge depends not only on the correctness of new reasoning, but also on the correctness of old units that the new one builds on. The errors in such accumulation processes are often remedied by error correction and detection heuristics. Motivating examples include the scientific process based on scientifi… ▽ More Societal accumulation of knowledge is a complex process. The correctness of new units of knowledge depends not only on the correctness of new reasoning, but also on the correctness of old units that the new one builds on. The errors in such accumulation processes are often remedied by error correction and detection heuristics. Motivating examples include the scientific process based on scientific publications, and software development based on libraries of code. Natural processes that aim to keep errors under control, such as peer review in scientific publications, and testing and debugging in software development, would typically check existing pieces of knowledge -- both for the reasoning that generated them and the previous facts they rely on. In this work, we present a simple process that models such accumulation of knowledge and study the persistence (or lack thereof) of errors. We consider a simple probabilistic model for the generation of new units of knowledge based on the preferential attachment growth model, which additionally allows for errors. Furthermore, the process includes checks aimed at catching these errors. We investigate when effects of errors persist forever in the system (with positive probability) and when they get rooted out completely by the checking process. The two basic parameters associated with the checking process are the {\em probability} of conducting a check and the depth of the check. We show that errors are rooted out if checks are sufficiently frequent and sufficiently deep. In contrast, shallow or infrequent checks are insufficient to root out errors. △ Less

Submitted 17 June, 2024; v1 submitted 22 November, 2022; originally announced November 2022.

Comments: 30 pages, minor changes and fixes

arXiv:2208.07438 [pdf, other]

Archimedes Meets Privacy: On Privately Estimating Quantiles in High Dimensions Under Minimal Assumptions

Authors: Omri Ben-Eliezer, Dan Mikulincer, Ilias Zadik

Abstract: The last few years have seen a surge of work on high dimensional statistics under privacy constraints, mostly following two main lines of work: the ``worst case'' line, which does not make any distributional assumptions on the input data; and the ``strong assumptions'' line, which assumes that the data is generated from specific families, e.g., subgaussian distributions. In this work we take a mid… ▽ More The last few years have seen a surge of work on high dimensional statistics under privacy constraints, mostly following two main lines of work: the ``worst case'' line, which does not make any distributional assumptions on the input data; and the ``strong assumptions'' line, which assumes that the data is generated from specific families, e.g., subgaussian distributions. In this work we take a middle ground, obtaining new differentially private algorithms with polynomial sample complexity for estimating quantiles in high-dimensions, as well as estimating and sampling points of high Tukey depth, all working under very mild distributional assumptions. From the technical perspective, our work relies upon deep robustness results in the convex geometry literature, demonstrating how such results can be used in a private context. Our main object of interest is the (convex) floating body (FB), a notion going back to Archimedes, which is a robust and well studied high-dimensional analogue of the interquantile range. We show how one can privately, and with polynomially many samples, (a) output an approximate interior point of the FB -- e.g., ``a typical user'' in a high-dimensional database -- by leveraging the robustness of the Steiner point of the FB; and at the expense of polynomially many more samples, (b) produce an approximate uniform sample from the FB, by constructing a private noisy projection oracle. △ Less

Submitted 15 August, 2022; originally announced August 2022.

Comments: 38 pages, 1 figure

arXiv:2208.06508 [pdf, ps, other]

Noise stability on the Boolean hypercube via a renormalized Brownian motion

Authors: Ronen Eldan, Dan Mikulincer, Prasad Raghavendra

Abstract: We consider a variant of the classical notion of noise on the Boolean hypercube which gives rise to a new approach to inequalities regarding noise stability. We use this approach to give a new proof of the Majority is Stablest theorem by Mossel, O'Donnell, and Oleszkiewicz, improving the dependence of the bound on the maximal influence of the function from logarithmic to polynomial. We also show t… ▽ More We consider a variant of the classical notion of noise on the Boolean hypercube which gives rise to a new approach to inequalities regarding noise stability. We use this approach to give a new proof of the Majority is Stablest theorem by Mossel, O'Donnell, and Oleszkiewicz, improving the dependence of the bound on the maximal influence of the function from logarithmic to polynomial. We also show that a variant of the conjecture by Courtade and Kumar regarding the most informative Boolean function, where the classical noise is replaced by our notion, holds true. Our approach is based on a stochastic construction that we call the renormalized Brownian motion, which facilitates the use of inequalities in Gaussian space in the analysis of Boolean functions. △ Less

Submitted 12 August, 2022; originally announced August 2022.

Comments: 21 pages

arXiv:2207.05275 [pdf, other]

Size and depth of monotone neural networks: interpolation and approximation

Authors: Dan Mikulincer, Daniel Reichman

Abstract: We study monotone neural networks with threshold gates where all the weights (other than the biases) are non-negative. We focus on the expressive power and efficiency of representation of such networks. Our first result establishes that every monotone function over $[0,1]^d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network. When $d > 3$, we improve upon the… ▽ More We study monotone neural networks with threshold gates where all the weights (other than the biases) are non-negative. We focus on the expressive power and efficiency of representation of such networks. Our first result establishes that every monotone function over $[0,1]^d$ can be approximated within arbitrarily small additive error by a depth-4 monotone network. When $d > 3$, we improve upon the previous best-known construction which has depth $d+1$. Our proof goes by solving the monotone interpolation problem for monotone datasets using a depth-4 monotone threshold network. In our second main result we compare size bounds between monotone and arbitrary neural networks with threshold gates. We find that there are monotone real functions that can be computed efficiently by networks with no restriction on the gates whereas monotone networks approximating these functions need exponential size in the dimension. △ Less

Submitted 29 April, 2024; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: 25 pages, 1 Figure; improved inapproximability results and added general reduction results

arXiv:2203.11863 [pdf, other]

doi 10.1137/1.9781611977554

Integrality Gaps for Random Integer Programs via Discrepancy

Authors: Sander Borst, Daniel Dadush, Dan Mikulincer

Abstract: We prove new bounds on the additive gap between the value of a random integer program $\max c^Tx,\ Ax\leq b,\ x\in\{0,1\}^n$ with $m$ constraints and that of its linear programming relaxation for a wide range of distributions on $(A,b,c)$ . We are motivated by the work of Dey, Dubey, and Molinaro (SODA '21), who gave a framework for relating the size of Branch-and-Bound (B&B) trees to additive int… ▽ More We prove new bounds on the additive gap between the value of a random integer program $\max c^Tx,\ Ax\leq b,\ x\in\{0,1\}^n$ with $m$ constraints and that of its linear programming relaxation for a wide range of distributions on $(A,b,c)$ . We are motivated by the work of Dey, Dubey, and Molinaro (SODA '21), who gave a framework for relating the size of Branch-and-Bound (B&B) trees to additive integrality gaps. Dyer and Frieze (MOR '89) and Borst et al. (Mathematical Programming '22), respectively, showed that for certain random packing and Gaussian IPs, where the entries of $A,c$ are independently distributed according to either the uniform distribution on $[0,1]$ or the Gaussian distribution $\mathcal{N}(0,1)$, the integrality gap is bounded by $O_m(\log^2 n / n)$ with probability at least $1-1/n-e^{-Ω_m(1)}$. In this paper, we generalize these results to the case where the entries of $A$ are uniformly distributed on an integer interval (e.g., entries in $\{-1,0,1\}$), and where the columns of $A$ are distributed according to an isotropic logconcave distribution. Second, we substantially improve the success probability to $1-1/poly(n)$, compared to constant probability in prior works (depending on $m$). Leveraging the connection to Branch-and-Bound, our gap results imply that for these IPs B&B trees have size $n^{poly(m)}$ with high probability (i.e., polynomial for fixed $m$), which significantly extends the class of IPs for which B&B is known to be polynomial. Our main technical contribution is a new linear discrepancy theorem for random matrices. Our theorem gives general conditions under which a target vector is equal to or very close to a $\{0,1\}$ combination of the columns of a random matrix $A$ . The proof uses a Fourier analytic approach, building on work of Hoberg and Rothvoss (SODA '19) and Franks and Saks (RSA '20). △ Less

Submitted 11 April, 2023; v1 submitted 22 March, 2022; originally announced March 2022.

arXiv:2201.01382 [pdf, ps, other]

On the Lipschitz properties of transportation along heat flows

Authors: Dan Mikulincer, Yair Shenfeld

Abstract: We prove new Lipschitz properties for transport maps along heat flows, constructed by Kim and Milman. For (semi)-log-concave measures and Gaussian mixtures, our bounds have several applications: eigenvalues comparisons, dimensional functional inequalities, and domination of distribution functions. We prove new Lipschitz properties for transport maps along heat flows, constructed by Kim and Milman. For (semi)-log-concave measures and Gaussian mixtures, our bounds have several applications: eigenvalues comparisons, dimensional functional inequalities, and domination of distribution functions. △ Less

Submitted 30 June, 2022; v1 submitted 4 January, 2022; originally announced January 2022.

Comments: 18 pages. Minor fixes

MSC Class: 39B62 (Primary) 49Q22; 49R05; 35P15 (Secondary)

arXiv:2111.11521 [pdf, ps, other]

The Brownian transport map

Authors: Dan Mikulincer, Yair Shenfeld

Abstract: Contraction properties of transport maps between probability measures play an important role in the theory of functional inequalities. The actual construction of such maps, however, is a non-trivial task and, so far, relies mostly on the theory of optimal transport. In this work, we take advantage of the infinite-dimensional nature of the Gaussian measure and construct a new transport map, based o… ▽ More Contraction properties of transport maps between probability measures play an important role in the theory of functional inequalities. The actual construction of such maps, however, is a non-trivial task and, so far, relies mostly on the theory of optimal transport. In this work, we take advantage of the infinite-dimensional nature of the Gaussian measure and construct a new transport map, based on the Föllmer process, which pushes forward the Wiener measure onto probability measures on Euclidean spaces. Utilizing the tools of the Malliavin and stochastic calculus in Wiener space, we show that this Brownian transport map is a contraction in various settings where the analogous questions for optimal transport maps are open. The contraction properties of the Brownian transport map enable us to prove functional inequalities in Euclidean spaces, which are either completely new or improve on current results. Further and related applications of our contraction results are the existence of Stein kernels with desirable properties (which lead to new central limit theorems), as well as new insights into the Kannan--Lovász--Simonovits conjecture. We go beyond the Euclidean setting and address the problem of contractions on the Wiener space itself. We show that optimal transport maps and causal optimal transport maps (which are related to Brownian transport maps) between the Wiener measure and other target measures on Wiener space exhibit very different behaviors. △ Less

Submitted 26 April, 2024; v1 submitted 22 November, 2021; originally announced November 2021.

Comments: 49 pages, some edits to the introduction and minor fixes

MSC Class: 49Q22 (Primary) 39B62; 60B11(Secondary)

arXiv:2108.04268 [pdf, ps, other]

Anti-concentration of polynomials: dimension-free covariance bounds and decay of Fourier coefficients

Authors: Itay Glazer, Dan Mikulincer

Abstract: We study random variables of the form $f(X)$, when $f$ is a degree $d$ polynomial, and $X$ is a random vector on $\mathbb{R}^{n}$, motivated towards a deeper understanding of the covariance structure of $X^{\otimes d}$. For applications, the main interest is to bound $\mathrm{Var}(f(X))$ from below, assuming a suitable normalization on the coefficients of $f$. Our first result applies when $X$ has… ▽ More We study random variables of the form $f(X)$, when $f$ is a degree $d$ polynomial, and $X$ is a random vector on $\mathbb{R}^{n}$, motivated towards a deeper understanding of the covariance structure of $X^{\otimes d}$. For applications, the main interest is to bound $\mathrm{Var}(f(X))$ from below, assuming a suitable normalization on the coefficients of $f$. Our first result applies when $X$ has independent coordinates, and we establish dimension-free bounds. We also show that the assumption of independence can be relaxed and that our bounds carry over to uniform measures on isotropic $L_{p}$ balls. Moreover, in the case of the Euclidean ball, we provide an orthogonal decomposition of $\mathrm{Cov}(X^{\otimes d})$. Finally, we utilize the connection between anti-concentration and decay of Fourier coefficients to prove a high-dimensional analogue of the van der Corput lemma, thus partially answering a question posed by Carbery and Wright. △ Less

Submitted 14 July, 2022; v1 submitted 9 August, 2021; originally announced August 2021.

Comments: 28 pages. Lemma 8 is updated in Version 2

MSC Class: 52A23 (Primary) 46B07; 46M05; 42B10 (Secondary)

arXiv:2102.08668 [pdf, ps, other]

Non-asymptotic approximations of neural networks by Gaussian processes

Authors: Ronen Eldan, Dan Mikulincer, Tselil Schramm

Abstract: We study the extent to which wide neural networks may be approximated by Gaussian processes when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space… ▽ More We study the extent to which wide neural networks may be approximated by Gaussian processes when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space, metrized with a natural transportation distance. We identify two regimes of interest; when the activation function is polynomial, its degree determines the rate of convergence, while for non-polynomial activations, the rate is governed by the smoothness of the function. △ Less

Submitted 17 February, 2021; originally announced February 2021.

Comments: 18 pages

arXiv:2010.14178 [pdf, ps, other]

Stability estimates for invariant measures of diffusion processes, with applications to stability of moment measures and Stein kernels

Authors: Max Fathi, Dan Mikulincer

Abstract: We investigate stability of invariant measures of diffusion processes with respect to $L^p$ distances on the coefficients, under an assumption of log-concavity. The method is a variant of a technique introduced by Crippa and De Lellis to study transport equations. As an application, we prove a partial extension of an inequality of Ledoux, Nourdin and Peccati relating transport distances and Stein… ▽ More We investigate stability of invariant measures of diffusion processes with respect to $L^p$ distances on the coefficients, under an assumption of log-concavity. The method is a variant of a technique introduced by Crippa and De Lellis to study transport equations. As an application, we prove a partial extension of an inequality of Ledoux, Nourdin and Peccati relating transport distances and Stein discrepancies to a non-Gaussian setting via the moment map construction of Stein kernels. △ Less

Submitted 27 October, 2020; originally announced October 2020.

Comments: 29 pages, comments are welcome

arXiv:2006.15574 [pdf, ps, other]

doi 10.1017/S0963548322000098

Community detection and percolation of information in a geometric setting

Authors: Ronen Eldan, Dan Mikulincer, Hester Pieters

Abstract: We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry. We consider a geometric random graph over a homogeneous metric space where the probability of two vertices to be connected is an arbitrary function of the distance. We give sufficient conditions under… ▽ More We make the first steps towards generalizing the theory of stochastic block models, in the sparse regime, towards a model where the discrete community structure is replaced by an underlying geometry. We consider a geometric random graph over a homogeneous metric space where the probability of two vertices to be connected is an arbitrary function of the distance. We give sufficient conditions under which the locations can be recovered (up to an isomorphism of the space) in the sparse regime. Moreover, we define a geometric counterpart of the model of flow of information on trees, due to Mossel and Peres, in which one considers a branching random walk on a sphere and the goal is to recover the location of the root based on the locations of leaves. We give some sufficient conditions for percolation and for non-percolation of information in this model. △ Less

Submitted 1 July, 2022; v1 submitted 28 June, 2020; originally announced June 2020.

Comments: 23 pages. Changes to Lemma 16

arXiv:2006.02855 [pdf, ps, other]

Network size and weights size for memorization with two-layers neural networks

Authors: Sébastien Bubeck, Ronen Eldan, Yin Tat Lee, Dan Mikulincer

Abstract: In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $\mathbb{R}^d$ using only $\ulcorner n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $ε$, the n… ▽ More In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $\mathbb{R}^d$ using only $\ulcorner n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $ε$, the neural tangent kernel can also memorize with only $O\left(\frac{n}{d} \cdot \log(1/ε) \right)$ neurons (assuming that the data is well dispersed too). We show however that these constructions give rise to networks where the magnitude of the neurons' weights are far from optimal. In contrast we propose a new training procedure for ReLU networks, based on complex (as opposed to real) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/ε)}ε\right)$ neurons, as well as nearly-optimal size of the weights. △ Less

Submitted 3 November, 2020; v1 submitted 4 June, 2020; originally announced June 2020.

Comments: 27 pages

arXiv:2002.10846 [pdf, ps, other]

A CLT in Stein's distance for generalized Wishart matrices and higher order tensors

Authors: Dan Mikulincer

Abstract: We study the central limit theorem for sums of independent tensor powers, $\frac{1}{\sqrt{d}}\sum\limits_{i=1}^d X_i^{\otimes p}$. We focus on the high-dimensional regime where $X_i \in \mathbb{R}^n$ and $n$ may scale with $d$. Our main result is a proposed threshold for convergence. Specifically, we show that, under some regularity assumption, if $n^{2p-1}\gg d$, then the normalized sum converges… ▽ More We study the central limit theorem for sums of independent tensor powers, $\frac{1}{\sqrt{d}}\sum\limits_{i=1}^d X_i^{\otimes p}$. We focus on the high-dimensional regime where $X_i \in \mathbb{R}^n$ and $n$ may scale with $d$. Our main result is a proposed threshold for convergence. Specifically, we show that, under some regularity assumption, if $n^{2p-1}\gg d$, then the normalized sum converges to a Gaussian. The results apply, among others, to symmetric uniform log-concave measures and to product measures. This generalizes several results found in the literature. Our main technique is a novel application of optimal transport to Stein's method which accounts for the low dimensional structure which is inherent in $X_i^{\otimes p}$. △ Less

Submitted 4 November, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: 22 pages. Added a section about non-homogenous sums and correlated Wishart matrices

arXiv:2001.02968 [pdf, other]

How to trap a gradient flow

Authors: Sébastien Bubeck, Dan Mikulincer

Abstract: We consider the problem of finding an $\varepsilon$-approximate stationary point of a smooth function on a compact domain of $\mathbb{R}^d$. In contrast with dimension-free approaches such as gradient descent, we focus here on the case where $d$ is finite, and potentially small. This viewpoint was explored in 1993 by Vavasis, who proposed an algorithm which, for any fixed finite dimension $d$, imp… ▽ More We consider the problem of finding an $\varepsilon$-approximate stationary point of a smooth function on a compact domain of $\mathbb{R}^d$. In contrast with dimension-free approaches such as gradient descent, we focus here on the case where $d$ is finite, and potentially small. This viewpoint was explored in 1993 by Vavasis, who proposed an algorithm which, for any fixed finite dimension $d$, improves upon the $O(1/\varepsilon^2)$ oracle complexity of gradient descent. For example for $d=2$, Vavasis' approach obtains the complexity $O(1/\varepsilon)$. Moreover for $d=2$ he also proved a lower bound of $Ω(1/\sqrt{\varepsilon})$ for deterministic algorithms (we extend this result to randomized algorithms). Our main contribution is an algorithm, which we call gradient flow trap** (GFT), and the analysis of its oracle complexity. In dimension $d=2$, GFT closes the gap with Vavasis' lower bound (up to a logarithmic factor), as we show that it has complexity $O\left(\sqrt{\frac{\log(1/\varepsilon)}{\varepsilon}}\right)$. In dimension $d=3$, we show a complexity of $O\left(\frac{\log(1/\varepsilon)}{\varepsilon}\right)$, improving upon Vavasis' $O\left(1 / \varepsilon^{1.2} \right)$. In higher dimensions, GFT has the remarkable property of being a logarithmic parallel depth strategy, in stark contrast with the polynomial depth of gradient descent or Vavasis' algorithm. In this higher dimensional regime, the total work of GFT improves quadratically upon the only other known polylogarithmic depth strategy for this problem, namely naive grid search. We augment this result with another algorithm, named \emph{cut and flow} (CF), which improves upon Vavasis' algorithm in any fixed dimension. △ Less

Submitted 30 December, 2020; v1 submitted 9 January, 2020; originally announced January 2020.

Comments: 25 pages, 5 figures. Added an improved algorithm for dimensions > 3

arXiv:1906.05904 [pdf, ps, other]

Stability of Talagrand's Gaussian transport-entropy inequality via the Föllmer process

Authors: Dan Mikulincer

Abstract: We establish a dimension-free improvement of Talagrand's Gaussian transport-entropy inequality, under the assumption that the measures satisfy a Poincaré inequality. We also study stability of the inequality, in terms of relative entropy, when restricted to measures whose covariance matrix trace is smaller than the ambient dimension. In case the covariance matrix is strictly smaller than the ident… ▽ More We establish a dimension-free improvement of Talagrand's Gaussian transport-entropy inequality, under the assumption that the measures satisfy a Poincaré inequality. We also study stability of the inequality, in terms of relative entropy, when restricted to measures whose covariance matrix trace is smaller than the ambient dimension. In case the covariance matrix is strictly smaller than the identity, we give dimension-free estimates which depend on its eigenvalues. To complement our results, we show that our conditions cannot be relaxed, and that there exist measures with covariance larger than the identity, for which the inequality is not stable, in relative entropy. To deal with these examples, we show that, without any assumptions, one can always get quantitative stability estimates in terms of relative entropy to Gaussian mixtures. Our approach gives rise to a new point of view which sheds light on the hierarchy between Fisher information, entropy, and transportation distance, and may be of independent interest. In particular, it implies that the described results apply verbatim to the log-Sobolev inequality and improve upon some known estimates in the literature. △ Less

Submitted 25 April, 2021; v1 submitted 13 June, 2019; originally announced June 2019.

Comments: 23 pages. Changed the formulation of Lemma 4

arXiv:1903.07140 [pdf, ps, other]

doi 10.1007/s00440-020-00967-w

Stability of the Shannon-Stam inequality via the Föllmer process

Authors: Ronen Eldan, Dan Mikulincer

Abstract: We prove stability estimates for the Shannon-Stam inequality (also known as the entropy-power inequality) for log-concave random vectors in terms of entropy and transportation distance. In particular, we give the first stability estimate for general log-concave random vectors in the following form: for log-concave random vectors $X,Y \in \mathbb{R}^d$, the deficit in the Shannon-Stam inequality is… ▽ More We prove stability estimates for the Shannon-Stam inequality (also known as the entropy-power inequality) for log-concave random vectors in terms of entropy and transportation distance. In particular, we give the first stability estimate for general log-concave random vectors in the following form: for log-concave random vectors $X,Y \in \mathbb{R}^d$, the deficit in the Shannon-Stam inequality is bounded from below by the expression $$ C \left(\mathrm{D}\left(X||G\right) + \mathrm{D}\left(Y||G\right)\right), $$ where $\mathrm{D}\left( \cdot ~ ||G\right)$ denotes the relative entropy with respect to the standard Gaussian and the constant $C$ depends only on the covariance structures and the spectral gaps of $X$ and $Y$. In the case of uniformly log-concave vectors our analysis gives dimension-free bounds. Our proofs are based on a new approach which uses an entropy-minimizing process from stochastic control theory. △ Less

Submitted 17 March, 2019; originally announced March 2019.

Comments: 24 pages

Journal ref: Probab. Theory Relat. Fields 17, 891-922 (2020)

arXiv:1806.09087 [pdf, ps, other]

The CLT in high dimensions: quantitative bounds via martingale embedding

Authors: Ronen Eldan, Dan Mikulincer, Alex Zhai

Abstract: We introduce a new method for obtaining quantitative convergence rates for the central limit theorem (CLT) in a high dimensional setting. Using our method, we obtain several new bounds for convergence in transportation distance and entropy, and in particular: (a) We improve the best known bound, obtained by the third named author, for convergence in quadratic Wasserstein transportation distance fo… ▽ More We introduce a new method for obtaining quantitative convergence rates for the central limit theorem (CLT) in a high dimensional setting. Using our method, we obtain several new bounds for convergence in transportation distance and entropy, and in particular: (a) We improve the best known bound, obtained by the third named author, for convergence in quadratic Wasserstein transportation distance for bounded random vectors; (b) We derive the first non-asymptotic convergence rate for the entropic CLT in arbitrary dimension, for general log-concave random vectors; (c) We give an improved bound for convergence in transportation distance under a log-concavity assumption and improvements for both metrics under the assumption of strong log-concavity. Our method is based on martingale embeddings and specifically on the Skorokhod embedding constructed by the first named author. △ Less

Submitted 7 September, 2020; v1 submitted 24 June, 2018; originally announced June 2018.

Comments: 37 pages

arXiv:1609.02490 [pdf, ps, other]

Information and dimensionality of anisotropic random geometric graphs

Authors: Ronen Eldan, Dan Mikulincer

Abstract: This paper deals with the problem of detecting non-isotropic high-dimensional geometric structure in random graphs. Namely, we study a model of a random geometric graph in which vertices correspond to points generated randomly and independently from a non-isotropic $d$-dimensional Gaussian distribution, and two vertices are connected if the distance between them is smaller than some pre-specified… ▽ More This paper deals with the problem of detecting non-isotropic high-dimensional geometric structure in random graphs. Namely, we study a model of a random geometric graph in which vertices correspond to points generated randomly and independently from a non-isotropic $d$-dimensional Gaussian distribution, and two vertices are connected if the distance between them is smaller than some pre-specified threshold. We derive new notions of dimensionality which depend upon the eigenvalues of the covariance of the Gaussian distribution. If $α$ denotes the vector of eigenvalues, and $n$ is the number of vertices, then the quantities $\left(\frac{||α||_2}{||α||_3}\right)^6/n^3$ and $\left(\frac{||α||_2}{||α||_4}\right)^4/n^3$ determine upper and lower bounds for the possibility of detection. This generalizes a recent result by Bubeck, Ding, Rácz and the first named author from [BDER14] which shows that the quantity $d/n^3$ determines the boundary of detection for isotropic geometry. Our methods involve Fourier analysis and the theory of characteristic functions to investigate the underlying probabilities of the model. The proof of the lower bound uses information theoretic tools, based on the method presented in [BG15]. △ Less

Submitted 23 February, 2020; v1 submitted 8 September, 2016; originally announced September 2016.

Comments: 38 pages

Showing 1–21 of 21 results for author: Mikulincer, D