Shifted Composition II: Shift Harnack Inequalities
and Curvature Upper Bounds

Jason M. Altschuler
UPenn
[email protected] Sinho Chewi
IAS
[email protected]

Abstract

We apply the shifted composition rule—an information-theoretic principle introduced in our earlier work [scr1]—to establish shift Harnack inequalities for the Langevin diffusion. We obtain sharp constants for these inequalities for the first time, allowing us to investigate their relationship with other properties of the diffusion. Namely, we show that they are equivalent to a sharp “local gradient-entropy” bound, and that they imply curvature upper bounds in a compelling reflection of the Bakry–Émery theory of curvature lower bounds. Finally, we show that the local gradient-entropy inequality implies optimal concentration of the score, a.k.a. the logarithmic gradient of the density.

1 Introduction

This paper is a sequel to our earlier work [scr1], in which we introduced an information-theoretic principle called the shifted composition rule. Briefly, this principle allows for bounding information-theoretic divergences, such as the Kullback–Leibler (KL) or Rényi divergence, between the marginal laws of two stochastic processes through the introduction of a third, auxiliary process. In that paper, we applied the shifted composition rule to provide, among other results, information-theoretic proofs of F.-Y. Wang’s celebrated dimension-free Harnack inequalities for diffusions [Wang1997LSINoncompact] via their dual formulation as a certain family of reverse transport inequalities.

The Harnack inequalities therein encode regularity for Kolmogorov’s backward equation. The aim of the present paper is to apply our information-theoretic framework to the dual problem of regularity for Kolmogorov’s forward equation, i.e., the Fokker–Planck equation. To describe the problem setting and our results, we introduce some basic concepts.

Harnack and shift Harnack inequalities.

For concreteness, let $V:\mathbb{R}\to\mathbb{R}$ be a smooth potential and consider the Langevin diffusion

\displaystyle\mathrm{d}X_{t}=-\nabla V(X_{t})\,\mathrm{d}t+\sqrt{2}\,\mathrm{d% }B_{t}\,,

(1.1)

where ${(B_{t})}_{t\geqslant 0}$ is a standard Brownian motion on $\mathbb{R}$ . Following the storied tradition of functional analysis, we study the regularity of this process through its Markov semigroup ${(P_{t})}_{t\geqslant 0}$ , which acts on any (reasonable) function $f:\mathbb{R}\to\mathbb{R}$ via $P_{t}f(x)\coloneqq\mathbb{E}[f(X_{t})\mid X_{0}=x]$ . Classically, under a curvature lower bound of the form $\nabla^{2}V\succeq\alpha I$ , $\alpha\in\mathbb{R}$ (also called the curvature-dimension condition, or the Bakry–Émery criterion), one obtains Harnack inequalities of the form

\displaystyle{P_{t}f(x)}^{p}\leqslant C(p,t,x,y)\,P_{t}(f^{p})(y)\,,\qquad% \text{for all}~{}x,y\in\mathbb{R}\,,\,t>0\,,~{}\text{and}~{}f:\mathbb{R}\to% \mathbb{R}_{\geqslant 0}\,.

(1.2)

Here, $p>1$ and $C(p,t,x,y)>0$ is an appropriate constant. Such inequalities witness the regularizing effect of the semigroup: they imply that for every $t>0$ , the semigroup $P_{t}$ maps bounded functions into differentiable ones. If the semigroup is described by transition densities $(t,x,y)\mapsto p_{t}(x,y)$ , then this amounts to regularity properties with respect to the backward variable $x$ , i.e., with respect to perturbations to the initial condition of the diffusion. Moreover, the sharp Harnack inequalities imply back the curvature lower bound $\nabla^{2}V\succeq\alpha I$ ; see [Wang04EquivHarnack, Wang10HarnackBoundary, bakry2014analysis, Wang14Diffusion] or [scr1, §6] for further discussion.

To address the regularity of Kolmogorov’s forward equation, the seminal work of F.-Y. Wang in [Wang14ShiftHarnack] introduced the family of shift Harnack inequalities, written

\displaystyle{P_{t}(f(\cdot+v))}^{p}\leqslant C(p,t,v)\,P_{t}(f^{p})\,,\qquad% \text{for all}~{}v\in\mathbb{R},\,t>0\,,~{}\text{and}~{}f:\mathbb{R}\to\mathbb% {R}_{\geqslant 0}\,.

(1.3)

Shift Harnack inequalities imply, for example, the existence of the transition densities ${(p_{t})}_{t\geqslant 0}$ with respect to the Lebesgue measure, and they also entail gradient bounds for the Lebesgue density $p_{t}(x,y)$ with respect to the forward variable $y$ (a.k.a. the terminal condition for the diffusion). See §LABEL:sec:shift_harnack_bg for further background and discussion. Note that the Harnack (1.2) and shift Harnack (1.3) inequalities differ because the Langevin semigroup does not commute with convolutions.

In his original paper [Wang14ShiftHarnack], F.-Y. Wang established a number of applications and equivalences for shift Harnack inequalities. (See also his monographs [Wang13HarnackSPDE, Wang14Diffusion] for comprehensive expositions and further applications.) Through the use of coupling arguments, he then established integration by parts formulas and shift Harnack inequalities for (degenerate) stochastic Hamiltonian systems and some SPDEs. Subsequent work extended his techniques to an abundance of settings, including SDEs driven by fractional Brownian motion [Fan15FBM], SDEs driven by jump processes [Wang16IBPJumps], other examples of SPDEs [Zhang16ShiftHarnack, LvHua21HarnackSPDEs], McKean–Vlasov equations [Wang18Landau, HuaWan19DistDepSingular], and SDEs with irregular coefficients [Huang19HarnackIntegrable, HuaWan19DistDepSingular, LvHua21HarnackSPDEs].

Our starting point is the equivalent reformulation of the shift Harnack inequality (1.3), via Hölder duality, in the form of a reverse transport inequality:

\displaystyle\mathsf{R}_{q}(\delta_{x}P_{t}*\delta_{v}\mathbin{\|}\delta_{x}P_% {t})

\displaystyle\leqslant C^{\prime}(p,t,v)\,,\qquad\text{for all}~{}x\in\mathbb{% R}\,,t>0\,,

(1.4)

see §LABEL:ssec:prelim:duality. Here, $\mathsf{R}_{q}$ is the Rényi divergence of order $q\coloneqq\tfrac{p}{p-1}$ (see §LABEL:ssec:prelim:info for information-theoretic preliminaries). We apply the shifted composition rule to provide information-theoretic arguments, in discrete time, to establish (1.4) and hence (1.3). In this respect, our development parallels our earlier work [scr1] at least at a syntactic level, and we organize our paper accordingly to emphasize these similarities. However, despite the syntactic similarity, the problem of forward regularity is substantially different from that of backward regularity, and the latter is far less well-understood.

Relationship with curvature upper bounds.

Indeed, whereas F.-Y. Wang’s original Harnack inequalities and the celebrated web of equivalences around the curvature-dimension condition have been well-understood for at least two decades, the understanding of shift Harnack inequalities is relatively nascent.¹¹1See, for example, [Wang14ShiftHarnack] for a discussion of the greater challenges faced in the forward regularity setting. A significant point of departure is that the optimal constants $C(p,t,v)$ in (1.3) are not sharply characterized. In fact, the bounds established in the literature involve constants $C(p,t,v)$ which diverge to infinity as $t\to\infty$ . Consequently, such bounds do not yield meaningful information about the regularity of the stationary distribution.

One of the primary contributions of this paper is to prove shift Harnack inequalities with optimal constants. With these sharp inequalities in hand, we are then in a position to investigate the possibility of develo** equivalences for the shift Harnack inequalities. Towards this end, we prove the following chain of implications.

Theorem 1.1.

Let ${(P_{t})}_{t\geqslant 0}$ denote the Markov semigroup corresponding to the Langevin diffusion with potential $V$ . Let $\beta>0$ and $p,q>1$ . Consider the following properties.

$(\mathsf{CurvBdd})$

The two-sided curvature bound $-\beta I\preceq\nabla^{2}V\preceq\beta I$ holds.

(\mathsf{LGE})

The local gradient-entropy bound

\displaystyle\frac{\lVert{P_{t}\nabla f}\rVert^{2}}{P_{t}f}\leqslant\frac{2% \beta}{1-\exp(-2\beta t)}\,\{P_{t}(f\log f)-P_{t}f\log P_{t}f\}

holds for all $t>0$ and all smooth $f:\mathbb{R}\to\mathbb{R}_{>0}$ .

(\mathsf{SH}_{p})

The shift Harnack inequality

\displaystyle{P_{t}(f(\cdot+v))}^{p}

\displaystyle\leqslant\exp\Bigl{(}\frac{\beta p\,\lVert{v}\rVert^{2}}{2\,(p-1)% \,(1-\exp(-2\beta t))}\Bigr{)}\,P_{t}(f^{p})

holds for all $v\in\mathbb{R}$ , all $t>0$ , and all $f:\mathbb{R}\to\mathbb{R}_{>0}$ .

(\mathsf{SRT}_{q})

The shift reverse transport inequality

\displaystyle\mathsf{R}_{q}(\delta_{x}P_{t}*\delta_{v}\mathbin{\|}\delta_{x}P_% {t})

\displaystyle\leqslant\frac{\beta q\,\lVert{v}\rVert^{2}}{2\,(1-\exp(-2\beta t% ))}

holds for all $x,v\in\mathbb{R}$ and all $t>0$ .

(\mathsf{SH}_{\rm log})

The shift log-Harnack inequality

\displaystyle P_{t}(f(\cdot+v))

\displaystyle\leqslant\log P_{t}(\exp f)+\frac{\beta\,\lVert{v}\rVert^{2}}{2\,% (1-\exp(-2\beta t))}