Regularized estimation of Monge-Kantorovich quantiles for spherical data

Bernard Bercu, Jérémie Bigot and Gauthier Thurin Université de Bordeaux
Institut de Mathématiques de Bordeaux et CNRS (UMR 5251)
Abstract.

Tools from optimal transport (OT) theory have recently been used to define a notion of quantile function for directional data. In practice, regularization is mandatory for applications that require out-of-sample estimates. To this end, we introduce a regularized estimator built from entropic optimal transport, by extending the definition of the entropic map to the spherical setting. We propose a stochastic algorithm to directly solve a continuous OT problem between the uniform distribution and a target distribution, by expanding Kantorovich potentials in the basis of spherical harmonics. In addition, we define the directional Monge-Kantorovich depth, a companion concept for OT-based quantiles. We show that it benefits from desirable properties related to Liu-Zuo-Serfling axioms for the statistical analysis of directional data. Building on our regularized estimators, we illustrate the benefits of our methodology for data analysis.

The authors gratefully acknowledge financial support from the Agence Nationale de la Recherche (MaSDOL grant ANR-19-CE23-0017).

Keywords: Spherical data; Directional statistics; Monge-Kantorovich quantiles; Entropic Optimal Transport; Spherical harmonics; Fast Fourier transform; Stochastic optimisation.

AMS classifications: 62H12, 62G20, 62L20.

1. Introduction

1.1. Quantiles for directional data using optimal transport

In various situations, data naturally correspond to directions that are modeled as observations belonging to the circle or the unit d𝑑ditalic_d-sphere 𝕊d1superscript𝕊𝑑1{\mathbb{S}}^{d-1}blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT for d2𝑑2d\geq 2italic_d ≥ 2. Such observations, referred to as directional data, can be found in various applications including wildfires [2], gene expressions [23], or cosmology [55] to name but a few. Directional statistics [54, 47, 48, 68] is the field that brings together the corresponding models, methods and applications for statistical inference.

In this paper, we focus on the concept of quantiles for directional data. For real random variables, the notion of quantile is a well established statistical concept thanks to the canonical ordering of observations on the real line. Beyond the setting of distributions with rotational symmetry [46], the absence of a canonical ordering on 𝕊d1superscript𝕊𝑑1{\mathbb{S}}^{d-1}blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT makes the definition of quantiles for directional data more involved.

A recent line of research in nonparametric statistics [12, 36, 37, 35] deals with the use of the theory of optimal transport (OT) to define Monge-Kantorovich (MK) quantiles for multivariate data, that are also referred to as center-outward quantiles. Desirable properties of MK quantiles include ancillarity, distribution-freeness of associated ranks, and consistence with the univariate setting [36], together with connections to the celebrated Tukey’s notion of statistical depth [12] and Mahalanobis distance [34]. The concept of MK quantiles has also proven to be fruitful in many applications, including statistical testing [32, 39, 63, 76, 77], regression [10, 19], risk measurement [4], and Lorenz maps [24, 38].

This approach has recently been applied in [37] to obtain a new notion of center-outward quantiles for directional data. Starting from independent and identically distributed (i.i.d.formulae-sequence𝑖𝑖𝑑i.i.d.italic_i . italic_i . italic_d .) directional data X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\cdots,X_{n}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT sampled from a target measure ν𝜈\nuitalic_ν, the main idea in [37] is to define an empirical quantile function 𝐐nsubscript𝐐𝑛\mathbf{Q}_{n}bold_Q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT as an OT map, from μ𝕊d1subscript𝜇superscript𝕊𝑑1\mu_{{\mathbb{S}}^{d-1}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT the uniform probability distribution on 𝕊d1superscript𝕊𝑑1{\mathbb{S}}^{d-1}blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT towards the empirical measure ν^n=1ni=1nδXisubscript^𝜈𝑛1𝑛superscriptsubscript𝑖1𝑛subscript𝛿subscript𝑋𝑖\widehat{\nu}_{n}=\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}}over^ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_δ start_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT, with a transport cost equal to the squared Riemannian distance. In this manner, if U𝑈Uitalic_U denotes a random variable with distribution μ𝕊d1subscript𝜇superscript𝕊𝑑1\mu_{{\mathbb{S}}^{d-1}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, the random vector 𝐐n(U)subscript𝐐𝑛𝑈\mathbf{Q}_{n}(U)bold_Q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_U ) follows the empirical distribution of the observations, which is consistent with the standard univariate quantile function. Many statistical properties of these directional quantiles and related notions of ranks, signs and MANOVA are then investigated in [37]. Various numerical experiments reported in [37] also illustrate the benefits of OT to compute relevant quantiles for directional data.

1.2. Main contributions

To compute the estimator 𝐐nsubscript𝐐𝑛\mathbf{Q}_{n}bold_Q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, it is proposed in [37] to first approximate the uniform measure μ𝕊d1subscript𝜇superscript𝕊𝑑1\mu_{{\mathbb{S}}^{d-1}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT by a “regular" grid of n𝑛nitalic_n-points over the unit sphere 𝕊d1superscript𝕊𝑑1{\mathbb{S}}^{d-1}blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT. Then, the empirical quantile function is defined through the discrete OT problem between this n𝑛nitalic_n-points grid and ν^nsubscript^𝜈𝑛\widehat{\nu}_{n}over^ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. However, the computational cost of finding a numerical solution to a discrete OT problem is known to scale cubically in the number of observations [18]. Moreover, being a matching between two discrete distributions, the resulting quantile function does not provide out-of-sample estimates which is desirable in many statistical applications.

To circumvent these issues, we suggest to adapt the stochastic algorithm developed in [7] dealing with multivariate data in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, that is not assumed to belong to the unit d𝑑ditalic_d-sphere. The method in [7] benefits from entropic regularization [16], that is well-known for its computational advantages [18]. By expanding Kantorovich dual potentials in their series of Fourier coefficients, each iteration of the stochastic algorithm in [7] reduces to the use of the Fast Fourier Transform (FFT). On the sphere 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, dual potentials can be parameterized via spherical harmonics coefficients instead, that is the analog of the Fourier basis for square-integrable functions on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. In this manner, using a sequence of random variables X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\ldots,X_{n}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT sampled from a target distribution ν𝜈\nuitalic_ν supported on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, we construct a stochastic algorithm in the space of spherical harmonics coefficients. In practice, this algorithm depends on the choice of a grid of points in 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of size 𝒪(p2)𝒪superscript𝑝2\mathcal{O}\left(p^{2}\right)caligraphic_O ( italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) to implement a FFT on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT [84]. The computational cost at each iteration is thus of order 𝒪(p2log2(p))𝒪superscript𝑝2superscript2𝑝\mathcal{O}\left(p^{2}\log^{2}(p)\right)caligraphic_O ( italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_p ) ) [45].

Furthermore, we also derive an estimator of the quantile function for spherical data that is smoother than 𝐐nsubscript𝐐𝑛\mathbf{Q}_{n}bold_Q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. This regularized quantile function is a spherical counterpart of the entropic map [69], following classical results on entropic optimal transport in the euclidean space dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT.

Finally, we introduce the new notion of MK directional statistical depth, in accordance with the euclidean MK depth [12]. We discuss its properties relative to traditional Liu-Zuo-Serfling axioms for the statistical analysis of directional data. Moreover, we study statistical applications built from it, and provide a comparison with other estimators, to better highlight the potential of entropic regularization for spherical quantiles estimation and out-of-sample estimation.

1.3. Organization of the paper

In Section 2, we discuss related works on alternative definitions and estimators of quantiles for directional data. The definitions of OT-based directional distribution and quantile functions on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT are given in Section 3. Our approach to obtain regularized estimators of MK quantiles on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is detailed in Section 4. A study of the MK directional statistical depth, is proposed in Section 5 and illustrated with simulated data. Numerical experiments are reported in Section 6 to highlight the benefits of entropic regularization. Concluding remarks and discussions on this work are proposed in Section 7. Finally, mathematical details and proofs are deferred to technical appendices at the end of the paper.

2. Related works

2.1. Directional quantiles and depth

Very much related to multivariate quantiles is the idea of statistical depth and the center-outward ordering of data, with a long history dating back to Tukey’s work [81]. The issue of picturing data, in Tukey’s terminology, has since gained very much attention, which prevents us from providing an exhaustive survey, and we rather refer to [50, 3, 75, 60]. The first notion of directional depth function was introduced in [78], followed by the work of [51] that developed three different approaches. The properties of the latter have been studied and applied for inference in [71, 1]. The required computational effort led the authors of [46] to build the angular Mahalanobis depth, and, doing so, they provided the first concept of directional quantiles. Despite appealing properties, the obtained contours are constrained to be rotationally symmetric, motivating the elliptic counterpart from [41]. Still, the elliptic assumption is a strong one as discussed in [37]. Facing either this lack of adaptiveness or the computational burdens of previous references, distance-based depths were proposed in [66], even though not explicitly related to the notion of quantiles. These directional depth functions can been applied, for instance, in data analysis and inference [51, 78, 1, 44], classification [66, 51, 21, 22, 61, 44] or clustering [65]. Among recent years, two concepts of multivariate quantiles have emerged in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, namely the spatial quantiles [11] and the center-outward ones [12, 36], both gathering most of commonly sought-after properties. More importantly here, these promising ideas have successfully been extended to directional data [44, 37], improving on the lack of adaptiveness of Mahalanobis quantiles [46], with desirable asymptotic results inherited from the formalism of quantiles.

To put it in a nutshell, existing concepts of directional quantiles include Mahalanobis quantiles, Spatial quantiles and Monge-Kantorovich ones. On the one hand, a statistical depth associated to a notion of quantiles is amenable to benefit from the best of both worlds, that is adaptivity to the underlying geometry and consistency of empirical versions, as argued for instance in [44]. On another hand, in comparison with other directional quantiles, Monge-Kantorovich ones present an additional descriptive power inherited from the fact that 𝐐(U)νsimilar-to𝐐𝑈𝜈\mathbf{Q}(U)\sim\nubold_Q ( italic_U ) ∼ italic_ν. A direct consequence is that 𝐐𝐐\mathbf{Q}bold_Q must contain all the available information, which is appealing with the purpose of summing up unknown features of multivariate data. Even more, these concepts provide a curvilinear coordinate system within the support of the distribution of interest ν𝜈\nuitalic_ν, [37], which is a promising way to render the available information.

2.2. Regularized center-outward quantiles

In [37], the estimation of directional quantiles amounts to a discrete OT problem, between a regular grid and the observed sample, in a similar fashion to what is done for Euclidean data in [12]. However, such estimator is piecewise constant, restricted to taking its values in the set of observations. As in the Euclidean setting dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, this raises issues in a number of situations that require smoothness of the quantile function, for instance to implement volumes of quantile regions [4]. There, regularization naturally enters the picture, either after the estimation of unregularized OT [36, 4], or with EOT [7] building on the entropic map, [69]. Facing the same issue in the present directional context, the entropic map is extended here to the non-Euclidean setting, similarly to what is done in [17] for general costs in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. To the best of our knowledge, this appears to be new, although it benefits from explicit formulation tractable in linear time, given the dual potentials solving EOT. This belongs to the line of work estimating OT maps on manifolds, see e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . [28, 14, 67].

3. Directional distribution and quantile functions on the 2-sphere based on optimal transport

In this section, we introduce the main definitions of OT-based distribution and quantile functions for spherical data, beginning with notation related to spherical harmonics and differentiation on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

3.1. Context

The unit 2222-sphere is defined by 𝕊2={x3:x=1}.superscript𝕊2conditional-set𝑥superscript3norm𝑥1{\mathbb{S}^{2}}=\{x\in{\mathbb{R}}^{3}:\|x\|=1\}.blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { italic_x ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT : ∥ italic_x ∥ = 1 } . The points x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT can be written in spherical coordinates, with longitude ϕ[π,π]italic-ϕ𝜋𝜋\phi\in[-\pi,\pi]italic_ϕ ∈ [ - italic_π , italic_π ] and colatitude θ[0,π]𝜃0𝜋\theta\in[0,\pi]italic_θ ∈ [ 0 , italic_π ], as

(3.1) x=Φ(θ,ϕ):=(cosϕsinθ,sinϕsinθ,cosθ).𝑥Φ𝜃italic-ϕassignitalic-ϕ𝜃italic-ϕ𝜃𝜃x=\Phi(\theta,\phi):=(\cos\phi\sin\theta,\sin\phi\sin\theta,\cos\theta).italic_x = roman_Φ ( italic_θ , italic_ϕ ) := ( roman_cos italic_ϕ roman_sin italic_θ , roman_sin italic_ϕ roman_sin italic_θ , roman_cos italic_θ ) .

On the 2222-sphere, the geodesic distance is d(x,y)=arccos(x,y),𝑑𝑥𝑦𝑥𝑦d(x,y)=\arccos(\langle x,y\rangle),italic_d ( italic_x , italic_y ) = roman_arccos ( ⟨ italic_x , italic_y ⟩ ) , and the squared Riemannian distance is

(3.2) c(x,y)=12d(x,y)2.𝑐𝑥𝑦12𝑑superscript𝑥𝑦2c(x,y)=\frac{1}{2}d(x,y)^{2}.italic_c ( italic_x , italic_y ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG italic_d ( italic_x , italic_y ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Both d𝑑ditalic_d and c𝑐citalic_c are continuous and bounded as d(x,y)[0,π]𝑑𝑥𝑦0𝜋d(x,y)\in[0,\pi]italic_d ( italic_x , italic_y ) ∈ [ 0 , italic_π ]. Moreover, (𝕊2,d)superscript𝕊2𝑑({\mathbb{S}^{2}},d)( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_d ) is a separable complete metric space, with Borel algebra 2superscript2\mathcal{B}^{2}caligraphic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. The surface measure σ𝕊2subscript𝜎superscript𝕊2\sigma_{\mathbb{S}^{2}}italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is given by

𝕊2f(x)𝑑σ𝕊2(x)=0πππf(Φ(θ,ϕ))sinθdϕdθ,subscriptsuperscript𝕊2𝑓𝑥differential-dsubscript𝜎superscript𝕊2𝑥superscriptsubscript0𝜋superscriptsubscript𝜋𝜋𝑓Φ𝜃italic-ϕ𝜃𝑑italic-ϕ𝑑𝜃\int_{\mathbb{S}^{2}}f(x)d\sigma_{\mathbb{S}^{2}}(x)=\int_{0}^{\pi}\int_{-\pi}% ^{\pi}f(\Phi(\theta,\phi))\sin\theta d\phi d\theta,∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f ( italic_x ) italic_d italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) = ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_π end_POSTSUPERSCRIPT ∫ start_POSTSUBSCRIPT - italic_π end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_π end_POSTSUPERSCRIPT italic_f ( roman_Φ ( italic_θ , italic_ϕ ) ) roman_sin italic_θ italic_d italic_ϕ italic_d italic_θ ,

and the uniform measure on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT writes μ𝕊2=14πσ𝕊2subscript𝜇superscript𝕊214𝜋subscript𝜎superscript𝕊2\mu_{\mathbb{S}^{2}}=\frac{1}{4\pi}\sigma_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG 4 italic_π end_ARG italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. The space of all equivalence classes of square integrable functions on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is denoted by L2(𝕊2)superscript𝐿2superscript𝕊2L^{2}({\mathbb{S}^{2}})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ). We define the spherical harmonic function of degree l𝑙superscriptl\in\mathbb{N}^{*}italic_l ∈ blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and order m{l,,l}𝑚𝑙𝑙m\in\{-l,\cdots,l\}italic_m ∈ { - italic_l , ⋯ , italic_l } by

Ylm(x)=Ylm(Φ(θ,ϕ))=2l+14π(lm)!(l+m)!Plm(cosθ)eimϕ,superscriptsubscript𝑌𝑙𝑚𝑥superscriptsubscript𝑌𝑙𝑚Φ𝜃italic-ϕ2𝑙14𝜋𝑙𝑚𝑙𝑚superscriptsubscript𝑃𝑙𝑚𝜃superscript𝑒𝑖𝑚italic-ϕY_{l}^{m}(x)=Y_{l}^{m}(\Phi(\theta,\phi))=\sqrt{\frac{2l+1}{4\pi}\frac{(l-m)!}% {(l+m)!}}P_{l}^{m}(\cos\theta)e^{im\phi},italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) = italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( roman_Φ ( italic_θ , italic_ϕ ) ) = square-root start_ARG divide start_ARG 2 italic_l + 1 end_ARG start_ARG 4 italic_π end_ARG divide start_ARG ( italic_l - italic_m ) ! end_ARG start_ARG ( italic_l + italic_m ) ! end_ARG end_ARG italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( roman_cos italic_θ ) italic_e start_POSTSUPERSCRIPT italic_i italic_m italic_ϕ end_POSTSUPERSCRIPT ,

where the associated Legendre functions Plm:[1,1]:superscriptsubscript𝑃𝑙𝑚11P_{l}^{m}:[-1,1]\rightarrow{\mathbb{R}}italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT : [ - 1 , 1 ] → blackboard_R verify, for l𝑙superscriptl\in\mathbb{N}^{*}italic_l ∈ blackboard_N start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and m0𝑚0m\geq 0italic_m ≥ 0,

Plm(t)=(1)m2ll!(1t2)m/2dl+m(t21)ldtl+mandPlm(t)=(1)m(lm)!(l+m)!Plm(t).formulae-sequencesuperscriptsubscript𝑃𝑙𝑚𝑡superscript1𝑚superscript2𝑙𝑙superscript1superscript𝑡2𝑚2superscript𝑑𝑙𝑚superscriptsuperscript𝑡21𝑙𝑑superscript𝑡𝑙𝑚andsuperscriptsubscript𝑃𝑙𝑚𝑡superscript1𝑚𝑙𝑚𝑙𝑚superscriptsubscript𝑃𝑙𝑚𝑡P_{l}^{m}(t)=\frac{(-1)^{m}}{2^{l}l!}(1-t^{2})^{m/2}\frac{d^{l+m}(t^{2}-1)^{l}% }{dt^{l+m}}\quad\text{and}\quad P_{l}^{-m}(t)=(-1)^{m}\frac{(l-m)!}{(l+m)!}P_{% l}^{m}(t).italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_t ) = divide start_ARG ( - 1 ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT end_ARG start_ARG 2 start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT italic_l ! end_ARG ( 1 - italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_m / 2 end_POSTSUPERSCRIPT divide start_ARG italic_d start_POSTSUPERSCRIPT italic_l + italic_m end_POSTSUPERSCRIPT ( italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - 1 ) start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT end_ARG start_ARG italic_d italic_t start_POSTSUPERSCRIPT italic_l + italic_m end_POSTSUPERSCRIPT end_ARG and italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - italic_m end_POSTSUPERSCRIPT ( italic_t ) = ( - 1 ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT divide start_ARG ( italic_l - italic_m ) ! end_ARG start_ARG ( italic_l + italic_m ) ! end_ARG italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_t ) .

Importantly, the spherical harmonics form an orthonormal basis of L2(𝕊2)superscript𝐿2superscript𝕊2L^{2}({\mathbb{S}^{2}})italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), so that every function fL2(𝕊2)𝑓superscript𝐿2superscript𝕊2f\in L^{2}({\mathbb{S}^{2}})italic_f ∈ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) is uniquely decomposed, for x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, as

(3.3) f(x)=l=0m=llf¯lmYlm(x),𝑓𝑥superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝑓𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑥f(x)=\sum_{l=0}^{\infty}\sum_{m=-l}^{l}\bar{f}_{l}^{m}Y_{l}^{m}(x),italic_f ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) ,

where the sequence of spherical harmonic coefficients f¯=(f¯lm)¯𝑓superscriptsubscript¯𝑓𝑙𝑚\bar{f}=(\bar{f}_{l}^{m})over¯ start_ARG italic_f end_ARG = ( over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) verifies

(3.4) f¯lm=14π𝕊2f(x)Ylm(x)¯𝑑σ𝕊2(x).superscriptsubscript¯𝑓𝑙𝑚14𝜋subscriptsuperscript𝕊2𝑓𝑥¯superscriptsubscript𝑌𝑙𝑚𝑥differential-dsubscript𝜎superscript𝕊2𝑥\bar{f}_{l}^{m}=\frac{1}{4\pi}\int_{\mathbb{S}^{2}}f(x)\overline{Y_{l}^{m}(x)}% d\sigma_{\mathbb{S}^{2}}(x).over¯ start_ARG italic_f end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT = divide start_ARG 1 end_ARG start_ARG 4 italic_π end_ARG ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f ( italic_x ) over¯ start_ARG italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) end_ARG italic_d italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

We refer to [13] for an introduction to Fourier analysis on the sphere.

Below, we introduce a few notation from differential geometry [79], that one can find for instance in [25]. At any point x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the tangent space is 𝒯x𝕊2={y3:x,y=0}subscript𝒯𝑥superscript𝕊2conditional-set𝑦superscript3𝑥𝑦0\mathcal{T}_{x}{\mathbb{S}^{2}}=\{y\in{\mathbb{R}}^{3}:\langle x,y\rangle=0\}caligraphic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = { italic_y ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT : ⟨ italic_x , italic_y ⟩ = 0 }, and the associated orthogonal projection ρx:3𝒯x𝕊2:subscript𝜌𝑥superscript3subscript𝒯𝑥superscript𝕊2\rho_{x}:{\mathbb{R}}^{3}\rightarrow\mathcal{T}_{x}{\mathbb{S}^{2}}italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT : blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT → caligraphic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT verifies

(3.5) ρxξ=(IxxT)ξ=ξξ,xx.subscript𝜌𝑥𝜉𝐼𝑥superscript𝑥𝑇𝜉𝜉𝜉𝑥𝑥\rho_{x}\xi=(I-xx^{T})\xi=\xi-\langle\xi,x\rangle x.italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_ξ = ( italic_I - italic_x italic_x start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) italic_ξ = italic_ξ - ⟨ italic_ξ , italic_x ⟩ italic_x .

The exponential map at x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, Expx:𝒯x𝕊2𝕊2:subscriptExp𝑥subscript𝒯𝑥superscript𝕊2superscript𝕊2\text{Exp}_{x}:\mathcal{T}_{x}{\mathbb{S}^{2}}\rightarrow{\mathbb{S}^{2}}Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT : caligraphic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, has the explicit form

Expx(v)=cos(v)x+sin(v)vv,subscriptExp𝑥𝑣norm𝑣𝑥norm𝑣𝑣norm𝑣\text{Exp}_{x}(v)=\cos(\|v\|)x+\sin(\|v\|)\frac{v}{\|v\|},Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_v ) = roman_cos ( ∥ italic_v ∥ ) italic_x + roman_sin ( ∥ italic_v ∥ ) divide start_ARG italic_v end_ARG start_ARG ∥ italic_v ∥ end_ARG ,

and its inverse LogxsubscriptLog𝑥\text{Log}_{x}Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT writes

Logx(z)=d(x,z)1x,z2ρxz=d(x,z)ρx(zx)ρx(zx).subscriptLog𝑥𝑧𝑑𝑥𝑧1superscript𝑥𝑧2subscript𝜌𝑥𝑧𝑑𝑥𝑧subscript𝜌𝑥𝑧𝑥normsubscript𝜌𝑥𝑧𝑥\text{Log}_{x}(z)=\frac{d(x,z)}{\sqrt{1-\langle x,z\rangle^{2}}}\rho_{x}z=d(x,% z)\frac{\rho_{x}(z-x)}{\|\rho_{x}(z-x)\|}.Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z ) = divide start_ARG italic_d ( italic_x , italic_z ) end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_z = italic_d ( italic_x , italic_z ) divide start_ARG italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z - italic_x ) end_ARG start_ARG ∥ italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z - italic_x ) ∥ end_ARG .

We take the extrinsic viewpoint for the manifold 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT embedded in 3superscript3{\mathbb{R}}^{3}blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT, so that the Riemannian gradient is given by orthogonally projecting Euclidean derivatives onto the tangent space. For a smooth function f:𝕊2:𝑓superscript𝕊2f:{\mathbb{S}^{2}}\rightarrow{\mathbb{R}}italic_f : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_R, its Riemannian gradient f(x)𝑓𝑥\nabla f(x)∇ italic_f ( italic_x ) at x𝑥xitalic_x is thus defined by

(3.6) f(x)=ρxDf(x),whereDf(x)=(f(x)xi)i,j{1,2,3}.formulae-sequence𝑓𝑥subscript𝜌𝑥𝐷𝑓𝑥where𝐷𝑓𝑥subscript𝑓𝑥subscript𝑥𝑖𝑖𝑗123\nabla f(x)=\rho_{x}Df(x),\hskip 28.45274pt\mbox{where}\hskip 28.45274ptDf(x)=% \Big{(}\frac{\partial f(x)}{\partial x_{i}}\Big{)}_{i,j\in\{1,2,3\}}.∇ italic_f ( italic_x ) = italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_D italic_f ( italic_x ) , where italic_D italic_f ( italic_x ) = ( divide start_ARG ∂ italic_f ( italic_x ) end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ) start_POSTSUBSCRIPT italic_i , italic_j ∈ { 1 , 2 , 3 } end_POSTSUBSCRIPT .

The Riemannian Hessian 2f(x):𝒯x𝕊2𝒯x𝕊2:superscript2𝑓𝑥subscript𝒯𝑥superscript𝕊2subscript𝒯𝑥superscript𝕊2\nabla^{2}f(x):\mathcal{T}_{x}{\mathbb{S}^{2}}\rightarrow\mathcal{T}_{x}{% \mathbb{S}^{2}}∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) : caligraphic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → caligraphic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT at x𝑥xitalic_x is defined by the same token,

(3.7) 2f(x)=ρx[D2f(x)Df(x),xI]withD2f(x)=(2f(x)xixj)i,j{1,2,3}.formulae-sequencesuperscript2𝑓𝑥subscript𝜌𝑥delimited-[]superscript𝐷2𝑓𝑥𝐷𝑓𝑥𝑥𝐼withsuperscript𝐷2𝑓𝑥subscriptsuperscript2𝑓𝑥subscript𝑥𝑖subscript𝑥𝑗𝑖𝑗123\nabla^{2}f(x)=\rho_{x}\Big{[}D^{2}f(x)-\langle Df(x),x\rangle I\Big{]}\hskip 2% 8.45274pt\mbox{with}\hskip 28.45274ptD^{2}f(x)=\Big{(}\frac{\partial^{2}f(x)}{% \partial x_{i}\partial x_{j}}\Big{)}_{i,j\in\{1,2,3\}}.∇ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) = italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT [ italic_D start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) - ⟨ italic_D italic_f ( italic_x ) , italic_x ⟩ italic_I ] with italic_D start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) = ( divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_f ( italic_x ) end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∂ italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG ) start_POSTSUBSCRIPT italic_i , italic_j ∈ { 1 , 2 , 3 } end_POSTSUBSCRIPT .

3.2. Main definitions

Here, we fix the definitions of directional distribution and quantile functions as introduced in [37]. Given μ𝜇\muitalic_μ and ν𝜈\nuitalic_ν two probability measures supported on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, we say that T𝑇Titalic_T pushes forward μ𝜇\muitalic_μ to ν𝜈\nuitalic_ν, denoted by T#μ=νsubscript𝑇#𝜇𝜈T_{\#}\mu=\nuitalic_T start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ = italic_ν, if, for each measurable B2𝐵superscript2B\in\mathcal{B}^{2}italic_B ∈ caligraphic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT,

ν(B)=μ(T1(B)).𝜈𝐵𝜇superscript𝑇1𝐵\nu(B)=\mu(T^{-1}(B)).italic_ν ( italic_B ) = italic_μ ( italic_T start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B ) ) .

Then, for the quadratic cost c𝑐citalic_c defined in (3.2), Monge’s formulation of the OT problem between μ𝜇\muitalic_μ and ν𝜈\nuitalic_ν writes

(3.8) minT:T#μ=ν𝔼Xμ[c(X,T(X))].subscript:𝑇subscript𝑇#𝜇𝜈subscript𝔼similar-to𝑋𝜇delimited-[]𝑐𝑋𝑇𝑋\min\limits_{T:T_{\#}\mu=\nu}\mathbb{E}_{X\sim\mu}\left[c(X,T(X))\right].roman_min start_POSTSUBSCRIPT italic_T : italic_T start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ = italic_ν end_POSTSUBSCRIPT blackboard_E start_POSTSUBSCRIPT italic_X ∼ italic_μ end_POSTSUBSCRIPT [ italic_c ( italic_X , italic_T ( italic_X ) ) ] .

The solution of (3.8) is referred to as a Monge map, while μ𝜇\muitalic_μ and ν𝜈\nuitalic_ν are called the reference and target measures, respectively. Before considering the existence of Monge maps, we recall the key definition of c𝑐citalic_c-transforms as stated in [57], that is equivalent to the formulation of [37][Definition 2].

Definition 3.1.

Given a function ψ:𝕊2:𝜓superscript𝕊2\psi:{\mathbb{S}^{2}}\rightarrow{\mathbb{R}}italic_ψ : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_R, its c𝑐citalic_c-transform is defined by

ψc(y)=infx𝕊2{c(x,y)ψ(x)}.superscript𝜓𝑐𝑦subscriptinfimum𝑥superscript𝕊2𝑐𝑥𝑦𝜓𝑥\psi^{c}(y)=\inf_{x\in{\mathbb{S}^{2}}}\{c(x,y)-\psi(x)\}.italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( italic_y ) = roman_inf start_POSTSUBSCRIPT italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT { italic_c ( italic_x , italic_y ) - italic_ψ ( italic_x ) } .

Then, ψ𝜓\psiitalic_ψ is said to be c𝑐citalic_c-concave when ψcc:=(ψc)c=ψassignsuperscript𝜓𝑐𝑐superscriptsuperscript𝜓𝑐𝑐𝜓\psi^{cc}:=(\psi^{c})^{c}=\psiitalic_ψ start_POSTSUPERSCRIPT italic_c italic_c end_POSTSUPERSCRIPT := ( italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT = italic_ψ.

The proper summary of [37][Proposition 1] highlights that c𝑐citalic_c-concavity is related to optimality in Monge’s OT problem (3.8). For our continuous and bounded cost c𝑐citalic_c, a sufficient assumption is that the reference measure belongs to 𝐁2subscript𝐁2\mathbf{B}_{2}bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, the family of σ𝕊2subscript𝜎superscript𝕊2\sigma_{\mathbb{S}^{2}}italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT-absolutely continuous distributions with densities bounded away from 00 and \infty, see [57][Theorem 9], which is the case for the uniform measure μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. This enables the definition of directional distribution and quantile functions, first introduced in [37].

Definition 3.2.

The directional MK quantile function of the arbitrary probability measure ν𝜈\nuitalic_ν is the μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT-a.s.formulae-sequence𝑎𝑠a.s.italic_a . italic_s . unique map 𝐐:𝕊2𝕊2:𝐐superscript𝕊2superscript𝕊2\mathbf{Q}:{\mathbb{S}^{2}}\rightarrow{\mathbb{S}^{2}}bold_Q : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT such that 𝐐#μ𝕊2=νsubscript𝐐#subscript𝜇superscript𝕊2𝜈\mathbf{Q}_{\#}\mu_{\mathbb{S}^{2}}=\nubold_Q start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = italic_ν and there exists a c𝑐citalic_c-concave differentiable map** ψ:𝕊2:𝜓superscript𝕊2\psi:{\mathbb{S}^{2}}\rightarrow{\mathbb{R}}italic_ψ : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_R such that, σ𝕊2subscript𝜎superscript𝕊2\sigma_{\mathbb{S}^{2}}italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT-a.e.formulae-sequence𝑎𝑒a.e.italic_a . italic_e .,

𝐐(x)=Expx(ψ(x)).𝐐𝑥subscriptExp𝑥𝜓𝑥\mathbf{Q}(x)=\text{Exp}_{x}(-\nabla\psi(x)).bold_Q ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( - ∇ italic_ψ ( italic_x ) ) .

In addition, the directional MK distribution function of ν𝜈\nuitalic_ν is given by

𝐅(x)=Expx(ψc(x)).𝐅𝑥subscriptExp𝑥superscript𝜓𝑐𝑥\mathbf{F}(x)=\text{Exp}_{x}(-\nabla\psi^{c}(x)).bold_F ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( - ∇ italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ( italic_x ) ) .

As soon as ν𝜈\nuitalic_ν belongs to 𝐁2subscript𝐁2\mathbf{B}_{2}bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, [57][Corollary 10] ensures that 𝐅=𝐐1𝐅superscript𝐐1\mathbf{F}=\mathbf{Q}^{-1}bold_F = bold_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT almost everywhere, whereas 𝐐1superscript𝐐1\mathbf{Q}^{-1}bold_Q start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT might not exist if ν𝜈\nuitalic_ν is not absolutely continuous. Compared to [37], our definition begins with 𝐐𝐐\mathbf{Q}bold_Q instead of 𝐅𝐅\mathbf{F}bold_F, and it does not require the absolute continuity for ν𝜈\nuitalic_ν. This follows developments from [32] for measures supported in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT.

Remark 3.1 (Regularity).

The regularity of OT maps on the sphere is a delicate subject that has inspired a number of works, including [83, 52, 20, 53]. Firstly, any c𝑐citalic_c-concave potential ψ𝜓\psiitalic_ψ is twice differentiable almost everywhere [15][Proposition 3.14]. For further regularity, an appropriate requirement is that the underlying measures are smooth and belong to 𝐁2subscript𝐁2\mathbf{B}_{2}bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In particular, if, a minima, ν𝜈\nuitalic_ν has density fC1,1(𝕊2)𝑓superscript𝐶11superscript𝕊2f\in C^{1,1}({\mathbb{S}^{2}})italic_f ∈ italic_C start_POSTSUPERSCRIPT 1 , 1 end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) with respect to σ𝕊2subscript𝜎superscript𝕊2\sigma_{\mathbb{S}^{2}}italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, then the MK quantile function 𝐐𝐐\mathbf{Q}bold_Q belongs to C2,β(𝕊2)superscript𝐶2𝛽superscript𝕊2C^{2,\beta}({\mathbb{S}^{2}})italic_C start_POSTSUPERSCRIPT 2 , italic_β end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) for all β]0,1[\beta\in\,]0,1[italic_β ∈ ] 0 , 1 [, see [53] for more details.

Remark 3.2 (Gradient map**s).

A gradient map** is built from a c𝑐citalic_c-concave potential ψ𝜓\psiitalic_ψ, through xExpx(ψ(x))maps-to𝑥subscriptExp𝑥𝜓𝑥x\mapsto\text{Exp}_{x}(-\nabla\psi(x))italic_x ↦ Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( - ∇ italic_ψ ( italic_x ) ). A statistical model based on convex combinations of such maps was introduced in [73], and further used for directional data in [74]. With the viewpoint of [12, 36, 37], this amounts to a barycenter model for MK quantile functions.

In view of the proof of [57][Theorem 9], ψ𝜓\psiitalic_ψ and its c𝑐citalic_c-transform ψcsuperscript𝜓𝑐\psi^{c}italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT in Definition 3.2 maximize the dual version of Kantorovich’s problem, in the sense that

(3.9) (ψ,ψc)argmax(u,v)Lipc𝕊2u(x)𝑑μ𝕊2(x)+𝕊2v(y)𝑑ν(y),𝜓superscript𝜓𝑐subscriptargmax𝑢𝑣subscriptLip𝑐subscriptsuperscript𝕊2𝑢𝑥differential-dsubscript𝜇superscript𝕊2𝑥subscriptsuperscript𝕊2𝑣𝑦differential-d𝜈𝑦(\psi,\psi^{c})\in\operatornamewithlimits{argmax}\limits_{(u,v)\in\text{Lip}_{% c}}\int_{\mathbb{S}^{2}}u(x)d\mu_{\mathbb{S}^{2}}(x)+\int_{\mathbb{S}^{2}}v(y)% d\nu(y),( italic_ψ , italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) ∈ roman_argmax start_POSTSUBSCRIPT ( italic_u , italic_v ) ∈ Lip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u ( italic_x ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) + ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_v ( italic_y ) italic_d italic_ν ( italic_y ) ,

with

Lipc={u,v:𝕊2 continuous ;u(x)+v(y)c(x,y)}.subscriptLip𝑐conditional-set𝑢𝑣formulae-sequencesuperscript𝕊2 continuous 𝑢𝑥𝑣𝑦𝑐𝑥𝑦\text{Lip}_{c}=\left\{u,v:{\mathbb{S}^{2}}\rightarrow{\mathbb{R}}\text{ % continuous };u(x)+v(y)\leq c(x,y)\right\}.Lip start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = { italic_u , italic_v : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_R continuous ; italic_u ( italic_x ) + italic_v ( italic_y ) ≤ italic_c ( italic_x , italic_y ) } .

The semi-dual version of (3.9) refers to the optimisation over a single dual variable u𝑢uitalic_u, (resp. v𝑣vitalic_v), the other being taken as ucsuperscript𝑢𝑐u^{c}italic_u start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT, (resp. vcsuperscript𝑣𝑐v^{c}italic_v start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT). Our proposal is to build upon the semi-dual problem to tackle the issue of finding regularized estimators for 𝐅𝐅\mathbf{F}bold_F and 𝐐𝐐\mathbf{Q}bold_Q.

Before that, we recall the definitions of directional quantile contours and regions from [37], that simplify in dimension d=3𝑑3d=3italic_d = 3. A central point in 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT must be chosen for the uniform distribution μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, in view of defining nested regions with μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT-content τ[0,1]𝜏01\tau\in[0,1]italic_τ ∈ [ 0 , 1 ]. In our Riemannian framework, a well-suited notion of central point is the Fréchet median

(3.10) θM=argminz𝕊2𝔼Zν[d(Z,z)],subscript𝜃𝑀subscriptargmin𝑧superscript𝕊2subscript𝔼similar-to𝑍𝜈delimited-[]𝑑𝑍𝑧\theta_{M}=\mathop{\rm arg\;min}\limits_{z\in{\mathbb{S}^{2}}}\mathbb{E}_{Z% \sim\nu}[d(Z,z)],italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = start_BIGOP roman_arg roman_min end_BIGOP start_POSTSUBSCRIPT italic_z ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT blackboard_E start_POSTSUBSCRIPT italic_Z ∼ italic_ν end_POSTSUBSCRIPT [ italic_d ( italic_Z , italic_z ) ] ,

that can be computed with the package geomstats [59], in Python. Then, the spherical cap with μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT probability τ[0,1]𝜏01\tau\in[0,1]italic_τ ∈ [ 0 , 1 ] centered at 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) is

τU={x𝕊2:x,𝐅(θM)12τ},subscriptsuperscript𝑈𝜏conditional-set𝑥superscript𝕊2𝑥𝐅subscript𝜃𝑀12𝜏\mathbb{C}^{U}_{\tau}=\left\{x\in{\mathbb{S}^{2}}:\langle x,\mathbf{F}(\theta_% {M})\rangle\geq 1-2\tau\right\},blackboard_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = { italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ ≥ 1 - 2 italic_τ } ,

with boundary 𝒞τU={x𝕊2:x,𝐅(θM)=12τ}subscriptsuperscript𝒞𝑈𝜏conditional-set𝑥superscript𝕊2𝑥𝐅subscript𝜃𝑀12𝜏\mathcal{C}^{U}_{\tau}=\left\{x\in{\mathbb{S}^{2}}:\langle x,\mathbf{F}(\theta% _{M})\rangle=1-2\tau\right\}caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = { italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ = 1 - 2 italic_τ } a parallel of order τ𝜏\tauitalic_τ. This defines a rotated version of the usual latitude-longitude coordinate system (3.1), with respect to the pole 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), as follows. Any x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT decomposes into

(3.11) x=x,𝐅(θM)𝐅(θM)+1x,𝐅(θM)2𝐒𝐅(θM)(x),𝑥𝑥𝐅subscript𝜃𝑀𝐅subscript𝜃𝑀1superscript𝑥𝐅subscript𝜃𝑀2subscript𝐒𝐅subscript𝜃𝑀𝑥x=\langle x,\mathbf{F}(\theta_{M})\rangle\mathbf{F}(\theta_{M})+\sqrt{1-% \langle x,\mathbf{F}(\theta_{M})\rangle^{2}}\mathbf{S}_{\mathbf{F}(\theta_{M})% }(x),italic_x = ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) + square-root start_ARG 1 - ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ) ,

where x,𝐅(θM)𝑥𝐅subscript𝜃𝑀\langle x,\mathbf{F}(\theta_{M})\rangle⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ is a latitude, constant over the parallel 𝒞τUsubscriptsuperscript𝒞𝑈𝜏\mathcal{C}^{U}_{\tau}caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT, while the directional sign

(3.12) 𝐒𝐅(θM)(x)=xx,𝐅(θM)𝐅(θM)xx,𝐅(θM)𝐅(θM)subscript𝐒𝐅subscript𝜃𝑀𝑥𝑥𝑥𝐅subscript𝜃𝑀𝐅subscript𝜃𝑀norm𝑥𝑥𝐅subscript𝜃𝑀𝐅subscript𝜃𝑀\mathbf{S}_{\mathbf{F}(\theta_{M})}(x)=\frac{x-\langle x,\mathbf{F}(\theta_{M}% )\rangle\mathbf{F}(\theta_{M})}{\|x-\langle x,\mathbf{F}(\theta_{M})\rangle% \mathbf{F}(\theta_{M})\|}bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ) = divide start_ARG italic_x - ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_ARG start_ARG ∥ italic_x - ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ∥ end_ARG

is a longitude, with the convention 𝟎/0=𝟎000\mathbf{0}/0=\mathbf{0}bold_0 / 0 = bold_0 for x=±𝐅(θM)𝑥plus-or-minus𝐅subscript𝜃𝑀x=\pm\mathbf{F}(\theta_{M})italic_x = ± bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ). The unit vector 𝐒𝐅(θM)(x)subscript𝐒𝐅subscript𝜃𝑀𝑥\mathbf{S}_{\mathbf{F}(\theta_{M})}(x)bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ) takes values on the rotated equator 𝒞1/2Usubscriptsuperscript𝒞𝑈12\mathcal{C}^{U}_{1/2}caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 / 2 end_POSTSUBSCRIPT, and thus allows to characterize meridians crossing s𝒞1/2U𝑠subscriptsuperscript𝒞𝑈12s\in\mathcal{C}^{U}_{1/2}italic_s ∈ caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 / 2 end_POSTSUBSCRIPT through sU={x𝕊2:𝐒𝐅(θM)(x)=s}superscriptsubscript𝑠𝑈conditional-set𝑥superscript𝕊2subscript𝐒𝐅subscript𝜃𝑀𝑥𝑠\mathcal{M}_{s}^{U}=\{x\in{\mathbb{S}^{2}}:\mathbf{S}_{\mathbf{F}(\theta_{M})}% (x)=s\}caligraphic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT = { italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT : bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ) = italic_s }. For ease of understanding, we take the example of 𝐅(θM)=(0,0,1)T𝐅subscript𝜃𝑀superscript001𝑇\mathbf{F}(\theta_{M})=(0,0,1)^{T}bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) = ( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT : taking x𝒞τU𝑥superscriptsubscript𝒞𝜏𝑈x\in\mathcal{C}_{\tau}^{U}italic_x ∈ caligraphic_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT such that x,𝐅(θM)=12τ𝑥𝐅subscript𝜃𝑀12𝜏\langle x,\mathbf{F}(\theta_{M})\rangle=1-2\tau⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ = 1 - 2 italic_τ is equivalent to x3=12τsubscript𝑥312𝜏x_{3}=1-2\tauitalic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 1 - 2 italic_τ. Thus, we retrieve that for a fixed τ𝜏\tauitalic_τ, quantile contours 𝒞τUsuperscriptsubscript𝒞𝜏𝑈\mathcal{C}_{\tau}^{U}caligraphic_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT have indeed a fixed latitude in the classical longitude-latitude system (3.1). In fact, one can use contours of constant latitude in the system (3.1) to discretize a reference quantile contour oriented towards 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), by choosing the appropriate rotation matrix 𝐎𝐎\mathbf{O}bold_O that sends (0,0,1)Tsuperscript001𝑇(0,0,1)^{T}( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT towards 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ). Numerically, it can be computed using Rodrigues’ rotation formula, for instance.

The image by 𝐐𝐐\mathbf{Q}bold_Q of the parallel / meridian system (3.11) provides curvilinear parallels 𝐐(𝒞τU)𝐐subscriptsuperscript𝒞𝑈𝜏\mathbf{Q}(\mathcal{C}^{U}_{\tau})bold_Q ( caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) and curvilinear meridians 𝐐(sU)𝐐superscriptsubscript𝑠𝑈\mathbf{Q}(\mathcal{M}_{s}^{U})bold_Q ( caligraphic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT ) adapted to the geometry of the support of ν𝜈\nuitalic_ν, giving rise to suitable directional concepts of quantile contours and signs [37]. Intuitively, a change in coordinates in a data-adaptive fashion must retain all the available information, in a simpler form amenable to be summed up.

Definition 3.3 (Quantile contours, regions and signs).

Let ν𝐁2𝜈subscript𝐁2\nu\in\mathbf{B}_{2}italic_ν ∈ bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, with directional quantile function 𝐐𝐐\mathbf{Q}bold_Q. Then,

  • the quantile contour of order τ[0,1]𝜏01\tau\in[0,1]italic_τ ∈ [ 0 , 1 ] is 𝒞τ=𝐐(𝒞τU)subscript𝒞𝜏𝐐subscriptsuperscript𝒞𝑈𝜏\mathcal{C}_{\tau}=\mathbf{Q}(\mathcal{C}^{U}_{\tau})caligraphic_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = bold_Q ( caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ),

  • the quantile region of order τ[0,1]𝜏01\tau\in[0,1]italic_τ ∈ [ 0 , 1 ] is τ=𝐐(τU)subscript𝜏𝐐subscriptsuperscript𝑈𝜏\mathbb{C}_{\tau}=\mathbf{Q}(\mathbb{C}^{U}_{\tau})blackboard_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = bold_Q ( blackboard_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ),

  • the sign curve associated with s𝒞1/2U𝑠subscriptsuperscript𝒞𝑈12s\in\mathcal{C}^{U}_{1/2}italic_s ∈ caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 / 2 end_POSTSUBSCRIPT is 𝐐(sU)𝐐superscriptsubscript𝑠𝑈\mathbf{Q}(\mathcal{M}_{s}^{U})bold_Q ( caligraphic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT ).

Since 𝐐𝐐\mathbf{Q}bold_Q is a push-forward map**, 𝐐#μ𝕊2=νsubscript𝐐#subscript𝜇superscript𝕊2𝜈\mathbf{Q}_{\#}\mu_{\mathbb{S}^{2}}=\nubold_Q start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = italic_ν, the ν𝜈\nuitalic_ν-probability content of τsubscript𝜏\mathbb{C}_{\tau}blackboard_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT is τ𝜏\tauitalic_τ. Moreover, the quantile contours of ν𝐁2𝜈subscript𝐁2\nu\in\mathbf{B}_{2}italic_ν ∈ bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are continuous and the quantile regions are closed, connected and nested, as stated in [37]. Invariance properties, that were shown in [37], are gathered in Appendix A, for the sake of completeness. There, it is also mentioned that, from [74][Lemma 1], the convex combination between c𝑐citalic_c-concave functions is itself c𝑐citalic_c-concave.

4. Regularized estimation of Monge-Kantorovich quantiles on the 2-sphere

4.1. Entropic OT on the 2-sphere

We now introduce an algorithm based on the spherical Fourier transform to solve the regularized Kantorovich problem on the 2222-sphere. In the Euclidean case, Kantorovich’s problem is known to be easier to solve than Monge’s problem [18]. Even more so, since the founding work of [16], adding an entropic regularization term to (3.9) has been a cornerstone for the development of OT-based methods in statistics and machine learning. In [31], rewriting the dual objective function to be optimized allowed the introduction of stochastic algorithms to obtain provably convergent algorithms, see also [6]. For arbitrary measures, (not only discrete ones), the dual variables cannot be viewed as finite-dimensional vectors anymore. Therefore, one requires the use of nonparametric families of dual functions, as proposed in [31] with reproducing kernel Hilbert spaces or in [72] with deep neural networks. In our previous work [7], we suggested the use of Fourier series in the specific context of center-outward quantiles to take advantage of the knowledge of the reference measure. The resulting algorithm directly targets the continuous OT problem between the continuous reference measure and the underlying ν𝜈\nuitalic_ν, instead of the discrete or semi-discrete OT problem towards the empirical measure ν^nsubscript^𝜈𝑛\widehat{\nu}_{n}over^ start_ARG italic_ν end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. The idea that we are introducing in this paper for spherical distributions is in the same spirit. We mention that several algorithms exist in order to solve the unregularized OT problem on the sphere, see [40] and the references therein. Also, the Network simplex and Sinkhorn algorithm only depend on the cost matrix, so that they can be adapted trivially on any space. While these discrete solvers require the storage of the cost matrix, of size n2superscript𝑛2n^{2}italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT for two samples of sizes n𝑛nitalic_n, stochastic algorithms are designed to avoid it.

In the context of spherical quantiles, one must solve an OT problem where the reference measure is μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, the uniform probability measure on the sphere. Here, we consider the formulation from [31, 30] for OT between μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and ν𝜈\nuitalic_ν, regularized by relative entropy with respect to the product measure μ𝕊2νtensor-productsubscript𝜇superscript𝕊2𝜈\mu_{\mathbb{S}^{2}}\otimes\nuitalic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⊗ italic_ν. The semi-dual version of EOT between μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and ν𝜈\nuitalic_ν from [30][Proposition 12], for ε>0𝜀0\varepsilon>0italic_ε > 0 a regularization parameter, writes

(4.1) maxuL(𝕊2)𝕊2u(x)𝑑μ𝕊2(x)+𝕊2uc,ε(y)𝑑ν(y),subscript𝑢superscript𝐿superscript𝕊2subscriptsuperscript𝕊2𝑢𝑥differential-dsubscript𝜇superscript𝕊2𝑥subscriptsuperscript𝕊2superscript𝑢𝑐𝜀𝑦differential-d𝜈𝑦\max_{u\in L^{\infty}({\mathbb{S}^{2}})}\int_{{\mathbb{S}^{2}}}u(x)d\mu_{% \mathbb{S}^{2}}(x)+\int_{{\mathbb{S}^{2}}}u^{c,\varepsilon}(y)d\nu(y),roman_max start_POSTSUBSCRIPT italic_u ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u ( italic_x ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) + ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_y ) italic_d italic_ν ( italic_y ) ,

with uc,εsuperscript𝑢𝑐𝜀u^{c,\varepsilon}italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT the smooth conjugate of u𝑢uitalic_u defined by

(4.2) uc,ε(y)=εlog(𝕊2exp(u(x)c(x,y)ε)𝑑μ𝕊2(x)).superscript𝑢𝑐𝜀𝑦𝜀subscriptsuperscript𝕊2𝑢𝑥𝑐𝑥𝑦𝜀differential-dsubscript𝜇superscript𝕊2𝑥u^{c,\varepsilon}(y)=-\varepsilon\log\left(\int_{{\mathbb{S}^{2}}}\exp\Big{(}% \frac{u(x)-c(x,y)}{\varepsilon}\Big{)}d\mu_{\mathbb{S}^{2}}(x)\right).italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_y ) = - italic_ε roman_log ( ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_exp ( divide start_ARG italic_u ( italic_x ) - italic_c ( italic_x , italic_y ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) ) .

The smooth conjugate is the entropic counterpart of Definition 3.1. For bounded costs, the problem (4.1) admits a solution in Lsuperscript𝐿L^{\infty}italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT, unique up to additive constants, [30][Theorem 7]. To leverage unicity, we impose that 𝕊2u(x)𝑑μ𝕊2(x)=0subscriptsuperscript𝕊2𝑢𝑥differential-dsubscript𝜇superscript𝕊2𝑥0\int_{{\mathbb{S}^{2}}}u(x)d\mu_{\mathbb{S}^{2}}(x)=0∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u ( italic_x ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) = 0, so that the optimisation problem (4.1) becomes

maxuL(𝕊2)𝕊2uc,ε(y)𝑑ν(y).subscript𝑢superscript𝐿superscript𝕊2subscriptsuperscript𝕊2superscript𝑢𝑐𝜀𝑦differential-d𝜈𝑦\max_{u\in L^{\infty}({\mathbb{S}^{2}})}\int_{{\mathbb{S}^{2}}}u^{c,% \varepsilon}(y)d\nu(y).roman_max start_POSTSUBSCRIPT italic_u ∈ italic_L start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_y ) italic_d italic_ν ( italic_y ) .

It is well-known, see e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . [64, 31], that 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is solution of (4.1) if and only if

(4.3) 𝐮ε=((𝐮ε)c,ε)c,ε.subscript𝐮𝜀superscriptsuperscriptsubscript𝐮𝜀𝑐𝜀𝑐𝜀\mathbf{u}_{\varepsilon}=((\mathbf{u}_{\varepsilon})^{c,\varepsilon})^{c,% \varepsilon}.bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT = ( ( bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT .

Note that a Lipschitz continuous function on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT shall equal its spherical Fourier series (3.3) pointwise [58][Theorem 5.26]. One can find in [57][Lemma 2] that the true unregularized potentials are Lipschitz, because 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT has a finite diameter |𝕊2|=πsuperscript𝕊2𝜋|{\mathbb{S}^{2}}|=\pi| blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | = italic_π. Furthermore, the same holds in the regularized case ε>0𝜀0\varepsilon>0italic_ε > 0, from the optimality condition (4.3), see Proposition 12 from [26][Appendix B] or [64][Lemma 3.1]. Consequently, we suggest to parameterize the dual variable in (4.1) by its spherical harmonic coefficients. For a given ε>0𝜀0\varepsilon>0italic_ε > 0, we consider the optimal sequence of coefficients 𝐮¯εsubscript¯𝐮𝜀\bar{\mathbf{u}}_{\varepsilon}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT defined as the solution of the following stochastic convex minimisation problem

(4.4) 𝐮¯ε=argmin𝐮¯1Hε(𝐮¯)withHε(𝐮¯)=𝔼[hε(𝐮¯,X)]formulae-sequencesubscript¯𝐮𝜀subscriptargmin¯𝐮subscript1subscript𝐻𝜀¯𝐮withsubscript𝐻𝜀¯𝐮𝔼delimited-[]subscript𝜀¯𝐮𝑋\bar{\mathbf{u}}_{\varepsilon}=\mathop{\mathrm{argmin}}_{\bar{\mathbf{u}}\in% \ell_{1}}H_{\varepsilon}(\bar{\mathbf{u}})\hskip 28.45274pt\mbox{with}\hskip 2% 8.45274ptH_{\varepsilon}(\bar{\mathbf{u}})=\mathbb{E}\left[h_{\varepsilon}(% \bar{\mathbf{u}},X)\right]over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT = roman_argmin start_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG ∈ roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG ) with italic_H start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG ) = blackboard_E [ italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG , italic_X ) ]

where X𝑋Xitalic_X is a random vector with distribution ν𝜈\nuitalic_ν , 𝐮¯=(𝐮¯lm)l1¯𝐮subscriptsuperscriptsubscript¯𝐮𝑙𝑚𝑙1\bar{\mathbf{u}}=(\bar{\mathbf{u}}_{l}^{m})_{l\geq 1}over¯ start_ARG bold_u end_ARG = ( over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) start_POSTSUBSCRIPT italic_l ≥ 1 end_POSTSUBSCRIPT and

hε(𝐮¯,x)=uc,ε(x)withu(z)=l=0m=ll𝐮¯lmYlm(z).formulae-sequencesubscript𝜀¯𝐮𝑥superscript𝑢𝑐𝜀𝑥with𝑢𝑧superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝐮𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑧h_{\varepsilon}(\bar{\mathbf{u}},x)=-u^{c,\varepsilon}(x)\quad\mbox{with}\quad u% (z)=\sum\limits_{l=0}^{\infty}\sum\limits_{m=-l}^{l}\bar{\mathbf{u}}_{l}^{m}Y_% {l}^{m}(z).italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG , italic_x ) = - italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_x ) with italic_u ( italic_z ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_z ) .

Note that the spherical harmonic coefficient 𝐮¯00superscriptsubscript¯𝐮00\bar{\mathbf{u}}_{0}^{0}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT equals 00 because of the identifiability condition 𝕊2u(x)𝑑μ𝕊2(x)=0subscriptsuperscript𝕊2𝑢𝑥differential-dsubscript𝜇superscript𝕊2𝑥0\int_{{\mathbb{S}^{2}}}u(x)d\mu_{\mathbb{S}^{2}}(x)=0∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_u ( italic_x ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) = 0.

We shall now discuss the equivalence between (4.4) and the original problem (4.1). On the 2222-sphere, the series of spherical harmonics of a continuously differentiable function is uniformly convergent, see [43][p.259] or [42]. To obtain the stronger result that the sequence of spherical harmonics belongs to 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, the function needs to be twice continuously differentiable, [42][Theorem 2]. Regarding the unregularized Kantorovich potentials, such differentiability requires smoothness of the measures involved, as highlighted in Remark 3.1. A sufficient condition when the reference measure is μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT is that the density of ν𝜈\nuitalic_ν is differentiable and bounded above and below by positive constants, see [53][Corollary 6.2]. It appears that this property holds for the regularized potential solving (4.1), without any continuity condition for ν𝜈\nuitalic_ν.

Proposition 4.1.

Let 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT be a solution of (4.1). Then, 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is twice continuously differentiable, and, as a byproduct, its series of spherical harmonics belongs to 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT.

Consequently, the problem (4.4) is equivalent to the original one (4.1). The main virtue of this parameterization is that partial derivatives of hεsubscript𝜀h_{\varepsilon}italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT with respect to the parameters 𝐮¯lmsuperscriptsubscript¯𝐮𝑙𝑚\bar{\mathbf{u}}_{l}^{m}over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT can be derived easily, which is appealing in view of a stochastic gradient scheme. Here, the objective hε(,x)subscript𝜀𝑥h_{\varepsilon}(\cdot,x)italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( ⋅ , italic_x ) is of the same mathematical nature than in [7], so that it is differentiable and the following property holds.

Proposition 4.2.

For every x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the function hε(,x):1:subscript𝜀𝑥subscript1h_{\varepsilon}(\cdot,x):\ell_{1}\rightarrow\mathbb{R}italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( ⋅ , italic_x ) : roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT → blackboard_R is Fréchet differentiable and its differential D𝐮¯hε(𝐮¯,x)subscript𝐷¯𝐮subscript𝜀¯𝐮𝑥D_{\bar{\mathbf{u}}}h_{\varepsilon}(\bar{\mathbf{u}},x)italic_D start_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG , italic_x ) belongs to the dual Banach space (,)(\ell_{\infty},\|\cdot\|_{\ell_{\infty}})( roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) where 𝐮¯=supl,m|𝐮¯lm|.subscriptnorm¯𝐮subscriptsubscriptsupremum𝑙𝑚superscriptsubscript¯𝐮𝑙𝑚\|\bar{\mathbf{u}}\|_{\ell_{\infty}}=\sup_{l,m}|\bar{\mathbf{u}}_{l}^{m}|.∥ over¯ start_ARG bold_u end_ARG ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT italic_l , italic_m end_POSTSUBSCRIPT | over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT | . The components of D𝐮¯hε(𝐮¯,x)subscript𝐷¯𝐮subscript𝜀¯𝐮𝑥D_{\bar{\mathbf{u}}}h_{\varepsilon}(\bar{\mathbf{u}},x)italic_D start_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG , italic_x ) are the partial derivatives

(4.5) hε(𝐮¯,x)𝐮¯lm=14π𝕊2g𝐮¯,x(z)Ylm(z)𝑑σ𝕊2(z),subscript𝜀¯𝐮𝑥superscriptsubscript¯𝐮𝑙𝑚14𝜋subscriptsuperscript𝕊2subscript𝑔¯𝐮𝑥𝑧superscriptsubscript𝑌𝑙𝑚𝑧differential-dsubscript𝜎superscript𝕊2𝑧\frac{\partial h_{\varepsilon}(\bar{\mathbf{u}},x)}{\partial\bar{\mathbf{u}}_{% l}^{m}}=\frac{1}{4\pi}\int_{\mathbb{S}^{2}}g_{\bar{\mathbf{u}},x}(z)Y_{l}^{m}(% z)d\sigma_{\mathbb{S}^{2}}(z),divide start_ARG ∂ italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over¯ start_ARG bold_u end_ARG , italic_x ) end_ARG start_ARG ∂ over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT end_ARG = divide start_ARG 1 end_ARG start_ARG 4 italic_π end_ARG ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG , italic_x end_POSTSUBSCRIPT ( italic_z ) italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_z ) italic_d italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_z ) ,

that are the spherical harmonics coefficients of the function

(4.6) g𝐮¯,x(z)=exp(u(z)c(z,x)ε)exp(u(y)c(y,x)ε)𝑑μ𝕊2(y)withu(z)=l=0m=ll𝐮¯lmYlm(z).formulae-sequencesubscript𝑔¯𝐮𝑥𝑧𝑢𝑧𝑐𝑧𝑥𝜀𝑢𝑦𝑐𝑦𝑥𝜀differential-dsubscript𝜇superscript𝕊2𝑦with𝑢𝑧superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝐮𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑧g_{\bar{\mathbf{u}},x}(z)=\frac{\exp\left(\frac{u(z)-c(z,x)}{\varepsilon}% \right)}{\int\exp\left(\frac{u(y)-c(y,x)}{\varepsilon}\right)d\mu_{\mathbb{S}^% {2}}(y)}\hskip 14.22636pt\mbox{with}\hskip 14.22636ptu(z)=\sum\limits_{l=0}^{% \infty}\sum\limits_{m=-l}^{l}\bar{\mathbf{u}}_{l}^{m}Y_{l}^{m}(z).italic_g start_POSTSUBSCRIPT over¯ start_ARG bold_u end_ARG , italic_x end_POSTSUBSCRIPT ( italic_z ) = divide start_ARG roman_exp ( divide start_ARG italic_u ( italic_z ) - italic_c ( italic_z , italic_x ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∫ roman_exp ( divide start_ARG italic_u ( italic_y ) - italic_c ( italic_y , italic_x ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_y ) end_ARG with italic_u ( italic_z ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_z ) .

For (Xn)subscript𝑋𝑛(X_{n})( italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) a sequence of independent random vectors with distribution ν𝜈\nuitalic_ν, we consider the stochastic algorithm in the Banach space (1,1)(\ell_{1},\|\cdot\|_{\ell_{1}})( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) defined, for all n0𝑛0n\geq 0italic_n ≥ 0, by

(4.7) u^n+1=u^nγnWDu^hε(u^n,Xn+1)subscript^𝑢𝑛1subscript^𝑢𝑛subscript𝛾𝑛𝑊subscript𝐷^𝑢subscript𝜀subscript^𝑢𝑛subscript𝑋𝑛1\widehat{u}_{n+1}=\widehat{u}_{n}-\gamma_{n}WD_{\widehat{u}}h_{\varepsilon}(% \widehat{u}_{n},X_{n+1})over^ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT = over^ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - italic_γ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_W italic_D start_POSTSUBSCRIPT over^ start_ARG italic_u end_ARG end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( over^ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_n + 1 end_POSTSUBSCRIPT )

where γn=γnαsubscript𝛾𝑛𝛾superscript𝑛𝛼\gamma_{n}=\gamma n^{-\alpha}italic_γ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_γ italic_n start_POSTSUPERSCRIPT - italic_α end_POSTSUPERSCRIPT is a decreasing sequence of positive numbers with 1/2<α<112𝛼11/2<\alpha<11 / 2 < italic_α < 1 and γ>0𝛾0\gamma>0italic_γ > 0. Because 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT differs from its dual space subscript\ell_{\infty}roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT, the linear operator W𝑊Witalic_W is defined by

{W:(,)(1,1)v¯=(v¯lm)w¯v¯=(w¯lmv¯lm)\left\{\begin{array}[]{ccc}W:(\ell_{\infty},\|\cdot\|_{\ell_{\infty}})&\to&(% \ell_{1},\|\cdot\|_{\ell_{1}})\\ \bar{v}=(\bar{v}_{l}^{m})&\mapsto&\bar{w}\odot\bar{v}=(\bar{w}_{l}^{m}\bar{v}_% {l}^{m})\end{array}\right.{ start_ARRAY start_ROW start_CELL italic_W : ( roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) end_CELL start_CELL → end_CELL start_CELL ( roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ∥ ⋅ ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL over¯ start_ARG italic_v end_ARG = ( over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) end_CELL start_CELL ↦ end_CELL start_CELL over¯ start_ARG italic_w end_ARG ⊙ over¯ start_ARG italic_v end_ARG = ( over¯ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) end_CELL end_ROW end_ARRAY

where w¯=(w¯lm)¯𝑤superscriptsubscript¯𝑤𝑙𝑚\bar{w}=(\bar{w}_{l}^{m})over¯ start_ARG italic_w end_ARG = ( over¯ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) is a deterministic sequence of positive weights satisfying the condition

(4.8) w¯1=l=0m=llw¯lm<+.subscriptnorm¯𝑤subscript1superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝑤𝑙𝑚\|\bar{w}\|_{\ell_{1}}=\sum\limits_{l=0}^{\infty}\sum\limits_{m=-l}^{l}\bar{w}% _{l}^{m}<+\infty.∥ over¯ start_ARG italic_w end_ARG ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT < + ∞ .

In all the experiments carried out in this paper, we use the sequence w¯lm=(l2+m2)1superscriptsubscript¯𝑤𝑙𝑚superscriptsuperscript𝑙2superscript𝑚21\bar{w}_{l}^{m}=(l^{2}+m^{2})^{-1}over¯ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT = ( italic_l start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

For a given regularization parameter ε>0𝜀0\varepsilon>0italic_ε > 0, a regularized estimator of the optimal potential 𝐮ε(x)=l=0m=ll𝐮¯ε,lmYlm(x)subscript𝐮𝜀𝑥superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝐮𝜀𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑥\mathbf{u}_{\varepsilon}(x)=\sum\limits_{l=0}^{\infty}\sum\limits_{m=-l}^{l}% \bar{\mathbf{u}}_{\varepsilon,l}^{m}Y_{l}^{m}(x)bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) is naturally given by

(4.9) 𝐮^ε,n(x)=l=0m=llu^n,lmYlm(x).subscript^𝐮𝜀𝑛𝑥superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript^𝑢𝑛𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑥\widehat{\mathbf{u}}_{\varepsilon,n}(x)=\sum\limits_{l=0}^{\infty}\sum\limits_% {m=-l}^{l}\widehat{u}_{n,l}^{m}Y_{l}^{m}(x).over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over^ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_n , italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) .

From a practical point of view, the stochastic sequence (4.7) must be discretized. To do so, one must consider a grid of p2superscript𝑝2p^{2}italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT points on the 2222-sphere and the associated spherical harmonics coefficients. We emphasize that this discretization takes place in the space of frequencies, willing to take advantage from implicit interpolation in this space. The Python library pyshtools [84], implements spherical harmonics transforms and reconstructions. Our numerical procedure builds upon it as the stochastic algorithm (4.7) requires computing the spherical harmonics coefficients of the function gu^,y(x)subscript𝑔^𝑢𝑦𝑥g_{\widehat{u},y}(x)italic_g start_POSTSUBSCRIPT over^ start_ARG italic_u end_ARG , italic_y end_POSTSUBSCRIPT ( italic_x ) in (4.6), that relies on u(x)𝑢𝑥u(x)italic_u ( italic_x ) reconstructed from u^^𝑢\widehat{u}over^ start_ARG italic_u end_ARG thanks to the inverse Fourier transform on the 2222-sphere. With the help of the fast routine within pyshtools, the computational cost at each iteration of (4.7) is of order 𝒪(p2log2(p))𝒪superscript𝑝2superscript2𝑝\mathcal{O}\left(p^{2}\log^{2}(p)\right)caligraphic_O ( italic_p start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_p ) ) [84]. Finally, the estimator 𝐮^ε,n(x)subscript^𝐮𝜀𝑛𝑥\widehat{\mathbf{u}}_{\varepsilon,n}(x)over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_x ) in (4.9) can be quickly recovered for any x𝑥xitalic_x thanks to the Python library sphericart [9].

Remark 4.1.

To study the convergence of the stochastic algorithm (4.7), we could adapt the theoretical results in [7] to our setting. Importantly, the consistency results obtained in [7] remain valid in the present spherical context, because they were irrespective of the orthonormal basis and the cost c𝑐citalic_c. However, such a study is beyond the scope of this paper.

4.2. Regularized distribution and quantile functions

We now introduce the regularized counterpart of Definition 3.2. When dealing with measures supported on dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, one can use the entropic map, defined as the barycentric projection of the entropic optimal plan, see e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . [69, 70]. At first sight, this requires a notion of average on the sphere, as done in [28] from the unregularized empirical OT plan. Nonetheless, this map is alternatively characterized by analogy with Brenier theorem [69][Proposition 2], whose building block is the gradient of Kantorovich potential. Based on such differentiation, entropic maps were introduced in [17] for OT problems involving general convex costs in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Because this enforces the structure of optimality, we pursue this idea for our non-Euclidean setting. Note that EOT has already been considered in the specific setting of MK quantiles in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT [7, 56, 10], both for smoothing and computational purposes. Our numerical experiments in Section 6 flesh out the empirical benefits and shortcomings when varying ε𝜀\varepsilonitalic_ε.

Definition 4.1.

Let ν𝜈\nuitalic_ν be an arbitrary probability measure supported on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and 𝐮ε:𝕊2:subscript𝐮𝜀superscript𝕊2\mathbf{u}_{\varepsilon}:{\mathbb{S}^{2}}\rightarrow{\mathbb{R}}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT : blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT → blackboard_R be a solution of (4.1) between μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and ν𝜈\nuitalic_ν. Then, the regularized distribution function of ν𝜈\nuitalic_ν is given by

(4.10) 𝐅ε(z)=Expz(𝐮εc,ε(z)),subscript𝐅𝜀𝑧subscriptExp𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧\mathbf{F}_{\varepsilon}(z)=\text{Exp}_{z}(-\nabla\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)),bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_z ) = Exp start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( - ∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) ) ,

and the regularized quantile function of ν𝜈\nuitalic_ν is

(4.11) 𝐐ε(x)=Expx(𝐮ε(x)).subscript𝐐𝜀𝑥subscriptExp𝑥subscript𝐮𝜀𝑥\mathbf{Q}_{\varepsilon}(x)=\text{Exp}_{x}(-\nabla\mathbf{u}_{\varepsilon}(x)).bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( - ∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) ) .

This requires the differentiation of entropic Kantorovich potentials. For a given regularization parameter ε>0𝜀0\varepsilon>0italic_ε > 0, partial derivatives can be retrieved by

𝐮ε(x)xi=l=0m=ll𝐮¯lmYlm(x)xi,subscript𝐮𝜀𝑥subscript𝑥𝑖superscriptsubscript𝑙0superscriptsubscript𝑚𝑙𝑙superscriptsubscript¯𝐮𝑙𝑚superscriptsubscript𝑌𝑙𝑚𝑥subscript𝑥𝑖\frac{\partial\mathbf{u}_{\varepsilon}(x)}{\partial x_{i}}=\sum\limits_{l=0}^{% \infty}\sum\limits_{m=-l}^{l}\bar{\mathbf{u}}_{l}^{m}\frac{\partial Y_{l}^{m}(% x)}{\partial x_{i}},divide start_ARG ∂ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG = ∑ start_POSTSUBSCRIPT italic_l = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = - italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_l end_POSTSUPERSCRIPT over¯ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT divide start_ARG ∂ italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG ,

where the package sphericart [9], allows the computation of Ylm(x)xisuperscriptsubscript𝑌𝑙𝑚𝑥subscript𝑥𝑖\frac{\partial Y_{l}^{m}(x)}{\partial x_{i}}divide start_ARG ∂ italic_Y start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_x ) end_ARG start_ARG ∂ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG easily. The Riemannian gradient 𝐮εsubscript𝐮𝜀\nabla\mathbf{u}_{\varepsilon}∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT follows using (3.6). But because this may lead to numerical instabilities, we suggest instead to make use of first-order conditions (4.3). With discrete counterparts in practice, changing a couple of potentials (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) to (vc,ε,uc,ε)superscript𝑣𝑐𝜀superscript𝑢𝑐𝜀(v^{c,\varepsilon},u^{c,\varepsilon})( italic_v start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT , italic_u start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ) improves the objective to solve. As (4.2) depends on the measure μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, its symmetric version (4.3) depends on ν𝜈\nuitalic_ν instead. Notably, Sinkhorn’s algorithm corresponds to perform such alternative smooth conjugates [30][Proposition 10]. The next proposition gives a generalized entropic map on the hypersphere 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

Proposition 4.3.

Denote by

(4.12) gε(x,z)=d(x,z)1x,z2exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε).subscript𝑔𝜀𝑥𝑧𝑑𝑥𝑧1superscript𝑥𝑧2subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀g_{\varepsilon}(x,z)=\frac{-d(x,z)}{\sqrt{1-\langle x,z\rangle^{2}}}\exp\Big{(% }\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)}{\varepsilon}\Big{)}.italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) = divide start_ARG - italic_d ( italic_x , italic_z ) end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) .

Then, the Euclidean partial derivatives of 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT admit the closed-form expression

(4.13) xi𝐮ε(x)=zigε(x,z)𝑑ν(z).subscriptsubscript𝑥𝑖subscript𝐮𝜀𝑥subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧differential-d𝜈𝑧\partial_{x_{i}}\mathbf{u}_{\varepsilon}(x)=\int z_{i}g_{\varepsilon}(x,z)d\nu% (z).∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = ∫ italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_ν ( italic_z ) .

Similarly, the Euclidean gradient of 𝐮εc,εsuperscriptsubscript𝐮𝜀𝑐𝜀\mathbf{u}_{\varepsilon}^{c,\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT verifies

(4.14) zi𝐮εc,ε(z)=xigε(x,z)𝑑μ𝕊2(x).subscriptsubscript𝑧𝑖superscriptsubscript𝐮𝜀𝑐𝜀𝑧subscript𝑥𝑖subscript𝑔𝜀𝑥𝑧differential-dsubscript𝜇superscript𝕊2𝑥\partial_{z_{i}}\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)=\int x_{i}g_{% \varepsilon}(x,z)d\mu_{\mathbb{S}^{2}}(x).∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = ∫ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

Combining Proposition 4.3 with (3.6) yields the following corollary, recalling that

(x,z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)maps-to𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀(x,z)\mapsto\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{% \varepsilon}^{c,\varepsilon}(z)}{\varepsilon}\Big{)}( italic_x , italic_z ) ↦ roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG )

is the density of the optimal entropic plan with respect to μνtensor-product𝜇𝜈\mu\otimes\nuitalic_μ ⊗ italic_ν, [64].

Corollary 4.1.

The regularized distribution and quantile functions of ν𝜈\nuitalic_ν on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT admit closed-form expressions through

𝐐ε(x)=ExpxLogx(z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑ν(z),subscript𝐐𝜀𝑥subscriptExp𝑥subscriptLog𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-d𝜈𝑧\mathbf{Q}_{\varepsilon}(x)=\text{Exp}_{x}\int\text{Log}_{x}(z)\exp\Big{(}% \frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)}{\varepsilon}\Big{)}d\nu(z),bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∫ Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) ,

and

𝐅ε(z)=ExpxLogz(x)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑μ𝕊2(x).subscript𝐅𝜀𝑧subscriptExp𝑥subscriptLog𝑧𝑥subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-dsubscript𝜇superscript𝕊2𝑥\mathbf{F}_{\varepsilon}(z)=\text{Exp}_{x}\int\text{Log}_{z}(x)\exp\Big{(}% \frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)}{\varepsilon}\Big{)}d\mu_{\mathbb{S}^{2}}(x).bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_z ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∫ Log start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_x ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .
Remark 4.2.

It should be noted that 𝐅εsubscript𝐅𝜀\mathbf{F}_{\varepsilon}bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, resp.𝑟𝑒𝑠𝑝resp.italic_r italic_e italic_s italic_p . 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, does not push ν𝜈\nuitalic_ν forward to μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT anymore, resp.𝑟𝑒𝑠𝑝resp.italic_r italic_e italic_s italic_p . μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT forward to ν𝜈\nuitalic_ν. However, they are expected to be close to their unregularized counterparts, for small values of ε>0𝜀0\varepsilon>0italic_ε > 0, as studied, for the quadratic cost in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, in [33, 69, 80]. The limit ε0𝜀0\varepsilon\rightarrow 0italic_ε → 0 has been considered outside the Euclidean setting [5, 8, 64], although not directly about the generalized entropic map itself. In particular, up to some sequence (εk)subscript𝜀𝑘(\varepsilon_{k})( italic_ε start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) such that limk+εk=0subscript𝑘subscript𝜀𝑘0\lim_{k\rightarrow+\infty}\varepsilon_{k}=0roman_lim start_POSTSUBSCRIPT italic_k → + ∞ end_POSTSUBSCRIPT italic_ε start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = 0, [64][Proposition 3.2] gives us the uniform convergence of potentials (𝐮εk,𝐮εkc,εk)subscript𝐮subscript𝜀𝑘superscriptsubscript𝐮subscript𝜀𝑘𝑐subscript𝜀𝑘(\mathbf{u}_{\varepsilon_{k}},\mathbf{u}_{\varepsilon_{k}}^{c,{\varepsilon_{k}% }})( bold_u start_POSTSUBSCRIPT italic_ε start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT , bold_u start_POSTSUBSCRIPT italic_ε start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ) on compact subsets of 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, towards (ψ,ψc)𝜓superscript𝜓𝑐(\psi,\psi^{c})( italic_ψ , italic_ψ start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT ) solving (3.9).

From Corollary 4.1, 𝐅εsubscript𝐅𝜀\mathbf{F}_{\varepsilon}bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT and 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT can be seen as weighted averages in the tangent space. We argue that this can be gainful in practice, because of the regularity it induces. Indeed, second-order derivatives are given hereafter, which entails the continuity of 𝐅εsubscript𝐅𝜀\mathbf{F}_{\varepsilon}bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT and 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT.

Proposition 4.4.

The potential 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is twice-differentiable everywhere, and

2𝐮εxixj(x)=c~ij(x,z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑ν(z)+xi𝐮ε(x)xj𝐮ε(x)ε,superscript2subscript𝐮𝜀subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗𝑥subscript~𝑐𝑖𝑗𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-d𝜈𝑧subscriptsubscript𝑥𝑖subscript𝐮𝜀𝑥subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥𝜀\frac{\partial^{2}\mathbf{u}_{\varepsilon}}{\partial_{x_{i}}\partial_{x_{j}}}(% x)=\int\tilde{c}_{ij}(x,z)\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+% \mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)}{\varepsilon}\Big{)}d\nu(z)+\frac{% \partial_{x_{i}}\mathbf{u}_{\varepsilon}(x)\partial_{x_{j}}\mathbf{u}_{% \varepsilon}(x)}{\varepsilon},divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ( italic_x ) = ∫ over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_x , italic_z ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) + divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_ε end_ARG ,

where

c~ij(x,z)subscript~𝑐𝑖𝑗𝑥𝑧\displaystyle\tilde{c}_{ij}(x,z)over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_x , italic_z ) =2c(x,z)xixjxic(x,z)xjc(x,z)ε.absentsuperscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗subscriptsubscript𝑥𝑖𝑐𝑥𝑧subscriptsubscript𝑥𝑗𝑐𝑥𝑧𝜀\displaystyle=\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}\partial_{x_{j}}}-% \frac{\partial_{x_{i}}c(x,z)\partial_{x_{j}}c(x,z)}{\varepsilon}.= divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG - divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG .

Besides, the same holds for 𝐮εc,εsuperscriptsubscript𝐮𝜀𝑐𝜀\mathbf{u}_{\varepsilon}^{c,\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT and

2𝐮εc,εzizj(z)=c¯ij(x,z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑ν(z)+zi𝐮εc,ε(z)zj𝐮εc,ε(z)ε,superscript2superscriptsubscript𝐮𝜀𝑐𝜀subscriptsubscript𝑧𝑖subscriptsubscript𝑧𝑗𝑧subscript¯𝑐𝑖𝑗𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-d𝜈𝑧subscriptsubscript𝑧𝑖superscriptsubscript𝐮𝜀𝑐𝜀𝑧subscriptsubscript𝑧𝑗superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀\frac{\partial^{2}\mathbf{u}_{\varepsilon}^{c,\varepsilon}}{\partial_{z_{i}}% \partial_{z_{j}}}(z)=\int\overline{c}_{ij}(x,z)\exp\Big{(}\frac{\mathbf{u}_{% \varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)}{% \varepsilon}\Big{)}d\nu(z)+\frac{\partial_{z_{i}}\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)\partial_{z_{j}}\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)}{% \varepsilon},divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ( italic_z ) = ∫ over¯ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_x , italic_z ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) + divide start_ARG ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ,

with

c¯ij(x,z)subscript¯𝑐𝑖𝑗𝑥𝑧\displaystyle\overline{c}_{ij}(x,z)over¯ start_ARG italic_c end_ARG start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ( italic_x , italic_z ) =2c(x,z)zizjzic(x,z)zjc(x,z)ε.absentsuperscript2𝑐𝑥𝑧subscriptsubscript𝑧𝑖subscriptsubscript𝑧𝑗subscriptsubscript𝑧𝑖𝑐𝑥𝑧subscriptsubscript𝑧𝑗𝑐𝑥𝑧𝜀\displaystyle=\frac{\partial^{2}c(x,z)}{\partial_{z_{i}}\partial_{z_{j}}}-% \frac{\partial_{z_{i}}c(x,z)\partial_{z_{j}}c(x,z)}{\varepsilon}.= divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG - divide start_ARG ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG .
Remark 4.3.

One can find e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . in [25] that first-order derivatives of the cost c𝑐citalic_c are given by

xic(x,z)xjc(x,z)=d(x,z)21x,z2zizj,subscriptsubscript𝑥𝑖𝑐𝑥𝑧subscriptsubscript𝑥𝑗𝑐𝑥𝑧𝑑superscript𝑥𝑧21superscript𝑥𝑧2subscript𝑧𝑖subscript𝑧𝑗\partial_{x_{i}}c(x,z)\partial_{x_{j}}c(x,z)=\frac{d(x,z)^{2}}{1-\langle x,z% \rangle^{2}}z_{i}z_{j},∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) = divide start_ARG italic_d ( italic_x , italic_z ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ,

and that second-order derivatives write

2c(x,z)xixjsuperscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗\displaystyle\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}\partial_{x_{j}}}divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG =d(x,z)x,z1x,z2(𝟙i=j11x,z2zizj)+11x,z2zizj.absent𝑑𝑥𝑧𝑥𝑧1superscript𝑥𝑧2subscript1𝑖𝑗11superscript𝑥𝑧2subscript𝑧𝑖subscript𝑧𝑗11superscript𝑥𝑧2subscript𝑧𝑖subscript𝑧𝑗\displaystyle=d(x,z)\frac{\langle x,z\rangle}{\sqrt{1-\langle x,z\rangle^{2}}}% \Big{(}\mathds{1}_{i=j}-\frac{1}{1-\langle x,z\rangle^{2}}z_{i}z_{j}\Big{)}+% \frac{1}{1-\langle x,z\rangle^{2}}z_{i}z_{j}.= italic_d ( italic_x , italic_z ) divide start_ARG ⟨ italic_x , italic_z ⟩ end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG ( blackboard_1 start_POSTSUBSCRIPT italic_i = italic_j end_POSTSUBSCRIPT - divide start_ARG 1 end_ARG start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + divide start_ARG 1 end_ARG start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT .

4.3. Regularized empirical distribution and quantile functions

Suppose that the estimator 𝐮^ε,nsubscript^𝐮𝜀𝑛\widehat{\mathbf{u}}_{\varepsilon,n}over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT, defined in (4.9), has been computed using the stochastic algorithm (4.7) from i.i.d.formulae-sequence𝑖𝑖𝑑i.i.d.italic_i . italic_i . italic_d . observations X1,,Xnsubscript𝑋1subscript𝑋𝑛X_{1},\cdots,X_{n}italic_X start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT sampled from ν𝜈\nuitalic_ν supported on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. To obtain a regularized quantile function, the empirical counterpart of Corollary 4.1 would involve integrals with respect to μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT to compute the smooth conjugate of 𝐮^ε,nsubscript^𝐮𝜀𝑛\widehat{\mathbf{u}}_{\varepsilon,n}over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT. Therefore, to circumvent this issue, we consider a random sample U1,,UNsubscript𝑈1subscript𝑈𝑁U_{1},\cdots,U_{N}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_U start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT uniformly drawn on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and we define the following estimator (as an approximation of 𝐮^ε,nc,εsuperscriptsubscript^𝐮𝜀𝑛𝑐𝜀\widehat{\mathbf{u}}_{\varepsilon,n}^{c,\varepsilon}over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT)

(4.15) 𝐮^N,nc,ε(z)=εlog1Ni=1Nexp(𝐮^ε,n(Ui)c(Ui,z)ε).superscriptsubscript^𝐮𝑁𝑛𝑐𝜀𝑧𝜀1𝑁superscriptsubscript𝑖1𝑁subscript^𝐮𝜀𝑛subscript𝑈𝑖𝑐subscript𝑈𝑖𝑧𝜀\widehat{\mathbf{u}}_{N,n}^{c,\varepsilon}(z)=-\varepsilon\log\frac{1}{N}\sum_% {i=1}^{N}\exp\Big{(}\frac{\widehat{\mathbf{u}}_{\varepsilon,n}(U_{i})-c(U_{i},% z)}{\varepsilon}\Big{)}.over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = - italic_ε roman_log divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT roman_exp ( divide start_ARG over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_c ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_z ) end_ARG start_ARG italic_ε end_ARG ) .

Then, thanks to (4.3), we remark that,

(4.16) exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)=exp(𝐮εc,ε(z)c(x,z)ε)exp(𝐮εc,ε(z)c(x,z)ε)𝑑ν(z).subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝑐𝑥𝑧𝜀superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝑐𝑥𝑧𝜀differential-d𝜈𝑧\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{% c,\varepsilon}(z)}{\varepsilon}\Big{)}=\frac{\exp\Big{(}\frac{\mathbf{u}_{% \varepsilon}^{c,\varepsilon}(z)-c(x,z)}{\varepsilon}\Big{)}}{\int\exp\Big{(}% \frac{\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)-c(x,z)}{\varepsilon}\Big{)}d% \nu(z)}.roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) = divide start_ARG roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∫ roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) end_ARG .

Hence, plugging (4.15) into (4.16), we propose the following estimator, for the regularized quantile function 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT defined in Corollary 4.1,

(4.17) 𝐐^N,nε(x)=Expx(i=1ng^N,nε(x,Xi)Logx(Xi)),superscriptsubscript^𝐐𝑁𝑛𝜀𝑥subscriptExp𝑥superscriptsubscript𝑖1𝑛superscriptsubscript^𝑔𝑁𝑛𝜀𝑥subscript𝑋𝑖subscriptLog𝑥subscript𝑋𝑖\hat{\mathbf{Q}}_{N,n}^{\varepsilon}(x)=\text{Exp}_{x}\Big{(}\sum_{i=1}^{n}% \hat{g}_{N,n}^{\varepsilon}(x,X_{i})\text{Log}_{x}(X_{i})\Big{)},over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_x , italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_X start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ,

where

g^N,nε(x,z)=exp(𝐮^N,nc,ε(z)c(x,z)ε)j=1nexp(𝐮^N,nc,ε(Xj)c(x,Xj)ε).superscriptsubscript^𝑔𝑁𝑛𝜀𝑥𝑧superscriptsubscript^𝐮𝑁𝑛𝑐𝜀𝑧𝑐𝑥𝑧𝜀superscriptsubscript𝑗1𝑛superscriptsubscript^𝐮𝑁𝑛𝑐𝜀subscript𝑋𝑗𝑐𝑥subscript𝑋𝑗𝜀\hat{g}_{N,n}^{\varepsilon}(x,z)=\frac{\exp\Big{(}\frac{\widehat{\mathbf{u}}_{% N,n}^{c,\varepsilon}(z)-c(x,z)}{\varepsilon}\Big{)}}{\sum_{j=1}^{n}\exp\Big{(}% \frac{\widehat{\mathbf{u}}_{N,n}^{c,\varepsilon}(X_{j})-c(x,X_{j})}{% \varepsilon}\Big{)}}.over^ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_x , italic_z ) = divide start_ARG roman_exp ( divide start_ARG over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT roman_exp ( divide start_ARG over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - italic_c ( italic_x , italic_X start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) end_ARG start_ARG italic_ε end_ARG ) end_ARG .

In the same token, an estimator of 𝐅εsubscript𝐅𝜀\mathbf{F}_{\varepsilon}bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is given by

(4.18) 𝐅^N,nε(z)=Expz(i=1Ng~N,nε(Ui,z)Logz(Ui)),superscriptsubscript^𝐅𝑁𝑛𝜀𝑧subscriptExp𝑧superscriptsubscript𝑖1𝑁superscriptsubscript~𝑔𝑁𝑛𝜀subscript𝑈𝑖𝑧subscriptLog𝑧subscript𝑈𝑖\hat{\mathbf{F}}_{N,n}^{\varepsilon}(z)=\text{Exp}_{z}\Big{(}\sum_{i=1}^{N}% \tilde{g}_{N,n}^{\varepsilon}(U_{i},z)\text{Log}_{z}(U_{i})\Big{)},over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_z ) = Exp start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT over~ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_z ) Log start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ,

with

g~N,nε(x,z)=exp(𝐮^ε,n(x)c(x,z)ε)j=1Nexp(𝐮^ε,n(Uj)c(Uj,z)ε).superscriptsubscript~𝑔𝑁𝑛𝜀𝑥𝑧subscript^𝐮𝜀𝑛𝑥𝑐𝑥𝑧𝜀superscriptsubscript𝑗1𝑁subscript^𝐮𝜀𝑛subscript𝑈𝑗𝑐subscript𝑈𝑗𝑧𝜀\tilde{g}_{N,n}^{\varepsilon}(x,z)=\frac{\exp\Big{(}\frac{\widehat{\mathbf{u}}% _{\varepsilon,n}(x)-c(x,z)}{\varepsilon}\Big{)}}{\sum_{j=1}^{N}\exp\Big{(}% \frac{\widehat{\mathbf{u}}_{\varepsilon,n}(U_{j})-c(U_{j},z)}{\varepsilon}\Big% {)}}.over~ start_ARG italic_g end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_x , italic_z ) = divide start_ARG roman_exp ( divide start_ARG over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT roman_exp ( divide start_ARG over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) - italic_c ( italic_U start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG .

Note that the empirical version of unregularized MK quantiles that is proposed in [37] relies on discrete OT, yielding a bijection between a reference grid of points in 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and the samples. This is beneficial for statistical testing where distribution-freeness of the ranks is highly desirable. On the contrary, regularization yields, even empirically, smooth maps that are not constrained to belong to the set of observed data, which is crucial for the descriptive analysis of Section 5.

Besides, the estimation of contours in [37] requires to solve two different discrete OT problems. The first one estimates the central point 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), whereas the second involves a grid oriented towards the estimate of 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), to render MK contours. On the contrary, with our algorithm targeting continuous OT, there is no need to solve two different OT problems, as the estimate 𝐮^ε,nsubscript^𝐮𝜀𝑛\widehat{\mathbf{u}}_{\varepsilon,n}over^ start_ARG bold_u end_ARG start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT yields both 𝐅^N,nεsuperscriptsubscript^𝐅𝑁𝑛𝜀\hat{\mathbf{F}}_{N,n}^{\varepsilon}over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT and 𝐐^N,nεsuperscriptsubscript^𝐐𝑁𝑛𝜀\hat{\mathbf{Q}}_{N,n}^{\varepsilon}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT, and a fortiori 𝐅^N,nε(θM)superscriptsubscript^𝐅𝑁𝑛𝜀subscript𝜃𝑀\hat{\mathbf{F}}_{N,n}^{\varepsilon}(\theta_{M})over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ).

5. Depth-based data analysis

This section is dedicated to study a companion concept of directional MK quantiles, the MK statistical depth. We state a directional definition and discuss its properties. After that, we introduce descriptive tools in the spirit of the ones presented in [50] in the euclidean setting.

For the sake of completeness, we first study the Euclidean setting dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT before the directional one, that is of particular interest for us. Indeed, the results that we derive below do not appear as such in the literature, at least to the best of our knowledge.

5.1. Euclidean setting

We begin with the main definitions taken from [12]. Our chosen reference measure, denoted by Udsubscript𝑈𝑑U_{d}italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT, is given by the random vector RΦ𝑅ΦR\Phiitalic_R roman_Φ, for R𝑅Ritalic_R and ΦΦ\Phiroman_Φ independently drawn from [0,1]01[0,1][ 0 , 1 ] and from the unit hypersphere 𝕊d1={φd:φ=1}superscript𝕊𝑑1conditional-set𝜑superscript𝑑norm𝜑1\mathbb{S}^{d-1}=\{\varphi\in{\mathbb{R}}^{d}:\|\varphi\|=1\}blackboard_S start_POSTSUPERSCRIPT italic_d - 1 end_POSTSUPERSCRIPT = { italic_φ ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT : ∥ italic_φ ∥ = 1 }, respectively, as originally proposed in [12, 36] to define MK quantiles in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Note that the MK distribution function, the inverse of the MK quantile function, might not exist, e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . if ν𝜈\nuitalic_ν is discrete. Following [32], this is tackled with the Legendre-Fenchel dual of a convex function ψ𝜓\psiitalic_ψ, given by ψ(x)=supud{x,uψ(u)}superscript𝜓𝑥subscriptsupremum𝑢superscript𝑑𝑥𝑢𝜓𝑢\psi^{*}(x)=\sup_{u\in{\mathbb{R}}^{d}}\{\langle x,u\rangle-\psi(u)\}italic_ψ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) = roman_sup start_POSTSUBSCRIPT italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT end_POSTSUBSCRIPT { ⟨ italic_x , italic_u ⟩ - italic_ψ ( italic_u ) }.

Definition 5.1.

Let ν𝜈\nuitalic_ν be an arbitrary probability measure on dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT. Its MK quantile function is the unique 𝐐=ψ𝐐𝜓\mathbf{Q}=\nabla\psibold_Q = ∇ italic_ψ for some convex ψ:d:𝜓superscript𝑑\psi:{\mathbb{R}}^{d}\rightarrow{\mathbb{R}}italic_ψ : blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT → blackboard_R such that 𝐐#Ud=νsubscript𝐐#subscript𝑈𝑑𝜈\mathbf{Q}_{\#}U_{d}=\nubold_Q start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = italic_ν. Then,

  1. (1)

    The MK α𝛼\alphaitalic_α-quantile contour is the image by 𝐐𝐐\mathbf{Q}bold_Q of the hypersphere

    𝒮(α)={ud:u=α}.𝒮𝛼conditional-set𝑢superscript𝑑norm𝑢𝛼\mathcal{S}(\alpha)=\{u\in{\mathbb{R}}^{d}:\|u\|=\alpha\}.caligraphic_S ( italic_α ) = { italic_u ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT : ∥ italic_u ∥ = italic_α } .
  2. (2)

    The sign curve associated to u𝔹(0,1)𝑢𝔹01u\in\mathbb{B}(0,1)italic_u ∈ blackboard_B ( 0 , 1 ) is the image by 𝐐𝐐\mathbf{Q}bold_Q of the radius

    Lu={tuu:t[0,1]}.subscript𝐿𝑢conditional-set𝑡𝑢norm𝑢𝑡01L_{u}=\{t\frac{u}{\|u\|}:t\in[0,1]\}.italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = { italic_t divide start_ARG italic_u end_ARG start_ARG ∥ italic_u ∥ end_ARG : italic_t ∈ [ 0 , 1 ] } .
  3. (3)

    The MK depth of xd𝑥superscript𝑑x\in{\mathbb{R}}^{d}italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT is the depth of ψsuperscript𝜓\nabla\psi^{*}∇ italic_ψ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT under Tukey’s depth, [81],

    Dν(x)=DUdTukey(ψ(x)).subscript𝐷𝜈𝑥superscriptsubscript𝐷subscript𝑈𝑑𝑇𝑢𝑘𝑒𝑦superscript𝜓𝑥D_{\nu}(x)=D_{U_{d}}^{Tukey}\Big{(}\nabla\psi^{*}(x)\Big{)}.italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( italic_x ) = italic_D start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_u italic_k italic_e italic_y end_POSTSUPERSCRIPT ( ∇ italic_ψ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_x ) ) .

The Liu-Zuo-Serfling axioms [49, 85], describe desirable properties for depth concepts. The MK-depth softens some of them, to reach more relevant contours [12]. Firstly, MK depth corresponds to Tukey depth for elliptical families [12]. Moreover, it benefits from invariance properties [32][Lemmas A.7,A.8], with respect to scaling (multiplication by a positive constant), translations, and orthogonal transformations (multiplication by an orthogonal matrix). Note that the affine-invariance does not hold. Another axiom is the linear monotonicity relative to the deepest points, that is Dν(x)Dν((1t)x0+tx)subscript𝐷𝜈𝑥subscript𝐷𝜈1𝑡subscript𝑥0𝑡𝑥D_{\nu}(x)\leq D_{\nu}((1-t)x_{0}+tx)italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( italic_x ) ≤ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( ( 1 - italic_t ) italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_t italic_x ) for all t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ] if x0subscript𝑥0x_{0}italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is a deepest point. This is not fulfilled by the MK depth [12], although it verifies a similar property along sign curves, as we shall see now. Proofs of the following results are deferred to the Appendix.

Proposition 5.1 (Curvilinear monotonicity relative to the deepest points).

Assume that ν𝜈\nuitalic_ν is continuous. The MK depth is monotonically decreasing along sign curves, that is, for each u𝔹(0,1)𝑢𝔹01u\in\mathbb{B}(0,1)italic_u ∈ blackboard_B ( 0 , 1 ) and t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ],

Dν(𝐐(u))Dν(𝐐(tu)).subscript𝐷𝜈𝐐𝑢subscript𝐷𝜈𝐐𝑡𝑢D_{\nu}(\mathbf{Q}(u))\leq D_{\nu}(\mathbf{Q}(tu)).italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_u ) ) ≤ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_t italic_u ) ) .

This corresponds to the classical linear monotonicity under distributions with straight sign curves, including spherical families due to the particular form of the MK quantile function in this setting, taken from [12].

Corollary 5.1.

For spherically symmetric distributions, sign curves are straight lines, and the MK depth verifies linear monotonicity relative to the deepest point. For any x𝑥xitalic_x in the support of ν𝜈\nuitalic_ν, for x0=𝔼(X)subscript𝑥0𝔼𝑋x_{0}=\mathbb{E}(X)italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = blackboard_E ( italic_X ) the deepest point of ν𝜈\nuitalic_ν,

(5.1) t[0,1],Dν(x)Dν((1t)x0+tx).formulae-sequencefor-all𝑡01subscript𝐷𝜈𝑥subscript𝐷𝜈1𝑡subscript𝑥0𝑡𝑥\forall t\in[0,1],D_{\nu}(x)\leq D_{\nu}((1-t)x_{0}+tx).∀ italic_t ∈ [ 0 , 1 ] , italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( italic_x ) ≤ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( ( 1 - italic_t ) italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_t italic_x ) .

We now turn to the properties of the directional MK depth.

5.2. Directional setting

Using the same ideas, one can define the MK depth on the sphere through any statistical depth with respect to the uniform μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT oriented towards 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), and the simplest is surely to consider the proximity with 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ).

Definition 5.2.

Let ν𝜈\nuitalic_ν be an arbitrary probability measure on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, with directional distribution function 𝐅𝐅\mathbf{F}bold_F. The directional MK depth of x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is defined by

Dν(x)=1d(𝐅(x),𝐅(θM))/π.subscript𝐷𝜈𝑥1𝑑𝐅𝑥𝐅subscript𝜃𝑀𝜋D_{\nu}(x)=1-d(\mathbf{F}(x),\mathbf{F}(\theta_{M}))/\pi.italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( italic_x ) = 1 - italic_d ( bold_F ( italic_x ) , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ) / italic_π .

Regarding the 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-adapted versions of Liu-Zuo-Serfling axioms, [49, 85], the directional MK depth behaves like its Euclidean counterpart. We begin with the four classical properties that are direct spherical counterparts of the Euclidean axioms, see e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . [46]. The affine-invariance is replaced on 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT by rotational invariance, which holds true from [37], see also Proposition A.1. Moreover, it is straightforward that Dνsubscript𝐷𝜈D_{\nu}italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT attains its maximum at the center 𝐅(θM)𝐅subscript𝜃𝑀\mathbf{F}(\theta_{M})bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), and that it vanishes at 𝐅(θM)𝐅subscript𝜃𝑀-\mathbf{F}(\theta_{M})- bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ), the spherical counterpart of infinity. Finally, monotonicity along great circles is not fulfilled, but it is replaced in the same data-adaptive fashion than in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT.

Proposition 5.2 (Curvilinear monotonicity relative to the deepest points).

Assume that ν𝜈\nuitalic_ν is continuous. The directional MK depth is monotonically decreasing along sign curves. For each x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and t[x,𝐅(θm),1]𝑡𝑥𝐅subscript𝜃𝑚1t\in[\langle x,\mathbf{F}(\theta_{m})\rangle,1]italic_t ∈ [ ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ⟩ , 1 ], let xtsUsubscript𝑥𝑡superscriptsubscript𝑠𝑈x_{t}\in\mathcal{M}_{s}^{U}italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT, for s=𝐒𝐅(θM)(x)𝑠subscript𝐒𝐅subscript𝜃𝑀𝑥s=\mathbf{S}_{\mathbf{F}(\theta_{M})}(x)italic_s = bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ), such that

(5.2) xt=t𝐅(θM)+1t2s.subscript𝑥𝑡𝑡𝐅subscript𝜃𝑀1superscript𝑡2𝑠x_{t}=t\mathbf{F}(\theta_{M})+\sqrt{1-t^{2}}s.italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_t bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) + square-root start_ARG 1 - italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_s .

Then,

Dν(𝐐(x))Dν(𝐐(xt)).subscript𝐷𝜈𝐐𝑥subscript𝐷𝜈𝐐subscript𝑥𝑡D_{\nu}(\mathbf{Q}(x))\leq D_{\nu}(\mathbf{Q}(x_{t})).italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_x ) ) ≤ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) .

Explicit formulations for rotationally invariant distributions are given in [37], and recalled in Appendix B. In particular, they show that MK quantile contours coincide with Mahalanobis ones, [46], for such distributions, so that the following is straightforward.

Corollary 5.2.

For rotationally invariant distributions, sign curves are great circles. Thus, the MK depth verifies linear monotonicity along great circles, relative to the deepest point.

Other desirable axioms have been put forward recently in [62, 61], namely the upper semi-continuity and the non-rigidity of central regions. Upper semi-continuity is ensured to hold as soon as the MK distribution function 𝐅𝐅\mathbf{F}bold_F is continuous, thus at least for ν𝐁2𝜈subscript𝐁2\nu\in\mathbf{B}_{2}italic_ν ∈ bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Even more, when ν𝜈\nuitalic_ν is arbitrary, taking the regularized directional MK depth built from 𝐅εsubscript𝐅𝜀\mathbf{F}_{\varepsilon}bold_F start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, for ε>0𝜀0\varepsilon>0italic_ε > 0, imposes continuity, which may motivate such regularized estimator. Lastly, the non-rigidity of central regions states that quantile regions are not restricted to be spherical caps, which is readily true for the MK depth. In fact, its adaptivity to the underlying support is one of its main feature, and it can be seen as a stronger non-rigidity axiom, requiring that 𝐐(U)νsimilar-to𝐐𝑈𝜈\mathbf{Q}(U)\sim\nubold_Q ( italic_U ) ∼ italic_ν as soon as Uμsimilar-to𝑈𝜇U\sim\muitalic_U ∼ italic_μ. Furthermore, Proposition 5.2 and Corollary 5.2 shed some light on the non-verified axiom of monotony along great circles. Our results suggest that the directional MK depth alleviates these axioms when necessary, e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . for complex distributions such as mixtures, whereas the axioms are fulfilled for distributions for which it is useful, in particular for rotationally invariant ones.

5.3. Descriptive tools

The seminal paper [50] gathers descriptive tools based on data depths. Monge-Kantorovich analogs already exist for data in dsuperscript𝑑{\mathbb{R}}^{d}blackboard_R start_POSTSUPERSCRIPT italic_d end_POSTSUPERSCRIPT, and we shall now extend some of them to the directional setting. We stress that the ability of our regularized estimator to interpolate between data points is crucial for (i)𝑖(i)( italic_i ) smooth contours in practice and (ii)𝑖𝑖(ii)( italic_i italic_i ) computing volumes of quantile regions.

5.3.1. Representative plots

Firstly, [50] study representative plots for bivariate data, and the MK analog is given by the descriptive plots from [36, 37], the latter with the added information of sign curves. Figure 1 illustrates it on a Tangent von-Mises Fisher distribution, [29], and on a Mixture of two von-Mises Fisher distributions, with the help of our empirical regularized quantile function. One can observe that the shapes of the distributions are well recovered.

Refer to caption
Refer to caption
Figure 1. Regularized quantile contours of levels {0.1,0.25,0.5,0.75,0.9}0.10.250.50.750.9\{0.1,0.25,0.5,0.75,0.9\}{ 0.1 , 0.25 , 0.5 , 0.75 , 0.9 } and associated sign curves, with ϵ=101italic-ϵsuperscript101\epsilon=10^{-1}italic_ϵ = 10 start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

5.3.2. Scale or dispersion

Hereafter, we present a graphical tool to describe the amount of dispersion, called the scale curve in [50] and whose MK analog has been introduced in [4] for Euclidean data. We mention that, in [50], this tool is also used to compare the variance of vector-valued estimators. Put simply, given any level α[0,1]𝛼01\alpha\in[0,1]italic_α ∈ [ 0 , 1 ], we consider the volumes V(α)𝑉𝛼V(\alpha)italic_V ( italic_α ) of MK quantile regions. Plotting such volumes with respect to α[0,1]𝛼01\alpha\in[0,1]italic_α ∈ [ 0 , 1 ] yields a scale curve [50]. The faster it grows, the greater the dispersion. Thus, if the scale curve of ν1subscript𝜈1\nu_{1}italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is consistently above the one of ν2subscript𝜈2\nu_{2}italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, then ν1subscript𝜈1\nu_{1}italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is more spread out than ν2subscript𝜈2\nu_{2}italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. On 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, the volume is bounded, so we consider the normalized μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT instead of σ𝕊2subscript𝜎superscript𝕊2\sigma_{\mathbb{S}^{2}}italic_σ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. Define

V(α)=𝒞α𝑑μ𝕊2(x)=𝕊2𝟙{x𝒞α}𝑑μ𝕊2(x)=𝕊2𝟙{𝐅(x),𝐅(θM)12α}𝑑μ𝕊2(x).𝑉𝛼subscriptsubscript𝒞𝛼differential-dsubscript𝜇superscript𝕊2𝑥subscriptsuperscript𝕊2subscript1𝑥subscript𝒞𝛼differential-dsubscript𝜇superscript𝕊2𝑥subscriptsuperscript𝕊2subscript1𝐅𝑥𝐅subscript𝜃𝑀12𝛼differential-dsubscript𝜇superscript𝕊2𝑥V(\alpha)=\int_{\mathcal{C}_{\alpha}}d\mu_{\mathbb{S}^{2}}(x)=\int_{\mathbb{S}% ^{2}}\mathds{1}_{\{x\in\mathcal{C}_{\alpha}\}}d\mu_{\mathbb{S}^{2}}(x)=\int_{% \mathbb{S}^{2}}\mathds{1}_{\{\langle\mathbf{F}(x),\mathbf{F}(\theta_{M})% \rangle\geq 1-2\alpha\}}d\mu_{\mathbb{S}^{2}}(x).italic_V ( italic_α ) = ∫ start_POSTSUBSCRIPT caligraphic_C start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) = ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT blackboard_1 start_POSTSUBSCRIPT { italic_x ∈ caligraphic_C start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT } end_POSTSUBSCRIPT italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) = ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT blackboard_1 start_POSTSUBSCRIPT { ⟨ bold_F ( italic_x ) , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ ≥ 1 - 2 italic_α } end_POSTSUBSCRIPT italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

This can be estimated with a sample U1,,UNsubscript𝑈1subscript𝑈𝑁U_{1},\cdots,U_{N}italic_U start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_U start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT from μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, by the proportion

Vε,n(α)=1Ni=1N𝟙{𝐅^N,nε(Ui),𝐅(θM)12α}.subscript𝑉𝜀𝑛𝛼1𝑁superscriptsubscript𝑖1𝑁subscript1superscriptsubscript^𝐅𝑁𝑛𝜀subscript𝑈𝑖𝐅subscript𝜃𝑀12𝛼V_{\varepsilon,n}(\alpha)=\frac{1}{N}\sum_{i=1}^{N}\mathds{1}_{\{\langle\hat{% \mathbf{F}}_{N,n}^{\varepsilon}(U_{i}),\mathbf{F}(\theta_{M})\rangle\geq 1-2% \alpha\}}.italic_V start_POSTSUBSCRIPT italic_ε , italic_n end_POSTSUBSCRIPT ( italic_α ) = divide start_ARG 1 end_ARG start_ARG italic_N end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT blackboard_1 start_POSTSUBSCRIPT { ⟨ over^ start_ARG bold_F end_ARG start_POSTSUBSCRIPT italic_N , italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ε end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ ≥ 1 - 2 italic_α } end_POSTSUBSCRIPT .

On the left-hand side of Figure 2, we draw the scale curves of von-Mises Fisher distributions with varying concentration parameter κ{1,2,5,15}𝜅12515\kappa\in\{1,2,5,15\}italic_κ ∈ { 1 , 2 , 5 , 15 }, which controls the dispersion of samples. It is well-captured that the lower the value of κ𝜅\kappaitalic_κ, the more spread out is the underlying distribution.

Besides, univariate order statistics (and equivalently, quantiles) are fundamental to analyse the presence of outliers. In our spherical setting, the scale curve is able to conveniently summarize this type of information, as illustrated in the right-hand side of Figure 2. We consider n=500𝑛500n=500italic_n = 500 observations coming from three identical von-Mises Fisher distributions with dispersion parameter κ=15𝜅15\kappa=15italic_κ = 15 and mean (0,1,0)Tsuperscript010𝑇(0,1,0)^{T}( 0 , 1 , 0 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, but each with a certain number N{5,20,50}𝑁52050N\in\{5,20,50\}italic_N ∈ { 5 , 20 , 50 } of outliers localized near from the North Pole (0,0,1)Tsuperscript001𝑇(0,0,1)^{T}( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT. It appears that the dispersion of quantile regions up to the order α0.8𝛼0.8\alpha\approx 0.8italic_α ≈ 0.8 are identical, whereas the dispersion increases with the number of outliers for peripheric quantile regions, which is precisely the expected behavior.

Refer to caption
Refer to caption
Figure 2. Scale curves of (left) von-Mises Fisher distributions for various κ𝜅\kappaitalic_κ and (right) empirical Fréchet means and medians.

6. Numerical experiments

Our numerical experiments first qualitatively compare our regularized estimator of the MK quantile function with other existing notions of spherical quantiles. After that, we study the influence of the regularization strength ε>0𝜀0\varepsilon>0italic_ε > 0 on the quantitative criterion of MSE with known ground truth, and on the smoothness of regularized quantile contours.

6.1. Other concepts of quantiles

In Figure 3, we display a visual comparison of existing concepts for quantiles on the sphere by focusing our statistical analysis on mixtures of von-Mises Fisher distributions. It can be observed from Figure 3 that Mahalanobis quantile regions [46] are concentric spherical caps, whereas spatial quantiles [44] and our regularized MK quantiles can exhibit more complex shapes that better fit the geometry of the data. To that extent, spatial and regularized MK quantiles, obtained through our entropically regularized estimator 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, are both more satisfactory. For the spatial quantiles, our naive implementation forces them to belong to data samples. For each notion of quantiles, 100100100100 points are drawn within each contours, with straight lines to link them. We emphasize that spatial quantiles are not indexed by their probability content, as opposed to the MK ones [37]. Because entropic MK quantiles interpolate between data points, contours cross the void between mixture components. A careful inspection shows that the number of points per contour within this void is much lower than in the high density areas. This illustrates how the variation of mass, that is the underlying geometry, is captured by our regularized estimator.

Refer to caption
(a) Mahalanobis quantiles
Refer to caption
(b) Spatial quantiles
Refer to caption
(c) Monge-Kantorovich quantiles
Refer to caption
(d) Mahalanobis quantiles
Refer to caption
(e) Spatial quantiles
Refer to caption
(f) Monge-Kantorovich quantiles
Figure 3. Quantile contours {0.1,0.25,0.5,0.75,0.9}0.10.250.50.750.9\{0.1,0.25,0.5,0.75,0.9\}{ 0.1 , 0.25 , 0.5 , 0.75 , 0.9 } for different notions of spherical quantiles and data sampled from a mixture of two (first row) and three (second row) von-Mises Fisher distributions.

6.2. Estimation of OT maps

In Figure 4, we study the influence of the regularization parameter on the estimation of known quantile contours, that were described in [37] and that are recalled in Appendix B for the sake of completeness. The mean-squared error from uniform samples (xi)𝕊2subscript𝑥𝑖superscript𝕊2(x_{i})\subset{\mathbb{S}^{2}}( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊂ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is

n(Q^)=1ni=1nc(Q(xi),Q^(xi)),subscript𝑛^𝑄1𝑛superscriptsubscript𝑖1𝑛𝑐𝑄subscript𝑥𝑖^𝑄subscript𝑥𝑖\mathcal{R}_{n}(\widehat{Q})=\frac{1}{n}\sum_{i=1}^{n}c(Q(x_{i}),\widehat{Q}(x% _{i})),caligraphic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG italic_Q end_ARG ) = divide start_ARG 1 end_ARG start_ARG italic_n end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_c ( italic_Q ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , over^ start_ARG italic_Q end_ARG ( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ) ,

for Q^^𝑄\widehat{Q}over^ start_ARG italic_Q end_ARG denoting either our regularized MK quantile estimator or the unregularized one proposed in [37]. The same experiments are performed 50505050 times. Samples of size n=500𝑛500n=500italic_n = 500 are drawn from the uniform μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and from a von-Mises fisher distribution of location (0,0,1)Tsuperscript001𝑇(0,0,1)^{T}( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and concentration κ=10𝜅10\kappa=10italic_κ = 10. For several values of ε𝜀\varepsilonitalic_ε, the regularized 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is estimated, and the unregularized estimator 𝐐^0subscript^𝐐0\widehat{\mathbf{Q}}_{0}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT from [37] is computed. Each resulting estimator is compared to the ground truth by n(Q^)subscript𝑛^𝑄\mathcal{R}_{n}(\widehat{Q})caligraphic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG italic_Q end_ARG ), computed on the uniform sample (xi)𝕊2subscript𝑥𝑖superscript𝕊2(x_{i})\subset{\mathbb{S}^{2}}( italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ⊂ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT of size n𝑛nitalic_n. By doing this experiment 50505050 times, we obtain a boxplot of MSE values for various values of the regularization parameter ε𝜀\varepsilonitalic_ε. The results are reported in Figure 4, where ε=0𝜀0\varepsilon=0italic_ε = 0 refers to the MSE of 𝐐^0subscript^𝐐0\widehat{\mathbf{Q}}_{0}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The dashed horizontal line illustrates the median value for the MSE of 𝐐^0subscript^𝐐0\widehat{\mathbf{Q}}_{0}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. It can be observed that the entropic regularization is able to significantly outperform the estimation of the quantile map, in particular for values around ε0.09𝜀0.09\varepsilon\approx 0.09italic_ε ≈ 0.09.

Refer to caption
Figure 4. Mean squared error n(𝐐^ε)subscript𝑛subscript^𝐐𝜀\mathcal{R}_{n}(\widehat{\mathbf{Q}}_{\varepsilon})caligraphic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ) as a function of the regularization parameter ε[0,0.2]𝜀00.2\varepsilon\in[0,0.2]italic_ε ∈ [ 0 , 0.2 ]. The horizontal dashed-line is the value of n(𝐐^0)subscript𝑛subscript^𝐐0\mathcal{R}_{n}(\widehat{\mathbf{Q}}_{0})caligraphic_R start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ).

6.3. Qualitative effect of regularization

In Figure 5, we visually compare regularized and unregularized MK spherical quantile contours of orders 24.4%,48.8%,75.6%percent24.4percent48.8percent75.624.4\%,48.8\%,75.6\%24.4 % , 48.8 % , 75.6 % on the same von-Mises fisher distribution than in Figure 4. Such uncommon probability contents are inherent to empirical unregularized contours because the number of contours as well as their size depends on the sample size, that is here fixed at n=2001𝑛2001n=2001italic_n = 2001. Ground-truth contours deduced from (B.3) are presented together with unregularized and regularized ones, for ε{0.01,0.05,1}𝜀0.010.051\varepsilon\in\{0.01,0.05,1\}italic_ε ∈ { 0.01 , 0.05 , 1 }. Each contour contains 100100100100 points, linked by straight lines. For ε=0.01𝜀0.01\varepsilon=0.01italic_ε = 0.01, contours adapt too much on the finite-sample data, causing errors as ground-truth contours are smoother. For ε=1𝜀1\varepsilon=1italic_ε = 1, contours are smoother, but there is too much bias in the approximation between 𝐐^εsubscript^𝐐𝜀\widehat{\mathbf{Q}}_{\varepsilon}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT and the underlying ground truth. For the well-chosen ε=0.05𝜀0.05\varepsilon=0.05italic_ε = 0.05, the trade-off between regularity and low-bias allows the better estimation. This sheds some light on the behavior of regularization. The lower the ε𝜀\varepsilonitalic_ε, the more adapted 𝐐^εsubscript^𝐐𝜀\widehat{\mathbf{Q}}_{\varepsilon}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is to the finite-sample data and its irregularities. Larger values of ε𝜀\varepsilonitalic_ε induce smoother contours, as a byproduct of a greater regularity for 𝐐^εsubscript^𝐐𝜀\widehat{\mathbf{Q}}_{\varepsilon}over^ start_ARG bold_Q end_ARG start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT. Thus, this emphasizes the need for calibration of the regularization strength.

Refer to caption
(a) ε=0.01𝜀0.01\varepsilon=0.01italic_ε = 0.01
Refer to caption
(b) ε=0.05𝜀0.05\varepsilon=0.05italic_ε = 0.05
Refer to caption
(c) ε=1𝜀1\varepsilon=1italic_ε = 1
Figure 5. Empirical regularized quantile contours in blue, unregularized ones in orange, and ground truth in green.

7. Concluding remarks

The major limitation of discrete OT for quantiles estimation is that it results in a matching between samples, instead of a function Q𝑄Qitalic_Q able to provide out-of-sample estimates Q(x)𝑄𝑥Q(x)italic_Q ( italic_x ). In the present paper, we showed that regularizing by entropy can be used as an alternative, particularly when the focus is on the quantile contours or the volumes of quantile regions. Still, we emphasize that the entropic regularization loses the distribution-freeness of associated ranks, compared to solving the discrete-discrete OT problem as in [37]. Because it is crucial for rank-based statistical testing, the choice of using OT or EOT shall depend on the considered task.

Our regularized quantile function is an entropic map, that has been generalized here outside of the euclidean setting, building on the particular structure of the 2222-sphere. Our numerical scheme leverages the existence of spherical Fourier series to construct a stochastic gradient descent to solve continuous OT in the limit of the iterations. This is particularly useful when the number of observations n𝑛nitalic_n is large and prevents the storage of the cost matrix. Numerical experiments revealed the ability of entropically regularized quantiles to improve the mean-squared-error when the ground truth is known, and showed the potential of this approach for the analysis of spherical data.

Appendix

Appendix A Invariance properties

Invariance properties of empirical versions of 𝐅𝐅\mathbf{F}bold_F and 𝐐𝐐\mathbf{Q}bold_Q were shown in [37]. The same holds in fact for the population counterparts, with the same argument : the transport problem (3.8) inherits invariance from the Riemannian distance.

Proposition A.1.

In dimension d𝑑ditalic_d, let 𝐎𝐎\mathbf{O}bold_O be a d×d𝑑𝑑d\times ditalic_d × italic_d orthogonal matrix and let ν𝐁2𝜈subscript𝐁2\nu\in\mathbf{B}_{2}italic_ν ∈ bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Denote by 𝐎#νsubscript𝐎#𝜈\mathbf{O}_{\#}\nubold_O start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_ν the distribution of 𝐎Z𝐎𝑍\mathbf{O}Zbold_O italic_Z if Zνsimilar-to𝑍𝜈Z\sim\nuitalic_Z ∼ italic_ν, and by 𝐅Z,𝐐Zsubscript𝐅𝑍subscript𝐐𝑍\mathbf{F}_{Z},\mathbf{Q}_{Z}bold_F start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT , bold_Q start_POSTSUBSCRIPT italic_Z end_POSTSUBSCRIPT, (resp.𝑟𝑒𝑠𝑝resp.italic_r italic_e italic_s italic_p . 𝐅𝐎Z,𝐐𝐎Zsubscript𝐅𝐎𝑍subscript𝐐𝐎𝑍\mathbf{F}_{\mathbf{O}Z},\mathbf{Q}_{\mathbf{O}Z}bold_F start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT , bold_Q start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT), the distribution and quantile functions of ν𝜈\nuitalic_ν, (resp.𝑟𝑒𝑠𝑝resp.italic_r italic_e italic_s italic_p . 𝐎#νsubscript𝐎#𝜈\mathbf{O}_{\#}\nubold_O start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_ν). Then,

𝐅𝐎Z(𝐎z)=𝐎𝐅(z).subscript𝐅𝐎𝑍𝐎𝑧𝐎𝐅𝑧\mathbf{F}_{\mathbf{O}Z}(\mathbf{O}z)=\mathbf{O}\mathbf{F}(z).bold_F start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT ( bold_O italic_z ) = bold_OF ( italic_z ) .

and

𝐐𝐎Z(𝐎z)=𝐎𝐐(z).subscript𝐐𝐎𝑍𝐎𝑧𝐎𝐐𝑧\mathbf{Q}_{\mathbf{O}Z}(\mathbf{O}z)=\mathbf{O}\mathbf{Q}(z).bold_Q start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT ( bold_O italic_z ) = bold_OQ ( italic_z ) .
Proof.

Note that the Kantorovich problem, equivalent to (3.8) when ν𝐁2𝜈subscript𝐁2\nu\in\mathbf{B}_{2}italic_ν ∈ bold_B start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, minimizes

𝕊2𝕊2c(x,y)𝑑π(x,y),subscriptsuperscript𝕊2subscriptsuperscript𝕊2𝑐𝑥𝑦differential-d𝜋𝑥𝑦\int_{\mathbb{S}^{2}}\int_{\mathbb{S}^{2}}c(x,y)d\pi(x,y),∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_y ) italic_d italic_π ( italic_x , italic_y ) ,

over the set of joint probabilities π𝜋\piitalic_π supported on 𝕊2×𝕊2superscript𝕊2superscript𝕊2{\mathbb{S}^{2}}\times{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT × blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT with marginals μ𝕊2,νsubscript𝜇superscript𝕊2𝜈\mu_{\mathbb{S}^{2}},\nuitalic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , italic_ν, see e.g.formulae-sequence𝑒𝑔e.g.italic_e . italic_g . [37]. Because c(𝐎x,𝐎y)=c(x,y),𝑐𝐎𝑥𝐎𝑦𝑐𝑥𝑦c(\mathbf{O}x,\mathbf{O}y)=c(x,y),italic_c ( bold_O italic_x , bold_O italic_y ) = italic_c ( italic_x , italic_y ) , the transport problem between μ𝕊2subscript𝜇superscript𝕊2\mu_{\mathbb{S}^{2}}italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and ν𝜈\nuitalic_ν is equivalent to the one between 𝐎#μ𝕊2subscript𝐎#subscript𝜇superscript𝕊2\mathbf{O}_{\#}\mu_{\mathbb{S}^{2}}bold_O start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and 𝐎#νsubscript𝐎#𝜈\mathbf{O}_{\#}\nubold_O start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_ν. Going back to Monge’s problem (3.8), it immediately follows that the Monge map T#μ𝕊2=νsubscript𝑇#subscript𝜇superscript𝕊2𝜈T_{\#}\mu_{\mathbb{S}^{2}}=\nuitalic_T start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = italic_ν verifies

𝐎T(z)=T𝐎Z(𝐎z).𝐎𝑇𝑧subscript𝑇𝐎𝑍𝐎𝑧\mathbf{O}T(z)=T_{\mathbf{O}Z}(\mathbf{O}z).bold_O italic_T ( italic_z ) = italic_T start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT ( bold_O italic_z ) .

Up to interverting the reference and the target measures, the result follows. ∎

The following corollary is straightforward.

Corollary A.1.

For any τ[0,1]𝜏01\tau\in[0,1]italic_τ ∈ [ 0 , 1 ], 𝐎𝒞τ=𝐐𝐎Z(𝐎𝒞τU)𝐎subscript𝒞𝜏subscript𝐐𝐎𝑍𝐎subscriptsuperscript𝒞𝑈𝜏\mathbf{O}\mathcal{C}_{\tau}=\mathbf{Q}_{\mathbf{O}Z}(\mathbf{O}\mathcal{C}^{U% }_{\tau})bold_O caligraphic_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = bold_Q start_POSTSUBSCRIPT bold_O italic_Z end_POSTSUBSCRIPT ( bold_O caligraphic_C start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT )

From [74][Lemma 1] or [27][Theorem 3.2], the convex combination between c𝑐citalic_c-concave functions is itself c𝑐citalic_c-concave, giving rise to the following immediate consequence. We also refer to [15][Section 5].

Lemma A.1 (Interpolation).

Let ν1,ν2subscript𝜈1subscript𝜈2\nu_{1},\nu_{2}italic_ν start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ν start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT be directional probability distributions with given Kantorovich potentials ψ1,ψ2subscript𝜓1subscript𝜓2\psi_{1},\psi_{2}italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and MK quantile functions 𝐐1,𝐐2subscript𝐐1subscript𝐐2\mathbf{Q}_{1},\mathbf{Q}_{2}bold_Q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_Q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, respectively. For any t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ], let ψt=tψ1+(1t)ψ2subscript𝜓𝑡𝑡subscript𝜓11𝑡subscript𝜓2\psi_{t}=t\psi_{1}+(1-t)\psi_{2}italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_t italic_ψ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + ( 1 - italic_t ) italic_ψ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Then, the interpolation

𝐐t(x)=Expx(ψt(x))subscript𝐐𝑡𝑥subscriptExp𝑥subscript𝜓𝑡𝑥\mathbf{Q}_{t}(x)=\text{Exp}_{x}(-\nabla\psi_{t}(x))bold_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( - ∇ italic_ψ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_x ) )

is the directional MK quantile function of the distribution νtsubscript𝜈𝑡\nu_{t}italic_ν start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT defined by νt=Qt#μ𝕊2subscript𝜈𝑡subscriptsubscript𝑄𝑡#subscript𝜇superscript𝕊2\nu_{t}={Q_{t}}_{\#}\mu_{\mathbb{S}^{2}}italic_ν start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_Q start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUBSCRIPT # end_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT.

Appendix B Explicit forms

Closed-form expressions of 𝐅𝐅\mathbf{F}bold_F for rotationally invariant distributions were given in [37] and simplify in dimension d=3𝑑3d=3italic_d = 3, allowing to deduce the inverse map 𝐐𝐐\mathbf{Q}bold_Q. Let Zνsimilar-to𝑍𝜈Z\sim\nuitalic_Z ∼ italic_ν be such a random vector with axis ±θMplus-or-minussubscript𝜃𝑀\pm\theta_{M}± italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT. Then, assume that ν𝜈\nuitalic_ν has density

z𝕊2cff(zTθM),𝑧superscript𝕊2maps-tosubscript𝑐𝑓𝑓superscript𝑧𝑇subscript𝜃𝑀z\in{\mathbb{S}^{2}}\mapsto c_{f}f(z^{T}\theta_{M}),italic_z ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ↦ italic_c start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT italic_f ( italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ,

for f𝑓fitalic_f some positive angular function and cfsubscript𝑐𝑓c_{f}italic_c start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT a normalizing constant. For r[1,1]𝑟11r\in[-1,1]italic_r ∈ [ - 1 , 1 ], denote by

Ff(r)=1rf(s)𝑑s/11f(s)𝑑ssubscript𝐹𝑓𝑟superscriptsubscript1𝑟𝑓𝑠differential-d𝑠superscriptsubscript11𝑓𝑠differential-d𝑠F_{f}(r)=\int_{-1}^{r}f(s)ds/\int_{-1}^{1}f(s)dsitalic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_r ) = ∫ start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT italic_f ( italic_s ) italic_d italic_s / ∫ start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT italic_f ( italic_s ) italic_d italic_s

the distribution function of ZTθMsuperscript𝑍𝑇subscript𝜃𝑀Z^{T}\theta_{M}italic_Z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and by Qf=Ff1subscript𝑄𝑓superscriptsubscript𝐹𝑓1Q_{f}=F_{f}^{-1}italic_Q start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT = italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT its quantile function. Then, letting Ff(r)=2Ff(r)1superscriptsubscript𝐹𝑓𝑟2subscript𝐹𝑓𝑟1F_{f}^{*}(r)=2F_{f}(r)-1italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_r ) = 2 italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT ( italic_r ) - 1, the directional distribution function of Z𝑍Zitalic_Z writes

(B.1) 𝐅(z)=Ff(zTθM)θM+1Ff(zTθM)2SθM(z).𝐅𝑧superscriptsubscript𝐹𝑓superscript𝑧𝑇subscript𝜃𝑀subscript𝜃𝑀1superscriptsubscript𝐹𝑓superscriptsuperscript𝑧𝑇subscript𝜃𝑀2subscript𝑆subscript𝜃𝑀𝑧\mathbf{F}(z)=F_{f}^{*}(z^{T}\theta_{M})\theta_{M}+\sqrt{1-F_{f}^{*}(z^{T}% \theta_{M})^{2}}S_{\theta_{M}}(z).bold_F ( italic_z ) = italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG italic_S start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_z ) .

For instance, taking f(s)=exp(κzTθM)𝑓𝑠𝜅superscript𝑧𝑇subscript𝜃𝑀f(s)=\exp(\kappa z^{T}\theta_{M})italic_f ( italic_s ) = roman_exp ( italic_κ italic_z start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) corresponds to the von Mises-Fisher distribution with location parameter θMsubscript𝜃𝑀\theta_{M}italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT and concentration parameter κ+𝜅subscript\kappa\in{\mathbb{R}}_{+}italic_κ ∈ blackboard_R start_POSTSUBSCRIPT + end_POSTSUBSCRIPT. Crucially, the transport (B.1) reduces to univariate transport along the axis θM=𝐅(θM)subscript𝜃𝑀𝐅subscript𝜃𝑀\theta_{M}=\mathbf{F}(\theta_{M})italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ). If θM=(0,0,1)Tsubscript𝜃𝑀superscript001𝑇\theta_{M}=(0,0,1)^{T}italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = ( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT, that is to say up to some rotation thanks to Proposition A.1 and Corollary A.1, this corresponds to changing the latitude w.r.t.formulae-sequence𝑤𝑟𝑡w.r.t.italic_w . italic_r . italic_t . the usual coordinate system (3.1). Indeed, as soon as θM=(0,0,1)Tsubscript𝜃𝑀superscript001𝑇\theta_{M}=(0,0,1)^{T}italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT = ( 0 , 0 , 1 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT,

𝐅(z)=Ff(z3)θM+1Ff(z3)2(z1,z2,0)T(z1,z2,0)T.𝐅𝑧superscriptsubscript𝐹𝑓subscript𝑧3subscript𝜃𝑀1superscriptsubscript𝐹𝑓superscriptsubscript𝑧32superscriptsubscript𝑧1subscript𝑧20𝑇normsuperscriptsubscript𝑧1subscript𝑧20𝑇\mathbf{F}(z)=F_{f}^{*}(z_{3})\theta_{M}+\sqrt{1-F_{f}^{*}(z_{3})^{2}}\frac{(z% _{1},z_{2},0)^{T}}{\|(z_{1},z_{2},0)^{T}\|}.bold_F ( italic_z ) = italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG divide start_ARG ( italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 0 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_ARG start_ARG ∥ ( italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 0 ) start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∥ end_ARG .

The third coordinate is changed to Ff(z3)superscriptsubscript𝐹𝑓subscript𝑧3F_{f}^{*}(z_{3})italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ), and the other coordinates are adapted to the constraint 𝐅(z)𝕊2𝐅𝑧superscript𝕊2\mathbf{F}(z)\in{\mathbb{S}^{2}}bold_F ( italic_z ) ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. This rewrites, in accordance with (3.1),

(B.2) 𝐅(z)=𝐅(Φ(θ,ϕ))=Φ(θ¯,ϕ)forθ¯=arccos((Ff)(z3)).formulae-sequence𝐅𝑧𝐅Φ𝜃italic-ϕΦ¯𝜃italic-ϕfor¯𝜃superscriptsubscript𝐹𝑓subscript𝑧3\mathbf{F}(z)=\mathbf{F}(\Phi(\theta,\phi))=\Phi(\overline{\theta},\phi)\hskip 2% 8.45274pt\mbox{for}\hskip 28.45274pt\overline{\theta}=\arccos\Big{(}(F_{f}^{*}% )(z_{3})\Big{)}.bold_F ( italic_z ) = bold_F ( roman_Φ ( italic_θ , italic_ϕ ) ) = roman_Φ ( over¯ start_ARG italic_θ end_ARG , italic_ϕ ) for over¯ start_ARG italic_θ end_ARG = roman_arccos ( ( italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) ( italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) ) .

Consequently, to get the inverse map 𝐐=𝐅1𝐐superscript𝐅1\mathbf{Q}=\mathbf{F}^{-1}bold_Q = bold_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, it suffices to change the pseudo latitude of 𝐅(z)𝒞τU𝐅𝑧superscriptsubscript𝒞𝜏𝑈\mathbf{F}(z)\in\mathcal{C}_{\tau}^{U}bold_F ( italic_z ) ∈ caligraphic_C start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT w.r.t.formulae-sequence𝑤𝑟𝑡w.r.t.italic_w . italic_r . italic_t . the axis ±θMplus-or-minussubscript𝜃𝑀\pm\theta_{M}± italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT. If x=𝐅(z)𝑥𝐅𝑧x=\mathbf{F}(z)italic_x = bold_F ( italic_z ), x3=Ff(z3)subscript𝑥3superscriptsubscript𝐹𝑓subscript𝑧3x_{3}=F_{f}^{*}(z_{3})italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) and z3=(Ff)1(x3)subscript𝑧3superscriptsuperscriptsubscript𝐹𝑓1subscript𝑥3z_{3}=(F_{f}^{*})^{-1}(x_{3})italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = ( italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ), that is

(B.3) 𝐐(x)=𝐐(Φ(θ,ϕ))=Φ(θ~,ϕ)forθ~=arccos((Ff)1(x3)).formulae-sequence𝐐𝑥𝐐Φ𝜃italic-ϕΦ~𝜃italic-ϕfor~𝜃superscriptsuperscriptsubscript𝐹𝑓1subscript𝑥3\mathbf{Q}(x)=\mathbf{Q}(\Phi(\theta,\phi))=\Phi(\tilde{\theta},\phi)\hskip 28% .45274pt\mbox{for}\hskip 28.45274pt\tilde{\theta}=\arccos\Big{(}(F_{f}^{*})^{-% 1}(x_{3})\Big{)}.bold_Q ( italic_x ) = bold_Q ( roman_Φ ( italic_θ , italic_ϕ ) ) = roman_Φ ( over~ start_ARG italic_θ end_ARG , italic_ϕ ) for over~ start_ARG italic_θ end_ARG = roman_arccos ( ( italic_F start_POSTSUBSCRIPT italic_f end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) ) .

As highlighted in [37], this shows that MK quantile contours coincide with Mahalanobis ones from [46] under the rotationally symmetric model.

Appendix C Proofs : Entropic maps

C.1. Proof of Proposition 4.3

Denoting by v=𝐮εc,ε𝑣superscriptsubscript𝐮𝜀𝑐𝜀v=\mathbf{u}_{\varepsilon}^{c,\varepsilon}italic_v = bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT, rewriting (4.3) gives

𝐮ε(x)=εlogexp(v(z)c(x,z)ε)𝑑ν(z).subscript𝐮𝜀𝑥𝜀𝑣𝑧𝑐𝑥𝑧𝜀differential-d𝜈𝑧\mathbf{u}_{\varepsilon}(x)=-\varepsilon\log\int\exp\Big{(}\frac{v(z)-c(x,z)}{% \varepsilon}\Big{)}d\nu(z).bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = - italic_ε roman_log ∫ roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) .

By the chain rule,

(C.1) xi𝐮ε(x)=εxiJ(x)J(x)forJ(x)=exp(v(z)c(x,z)ε)𝑑ν(z).formulae-sequencesubscriptsubscript𝑥𝑖subscript𝐮𝜀𝑥𝜀subscriptsubscript𝑥𝑖𝐽𝑥𝐽𝑥for𝐽𝑥𝑣𝑧𝑐𝑥𝑧𝜀differential-d𝜈𝑧\partial_{x_{i}}\mathbf{u}_{\varepsilon}(x)=-\varepsilon\frac{\partial_{x_{i}}% J(x)}{J(x)}\quad\mbox{for}\quad J(x)=\int\exp\Big{(}\frac{v(z)-c(x,z)}{% \varepsilon}\Big{)}d\nu(z).∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = - italic_ε divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_J ( italic_x ) end_ARG start_ARG italic_J ( italic_x ) end_ARG for italic_J ( italic_x ) = ∫ roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) .

We now turn to the differentiation of J𝐽Jitalic_J. As shown in [64][Lemma 2.1],

infx𝕊2{c(x,z)𝐮ε(x)}v(z)c(x,z)𝑑μ𝕊2(x).subscriptinfimum𝑥superscript𝕊2𝑐𝑥𝑧subscript𝐮𝜀𝑥𝑣𝑧𝑐𝑥𝑧differential-dsubscript𝜇superscript𝕊2𝑥\inf_{x\in{\mathbb{S}^{2}}}\{c(x,z)-\mathbf{u}_{\varepsilon}(x)\}\leq v(z)\leq% \int c(x,z)d\mu_{\mathbb{S}^{2}}(x).roman_inf start_POSTSUBSCRIPT italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT { italic_c ( italic_x , italic_z ) - bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) } ≤ italic_v ( italic_z ) ≤ ∫ italic_c ( italic_x , italic_z ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

By boundedness of 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, v𝑣vitalic_v is bounded and so is the integrand in (C.1). As we deal with probability measures, this justifies using the differentiation under the integral sign, that is

xiJ(x)=xiexp(v(z)c(x,z)ε)dν(z),subscriptsubscript𝑥𝑖𝐽𝑥subscriptsubscript𝑥𝑖𝑣𝑧𝑐𝑥𝑧𝜀𝑑𝜈𝑧\partial_{x_{i}}J(x)=\int\partial_{x_{i}}\exp\Big{(}\frac{v(z)-c(x,z)}{% \varepsilon}\Big{)}d\nu(z),∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_J ( italic_x ) = ∫ ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) ,

which induces

(C.2) xiJ(x)=xic(x,z)εexp(v(z)c(x,z)ε)dν(z).subscriptsubscript𝑥𝑖𝐽𝑥subscriptsubscript𝑥𝑖𝑐𝑥𝑧𝜀𝑣𝑧𝑐𝑥𝑧𝜀𝑑𝜈𝑧\partial_{x_{i}}J(x)=\int-\frac{\partial_{x_{i}}c(x,z)}{\varepsilon}\exp\Big{(% }\frac{v(z)-c(x,z)}{\varepsilon}\Big{)}d\nu(z).∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_J ( italic_x ) = ∫ - divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) .

Fix z𝕊2𝑧superscript𝕊2z\in{\mathbb{S}^{2}}italic_z ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, so

(C.3) xic(x,z)=d(x,z)xid(x,z)andxid(x,z)=11x,z2zi.formulae-sequencesubscriptsubscript𝑥𝑖𝑐𝑥𝑧𝑑𝑥𝑧subscriptsubscript𝑥𝑖𝑑𝑥𝑧andsubscriptsubscript𝑥𝑖𝑑𝑥𝑧11superscript𝑥𝑧2subscript𝑧𝑖\partial_{x_{i}}c(x,z)=d(x,z)\partial_{x_{i}}d(x,z)\quad\mbox{and}\quad% \partial_{x_{i}}d(x,z)=\frac{-1}{\sqrt{1-\langle x,z\rangle^{2}}}z_{i}.∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) = italic_d ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d ( italic_x , italic_z ) and ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_d ( italic_x , italic_z ) = divide start_ARG - 1 end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

Combining (C.2) with (C.3),

(C.4) xiJ(x)=ziεd(x,z)1x,z2exp(v(z)c(x,z)ε)𝑑ν(z).subscriptsubscript𝑥𝑖𝐽𝑥subscript𝑧𝑖𝜀𝑑𝑥𝑧1superscript𝑥𝑧2𝑣𝑧𝑐𝑥𝑧𝜀differential-d𝜈𝑧\partial_{x_{i}}J(x)=\int\frac{z_{i}}{\varepsilon}\frac{d(x,z)}{\sqrt{1-% \langle x,z\rangle^{2}}}\exp\Big{(}\frac{v(z)-c(x,z)}{\varepsilon}\Big{)}d\nu(% z).∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_J ( italic_x ) = ∫ divide start_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_ε end_ARG divide start_ARG italic_d ( italic_x , italic_z ) end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) .

Plugging (C.4) in (C.1) gives (4.14), where gεsubscript𝑔𝜀g_{\varepsilon}italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT defined in (4.12) shows up because, by properties of exp\exproman_exp,

(C.5) exp(v(z)c(x,z)+𝐮ε(x)ε)=exp(v(z)c(x,z)ε)exp(v(y)c(x,y)ε)𝑑ν(y)𝑣𝑧𝑐𝑥𝑧subscript𝐮𝜀𝑥𝜀𝑣𝑧𝑐𝑥𝑧𝜀𝑣𝑦𝑐𝑥𝑦𝜀differential-d𝜈𝑦\exp\Big{(}\frac{v(z)-c(x,z)+\mathbf{u}_{\varepsilon}(x)}{\varepsilon}\Big{)}=% \frac{\exp\Big{(}\frac{v(z)-c(x,z)}{\varepsilon}\Big{)}}{\int\exp\Big{(}\frac{% v(y)-c(x,y)}{\varepsilon}\Big{)}d\nu(y)}roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_ε end_ARG ) = divide start_ARG roman_exp ( divide start_ARG italic_v ( italic_z ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∫ roman_exp ( divide start_ARG italic_v ( italic_y ) - italic_c ( italic_x , italic_y ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_y ) end_ARG

Using also that,

(C.6) exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)=exp(𝐮ε(x)c(x,z)ε)exp(𝐮ε(y)c(z,y)ε)𝑑μ𝕊2(y),subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀subscript𝐮𝜀𝑥𝑐𝑥𝑧𝜀subscript𝐮𝜀𝑦𝑐𝑧𝑦𝜀differential-dsubscript𝜇superscript𝕊2𝑦\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{% c,\varepsilon}(z)}{\varepsilon}\Big{)}=\frac{\exp\Big{(}\frac{\mathbf{u}_{% \varepsilon}(x)-c(x,z)}{\varepsilon}\Big{)}}{\int\exp\Big{(}\frac{\mathbf{u}_{% \varepsilon}(y)-c(z,y)}{\varepsilon}\Big{)}d\mu_{\mathbb{S}^{2}}(y)},roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) = divide start_ARG roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) end_ARG start_ARG ∫ roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_y ) - italic_c ( italic_z , italic_y ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_y ) end_ARG ,

the same arguments on 𝐮εc,ε(z)=εlogexp(𝐮ε(x)c(x,z)ε)𝑑μ𝕊2(x)superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀subscript𝐮𝜀𝑥𝑐𝑥𝑧𝜀differential-dsubscript𝜇superscript𝕊2𝑥\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)=-\varepsilon\log\int\exp\Big{(}% \frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)}{\varepsilon}\Big{)}d\mu_{\mathbb{S}^% {2}}(x)bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = - italic_ε roman_log ∫ roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) yield (4.13).         \mathbin{\vbox{\hrule\hbox{\vrule height=6.02773pt\kern 6.00006pt\vrule height% =6.02773pt}\hrule}}BINOP

C.2. Proof of Corollary 4.1

Using the Euclidean derivatives obtained in Proposition 4.3,

𝐮ε(x)=ρxzgε(x,z)𝑑ν(z)and𝐮εc,ε(z)=ρzxgε(x,z)𝑑μ𝕊2(x).formulae-sequencesubscript𝐮𝜀𝑥subscript𝜌𝑥𝑧subscript𝑔𝜀𝑥𝑧differential-d𝜈𝑧andsuperscriptsubscript𝐮𝜀𝑐𝜀𝑧subscript𝜌𝑧𝑥subscript𝑔𝜀𝑥𝑧differential-dsubscript𝜇superscript𝕊2𝑥\nabla\mathbf{u}_{\varepsilon}(x)=\rho_{x}\int zg_{\varepsilon}(x,z)d\nu(z)% \quad\mbox{and}\quad\nabla\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)=\rho_{z}% \int xg_{\varepsilon}(x,z)d\mu_{\mathbb{S}^{2}}(x).∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ∫ italic_z italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_ν ( italic_z ) and ∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = italic_ρ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ∫ italic_x italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

But this is equivalent to

𝐮ε(x)=d(x,z)1x,z2ρx(z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)dν(z),subscript𝐮𝜀𝑥𝑑𝑥𝑧1superscript𝑥𝑧2subscript𝜌𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀𝑑𝜈𝑧\nabla\mathbf{u}_{\varepsilon}(x)=\int-\frac{d(x,z)}{\sqrt{1-\langle x,z% \rangle^{2}}}\rho_{x}(z)\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+% \mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)}{\varepsilon}\Big{)}d\nu(z),∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = ∫ - divide start_ARG italic_d ( italic_x , italic_z ) end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG italic_ρ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) ,

and

𝐮εc,ε(z)=d(x,z)1x,z2ρz(x)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)dμ𝕊2(x).superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝑑𝑥𝑧1superscript𝑥𝑧2subscript𝜌𝑧𝑥subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀𝑑subscript𝜇superscript𝕊2𝑥\nabla\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)=\int-\frac{d(x,z)}{\sqrt{1-% \langle x,z\rangle^{2}}}\rho_{z}(x)\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x% )-c(x,z)+\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)}{\varepsilon}\Big{)}d\mu_% {\mathbb{S}^{2}}(x).∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = ∫ - divide start_ARG italic_d ( italic_x , italic_z ) end_ARG start_ARG square-root start_ARG 1 - ⟨ italic_x , italic_z ⟩ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG italic_ρ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_x ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

There, one recovers an explicit formulation of Logx=Expx1subscriptLog𝑥superscriptsubscriptExp𝑥1\text{Log}_{x}=\text{Exp}_{x}^{-1}Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = Exp start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, the inverse of the exponential map, see for instance [25], that gives

𝐮ε(x)=Logx(z)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑ν(z),subscript𝐮𝜀𝑥subscriptLog𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-d𝜈𝑧\nabla\mathbf{u}_{\varepsilon}(x)=-\int\text{Log}_{x}(z)\exp\Big{(}\frac{% \mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)% }{\varepsilon}\Big{)}d\nu(z),∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) = - ∫ Log start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_z ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) ,

and

𝐮εc,ε(z)=Logz(x)exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε)𝑑μ𝕊2(x).superscriptsubscript𝐮𝜀𝑐𝜀𝑧subscriptLog𝑧𝑥subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀differential-dsubscript𝜇superscript𝕊2𝑥\nabla\mathbf{u}_{\varepsilon}^{c,\varepsilon}(z)=-\int\text{Log}_{z}(x)\exp% \Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{\varepsilon}^{c,% \varepsilon}(z)}{\varepsilon}\Big{)}d\mu_{\mathbb{S}^{2}}(x).∇ bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) = - ∫ Log start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ( italic_x ) roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_μ start_POSTSUBSCRIPT blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x ) .

        \mathbin{\vbox{\hrule\hbox{\vrule height=6.02773pt\kern 6.00006pt\vrule height% =6.02773pt}\hrule}}BINOP

C.3. Proof of Proposition 4.4

The same calculus can be found in [30][Lemma 3], up to the fact that one can recognize partial derivatives of 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, that is (4.13), in the result, at the very end of our proof. First of all, gεsubscript𝑔𝜀g_{\varepsilon}italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is bounded by using [64][Lemma 2.1] and the compacity of 𝕊2superscript𝕊2{\mathbb{S}^{2}}blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Thus, one can differentiate in (4.13) under the integral sign, and

(C.7) 2𝐮εxixj(x)=xjzigε(x,z)dν(z).superscript2subscript𝐮𝜀subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗𝑥subscriptsubscript𝑥𝑗subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧𝑑𝜈𝑧\frac{\partial^{2}\mathbf{u}_{\varepsilon}}{\partial_{x_{i}}\partial_{x_{j}}}(% x)=\int\partial_{x_{j}}z_{i}g_{\varepsilon}(x,z)d\nu(z).divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ( italic_x ) = ∫ ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_ν ( italic_z ) .

In view of using classical rules of differentiation, note that zigε(x,z)=(xic(x,z))G(x,z)subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧subscriptsubscript𝑥𝑖𝑐𝑥𝑧𝐺𝑥𝑧z_{i}g_{\varepsilon}(x,z)=(\partial_{x_{i}}c(x,z))G(x,z)italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) = ( ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ) italic_G ( italic_x , italic_z ), for

G(x,z)=exp(𝐮ε(x)c(x,z)+𝐮εc,ε(z)ε).𝐺𝑥𝑧subscript𝐮𝜀𝑥𝑐𝑥𝑧superscriptsubscript𝐮𝜀𝑐𝜀𝑧𝜀G(x,z)=\exp\Big{(}\frac{\mathbf{u}_{\varepsilon}(x)-c(x,z)+\mathbf{u}_{% \varepsilon}^{c,\varepsilon}(z)}{\varepsilon}\Big{)}.italic_G ( italic_x , italic_z ) = roman_exp ( divide start_ARG bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) + bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT ( italic_z ) end_ARG start_ARG italic_ε end_ARG ) .

Besides,

xjG(x,z)=1εG(x,z)xj(𝐮ε(x)c(x,z)).subscriptsubscript𝑥𝑗𝐺𝑥𝑧1𝜀𝐺𝑥𝑧subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥𝑐𝑥𝑧\partial_{x_{j}}G(x,z)=\frac{1}{\varepsilon}G(x,z)\partial_{x_{j}}(\mathbf{u}_% {\varepsilon}(x)-c(x,z)).∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_G ( italic_x , italic_z ) = divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG italic_G ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - italic_c ( italic_x , italic_z ) ) .

As a byproduct,

(C.8) xjzigε(x,z)subscriptsubscript𝑥𝑗subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧\displaystyle\partial_{x_{j}}z_{i}g_{\varepsilon}(x,z)∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) =2c(x,z)xixjG(x,z)+xic(x,z)1εG(x,z)(xj𝐮ε(x)xjc(x,z)),absentsuperscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗𝐺𝑥𝑧subscriptsubscript𝑥𝑖𝑐𝑥𝑧1𝜀𝐺𝑥𝑧subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥subscriptsubscript𝑥𝑗𝑐𝑥𝑧\displaystyle=\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}\partial_{x_{j}}}G(x,z% )+\partial_{x_{i}}c(x,z)\frac{1}{\varepsilon}G(x,z)\Big{(}\partial_{x_{j}}% \mathbf{u}_{\varepsilon}(x)-\partial_{x_{j}}c(x,z)\Big{)},= divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG italic_G ( italic_x , italic_z ) + ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG italic_G ( italic_x , italic_z ) ( ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ) ,
(C.9) =G(x,z)(2c(x,z)xixj+xic(x,z)xj𝐮ε(x)xjc(x,z)ε).absent𝐺𝑥𝑧superscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗subscriptsubscript𝑥𝑖𝑐𝑥𝑧subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥subscriptsubscript𝑥𝑗𝑐𝑥𝑧𝜀\displaystyle=G(x,z)\Big{(}\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}\partial_% {x_{j}}}+\partial_{x_{i}}c(x,z)\frac{\partial_{x_{j}}\mathbf{u}_{\varepsilon}(% x)-\partial_{x_{j}}c(x,z)}{\varepsilon}\Big{)}.= italic_G ( italic_x , italic_z ) ( divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG + ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) - ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) .

Plugging (C.9) in (C.7) and using that zigε(x,z)=(xic(x,z))G(x,z)subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧subscriptsubscript𝑥𝑖𝑐𝑥𝑧𝐺𝑥𝑧z_{i}g_{\varepsilon}(x,z)=(\partial_{x_{i}}c(x,z))G(x,z)italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) = ( ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ) italic_G ( italic_x , italic_z ) when rearranging,

2𝐮εxixj(x)superscript2subscript𝐮𝜀subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗𝑥\displaystyle\frac{\partial^{2}\mathbf{u}_{\varepsilon}}{\partial_{x_{i}}% \partial_{x_{j}}}(x)divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ( italic_x ) =G(x,z)(2c(x,z)xixjxic(x,z)xjc(x,z)ε)+xj𝐮ε(x)εzigε(x,z)dν(z),absent𝐺𝑥𝑧superscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗subscriptsubscript𝑥𝑖𝑐𝑥𝑧subscriptsubscript𝑥𝑗𝑐𝑥𝑧𝜀subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥𝜀subscript𝑧𝑖subscript𝑔𝜀𝑥𝑧𝑑𝜈𝑧\displaystyle=\int G(x,z)\Big{(}\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}% \partial_{x_{j}}}-\frac{\partial_{x_{i}}c(x,z)\partial_{x_{j}}c(x,z)}{% \varepsilon}\Big{)}+\frac{\partial_{x_{j}}\mathbf{u}_{\varepsilon}(x)}{% \varepsilon}z_{i}g_{\varepsilon}(x,z)d\nu(z),= ∫ italic_G ( italic_x , italic_z ) ( divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG - divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) + divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) end_ARG start_ARG italic_ε end_ARG italic_z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x , italic_z ) italic_d italic_ν ( italic_z ) ,
=G(x,z)(2c(x,z)xixjxic(x,z)xjc(x,z)ε)𝑑ν(z)+1εxi𝐮ε(x)xj𝐮ε(x).absent𝐺𝑥𝑧superscript2𝑐𝑥𝑧subscriptsubscript𝑥𝑖subscriptsubscript𝑥𝑗subscriptsubscript𝑥𝑖𝑐𝑥𝑧subscriptsubscript𝑥𝑗𝑐𝑥𝑧𝜀differential-d𝜈𝑧1𝜀subscriptsubscript𝑥𝑖subscript𝐮𝜀𝑥subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥\displaystyle=\int G(x,z)\Big{(}\frac{\partial^{2}c(x,z)}{\partial_{x_{i}}% \partial_{x_{j}}}-\frac{\partial_{x_{i}}c(x,z)\partial_{x_{j}}c(x,z)}{% \varepsilon}\Big{)}d\nu(z)+\frac{1}{\varepsilon}\partial_{x_{i}}\mathbf{u}_{% \varepsilon}(x)\partial_{x_{j}}\mathbf{u}_{\varepsilon}(x).= ∫ italic_G ( italic_x , italic_z ) ( divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG - divide start_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_c ( italic_x , italic_z ) end_ARG start_ARG italic_ε end_ARG ) italic_d italic_ν ( italic_z ) + divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) .

where we also used the explicit derivatives of 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT from (4.13). The Hessian of 𝐮εc,εsuperscriptsubscript𝐮𝜀𝑐𝜀\mathbf{u}_{\varepsilon}^{c,\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_c , italic_ε end_POSTSUPERSCRIPT follows by symmetry.

C.4. Proof of Proposition 4.1

From Proposition 4.4, 𝐮εsubscript𝐮𝜀\mathbf{u}_{\varepsilon}bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT is twice-differentiable everywhere, that gives us the continuity of 𝐐εsubscript𝐐𝜀\mathbf{Q}_{\varepsilon}bold_Q start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT, and of xi𝐮εsubscriptsubscript𝑥𝑖subscript𝐮𝜀\partial_{x_{i}}\mathbf{u}_{\varepsilon}∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT. In the expression of the second-order partial derivatives given in (4.4), the term 1εxi𝐮ε(x)xj𝐮ε(x)1𝜀subscriptsubscript𝑥𝑖subscript𝐮𝜀𝑥subscriptsubscript𝑥𝑗subscript𝐮𝜀𝑥\frac{1}{\varepsilon}\partial_{x_{i}}\mathbf{u}_{\varepsilon}(x)\partial_{x_{j% }}\mathbf{u}_{\varepsilon}(x)divide start_ARG 1 end_ARG start_ARG italic_ε end_ARG ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) ∂ start_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT bold_u start_POSTSUBSCRIPT italic_ε end_POSTSUBSCRIPT ( italic_x ) is thus continuous. The remaining term takes the form of a parameter-dependant integral, whose integrand is continuous and bounded. Thus, the result follows by a direct application of the theorem for continuity under the integral sign, and by using the property that a sequence of spherical harmonics belongs to 1subscript1\ell_{1}roman_ℓ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT for functions that are twice continuously differentiable, see [42][Theorem 2].

Appendix D Proofs : Directional MK depth

D.1. Proof of Proposition 5.1

Recall that Tukey’s depth verifies linear monotonicity relative to the deepest points [85]. As the origin is the deepest point for Udsubscript𝑈𝑑U_{d}italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT, this writes, for any t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ],

(D.1) DUdTukey(u)DUdTukey(tu).superscriptsubscript𝐷subscript𝑈𝑑𝑇𝑢𝑘𝑒𝑦𝑢superscriptsubscript𝐷subscript𝑈𝑑𝑇𝑢𝑘𝑒𝑦𝑡𝑢D_{U_{d}}^{Tukey}(u)\leq D_{U_{d}}^{Tukey}(tu).italic_D start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_u italic_k italic_e italic_y end_POSTSUPERSCRIPT ( italic_u ) ≤ italic_D start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_u italic_k italic_e italic_y end_POSTSUPERSCRIPT ( italic_t italic_u ) .

From Definition 5.1, Dν(𝐐(u))=DUdTukey(ψ𝐐(u))subscript𝐷𝜈𝐐𝑢superscriptsubscript𝐷subscript𝑈𝑑𝑇𝑢𝑘𝑒𝑦superscript𝜓𝐐𝑢D_{\nu}(\mathbf{Q}(u))=D_{U_{d}}^{Tukey}(\nabla\psi^{*}\circ\mathbf{Q}(u))italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_u ) ) = italic_D start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T italic_u italic_k italic_e italic_y end_POSTSUPERSCRIPT ( ∇ italic_ψ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∘ bold_Q ( italic_u ) ). By continuity of ν𝜈\nuitalic_ν, ψ𝐐(u)=usuperscript𝜓𝐐𝑢𝑢\nabla\psi^{*}\circ\mathbf{Q}(u)=u∇ italic_ψ start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∘ bold_Q ( italic_u ) = italic_u a.e.formulae-sequence𝑎𝑒a.e.italic_a . italic_e ., see [32] or [82][Theorem 2.12 and Corollary 2.3]. Thus the result follows, with (D.1).

D.2. Proof of Corollary 5.1

Let X𝑋Xitalic_X be a random vector associated with a spherically symmetric distribution, for which 𝔼(X)𝔼𝑋\mathbb{E}(X)blackboard_E ( italic_X ) and the deepest point shall coincide. From [12], the MK distribution function of X𝑋Xitalic_X is known. By inverting it, we get its quantile function

𝐐(u)=uuG1(u)+𝔼(X),𝐐𝑢𝑢norm𝑢superscript𝐺1norm𝑢𝔼𝑋\mathbf{Q}(u)=\frac{u}{\|u\|}G^{-1}(\|u\|)+\mathbb{E}(X),bold_Q ( italic_u ) = divide start_ARG italic_u end_ARG start_ARG ∥ italic_u ∥ end_ARG italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ∥ italic_u ∥ ) + blackboard_E ( italic_X ) ,

where G𝐺Gitalic_G is the univariate distribution function of the radial part X𝔼(X)norm𝑋𝔼𝑋\|X-\mathbb{E}(X)\|∥ italic_X - blackboard_E ( italic_X ) ∥. Because X0norm𝑋0\|X\|\geq 0∥ italic_X ∥ ≥ 0 a.s.formulae-sequence𝑎𝑠a.s.italic_a . italic_s . and G1superscript𝐺1G^{-1}italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT is increasing, G1(tu)/G1(u)[0,1]superscript𝐺1𝑡norm𝑢superscript𝐺1norm𝑢01G^{-1}(t\|u\|)/G^{-1}(\|u\|)\in[0,1]italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t ∥ italic_u ∥ ) / italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ∥ italic_u ∥ ) ∈ [ 0 , 1 ] and

𝐐(tu)=uuG1(tu)+𝔼(X)=G1(tu)G1(u)(𝐐(u)𝔼(X))+𝔼(X).𝐐𝑡𝑢𝑢norm𝑢superscript𝐺1𝑡norm𝑢𝔼𝑋superscript𝐺1𝑡norm𝑢superscript𝐺1norm𝑢𝐐𝑢𝔼𝑋𝔼𝑋\mathbf{Q}(tu)=\frac{u}{\|u\|}G^{-1}(t\|u\|)+\mathbb{E}(X)=\frac{G^{-1}(t\|u\|% )}{G^{-1}(\|u\|)}\Big{(}\mathbf{Q}(u)-\mathbb{E}(X)\Big{)}+\mathbb{E}(X).bold_Q ( italic_t italic_u ) = divide start_ARG italic_u end_ARG start_ARG ∥ italic_u ∥ end_ARG italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t ∥ italic_u ∥ ) + blackboard_E ( italic_X ) = divide start_ARG italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t ∥ italic_u ∥ ) end_ARG start_ARG italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ∥ italic_u ∥ ) end_ARG ( bold_Q ( italic_u ) - blackboard_E ( italic_X ) ) + blackboard_E ( italic_X ) .

This rewrites, for δt=G1(tu)/G1(u)subscript𝛿𝑡superscript𝐺1𝑡norm𝑢superscript𝐺1norm𝑢\delta_{t}=G^{-1}(t\|u\|)/G^{-1}(\|u\|)italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t ∥ italic_u ∥ ) / italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( ∥ italic_u ∥ ), 𝐐(tu)=δt𝐐(u)+(1δt)𝔼(X).𝐐𝑡𝑢subscript𝛿𝑡𝐐𝑢1subscript𝛿𝑡𝔼𝑋\mathbf{Q}(tu)=\delta_{t}\mathbf{Q}(u)+(1-\delta_{t})\mathbb{E}(X).bold_Q ( italic_t italic_u ) = italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_Q ( italic_u ) + ( 1 - italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) blackboard_E ( italic_X ) . Besides, δtsubscript𝛿𝑡\delta_{t}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT takes all values between 00 and 1111 for t[0,1]𝑡01t\in[0,1]italic_t ∈ [ 0 , 1 ]. This, combined with Proposition (5.1) induces

u𝔹(0,1),δ[0,1],Dν(𝐐(u))Dν(δ𝐐(u)+(1δ)𝔼(X)).formulae-sequencefor-all𝑢𝔹01formulae-sequencefor-all𝛿01subscript𝐷𝜈𝐐𝑢subscript𝐷𝜈𝛿𝐐𝑢1𝛿𝔼𝑋\forall u\in\mathbb{B}(0,1),\forall\delta\in[0,1],D_{\nu}(\mathbf{Q}(u))\leq D% _{\nu}(\delta\mathbf{Q}(u)+(1-\delta)\mathbb{E}(X)).∀ italic_u ∈ blackboard_B ( 0 , 1 ) , ∀ italic_δ ∈ [ 0 , 1 ] , italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_u ) ) ≤ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( italic_δ bold_Q ( italic_u ) + ( 1 - italic_δ ) blackboard_E ( italic_X ) ) .

But any x𝑥xitalic_x in the support of ν𝜈\nuitalic_ν writes 𝐐(u)𝐐𝑢\mathbf{Q}(u)bold_Q ( italic_u ) for u=𝐅(x)𝑢𝐅𝑥u=\mathbf{F}(x)italic_u = bold_F ( italic_x ), which gives (5.1).

D.3. Proof of Proposition 5.2

Fix x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. From the decomposition (3.11), s=𝐒𝐅(θM)(x)𝑠subscript𝐒𝐅subscript𝜃𝑀𝑥s=\mathbf{S}_{\mathbf{F}(\theta_{M})}(x)italic_s = bold_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ) is the directional sign associated to x𝑥xitalic_x. For t[1,1]𝑡11t\in[-1,1]italic_t ∈ [ - 1 , 1 ], let xtsUsubscript𝑥𝑡superscriptsubscript𝑠𝑈x_{t}\in\mathcal{M}_{s}^{U}italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_U end_POSTSUPERSCRIPT be a parameterization of the reference sign curve associated to s𝑠sitalic_s, as in (5.2). Immediately, one may note that xt,𝐅(θM)=tsubscript𝑥𝑡𝐅subscript𝜃𝑀𝑡\langle x_{t},\mathbf{F}(\theta_{M})\rangle=t⟨ italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ = italic_t, and xt=xsubscript𝑥𝑡𝑥x_{t}=xitalic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_x for t=x,𝐅(θM)𝑡𝑥𝐅subscript𝜃𝑀t=\langle x,\mathbf{F}(\theta_{M})\rangleitalic_t = ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩. Besides, Dν(𝐐(xt))=1d(xt,𝐅(θM))/π=1arccos(t)/πsubscript𝐷𝜈𝐐subscript𝑥𝑡1𝑑subscript𝑥𝑡𝐅subscript𝜃𝑀𝜋1𝑡𝜋D_{\nu}(\mathbf{Q}(x_{t}))=1-d(x_{t},\mathbf{F}(\theta_{M}))/\pi=1-\arccos(t)/\piitalic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) = 1 - italic_d ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ) / italic_π = 1 - roman_arccos ( italic_t ) / italic_π, so, as soon as tx,𝐅(θm)𝑡𝑥𝐅subscript𝜃𝑚t\geq\langle x,\mathbf{F}(\theta_{m})\rangleitalic_t ≥ ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) ⟩,

Dν(𝐐(xt))Dν(𝐐(x)).subscript𝐷𝜈𝐐subscript𝑥𝑡subscript𝐷𝜈𝐐𝑥D_{\nu}(\mathbf{Q}(x_{t}))\geq D_{\nu}(\mathbf{Q}(x)).italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ) ≥ italic_D start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( bold_Q ( italic_x ) ) .

D.4. Proof of Corollary 5.2

For any x𝕊2𝑥superscript𝕊2x\in{\mathbb{S}^{2}}italic_x ∈ blackboard_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and t[x,𝐅(θM),1]𝑡𝑥𝐅subscript𝜃𝑀1t\in[\langle x,\mathbf{F}(\theta_{M})\rangle,1]italic_t ∈ [ ⟨ italic_x , bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) ⟩ , 1 ], consider a parametrization of the sign curve associated to x𝑥xitalic_x as xt=t𝐅(θM)+1t2𝕊𝐅(θM)(x)subscript𝑥𝑡𝑡𝐅subscript𝜃𝑀1superscript𝑡2subscript𝕊𝐅subscript𝜃𝑀𝑥x_{t}=t\mathbf{F}(\theta_{M})+\sqrt{1-t^{2}}\mathbb{S}_{\mathbf{F}(\theta_{M})% }(x)italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_t bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) + square-root start_ARG 1 - italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG blackboard_S start_POSTSUBSCRIPT bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_x ). For the explicit 𝐅𝐅\mathbf{F}bold_F for rotationally invariant distributions given in (B.1), 𝐅(θM)=θM𝐅subscript𝜃𝑀subscript𝜃𝑀\mathbf{F}(\theta_{M})=\theta_{M}bold_F ( italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) = italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, thus xt=tθM+1t2𝕊θM(x)subscript𝑥𝑡𝑡subscript𝜃𝑀1superscript𝑡2subscript𝕊subscript𝜃𝑀𝑥x_{t}=t\theta_{M}+\sqrt{1-t^{2}}\mathbb{S}_{\theta_{M}}(x)italic_x start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_t italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT + square-root start_ARG 1 - italic_t start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG blackboard_S start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x ), and sign curves are great circles.         \mathbin{\vbox{\hrule\hbox{\vrule height=6.02773pt\kern 6.00006pt\vrule height% =6.02773pt}\hrule}}BINOP

References

  • [1] C. Agostinelli and M. Romanazzi, Nonparametric analysis of directional data based on data depth, Environmental and ecological statistics, 20 (2013), pp. 253–270.
  • [2] J. Ameijeiras-Alonso and R. M. Crujeiras, Directional statistics for wildfires, in Applied directional statistics, Chapman and Hall/CRC, 2018, pp. 203–226.
  • [3] V. Barnett, The ordering of multivariate data, Journal of the Royal Statistical Society: Series A (General), 139 (1976), pp. 318–344.
  • [4] J. Beirlant, S. Buitendag, E. del Barrio, M. Hallin, and F. Kamper, Center-outward quantiles and the measurement of multivariate risk, Insurance: Mathematics and Economics, 95 (2020), pp. 79–100.
  • [5] J.-D. Benamou, W. L. Ijzerman, and G. Rukhaia, An entropic optimal transport numerical approach to the reflector problem, Methods and Applications of Analysis, (2020).
  • [6] B. Bercu and J. Bigot, Asymptotic distribution and convergence rates of stochastic algorithms for entropic optimal transportation between probability measures, The Annals of Statistics, 49 (2021), pp. 968 – 987.
  • [7] B. Bercu, J. Bigot, and G. Thurin, Stochastic optimal transport in banach spaces for regularized estimation of multivariate quantiles. arXiv, 2023.
  • [8] E. Bernton, P. Ghosal, and M. Nutz, Entropic optimal transport : Geometry and large deviations, Duke Mathematical Journal, 171 (2022), pp. 3363–3400.
  • [9] F. Bigi, G. Fraux, N. J. Browning, and M. Ceriotti, Fast evaluation of spherical harmonics with sphericart, The Journal of Chemical Physics, 159 (2023).
  • [10] G. Carlier, V. Chernozhukov, G. De Bie, and A. Galichon, Vector quantile regression and optimal transport, from theory to numerics, Empirical Economics, 62 (2022), pp. 35–62.
  • [11] P. Chaudhuri, On a geometric notion of quantiles for multivariate data, Journal of the American statistical association, 91 (1996), pp. 862–872.
  • [12] V. Chernozhukov, A. Galichon, M. Hallin, and M. Henry, Monge–Kantorovich depth, quantiles, ranks and signs, The Annals of Statistics, 45 (2017), pp. 223 – 256.
  • [13] G. S. Chirikjian and A. B. Kyatkin, Engineering applications of noncommutative harmonic analysis: with emphasis on rotation and motion groups, CRC press, 2000.
  • [14] S. Cohen, B. Amos, and Y. Lipman, Riemannian convex potential maps, in International Conference on Machine Learning, PMLR, 2021, pp. 2028–2038.
  • [15] D. Cordero-Erausquin, R. J. McCann, and M. Schmuckenschläger, A riemannian interpolation inequality à la borell, brascamp and lieb, Inventiones mathematicae, 146 (2001), pp. 219–257.
  • [16] M. Cuturi, Sinkhorn distances: Lightspeed computation of optimal transport, Advances in Neural Information Processing Systems, 26 (2013).
  • [17] M. Cuturi, M. Klein, and P. Ablin, Monge, Bregman and occam: Interpretable optimal transport in high-dimensions with feature-sparse maps, vol. 202 of Proceedings of Machine Learning Research, 2023, pp. 6671–6682.
  • [18] M. Cuturi and G. Peyré, Computational optimal transport, Foundations and Trends® in Machine Learning, 11 (2019), pp. 355–607.
  • [19] E. del Barrio, A. González Sanz, and M. Hallin, Nonparametric multiple-output center-outward quantile regression, Journal of the American Statistical Association, (2024), pp. 1–43.
  • [20] P. Delanoë and G. Loeper, Gradient estimates for potentials of invertible gradient–map**s on the sphere, Calculus of Variations and Partial Differential Equations, 26 (2006), pp. 297–311.
  • [21] H. Demni, A. Messaoud, and G. C. Porzio, The cosine depth distribution classifier for directional data, Applications in Statistical Computing: From Music Data Analysis to Industrial Quality Improvement, (2019), pp. 49–60.
  • [22] H. Demni and G. C. Porzio, Directional dd-classifiers under non-rotational symmetry, in 2021 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), 2021, pp. 1–6.
  • [23] J.-L. Dortet-Bernadet and N. Wicker, Model-based clustering on the unit sphere with an illustration using gene expression profiles, Biostatistics, 9 (2008), pp. 66–80.
  • [24] Y. Fan, M. Henry, B. Pass, and J. A. Rivero, Lorenz map, inequality ordering and curves based on multidimensional rearrangements. arXiv, 2022.
  • [25] O. Ferreira, A. Iusem, and S. Németh, Concepts and techniques of optimization on the sphere, Top, 22 (2014), pp. 1148–1170.
  • [26] J. Feydy, T. Séjourné, F.-X. Vialard, S.-i. Amari, A. Trouvé, and G. Peyré, Interpolating between optimal transport and mmd using sinkhorn divergences, in The 22nd International Conference on Artificial Intelligence and Statistics, PMLR, 2019, pp. 2681–2690.
  • [27] A. Figalli, Y.-H. Kim, and R. J. McCann, When is multidimensional screening a convex program?, Journal of Economic Theory, 146 (2011), pp. 454–478.
  • [28] M. Frungillo, Discrete approximation of optimal transport on compact spaces, arXiv preprint arXiv:2401.14538, (2024).
  • [29] E. García-Portugués, D. Paindaveine, and T. Verdebout, On optimal tests for rotational symmetry against new classes of hyperspherical distributions, Journal of the American Statistical Association, 115 (2020), pp. 1873–1887.
  • [30] A. Genevay, Entropy-regularized Optimal Transport for Machine Learning, PhD thesis, Université Paris sciences et lettres, 2019.
  • [31] A. Genevay, M. Cuturi, G. Peyré, and F. Bach, Stochastic Optimization for Large-scale Optimal Transport, Advances in neural information processing systems, 29 (2016).
  • [32] P. Ghosal and B. Sen, Multivariate ranks and quantiles using optimal transport: Consistency, rates, and nonparametric testing, The Annals of Statistics, 50 (2022), pp. 1012–1037.
  • [33] Z. Goldfeld, K. Kato, G. Rioux, and R. Sadhu, Limit theorems for entropic optimal transport maps and sinkhorn divergence, Electronic Journal of Statistics, 18 (2024), pp. 980–1041.
  • [34] M. Hallin, From mahalanobis to bregman via monge and kantorovich: Towards a “general generalized distance”, Sankhya B, 80 (2018), pp. 135–146.
  • [35] M. Hallin, Measure Transportation and Statistical Decision Theory, Annual Review of Statistics and Its Application, 9 (2022), pp. 401–424.
  • [36] M. Hallin, E. del Barrio, J. Cuesta-Albertos, and C. Matrán, Distribution and quantile functions, ranks and signs in dimension d: A measure transportation approach, The Annals of Statistics, 49 (2021), pp. 1139 – 1165.
  • [37] M. Hallin, H. Liu, and T. Verdebout, Nonparametric measure-transportation-based methods for directional data, Journal of the Royal Statistical Society Series B: Statistical Methodology, (2024).
  • [38] M. Hallin and G. Mordant, Center-outward multiple-output lorenz curves and gini indices a measure transportation approach, working papers ecares, ULB – Universite Libre de Bruxelles, 2022.
  • [39] M. Hallin, D. Vecchia, and H. Liu, Rank-based testing for semiparametric var models: A measure transportation approach, Bernoulli, 29 (2023).
  • [40] B. F. Hamfeldt and A. G. Turnquist, A convergence framework for optimal transport on the sphere, Numerische Mathematik, 151 (2022), pp. 627–657.
  • [41] K. Hauch and C. Redenbach, Quantiles and depth for directional data from elliptically symmetric distributions, arXiv preprint arXiv:2210.06098, (2022).
  • [42] H. Kalf, On the expansion of a function in terms of spherical harmonics in arbitrary dimensions, Bulletin of the Belgian Mathematical Society-Simon Stevin, 2 (1995), pp. 361–380.
  • [43] O. D. Kellogg, Foundations of potential theory, vol. 31, Springer Science & Business Media, 2012.
  • [44] D. Konen and D. Paindaveine, Spatial quantiles on the hypersphere, The Annals of Statistics, 51 (2023), pp. 2221–2245.
  • [45] S. Kunis and D. Potts, Fast spherical fourier algorithms, Journal of Computational and Applied Mathematics, 161 (2003), pp. 75–98.
  • [46] C. Ley, C. Sabbah, and T. Verdebout, A new concept of quantiles for directional data and the angular mahalanobis depth, Electronic Journal of Statistics [electronic only], 8 (2014).
  • [47] C. Ley and T. Verdebout, Modern directional statistics, CRC Press, 2017.
  • [48]  , Applied directional statistics: modern methods and case studies, CRC Press, 2018.
  • [49] R. Y. Liu, On a notion of data depth based on random simplices, The Annals of Statistics, (1990), pp. 405–414.
  • [50] R. Y. Liu, J. M. Parelius, and K. Singh, Multivariate analysis by data depth: descriptive statistics, graphics and inference,(with discussion and a rejoinder by liu and singh), The annals of statistics, 27 (1999), pp. 783–858.
  • [51] R. Y. Liu and K. Singh, Ordering directional data: concepts of data depth on circles and spheres, The Annals of Statistics, 20 (1992), pp. 1468–1484.
  • [52] G. Loeper, Regularity of optimal maps on the sphere: The quadratic cost and the reflector antenna, Archive for rational mechanics and analysis, 199 (2011), pp. 269–289.
  • [53] G. Loeper and C. Villani, Regularity of optimal transport in curved geometry: The nonfocal case, Duke Mathematical Journal, 151 (2010), pp. 431 – 485.
  • [54] K. V. Mardia, P. E. Jupp, and K. Mardia, Directional statistics, vol. 2, Wiley Online Library, 2000.
  • [55] D. Marinucci, D. Pietrobon, A. Balbi, P. Baldi, P. Cabella, G. Kerkyacharian, P. Natoli, D. Picard, and N. Vittorio, Spherical needlets for cosmic microwave background data analysis, Monthly Notices of the Royal Astronomical Society, 383 (2008), pp. 539–545.
  • [56] S. B. Masud, M. Werenski, J. M. Murphy, and S. Aeron, Multivariate soft rank via entropic optimal transport: sample efficiency and generative modeling, Journal of Machine Learning Research, 24 (2023), pp. 1–65.
  • [57] R. J. McCann, Polar factorization of maps on riemannian manifolds, Geometric & Functional Analysis GAFA, 11 (2001), pp. 589–608.
  • [58] V. Michel, Lectures on constructive approximation: Fourier, spline, and wavelet methods on the real line, the sphere, and the ball, Springer Science & Business Media, 2012.
  • [59] N. Miolane, N. Guigui, A. L. Brigant, J. Mathe, B. Hou, Y. Thanwerdas, S. Heyder, O. Peltre, N. Koep, H. Zaatiti, H. Hajri, Y. Cabanes, T. Gerald, P. Chauchat, C. Shewmake, D. Brooks, B. Kainz, C. Donnat, S. Holmes, and X. Pennec, Geomstats: A python package for riemannian geometry in machine learning, Journal of Machine Learning Research, 21 (2020), pp. 1–9.
  • [60] K. Mosler, Depth statistics, Robustness and complex data structures: Festschrift in Honour of Ursula Gather, (2013), pp. 17–34.
  • [61] S. Nagy, H. Demni, D. Buttarazzi, and G. C. Porzio, Theory of angular depth for classification of directional data, Advances in Data Analysis and Classification, (2023).
  • [62] S. Nagy and P. Laketa, Theoretical properties of angular halfspace depth, arXiv preprint arXiv:2402.08285, (2024).
  • [63] Z. Niu and B. B. Bhattacharya, Distribution-free joint independence testing and robust independent component analysis using optimal transport. arXiv, 2022.
  • [64] M. Nutz and J. Wiesel, Entropic optimal transport: convergence of potentials, Probability Theory and Related Fields, 184 (2022), pp. 401–424.
  • [65] G. Pandolfo and A. D’ambrosio, Clustering directional data through depth functions, Computational Statistics, 38 (2023), pp. 1487–1506.
  • [66] G. Pandolfo, D. Paindaveine, and G. C. Porzio, Distance-based depths for directional data, Canadian Journal of Statistics, 46 (2018), pp. 593–609.
  • [67] M. Pegoraro, S. Vedula, A. A. Rosenberg, I. Tallini, E. Rodolà, and A. Bronstein, Vector quantile regression on manifolds, in ICML Workshop on New Frontiers in Learning, Control, and Dynamical Systems, 2023.
  • [68] A. Pewsey and E. García-Portugués, Recent advances in directional statistics, Test, 30 (2021), pp. 1–58.
  • [69] A.-A. Pooladian and J. Niles-Weed, Entropic estimation of optimal transport maps, 2021.
  • [70] P. Rigollet and A. J. Stromme, On the sample complexity of entropic optimal transport. arXiv, 2022.
  • [71] P. J. Rousseeuw and A. Struyf, Characterizing angular symmetry and regression symmetry, Journal of Statistical Planning and Inference, 122 (2004), pp. 161–173.
  • [72] V. Seguy, B. B. Damodaran, R. Flamary, N. Courty, A. Rolet, and M. Blondel, Large-Scale Optimal Transport and Map** Estimation, in ICLR 2018 - International Conference on Learning Representations, 2018, pp. 1–15.
  • [73] T. Sei, Gradient modeling for multivariate quantitative data, Annals of the Institute of Statistical Mathematics, 63 (2011), pp. 675–688.
  • [74]  , A jacobian inequality for gradient maps on the sphere and its application to directional statistics, Communications in Statistics-Theory and Methods, 42 (2013), pp. 2525–2542.
  • [75] R. Serfling, Depth functions in nonparametric multivariate inference, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 72 (2006), p. 1.
  • [76] H. Shi, M. Drton, M. Hallin, and F. Han, Center-outward sign- and rank-based quadrant, spearman, and kendall tests for multivariate independence., working papers ecares, ULB – Universite Libre de Bruxelles, 2021.
  • [77] H. Shi, M. Drton, and F. Han, Distribution-free consistent independence tests via center-outward ranks and signs, Journal of the American Statistical Association, 117 (2022), pp. 395–410.
  • [78] C. G. Small, Measures of centrality for multivariate and directional distributions, Canadian Journal of Statistics, 15 (1987), pp. 31–39.
  • [79] S. Sommer, T. Fletcher, and X. Pennec, Introduction to differential and riemannian geometry, in Riemannian Geometric Statistics in Medical Image Analysis, Elsevier, 2020, pp. 3–37.
  • [80] A. Stromme, Sampling from a schrödinger bridge, in International Conference on Artificial Intelligence and Statistics, PMLR, 2023, pp. 4058–4067.
  • [81] J. W. Tukey, Mathematics and the picturing of data, Proceedings of the International Congress of Mathematicians (Vancouver, B. C., 1974), 2 (1975), pp. 523–531.
  • [82] C. Villani, Topics in optimal transportation, vol. 58 of Graduate Studies in Mathematics, American Mathematical Society, 2003.
  • [83] G. T. von Nessi, On the regularity of optimal transportation potentials on round spheres, Acta applicandae mathematicae, 123 (2013), pp. 239–259.
  • [84] M. A. Wieczorek and M. Meschede, Shtools: Tools for working with spherical harmonics, Geochemistry, Geophysics, Geosystems, 19 (2018), pp. 2574–2592.
  • [85] Y. Zuo and R. Serfling, General notions of statistical depth function, The Annals of Statistics, (2000), pp. 461–482.