MSC Classification: 58D15, 68T99. Keywords: Manifold Learning, Embedding Spaces, Discretized Gradient Flow

Discretized Gradient Flow for Manifold Learning

Dara Gold [email protected] Jenner & Block LLP (contractor)  and  Steven Rosenberg [email protected] Department of Mathematics and Statistics
Boston University
Boston, Ma 02215, USA
Abstract.

Gradient descent, or negative gradient flow, is a standard technique in optimization to find minima of functions. Many implementations of gradient descent rely on discretized versions, i.e., moving in the gradient direction for a set step size, recomputing the gradient, and continuing. In this paper, we present an approach to manifold learning where gradient descent takes place in the infinite dimensional space =Emb(M,N)Emb𝑀superscript𝑁{\mathcal{E}}={\rm Emb}(M,\mathbb{R}^{N})caligraphic_E = roman_Emb ( italic_M , blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ) of smooth embeddings ϕitalic-ϕ\phiitalic_ϕ of a manifold M𝑀Mitalic_M into Nsuperscript𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Implementing a discretized version of gradient descent for P::𝑃P:{\mathcal{E}}\longrightarrow{\mathbb{R}}italic_P : caligraphic_E ⟶ blackboard_R, a penalty function that scores an embedding ϕitalic-ϕ\phi\in{\mathcal{E}}italic_ϕ ∈ caligraphic_E, requires estimating how far we can move in a fixed direction – the direction of one gradient step – before leaving the space of smooth embeddings. Our main result is to give an explicit lower bound for this step length in terms of the Riemannian geometry of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). In particular, we consider the case when the gradient of P𝑃Pitalic_P is pointwise normal to the embedded manifold ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). We prove this case arises when P𝑃Pitalic_P is invariant under diffeomorphisms of M𝑀Mitalic_M, a natural condition in manifold learning.

1. Introduction

A common approach in data analysis and machine learning is manifold learning, i.e., determining how to approximate a finite set of {yi}i=1Lsuperscriptsubscriptsubscript𝑦𝑖𝑖1𝐿\{y_{i}\}_{i=1}^{L}{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L end_POSTSUPERSCRIPT in Euclidean space Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT by a k𝑘kitalic_k-dimensional embedded, compact manifold M𝑀Mitalic_M for some kNmuch-less-than𝑘𝑁k\ll Nitalic_k ≪ italic_N [6, 12, 13, 23, 29, 36]. (The definition of embedding is at the end of the Introduction.) While classic approaches to non-linear manifold learning include Isometric Map** (IsoMap), Local-Linear Embeddings (LLE), and Laplacian and Hessian eigenmaps, there is a growing body of work that uses gradient descent of functionals to find manifold representations of high dimensional data. This mathematical setup involves the space of smooth embeddings =Emb(M,N)Emb𝑀superscript𝑁{\mathcal{E}}={\rm Emb}(M,\mathbb{R}^{N})caligraphic_E = roman_Emb ( italic_M , blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ) considered as an open subset of the infinite dimensional vector space of all maps from M𝑀Mitalic_M to Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT with the Banach space topology coming from a high Sobolev norm or the Csuperscript𝐶C^{\infty}italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT Fréchet space topology. We also have a C1superscript𝐶1C^{1}italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT penalty function P::𝑃P:{\mathcal{E}}\rightarrow\mathbb{R}italic_P : caligraphic_E → blackboard_R which typically contains a data fitting term and a regularization term, as explained below. (In kee** with the literature, we assume that k𝑘kitalic_k and the diffeomorphism type of M𝑀Mitalic_M are given.) In theory, finding a global minimum of P𝑃Pitalic_P via the negative gradient flow of P𝑃Pitalic_P on {\mathcal{E}}caligraphic_E gives an optimal embedding, or one that “best fits” the training set {yi}subscript𝑦𝑖\{y_{i}\}{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }. The main result of this paper (Theorem 5.7) gives precise bounds on the practical implementation issue of determining how far one can flow in a fixed gradient direction and still have an embedded manifold.

To avoid overfitting – or choosing a ϕitalic-ϕ\phiitalic_ϕ such that ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) fits the {yi}subscript𝑦𝑖\{y_{i}\}{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } very closely but performs poorly on new data points – a penalty function P𝑃Pitalic_P can penalize ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) both for being too far from {yi}subscript𝑦𝑖\{y_{i}\}{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } and for “twisting too much” to fit the data. Thus a typical penalty function P=P1+P2::𝑃subscript𝑃1subscript𝑃2P=P_{1}+P_{2}:{\mathcal{E}}\longrightarrow{\mathbb{R}}italic_P = italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT : caligraphic_E ⟶ blackboard_R contains two terms: (i) a data fitting term P1(ϕ)=i=1rd2(ϕ(M),yi),subscript𝑃1italic-ϕsuperscriptsubscript𝑖1𝑟superscript𝑑2italic-ϕ𝑀subscript𝑦𝑖P_{1}(\phi)=\sum_{i=1}^{r}d^{2}(\phi(M),y_{i}),italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_ϕ ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_ϕ ( italic_M ) , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , where d(ϕ(M),yi)𝑑italic-ϕ𝑀subscript𝑦𝑖d(\phi(M),y_{i})italic_d ( italic_ϕ ( italic_M ) , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) is the Euclidean distance from yisubscript𝑦𝑖y_{i}italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to the closest point in ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ); (ii) a regularization term P2subscript𝑃2P_{2}italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT designed to prevent overfitting, e.g. P2(ϕ)=ϕs,subscript𝑃2italic-ϕsubscriptnormitalic-ϕ𝑠P_{2}(\phi)=\|\phi\|_{s},italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_ϕ ) = ∥ italic_ϕ ∥ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT , the s𝑠sitalic_s-Sobolev norm of ϕitalic-ϕ\phiitalic_ϕ. (For overviews of this standard approach, see [5], [39].) Gradient descent, i.e., moving in the direction of P𝑃-\nabla P- ∇ italic_P in {\mathcal{E}}caligraphic_E, can find a local or global minimum of P𝑃Pitalic_P, or an optimal manifold to fit {yi}subscript𝑦𝑖\{y_{i}\}{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT }.

While there are theoretical challenges with this setup, we focus on an implementation problem in this paper. In theory, to find a negative gradient flow line on {\mathcal{E}}caligraphic_E, we need to know the gradient of P𝑃Pitalic_P at each point of .{\mathcal{E}}.caligraphic_E . This is generally intractable for computer calculations. Instead, the gradient flow is often discretized: we move in the negative gradient direction from an initial point ϕ0subscriptitalic-ϕ0\phi_{0}italic_ϕ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT for a fixed step size to a new point ϕ1subscriptitalic-ϕ1\phi_{1}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, stop and recompute the gradient at ϕ1subscriptitalic-ϕ1\phi_{1}italic_ϕ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, then iterate until the gradient is smaller than a specified amount. Since a gradient vector in the tangent space of {\mathcal{E}}caligraphic_E corresponds to a vector field along ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ), we need to estimate a lower bound t=t(ϕ)superscript𝑡superscript𝑡italic-ϕt^{*}=t^{*}(\phi)italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_ϕ ) for how far we can move from a fixed embedding ϕitalic-ϕ\phiitalic_ϕ in the negative gradient direction Pϕsubscript𝑃italic-ϕ-\nabla P_{\phi}- ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT and still remain in the space of embeddings. In summary, we avoid the usual problem that forward geometric flows tend to develop singularities by first discretizing the flow, and then estimating how big a gradient step avoids singularities.

This practical issue is the main focus of this paper. In the main result, Theorem 5.7, we provide such a lower bound tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, which in effect measures how well discretized flow can approximate the smooth flow. Here tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT depends on the local and global extrinsic geometry of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ).

We emphasize that our approach to manifold learning directly tackles the infinite dimensional nature of this optimization problem via gradient flow and without making any simplifying choices that reduce the problem to finite dimensions. Typical choices in the literature are parametric methods, which fix a finite dimensional parameter space of embeddings, and RKHS methods, which reduce the optimization to a finite dimensional problem via the Representation Theorem, but only after making a choice of kernel function. In contrast, our approach only assumes that M𝑀Mitalic_M is compact, possibly with boundary, and so must contend with infinite dimensional analytic issues. Since we are given a finite set of training data, the compactness assumption is reasonable.

We briefly discuss the issues with directly working with the smooth gradient flow on {\mathcal{E}}caligraphic_E. It may be difficult to prove that P𝑃Pitalic_P is differentiable for typical data terms which measure the minimum distance from a data point to the embedded manifold [4]. Even if P𝑃Pitalic_P is differentiable, in this infinite dimensional case it is not clear that a gradient flow line γ(t)𝛾𝑡\gamma(t)italic_γ ( italic_t ) stays in {\mathcal{E}}caligraphic_E or converges as t𝑡t\longrightarrow\inftyitalic_t ⟶ ∞ to a critical point of P𝑃Pitalic_P, as {\mathcal{E}}caligraphic_E is an open dense set in the space of smooth maps from M𝑀Mitalic_M to N.superscript𝑁{\mathbb{R}}^{N}.blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT . Even if we can prove convergence, since neither P𝑃Pitalic_P nor {\mathcal{E}}caligraphic_E is in general convex, a critical point need not be a global minimum, and a second derivative test for local minima may be difficult to develop and implement. Perhaps most fundamentally, even the short time existence for the gradient flow may be difficult to establish, particularly if we use the most natural Csuperscript𝐶C^{\infty}italic_C start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT topology on .{\mathcal{E}}.caligraphic_E . These problems are well known in differential geometry, e.g., in the study of minimal submanifolds. In contrast, discretized gradient flow is both a tool for theoretical results on gradient flow [1, Ch. 11] and for computer implementations based on discretized, usually linearized, versions of gradient flow [14], although there may again be convergence issues [11].

As an overview of the paper, in §2, we give a short overview of manifold learning with references to the literature. §3 gives an outline of the proof of Theorem 5.7. In §4, we argue that the entire penalty term should be invariant under the diffeomorphism group of M𝑀Mitalic_M, just like the data fitting term (i). In particular, regularization terms built from geometric quantities like the volume or total mean curvature of M𝑀Mitalic_M have this invariance, while more familiar regularization terms like a Sobolev norm of the embedding do not. We prove in Theorem 4.2 that for a diffeomorphism invariant penalty function P𝑃Pitalic_P, the gradient vector field Pϕsubscript𝑃italic-ϕ\nabla P_{\phi}∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT is guaranteed to be pointwise normal to ϕ(M).italic-ϕ𝑀\phi(M).italic_ϕ ( italic_M ) . §5 gives the proof of Theorem 5.7. §6 is a discussion of potential extensions of this work. Appendix A contains a proof of a quantitative implicit function theorem used in §5.

We recall the technical definition of an embedding of a manifold M𝑀Mitalic_M into a manifold W𝑊Witalic_W (W=N𝑊superscript𝑁W={\mathbb{R}}^{N}italic_W = blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT for us).

Definition 1.1.

A smooth map f:MW:𝑓𝑀𝑊f:M\longrightarrow Witalic_f : italic_M ⟶ italic_W between smooth manifolds is an immersion if the differential dfx:TxMTf(x)W:𝑑subscript𝑓𝑥subscript𝑇𝑥𝑀subscript𝑇𝑓𝑥𝑊df_{x}:T_{x}M\longrightarrow T_{f(x)}Witalic_d italic_f start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT : italic_T start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_M ⟶ italic_T start_POSTSUBSCRIPT italic_f ( italic_x ) end_POSTSUBSCRIPT italic_W is injective for all xM.𝑥𝑀x\in M.italic_x ∈ italic_M . An immersion is an embedding if f𝑓fitalic_f is a homeomorphism from M𝑀Mitalic_M to f(M)𝑓𝑀f(M)italic_f ( italic_M ) in the induced topology, i.e., a set Vf(M)𝑉𝑓𝑀V\subset f(M)italic_V ⊂ italic_f ( italic_M ) is open iff V=Uf(M)𝑉𝑈𝑓𝑀V=U\cap f(M)italic_V = italic_U ∩ italic_f ( italic_M ) for an open set UW.𝑈𝑊U\subset W.italic_U ⊂ italic_W .

Since M𝑀Mitalic_M is compact in this paper, the unwieldy topological condition for an embedding simplifies.

Proposition 1.1.

[26, Prop. 4.22] If M𝑀Mitalic_M is compact, a smooth immersion f:MW:𝑓𝑀𝑊f:M\longrightarrow Witalic_f : italic_M ⟶ italic_W is an embedding.

2. Related work

Manifold learning is an approach to dimensionality reduction, the attempt to replace high dimensional data in N,N0much-greater-thansuperscript𝑁𝑁0{\mathbb{R}}^{N},N\gg 0blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT , italic_N ≫ 0, by a low dimensional subset. Standard techniques in manifold learning, such as Locally Linear Embedding (LLE), IsoMap [40], Laplacian Eigenmaps [5], and Hessian Eigenmaps [13], involve algorithms that reduce to (often nontrivial) minimization problems in finite dimensions. In theory, these minimization problems can be solved by Lagrangian multipliers, so gradient descent is not a built in feature of these approaches. (We note that our discretization method is somewhat the reverse of the successful manifold approximation approach of Laplacian eigenmaps, where a discrete set of data in Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT that apparently lies close to a submanifold is parametrized by a subset of ksuperscript𝑘{\mathbb{R}}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT through eigenvectors of a graph Laplacian; this parametrization is our ϕ1.superscriptitalic-ϕ1\phi^{-1}.italic_ϕ start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .)

In contrast, our approach is inherently infinite dimensional and relies on gradient flow, as explained in the Introduction. The use of gradient flow for functionals on infinite dimensional manifolds of maps has a large literature in machine learning, where this comes under the general heading of nonparametric methods. (In the parametric approach, one restricts attention to a finite dimensional submanifold depending on a finite dimensional family of parameters.) Osher and Sethian introduced the Level Set Method [35], which has been applied to machine learning by using gradient desecent on energy functionals (which act on the space of level set functions) to find optimal data-classification boundaries. Viewing the decision boundary this way avoids typical problems that arise with cusps and discontinuities in a flow whose speed is curvature dependent. This work has been extended in many directions, including computer vision and image analysis, fluid mechanics, and classification problems [33, 38, 41, 44]. In supervised learning, [3] finds optimal statistical labeling functions by using gradient descent of penalty functionals that include both a data term P1subscript𝑃1P_{1}italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as above and a geometric regularization term P2subscript𝑃2P_{2}italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. (It should be noted that this paper has to resort to parametric methods to implement the discretized gradient flow algorithm.) There are intriguing connections between regularization methods and classical physical equations in Lin et al. [27].

Although applied here to manifold learning, the appearance of gradient flow in infinite dimensions of course has its roots in differential geometry. In minimal submanifold theory, the penalty function is the purely geometric volume of the embedded manifold, and the gradient flow is the mean curvature flow. A sampling of results is in [21, 24, 25, 37, 43]. Similar to our approach, Mayer [30] uses a discretized approximation to the gradient flow, which more closely mimics implementation processes. It is worth noting that historically, the modern study of gradient flow in differential geometry was initiated by Morse [32] in the 1930s on the infinite dimensional space of paths on a Riemannian manifold, which was then adapted by Milnor [31] to develop Morse theory on finite dimensional manifolds. In turn, Morse theory has undergone widespread development through Floer theory and its many variants in the past 25 years (see e.g., [2]).

As described in the Introduction, our penalty functions evaluate manifold embeddings with diffeomorphism invariant data and geometric regularization terms. The use of such terms is a develo** area at the interface of differential geometry and machine learning. [41] uses the surface area of a decision boundary as the regularization term P2(ϕ)subscript𝑃2italic-ϕP_{2}(\phi)italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_ϕ ), while [3] uses the area of the manifold itself for P2(ϕ)subscript𝑃2italic-ϕP_{2}(\phi)italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_ϕ ), and [7] uses a discrete version of the total mean curvature of a surface with applications to tomography. This last article contains many references to generalizations of optimization methods to a fixed finite dimensional Riemannian manifold, while our interest is in the infinite dimensional space of embeddings. Finally, the strongest connection to date between manifold learning and differential geometry is in the work of Fefferman et al. [16, 17, 18, 19, 20] on the “manifold hypothesis.”

Although using gradient descent for manifold learning has widespread applications to machine learning, discretizing the flow - which is needed for most implementations - has many unstudied challenges. [3] for example, which is most closely related to our paper, uses a fixed step size in their gradient flow implementation. Our paper is the first to address the maximum step size that ensures a manifold remains an embedding when finding a low dimensional representation of training data.

3. Proof Outline for the Discretized Gradient Flow Estimate

Because of the computational detail in §5, we give an overview of the proof structure and the locations of key results.

3.1. General Overview

In Theorem 4.2 in §4, we give a natural condition on the penalty function P::𝑃P:{\mathcal{E}}\longrightarrow{\mathbb{R}}italic_P : caligraphic_E ⟶ blackboard_R under which P𝑃\nabla P∇ italic_P is pointwise normal to an embedding ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). Throughout the paper, we assume that P𝑃Pitalic_P satisfies this condition.

Given a pointwise normal vector field u𝑢uitalic_u along ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) with the length of each vector in u𝑢uitalic_u at most one, §5, which has our main results, gives a lower bound for tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that

ϕt(m)=ϕ(m)+tumsubscriptitalic-ϕ𝑡𝑚italic-ϕ𝑚𝑡subscript𝑢𝑚\phi_{t}(m)=\phi(m)+tu_{m}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_m ) = italic_ϕ ( italic_m ) + italic_t italic_u start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT

remains an embedding for all |t|<t𝑡superscript𝑡|t|<t^{*}| italic_t | < italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.111 The Euler class of the normal bundle eHNdim(M)(M)𝑒superscript𝐻𝑁dim𝑀𝑀e\in H^{N-{\rm dim}(M)}(M)italic_e ∈ italic_H start_POSTSUPERSCRIPT italic_N - roman_dim ( italic_M ) end_POSTSUPERSCRIPT ( italic_M ) is the obstruction to the global existence of a unit normal vector field. Since e𝑒eitalic_e may be nonzero, we must refer to vector fields whose elements have length at most one. If N>2dim(M)𝑁2dim𝑀N>2{\rm dim}(M)italic_N > 2 roman_d roman_i roman_m ( italic_M ), the obstruction vanishes. In particular, this applies to u=kϕ1Pϕ𝑢superscriptsubscript𝑘italic-ϕ1subscript𝑃italic-ϕu=k_{\phi}^{-1}\cdot\nabla P_{\phi}italic_u = italic_k start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ⋅ ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, where kϕ=maxxMPϕ(x)subscript𝑘italic-ϕsubscript𝑥𝑀normsubscript𝑃italic-ϕ𝑥k_{\phi}=\max_{x\in M}\|\nabla P_{\phi(x)}\|italic_k start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT = roman_max start_POSTSUBSCRIPT italic_x ∈ italic_M end_POSTSUBSCRIPT ∥ ∇ italic_P start_POSTSUBSCRIPT italic_ϕ ( italic_x ) end_POSTSUBSCRIPT ∥. Since M𝑀Mitalic_M is compact, it suffices to prove that ϕt(M)subscriptitalic-ϕ𝑡𝑀\phi_{t}(M)italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) is an injective immersion.

3.2. Note on Computation of Key Values

Proposition 5.1 gives a condition under which ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an immersion, and Theorem 5.2 defines the bound tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT in which ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is injective. Finally, together with our assumption that M𝑀Mitalic_M is compact, our main result Theorem 5.7 concludes the map** is an embedding. In the proof of Theorem 5.2, tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is initially a function of the quantities ϵ,δH,δ,Kitalic-ϵsubscript𝛿𝐻𝛿𝐾\epsilon,\delta_{H},\delta,Kitalic_ϵ , italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_δ , italic_K. ϵitalic-ϵ\epsilonitalic_ϵ is defined in §5.1(1), and K𝐾Kitalic_K is explicitly defined in §5.1(7) as the maximal principal eigenvalue of ϕ(M).italic-ϕ𝑀\phi(M).italic_ϕ ( italic_M ) . In Lemma 4, ϵitalic-ϵ\epsilonitalic_ϵ is computed as a function of δ𝛿\deltaitalic_δ and K𝐾Kitalic_K, so t=t(K,δ,δH).superscript𝑡superscript𝑡𝐾𝛿subscript𝛿𝐻t^{*}=t^{*}(K,\delta,\delta_{H}).italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_K , italic_δ , italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) . The dependence of tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT on δHsubscript𝛿𝐻\delta_{H}italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT is eliminated after (5.24), so finally t=t(K,δ).superscript𝑡superscript𝑡𝐾𝛿t^{*}=t^{*}(K,\delta).italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ( italic_K , italic_δ ) .

The computation of δ𝛿\deltaitalic_δ is significantly more involved. The characterizing property of δ𝛿\deltaitalic_δ is in §5.1(8). δ𝛿\deltaitalic_δ is defined in (5.2) as the minimum of a quantity δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), where (q0,v0)subscript𝑞0subscript𝑣0(q_{0},v_{0})( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is in the normal bundle of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). In turn, δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is computed in the proof of Proposition 5.6 in three steps, each of which builds on the prior: δ0(q0,v0)superscript𝛿0subscript𝑞0subscript𝑣0\delta^{0}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined in (5.17), δ1(q0,v0)superscript𝛿1subscript𝑞0subscript𝑣0\delta^{1}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined by (5.18), and δ2(q0,v0)superscript𝛿2subscript𝑞0subscript𝑣0\delta^{2}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined in (5.22). Finally δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined in (5.23) in terms of δ0(q0,v0),δ2(q0,v0)superscript𝛿0subscript𝑞0subscript𝑣0superscript𝛿2subscript𝑞0subscript𝑣0\delta^{0}(q_{0},v_{0}),\delta^{2}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). These steps are recapped in Remark 5.2.

4. A Condition for Normal Gradient Vector Fields

As outlined in the introduction, manifold learning involves searching for an embedding ϕ:MN:italic-ϕ𝑀superscript𝑁\phi:M\longrightarrow{\mathbb{R}}^{N}italic_ϕ : italic_M ⟶ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT with yiIm(ϕ)subscript𝑦𝑖Imitalic-ϕy_{i}\in{\rm Im}(\phi)italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ roman_Im ( italic_ϕ ) for training data {yi}.subscript𝑦𝑖\{y_{i}\}.{ italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } . Of course, yiIm(ϕ)subscript𝑦𝑖Imitalic-ϕy_{i}\in{\rm Im}(\phi)italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ roman_Im ( italic_ϕ ) iff yiIm(ϕg)subscript𝑦𝑖Imitalic-ϕ𝑔y_{i}\in{\rm Im}(\phi\circ g)italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ roman_Im ( italic_ϕ ∘ italic_g ), where gDiff(M)𝑔Diff𝑀g\in{\rm Diff}(M)italic_g ∈ roman_Diff ( italic_M ) is a diffeomorphism of M𝑀Mitalic_M. Thus the penalty term P1::subscript𝑃1P_{1}:{\mathcal{E}}\longrightarrow{\mathbb{R}}italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT : caligraphic_E ⟶ blackboard_R which measures goodness of fit should not distinguish between ϕitalic-ϕ\phiitalic_ϕ and ϕgitalic-ϕ𝑔\phi\circ gitalic_ϕ ∘ italic_g, i.e., this penalty term must be invariant under the action of Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M ): P1(ϕ)=P1(ϕg)subscript𝑃1italic-ϕsubscript𝑃1italic-ϕ𝑔P_{1}(\phi)=P_{1}(\phi\circ g)italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_ϕ ) = italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_ϕ ∘ italic_g ). The data penalty term P1(ϕ)=i=1rd2(ϕ(M),yi)subscript𝑃1italic-ϕsuperscriptsubscript𝑖1𝑟superscript𝑑2italic-ϕ𝑀subscript𝑦𝑖P_{1}(\phi)=\sum_{i=1}^{r}d^{2}(\phi(M),y_{i})italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_ϕ ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_ϕ ( italic_M ) , italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) in the introduction is clearly diffeomorphism-invariant. (Since the quotient space /Diff(M)Diff𝑀{\mathcal{E}}/{\rm Diff}(M)caligraphic_E / roman_Diff ( italic_M ) may have a non-Hausdorff topology, we consider diffeomorphism-invariant penalty functions on {\mathcal{E}}caligraphic_E, rather than penalty functions on the quotient space.) These types of invariant functionals are familiar in gauge theory, where functionals are invariant under gauge group actions, and in Gromov-Witten theory, where maps are defined only up to holomorphic automorphisms.

Similarly, we can replace the non-diffeomorphism invariant regularization term ϕssubscriptnormitalic-ϕ𝑠\|\phi\|_{s}∥ italic_ϕ ∥ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT, which is computed in a choice of local coordinates, by e.g. P2(ϕ)=vol(ϕ(M))superscriptsubscript𝑃2italic-ϕvolitalic-ϕ𝑀P_{2}^{\prime}(\phi)={\rm vol}(\phi(M))italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_ϕ ) = roman_vol ( italic_ϕ ( italic_M ) ), which measures a combination of the first derivatives of ϕ=(ϕ1,,ϕN)italic-ϕsuperscriptitalic-ϕ1superscriptitalic-ϕ𝑁\phi=(\phi^{1},\ldots,\phi^{N})italic_ϕ = ( italic_ϕ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_ϕ start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ), or by
P2(ϕ)=M[j=1N((Id+Δ)sϕj)ϕj]1/2dvolMsuperscriptsubscript𝑃2italic-ϕsubscript𝑀superscriptdelimited-[]superscriptsubscript𝑗1𝑁superscriptIdΔ𝑠superscriptitalic-ϕ𝑗superscriptitalic-ϕ𝑗12subscriptdvol𝑀P_{2}^{\prime}(\phi)=\int_{M}\left[\sum_{j=1}^{N}(({\rm Id}+\Delta)^{s}\phi^{j% })\cdot\phi^{j}\right]^{1/2}{\rm dvol}_{M}italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_ϕ ) = ∫ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT [ ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( ( roman_Id + roman_Δ ) start_POSTSUPERSCRIPT italic_s end_POSTSUPERSCRIPT italic_ϕ start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) ⋅ italic_ϕ start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT 1 / 2 end_POSTSUPERSCRIPT roman_dvol start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, which is equivalent to the s𝑠sitalic_s-Sobolev norm by the basic elliptic estimate. As a simple example, for =Emb(S2,3)Embsuperscript𝑆2superscript3{\mathcal{E}}={\rm Emb}(S^{2},{\mathbb{R}}^{3})caligraphic_E = roman_Emb ( italic_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ), P1(ϕ)=d2(ϕ(S2),0)superscriptsubscript𝑃1italic-ϕsuperscript𝑑2italic-ϕsuperscript𝑆20P_{1}^{\prime}(\phi)=d^{2}(\phi(S^{2}),\vec{0})italic_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_ϕ ) = italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_ϕ ( italic_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , over→ start_ARG 0 end_ARG ), P2=vol(ϕ(S2))superscriptsubscript𝑃2volitalic-ϕsuperscript𝑆2P_{2}^{\prime}={\rm vol}(\phi(S^{2}))italic_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = roman_vol ( italic_ϕ ( italic_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ), and for the standard unit sphere as the initial embedding ϕ0(S2)subscriptitalic-ϕ0superscript𝑆2\phi_{0}(S^{2})italic_ϕ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_S start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ), gradient flow for P=P1+P2superscript𝑃subscriptsuperscript𝑃1subscriptsuperscript𝑃2P^{\prime}=P^{\prime}_{1}+P^{\prime}_{2}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT shrinks the unit sphere to the origin in infinite time.

In this section, we prove that such penalty functions have gradients that are pointwise normal vector fields to M𝑀Mitalic_M, and apply this result to {\mathcal{E}}caligraphic_E. We first review a known result about the gradient function on a finite dimensional manifold with a group action. Recall that for a C1superscript𝐶1C^{1}italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT function P:Z:𝑃𝑍P:Z\longrightarrow{\mathbb{R}}italic_P : italic_Z ⟶ blackboard_R on an oriented Riemannian manifold (Z,h)𝑍(Z,h)( italic_Z , italic_h ), the gradient vector field P𝑃\nabla P∇ italic_P is characterized by

dPm(v)=P,vh(m),𝑑subscript𝑃𝑚𝑣subscript𝑃𝑣𝑚dP_{m}(v)=\langle\nabla P,v\rangle_{h(m)},italic_d italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_v ) = ⟨ ∇ italic_P , italic_v ⟩ start_POSTSUBSCRIPT italic_h ( italic_m ) end_POSTSUBSCRIPT ,

for all mZ,vTmZ.formulae-sequence𝑚𝑍𝑣subscript𝑇𝑚𝑍m\in Z,v\in T_{m}Z.italic_m ∈ italic_Z , italic_v ∈ italic_T start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_Z . Here dPm:TmZ:𝑑subscript𝑃𝑚subscript𝑇𝑚𝑍dP_{m}:T_{m}Z\longrightarrow{\mathbb{R}}italic_d italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT : italic_T start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_Z ⟶ blackboard_R, the differential of P𝑃Pitalic_P at m𝑚mitalic_m, is independent of the Riemannian metric.

Lemma 4.1.

Let G𝐺Gitalic_G be a connected Lie group acting via isometries on a Riemannian manifold Z𝑍Zitalic_Z. A function P:Z:𝑃𝑍P:Z\longrightarrow{\mathbb{R}}italic_P : italic_Z ⟶ blackboard_R is G𝐺Gitalic_G-invariant (P(gm)=P(m)𝑃𝑔𝑚𝑃𝑚P(g\cdot m)=P(m)italic_P ( italic_g ⋅ italic_m ) = italic_P ( italic_m ) for all mZ,gGformulae-sequence𝑚𝑍𝑔𝐺m\in Z,g\in Gitalic_m ∈ italic_Z , italic_g ∈ italic_G) iff Pmsubscript𝑃𝑚\nabla P_{m}∇ italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is perpendicular to the orbit 𝒪m={gm:gG}subscript𝒪𝑚conditional-set𝑔𝑚𝑔𝐺\mathcal{O}_{m}=\{g\cdot m:g\in G\}caligraphic_O start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = { italic_g ⋅ italic_m : italic_g ∈ italic_G } for all mZ.𝑚𝑍m\in Z.italic_m ∈ italic_Z .

Strictly speaking, we mean P(m)h(m)Tm𝒪m.subscriptperpendicular-to𝑚𝑃𝑚subscript𝑇𝑚subscript𝒪𝑚\nabla P(m)\perp_{h(m)}T_{m}\mathcal{O}_{m}.∇ italic_P ( italic_m ) ⟂ start_POSTSUBSCRIPT italic_h ( italic_m ) end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT caligraphic_O start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT .

Proof.

If P𝑃Pitalic_P is G𝐺Gitalic_G-invariant, then 𝒪msubscript𝒪𝑚\mathcal{O}_{m}caligraphic_O start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT is contained in a level set of P𝑃Pitalic_P. The gradient is always perpendicular to a level set: for XTm𝒪𝑋subscript𝑇𝑚𝒪X\in T_{m}\mathcal{O}italic_X ∈ italic_T start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT caligraphic_O, take a curve γ(t)𝒪m𝛾𝑡subscript𝒪𝑚\gamma(t)\in\mathcal{O}_{m}italic_γ ( italic_t ) ∈ caligraphic_O start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT with γ˙(0)=X˙𝛾0𝑋\dot{\gamma}(0)=Xover˙ start_ARG italic_γ end_ARG ( 0 ) = italic_X, and compute

0=(d/dt)|t=0P(γ(t))=dPm(X)=Pm,X.0evaluated-at𝑑𝑑𝑡𝑡0𝑃𝛾𝑡𝑑subscript𝑃𝑚𝑋subscript𝑃𝑚𝑋0=(d/dt)|_{t=0}P(\gamma(t))=dP_{m}(X)=\langle\nabla P_{m},X\rangle.0 = ( italic_d / italic_d italic_t ) | start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT italic_P ( italic_γ ( italic_t ) ) = italic_d italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_X ) = ⟨ ∇ italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT , italic_X ⟩ .

Conversely, assume that PmT𝒪mperpendicular-tosubscript𝑃𝑚𝑇subscript𝒪𝑚\nabla P_{m}\perp T\mathcal{O}_{m}∇ italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ⟂ italic_T caligraphic_O start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT for all m𝑚mitalic_m. Take a smooth path η(t),t[0,1],𝜂𝑡𝑡01\eta(t),t\in[0,1],italic_η ( italic_t ) , italic_t ∈ [ 0 , 1 ] , from eG𝑒𝐺e\in Gitalic_e ∈ italic_G to a fixed gG𝑔𝐺g\in Gitalic_g ∈ italic_G, and for a fixed mZ𝑚𝑍m\in Zitalic_m ∈ italic_Z define γ(t)=η(t)m.𝛾𝑡𝜂𝑡𝑚\gamma(t)=\eta(t)\cdot m.italic_γ ( italic_t ) = italic_η ( italic_t ) ⋅ italic_m . Then

0=Pγ(t),γ˙(t)=dPγ(t)(γ˙(t)),0subscript𝑃𝛾𝑡˙𝛾𝑡𝑑subscript𝑃𝛾𝑡˙𝛾𝑡0=\langle\nabla P_{\gamma(t)},\dot{\gamma}(t)\rangle=dP_{\gamma(t)}(\dot{% \gamma}(t)),0 = ⟨ ∇ italic_P start_POSTSUBSCRIPT italic_γ ( italic_t ) end_POSTSUBSCRIPT , over˙ start_ARG italic_γ end_ARG ( italic_t ) ⟩ = italic_d italic_P start_POSTSUBSCRIPT italic_γ ( italic_t ) end_POSTSUBSCRIPT ( over˙ start_ARG italic_γ end_ARG ( italic_t ) ) ,

so P𝑃Pitalic_P is constant along γ(t).𝛾𝑡\gamma(t).italic_γ ( italic_t ) . In particular, P(m)=P(γ(0))=P(γ(1))=P(gm)𝑃𝑚𝑃𝛾0𝑃𝛾1𝑃𝑔𝑚P(m)=P(\gamma(0))=P(\gamma(1))=P(g\cdot m)italic_P ( italic_m ) = italic_P ( italic_γ ( 0 ) ) = italic_P ( italic_γ ( 1 ) ) = italic_P ( italic_g ⋅ italic_m ). ∎

We want to apply this result with Z,G𝑍𝐺Z,Gitalic_Z , italic_G given by ,Diff(M)Diff𝑀{\mathcal{E}},{\rm Diff}(M)caligraphic_E , roman_Diff ( italic_M ), respectively. (Since Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M ) need not be connected, we have to restrict to Diff0(M)subscriptDiff0𝑀{\rm Diff}_{0}(M)roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M ), the connected component of the identity diffeomorphism.) The smooth structure on map** spaces is well known (see e.g., [15]). Rather than go through the technicalities of the Lie group structure on Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M ) [34], we give a direct proof.

The tangent space Tϕsubscript𝑇italic-ϕT_{\phi}{\mathcal{E}}italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_E at an embedding ϕitalic-ϕ\phiitalic_ϕ is given by the infinitesimal variation of a family of embeddings ϕ(t)italic-ϕ𝑡\phi(t)italic_ϕ ( italic_t ), which for fixed mM𝑚𝑀m\in Mitalic_m ∈ italic_M is given by (d/dt)|t=0ϕt(m)Tϕ(m)NN.evaluated-at𝑑𝑑𝑡𝑡0subscriptitalic-ϕ𝑡𝑚subscript𝑇italic-ϕ𝑚superscript𝑁similar-to-or-equalssuperscript𝑁(d/dt)|_{t=0}\phi_{t}(m)\in T_{\phi(m)}{\mathbb{R}}^{N}\simeq{\mathbb{R}}^{N}.( italic_d / italic_d italic_t ) | start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_m ) ∈ italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ≃ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT . Thus elements X𝑋Xitalic_X of Tϕsubscript𝑇italic-ϕT_{\phi}{\mathcal{E}}italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_E are “nsuperscript𝑛{\mathbb{R}}^{n}blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT-valued vector fields along ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ),” i.e., smooth functions X:MN.:𝑋𝑀superscript𝑁X:M\longrightarrow{\mathbb{R}}^{N}.italic_X : italic_M ⟶ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT .

For ϕitalic-ϕ\phi\in{\mathcal{E}}italic_ϕ ∈ caligraphic_E, M𝑀Mitalic_M has a Riemannian metric gϕsubscript𝑔italic-ϕg_{\phi}italic_g start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT given by the ϕitalic-ϕ\phiitalic_ϕ-pullback of the standard metric/dot product on Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT restricted to ϕ(M).italic-ϕ𝑀\phi(M).italic_ϕ ( italic_M ) . Specifically, for v,wTmM𝑣𝑤subscript𝑇𝑚𝑀v,w\in T_{m}Mitalic_v , italic_w ∈ italic_T start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_M, v,wm=dϕ(v)dϕ(w).subscript𝑣𝑤𝑚𝑑italic-ϕ𝑣𝑑italic-ϕ𝑤\langle v,w\rangle_{m}=d\phi(v)\cdot d\phi(w).⟨ italic_v , italic_w ⟩ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_d italic_ϕ ( italic_v ) ⋅ italic_d italic_ϕ ( italic_w ) . Denote the associated volume form on M𝑀Mitalic_M by dvolϕ.subscriptdvolitalic-ϕ{\rm dvol}_{\phi}.roman_dvol start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT . We take the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT inner product on Tϕsubscript𝑇italic-ϕT_{\phi}{\mathcal{E}}italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_E associated to the standard metric/dot product on Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT and gϕsubscript𝑔italic-ϕg_{\phi}italic_g start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT:

X,Yϕ=MXmYmdvolϕ(m).subscript𝑋𝑌italic-ϕsubscript𝑀subscript𝑋𝑚subscript𝑌𝑚subscriptdvolitalic-ϕ𝑚\langle X,Y\rangle_{\phi}=\int_{M}X_{m}\cdot Y_{m}\ {\rm dvol}_{\phi}(m).⟨ italic_X , italic_Y ⟩ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ⋅ italic_Y start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT roman_dvol start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_m ) .

Thus the gradient of P::𝑃P:{\mathcal{E}}\longrightarrow{\mathbb{R}}italic_P : caligraphic_E ⟶ blackboard_R is characterized by

dPϕ(X)=Pϕ,Xϕ=MPmXmdvolϕ(m).𝑑subscript𝑃italic-ϕ𝑋subscriptsubscript𝑃italic-ϕ𝑋italic-ϕsubscript𝑀subscript𝑃𝑚subscript𝑋𝑚subscriptdvolitalic-ϕ𝑚dP_{\phi}(X)=\langle\nabla P_{\phi},X\rangle_{\phi}=\int_{M}\nabla P_{m}\cdot X% _{m}\ {\rm dvol}_{\phi}(m).italic_d italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_X ) = ⟨ ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT , italic_X ⟩ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∇ italic_P start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ⋅ italic_X start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT roman_dvol start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_m ) .

Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M ) acts on ϕitalic-ϕ\phi\in{\mathcal{E}}italic_ϕ ∈ caligraphic_E by gϕ=ϕg1.𝑔italic-ϕitalic-ϕsuperscript𝑔1g\cdot\phi=\phi\circ g^{-1}.italic_g ⋅ italic_ϕ = italic_ϕ ∘ italic_g start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT . It is standard that Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M ) acts via isometries on {\mathcal{E}}caligraphic_E with the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT metric.

In our setting, we can strengthen Lemma 4.1 to the pointwise normal condition
Pϕ(m)Qm=0subscript𝑃italic-ϕ𝑚subscript𝑄𝑚0\nabla P_{\phi(m)}\cdot Q_{m}=0∇ italic_P start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT ⋅ italic_Q start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = 0 for all QmTϕ(m)ϕ(M),mMformulae-sequencesubscript𝑄𝑚subscript𝑇italic-ϕ𝑚italic-ϕ𝑀𝑚𝑀Q_{m}\in T_{\phi(m)}\phi(M),m\in Mitalic_Q start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT italic_ϕ ( italic_M ) , italic_m ∈ italic_M, for a Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M )-invariant P𝑃Pitalic_P.

Theorem 4.2.

For a C1superscript𝐶1C^{1}italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT function P::𝑃P:{\mathcal{E}}\longrightarrow\mathbb{R}italic_P : caligraphic_E ⟶ blackboard_R, the gradient P𝑃\nabla P∇ italic_P is pointwise normal to Tϕ(m)ϕ(M)subscript𝑇italic-ϕ𝑚italic-ϕ𝑀T_{\phi(m)}\phi(M)italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT italic_ϕ ( italic_M ) for all mM𝑚𝑀m\in Mitalic_m ∈ italic_M and for all ϕitalic-ϕ\phi\in{\mathcal{E}}italic_ϕ ∈ caligraphic_E if and only if P𝑃Pitalic_P is invariant under diffeomorphisms in Diff0(M)subscriptDiff0𝑀{\rm Diff}_{0}(M)roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M ), the path connected component of the identity in Diff(M).Diff𝑀{\rm Diff}(M).roman_Diff ( italic_M ) .

We note that this pointwise perpendicularity is measured in the usual dot product on Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, even though we have implicitly been using ϕitalic-ϕ\phiitalic_ϕ-pullback metrics on M𝑀Mitalic_M. In particular, the theoretical use of the pullback metric does not affect the practical implementation of discretized gradient flow.

Proof.

Assume P𝑃Pitalic_P is Diff0(M)subscriptDiff0𝑀{\rm Diff}_{0}(M)roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M )-invariant. As in the Lemma, we conclude that PϕL2Tϕ𝒪ϕ.subscriptperpendicular-tosuperscript𝐿2subscript𝑃italic-ϕsubscript𝑇italic-ϕsubscript𝒪italic-ϕ\nabla P_{\phi}\perp_{L^{2}}T_{\phi}\mathcal{O}_{\phi}.∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ⟂ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_O start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT .

Take a family of diffeomorphisms gtsubscript𝑔𝑡g_{t}italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of M𝑀Mitalic_M with g0=Idsubscript𝑔0Idg_{0}={\rm Id}italic_g start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_Id and with tangent vector X=(d/dt)|t=0gtTIdDiff(M).𝑋evaluated-at𝑑𝑑𝑡𝑡0subscript𝑔𝑡subscript𝑇IdDiff𝑀X=(d/dt)|_{t=0}g_{t}\in T_{\rm Id}{\rm Diff}(M).italic_X = ( italic_d / italic_d italic_t ) | start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT roman_Id end_POSTSUBSCRIPT roman_Diff ( italic_M ) . Then ϕgt𝒪ϕitalic-ϕsubscript𝑔𝑡subscript𝒪italic-ϕ\phi\circ g_{t}\in\mathcal{O}_{\phi}italic_ϕ ∘ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∈ caligraphic_O start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, and the vector field (d/dt)|t=0ϕgt=dϕ(X)evaluated-at𝑑𝑑𝑡𝑡0italic-ϕsubscript𝑔𝑡𝑑italic-ϕ𝑋(d/dt)|_{t=0}\phi\circ g_{t}=d\phi(X)( italic_d / italic_d italic_t ) | start_POSTSUBSCRIPT italic_t = 0 end_POSTSUBSCRIPT italic_ϕ ∘ italic_g start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_d italic_ϕ ( italic_X ) tangent to ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) is in Tϕ𝒪ϕsubscript𝑇italic-ϕsubscript𝒪italic-ϕT_{\phi}\mathcal{O}_{\phi}italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_O start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT. Conversely, any tangent vector field V𝑉Vitalic_V to ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) integrates to a family of diffeomorphisms in Diff0(M)subscriptDiff0𝑀{\rm Diff}_{0}(M)roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M ), so we conclude that VTϕ𝒪ϕ𝑉subscript𝑇italic-ϕsubscript𝒪italic-ϕV\in T_{\phi}\mathcal{O}_{\phi}italic_V ∈ italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_O start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT and that (up to a choice of topology on Diff(M)Diff𝑀{\rm Diff}(M)roman_Diff ( italic_M )) Tϕ𝒪ϕsubscript𝑇italic-ϕsubscript𝒪italic-ϕT_{\phi}\mathcal{O}_{\phi}italic_T start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT caligraphic_O start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT is the space of tangent vector fields to ϕ(M).italic-ϕ𝑀\phi(M).italic_ϕ ( italic_M ) .

Fix m0Msubscript𝑚0𝑀m_{0}\in Mitalic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_M and a vector Qm0Tϕ(m0)ϕ(M)subscript𝑄subscript𝑚0subscript𝑇italic-ϕsubscript𝑚0italic-ϕ𝑀Q_{m_{0}}\in T_{\phi(m_{0})}\phi(M)italic_Q start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT italic_ϕ ( italic_M ). Choose a sequence ϵk0subscriptitalic-ϵ𝑘0\epsilon_{k}\longrightarrow 0italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟶ 0 and smooth functions fk:ϕ(M):subscript𝑓𝑘italic-ϕ𝑀f_{k}:\phi(M)\longrightarrow\mathbb{R}italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT : italic_ϕ ( italic_M ) ⟶ blackboard_R such that Mfkdvolϕ=1subscript𝑀subscript𝑓𝑘subscriptdvolitalic-ϕ1\int_{M}f_{k}\ {\rm dvol}_{\phi}=1∫ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT roman_dvol start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT = 1, supp(fk)Bϵk(ϕ(m0))ϕ(M)suppsubscript𝑓𝑘subscript𝐵subscriptitalic-ϵ𝑘italic-ϕsubscript𝑚0italic-ϕ𝑀{\rm supp}(f_{k})\subset B_{\epsilon_{k}}(\phi(m_{0}))\cap\phi(M)roman_supp ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ⊂ italic_B start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ∩ italic_ϕ ( italic_M ), with Bϵk(ϕ(m0))subscript𝐵subscriptitalic-ϵ𝑘italic-ϕsubscript𝑚0B_{\epsilon_{k}}(\phi(m_{0}))italic_B start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) the Euclidean ball of radius ϵksubscriptitalic-ϵ𝑘\epsilon_{k}italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT centered at ϕ(m0)italic-ϕsubscript𝑚0\phi(m_{0})italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Extend Qm0subscript𝑄subscript𝑚0Q_{m_{0}}italic_Q start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT to a vector field Q=Qm𝑄subscript𝑄𝑚Q=Q_{m}italic_Q = italic_Q start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT on ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ), and define the vector fields Yksubscript𝑌𝑘Y_{k}italic_Y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT on ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) by:

Yk(ϕ(m))=fk(ϕ(m))Qm.subscript𝑌𝑘italic-ϕ𝑚subscript𝑓𝑘italic-ϕ𝑚subscript𝑄𝑚Y_{k}(\phi(m))=f_{k}(\phi(m))\cdot Q_{m}.italic_Y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_ϕ ( italic_m ) ) = italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_ϕ ( italic_m ) ) ⋅ italic_Q start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT .

Then we have

00\displaystyle 0 =\displaystyle== limϵk0Pϕ,Yϵk=limϵk0Pϕ,fkQ=limϵk0MPϕ(ϕ(m))fk(ϕ(m))Qmdvolϕsubscriptsubscriptitalic-ϵ𝑘0subscript𝑃italic-ϕsubscript𝑌subscriptitalic-ϵ𝑘subscriptsubscriptitalic-ϵ𝑘0subscript𝑃italic-ϕsubscript𝑓𝑘𝑄subscriptsubscriptitalic-ϵ𝑘0subscript𝑀subscript𝑃italic-ϕitalic-ϕ𝑚subscript𝑓𝑘italic-ϕ𝑚subscript𝑄𝑚subscriptdvolitalic-ϕ\displaystyle\lim_{\epsilon_{k}\longrightarrow 0}\langle\nabla P_{\phi},Y_{% \epsilon_{k}}\rangle=\lim_{\epsilon_{k}\longrightarrow 0}\langle\nabla P_{\phi% },f_{k}\cdot Q\rangle=\lim_{\epsilon_{k}\longrightarrow 0}\int_{M}\nabla P_{% \phi}(\phi(m))\cdot f_{k}(\phi(m))Q_{m}\ {\rm dvol}_{\phi}roman_lim start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟶ 0 end_POSTSUBSCRIPT ⟨ ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT , italic_Y start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ = roman_lim start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟶ 0 end_POSTSUBSCRIPT ⟨ ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT , italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⋅ italic_Q ⟩ = roman_lim start_POSTSUBSCRIPT italic_ϵ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ⟶ 0 end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_ϕ ( italic_m ) ) ⋅ italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_ϕ ( italic_m ) ) italic_Q start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT roman_dvol start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT
=\displaystyle== Pϕ(ϕ(m0))Qm0subscript𝑃italic-ϕitalic-ϕsubscript𝑚0subscript𝑄subscript𝑚0\displaystyle\nabla P_{\phi}(\phi(m_{0}))\cdot Q_{m_{0}}∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ⋅ italic_Q start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT

Therefore Pϕ(ϕ(m0))Qm0perpendicular-tosubscript𝑃italic-ϕitalic-ϕsubscript𝑚0subscript𝑄subscript𝑚0\nabla P_{\phi}(\phi(m_{0}))\perp Q_{m_{0}}∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ⟂ italic_Q start_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, and so Pϕ(ϕ(m0))Tϕ(m0)ϕ(M)perpendicular-tosubscript𝑃italic-ϕitalic-ϕsubscript𝑚0subscript𝑇italic-ϕsubscript𝑚0italic-ϕ𝑀\nabla P_{\phi}(\phi(m_{0}))\perp T_{\phi(m_{0})}\phi(M)∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ⟂ italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT italic_ϕ ( italic_M ).

For the converse, assume that Pϕ(ϕ(m))Tϕ(m)ϕ(M)perpendicular-tosubscript𝑃italic-ϕitalic-ϕ𝑚subscript𝑇italic-ϕ𝑚italic-ϕ𝑀\nabla P_{\phi}(\phi(m))\perp T_{\phi(m)}\phi(M)∇ italic_P start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ( italic_ϕ ( italic_m ) ) ⟂ italic_T start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT italic_ϕ ( italic_M ) for all mM𝑚𝑀m\in Mitalic_m ∈ italic_M. Then PL2Qsubscriptperpendicular-tosuperscript𝐿2𝑃𝑄\nabla P\perp_{L^{2}}Q∇ italic_P ⟂ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_Q for all tangent vector fields Q𝑄Qitalic_Q to ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ), and so P𝑃\nabla P∇ italic_P is perpendicular to the orbit of Diff0(M).subscriptDiff0𝑀{\rm Diff}_{0}(M).roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M ) . As in Lemma 4.1, we conclude that P𝑃Pitalic_P is Diff0(M)subscriptDiff0𝑀{\rm Diff}_{0}(M)roman_Diff start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_M )-invariant. ∎

5. Estimates for Flows in Normal Gradient Directions

Under the assumption that our penalty function is diffeomorphism invariant, to implement discretized gradient flow, by Theorem 4.2 we have to know how far ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) can move in a fixed normal gradient direction while remaining in the space of embeddings. The next set of results gives an explicit estimate for the lower bound tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT of this flow, with the main result in Theorem 5.7.

Throughout the paper, we assume that M𝑀Mitalic_M is compact. By Prop. 1.1, ϕ:MN:italic-ϕ𝑀superscript𝑁\phi:M\longrightarrow{\mathbb{R}}^{N}italic_ϕ : italic_M ⟶ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT is an embedding iff it is an injective immersion. Recall that ϕitalic-ϕ\phiitalic_ϕ is an immersion if its differential dϕ𝑑italic-ϕd\phiitalic_d italic_ϕ is pointwise injective, which is the infinitesimal condition for the map ϕitalic-ϕ\phiitalic_ϕ to be a local injection. Thus, there are two types of obstructions to a linearly deformed embedding ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT of ϕitalic-ϕ\phiitalic_ϕ remaining an embedding: (1) a local obstruction, where distinct nearby points in ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) deform to the same point in ϕt(M)subscriptitalic-ϕ𝑡𝑀\phi_{t}(M)italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ); (2) a global obstruction, where points far from each other in the induced Riemannian metric on ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) deform to the same point in ϕt(M)subscriptitalic-ϕ𝑡𝑀\phi_{t}(M)italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) because they are close in N.superscript𝑁{\mathbb{R}}^{N}.blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT . The local obstruction is controlled by the injectivity of the differential. Specifically, in Theorem 5.7, we conclude that tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is ultimately a function of K𝐾Kitalic_K and δ𝛿\deltaitalic_δ, where K𝐾Kitalic_K is is a bound on the principal curvature of ϕitalic-ϕ\phiitalic_ϕ and thus controls the local obstruction. The global obstruction, which cannot be treated by infinitesimal means, is controlled in Theorem 5.7 by δ𝛿\deltaitalic_δ, which is constructed by bounds in the Implicit Function Theorem.

5.1. Notation and Definitions

  1. (1)

    ϵ=ϵϕitalic-ϵsubscriptitalic-ϵitalic-ϕ\epsilon=\epsilon_{\phi}italic_ϵ = italic_ϵ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT is chosen so that each s𝑠sitalic_s in the ϵitalic-ϵ\epsilonitalic_ϵ-neighborhood Bϵ(ϕ(M))subscript𝐵italic-ϵitalic-ϕ𝑀B_{\epsilon}(\phi(M))italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) has a unique closest point q=q(s)𝑞𝑞𝑠q=q(s)italic_q = italic_q ( italic_s ) in ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). The existence of this neighborhood is guaranteed by the ϵitalic-ϵ\epsilonitalic_ϵ-Neighborhood Theorem [26, Thm. 6.24]. Bϵ(ϕ(M))subscript𝐵italic-ϵitalic-ϕ𝑀B_{\epsilon}(\phi(M))italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) is diffeomorphic to a neighborhood of the zero section of the normal bundle ν=νϕ𝜈subscript𝜈italic-ϕ\nu=\nu_{\phi}italic_ν = italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ): we have sqνq=νϕ,q𝑠𝑞subscript𝜈𝑞subscript𝜈italic-ϕ𝑞s-q\in\nu_{q}=\nu_{\phi,q}italic_s - italic_q ∈ italic_ν start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT = italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q end_POSTSUBSCRIPT, the fiber of νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT at q𝑞qitalic_q, and the map ssqmaps-to𝑠𝑠𝑞s\mapsto s-qitalic_s ↦ italic_s - italic_q is the diffeomorphism. A lower bound for ϵitalic-ϵ\epsilonitalic_ϵ is given in Lemma 5.5 in terms of δ𝛿\deltaitalic_δ in (8) below; it will become explicit in Remark 5.2.

  2. (2)

    We use two sets of coordinates on Nsuperscript𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Standard (global) coordinates are denoted (x1,,xN)superscript𝑥1superscript𝑥𝑁(x^{1},\ldots,x^{N})( italic_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_x start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ). We also represent points sBϵ(ϕ(M))𝑠subscript𝐵italic-ϵitalic-ϕ𝑀s\in B_{\epsilon}(\phi(M))italic_s ∈ italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) as

    s=(q1,,qk,v1,,vNk)=(q,v),𝑠superscript𝑞1superscript𝑞𝑘superscript𝑣1superscript𝑣𝑁𝑘𝑞𝑣s=(q^{1},\ldots,q^{k},v^{1},\ldots,v^{N-k})=(q,v),italic_s = ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_v start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT ) = ( italic_q , italic_v ) ,

    where the qisuperscript𝑞𝑖q^{i}italic_q start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT are local manifold coordinates and vjsuperscript𝑣𝑗v^{j}italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT are local coordinates for the normal space. These are called normal coordinates. Thus qϕ(M)𝑞italic-ϕ𝑀q\in\phi(M)italic_q ∈ italic_ϕ ( italic_M ) has q=(q1,,qk,0,,0)𝑞superscript𝑞1superscript𝑞𝑘00q=(q^{1},\ldots,q^{k},0,\ldots,0)italic_q = ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , 0 , … , 0 ). Here k=dim(M).𝑘dim𝑀k={\rm dim}(M).italic_k = roman_dim ( italic_M ) . Note that normal coordinates are not well defined outside Bϵ(M).subscript𝐵italic-ϵ𝑀B_{\epsilon}(M).italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_M ) .

  3. (3)

    A vector in νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT can be expressed either as tvq𝑡subscript𝑣𝑞tv_{q}italic_t italic_v start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT, where vqsubscript𝑣𝑞v_{q}italic_v start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT is a unit length vector at q𝑞qitalic_q, or as viwi,qsuperscript𝑣𝑖subscript𝑤𝑖𝑞v^{i}w_{i,q}italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i , italic_q end_POSTSUBSCRIPT, where {wi,q}subscript𝑤𝑖𝑞\{w_{i,q}\}{ italic_w start_POSTSUBSCRIPT italic_i , italic_q end_POSTSUBSCRIPT } is an orthonormal basis of νϕ,qsubscript𝜈italic-ϕ𝑞\nu_{\phi,q}italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q end_POSTSUBSCRIPT. There are Nk𝑁𝑘N-kitalic_N - italic_k {wi,q}subscript𝑤𝑖𝑞\{w_{i,q}\}{ italic_w start_POSTSUBSCRIPT italic_i , italic_q end_POSTSUBSCRIPT } vectors, each with N𝑁Nitalic_N Euclidean coordinates.

  4. (4)

    The endpoint map E:νϕN:𝐸subscript𝜈italic-ϕsuperscript𝑁E:\nu_{\phi}\rightarrow\mathbb{R}^{N}italic_E : italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT is E(q,v)=q+v𝐸𝑞𝑣𝑞𝑣E(q,v)=q+vitalic_E ( italic_q , italic_v ) = italic_q + italic_v. It is given explicitly by:

    E(q1,,qk,v1,,vNk)=(x1(q)+viwi,q1,,xN(q)+viwi,qN),𝐸superscript𝑞1superscript𝑞𝑘superscript𝑣1superscript𝑣𝑁𝑘superscript𝑥1𝑞superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑞1superscript𝑥𝑁𝑞superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑞𝑁E(q^{1},\ldots,q^{k},v^{1},\ldots,v^{N-k})=(x^{1}(q)+v^{i}w_{i,q}^{1},\ldots,x% ^{N}(q)+v^{i}w_{i,q}^{N}),italic_E ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_v start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT ) = ( italic_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q ) + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_x start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( italic_q ) + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i , italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ) ,

    where the domain is in normal coordinates and the range is in standard coordinates.

    Definition 5.1.

    [31, §6] e=qe+veBϵ(ϕ(M))𝑒subscript𝑞𝑒subscript𝑣𝑒subscript𝐵italic-ϵitalic-ϕ𝑀e=q_{e}+v_{e}\in B_{\epsilon}(\phi(M))italic_e = italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT + italic_v start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) is a focal point if the Jacobian of the E𝐸Eitalic_E map is not full rank at (qe,ve)subscript𝑞𝑒subscript𝑣𝑒(q_{e},v_{e})( italic_q start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ).

  5. (5)

    The inclusion map ϕ(M)Nitalic-ϕ𝑀superscript𝑁\phi(M)\rightarrow\mathbb{R}^{N}italic_ϕ ( italic_M ) → blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT is q=(q1,,qk)(x1(q),,xN(q))=x(q)𝑞superscript𝑞1superscript𝑞𝑘maps-tosuperscript𝑥1𝑞superscript𝑥𝑁𝑞𝑥𝑞q=(q^{1},\cdots,q^{k})\mapsto(x^{1}(q),\cdots,x^{N}(q))=x(q)italic_q = ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , ⋯ , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) ↦ ( italic_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q ) , ⋯ , italic_x start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( italic_q ) ) = italic_x ( italic_q ) in manifold to Euclidean coordinates, so the first fundamental form is the matrix (gij)=(xqixqj)subscript𝑔𝑖𝑗𝑥superscript𝑞𝑖𝑥superscript𝑞𝑗(g_{ij})=\big{(}\frac{\partial x}{\partial q^{i}}\cdot\frac{\partial x}{% \partial q^{j}}\big{)}( italic_g start_POSTSUBSCRIPT italic_i italic_j end_POSTSUBSCRIPT ) = ( divide start_ARG ∂ italic_x end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_ARG ⋅ divide start_ARG ∂ italic_x end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG ), where \cdot is the Euclidean dot product. The second fundamental form at the normal vector vνϕ𝑣subscript𝜈italic-ϕv\in\nu_{\phi}italic_v ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT is the matrix IIv=(v2xqiqj)subscriptII𝑣𝑣superscript2𝑥superscript𝑞𝑖superscript𝑞𝑗{\rm II}_{v}=\left(v\cdot\frac{\partial^{2}x}{\partial q^{i}\partial q^{j}}\right)roman_II start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT = ( italic_v ⋅ divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_x end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ∂ italic_q start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG ).

  6. (6)

    At a fixed qϕ(M)𝑞italic-ϕ𝑀q\in\phi(M)italic_q ∈ italic_ϕ ( italic_M ), we may choose manifold coordinates so that the first fundamental form is the identity matrix. The principal curvatures of v𝑣vitalic_v at q𝑞qitalic_q are by definition the eigenvalues p1,,pksubscript𝑝1subscript𝑝𝑘p_{1},\ldots,p_{k}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT of IIvsubscriptII𝑣{\rm II}_{v}roman_II start_POSTSUBSCRIPT italic_v end_POSTSUBSCRIPT. Here pi=pi(q,v).subscript𝑝𝑖subscript𝑝𝑖𝑞𝑣p_{i}=p_{i}(q,v).italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_q , italic_v ) .

  7. (7)

    Let K𝐾Kitalic_K be the maximal principal eigenvalue of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ). Thus we take the maximum of the pi(v)subscript𝑝𝑖𝑣p_{i}(v)italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v ) over all unit vectors in νϕ.subscript𝜈italic-ϕ\nu_{\phi}.italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT .

  8. (8)

    δ𝛿\deltaitalic_δ is chosen such that normal lines of length ϵitalic-ϵ\epsilonitalic_ϵ based at different, close points of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) do not intersect: for dN(ϕ(m1),ϕ(m2))<δsubscript𝑑superscript𝑁italic-ϕsubscript𝑚1italic-ϕsubscript𝑚2𝛿d_{\mathbb{R}^{N}}(\phi(m_{1}),\phi(m_{2}))<\deltaitalic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_ϕ ( italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , italic_ϕ ( italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) < italic_δ, ϕ(m1)+t1v1ϕ(m2)+t2v2italic-ϕsubscript𝑚1subscript𝑡1subscript𝑣1italic-ϕsubscript𝑚2subscript𝑡2subscript𝑣2\phi(m_{1})+t_{1}v_{1}\neq\phi(m_{2})+t_{2}v_{2}italic_ϕ ( italic_m start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ italic_ϕ ( italic_m start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for unit normal vectors viνϕ(mi)subscript𝑣𝑖subscript𝜈italic-ϕsubscript𝑚𝑖v_{i}\in\nu_{\phi(m_{i})}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ ( italic_m start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT, i=1,2𝑖12i=1,2italic_i = 1 , 2, and |t1|,|t2|<ϵ,subscript𝑡1subscript𝑡2italic-ϵ|t_{1}|,|t_{2}|<\epsilon,| italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | , | italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | < italic_ϵ , with ϵitalic-ϵ\epsilonitalic_ϵ defined in (1) above. δ𝛿\deltaitalic_δ is precisely defined in (5.2), and estimated in Remark 5.2. (δ𝛿\deltaitalic_δ is the reach of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ), as in e.g. [20].)

Remark 5.1.

In the calculations below, estimates for ϵ,δ,Kitalic-ϵ𝛿𝐾\epsilon,\delta,Kitalic_ϵ , italic_δ , italic_K are computed explicitly in terms of ϕitalic-ϕ\phiitalic_ϕ, local coordinates on M𝑀Mitalic_M, and local coordinates on νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT. Specifically, a lower bound for ϵitalic-ϵ\epsilonitalic_ϵ in terms of K𝐾Kitalic_K and δ𝛿\deltaitalic_δ is given in Lemma 4. K𝐾Kitalic_K of course depends on ϕitalic-ϕ\phiitalic_ϕ, but is in fact independent of coordinates on M𝑀Mitalic_M, as it is the maximum eigenvalue of any normal component of the trace of the second fundamental form. The estimate of δ𝛿\deltaitalic_δ uses ϕitalic-ϕ\phiitalic_ϕ, local coordinates on M𝑀Mitalic_M, and local coordinates on νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT in e.g., the proof of Proposition 5.6. It is reasonable to assume knowledge of coordinates on M𝑀Mitalic_M, as a manifold is specified by a cover of charts. In fact, local coordinates on M𝑀Mitalic_M and ϕitalic-ϕ\phiitalic_ϕ determine local coordinates on νϕ.subscript𝜈italic-ϕ\nu_{\phi}.italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT .222Take the standard basis {ei}subscript𝑒𝑖\{e_{i}\}{ italic_e start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } of Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. For I=(i1,,iNk)𝐼subscript𝑖1subscript𝑖𝑁𝑘I=(i_{1},\cdots,i_{N-k})italic_I = ( italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , ⋯ , italic_i start_POSTSUBSCRIPT italic_N - italic_k end_POSTSUBSCRIPT ) with 1i1<<iNkN,1subscript𝑖1subscript𝑖𝑁𝑘𝑁1\leq i_{1}<\cdots<i_{N-k}\leq N,1 ≤ italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < ⋯ < italic_i start_POSTSUBSCRIPT italic_N - italic_k end_POSTSUBSCRIPT ≤ italic_N , lexicographically ordered, set eI=(ei1,,eiNk)subscript𝑒𝐼subscript𝑒subscript𝑖1subscript𝑒subscript𝑖𝑁𝑘e_{I}=(e_{i_{1}},\ldots,e_{i_{N-k}})italic_e start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT = ( italic_e start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , … , italic_e start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_N - italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) Let UIsubscript𝑈𝐼U_{I}italic_U start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT be the open set of qϕ(M)𝑞italic-ϕ𝑀q\in\phi(M)italic_q ∈ italic_ϕ ( italic_M ) such that I𝐼Iitalic_I is the smallest multi-index such that the projection of eIsubscript𝑒𝐼e_{I}italic_e start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT into νϕ,qsubscript𝜈italic-ϕ𝑞\nu_{\phi,q}italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q end_POSTSUBSCRIPT is a basis of νϕ,qsubscript𝜈italic-ϕ𝑞\nu_{\phi,q}italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q end_POSTSUBSCRIPT. Then νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT is trivial over UIsubscript𝑈𝐼U_{I}italic_U start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT, and we can form a new, fixed cover of M𝑀Mitalic_M by taking {ViUI}.subscript𝑉𝑖subscript𝑈𝐼\{V_{i}\cap U_{I}\}.{ italic_V start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∩ italic_U start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT } . In particular, the local coordinates on νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT in (2) are not extra data, since the embedding ϕitalic-ϕ\phiitalic_ϕ determines which q𝑞qitalic_q are in which UIsubscript𝑈𝐼U_{I}italic_U start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT. Thus, in the end our estimates depend only on local coordinates on M𝑀Mitalic_M and on ϕ.italic-ϕ\phi.italic_ϕ . See Remark 5.2 for more details.

5.2. Calculating the Flow Length to Remain an Embedding

In this section, we compute tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT such that for t<t𝑡superscript𝑡t<t^{*}italic_t < italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT and u𝑢uitalic_u a normal vector field along ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) with |uϕ(m)|1subscript𝑢italic-ϕ𝑚1|u_{\phi(m)}|\leq 1| italic_u start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT | ≤ 1, the deformed manifold ϕt(M)={ϕ(m)+tuϕ(m):mM}subscriptitalic-ϕ𝑡𝑀conditional-setitalic-ϕ𝑚𝑡subscript𝑢italic-ϕ𝑚𝑚𝑀\phi_{t}(M)=\{\phi(m)+tu_{\phi(m)}:m\in M\}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) = { italic_ϕ ( italic_m ) + italic_t italic_u start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT : italic_m ∈ italic_M } is an embedding. As above, it suffices to prove that each ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an immersion.

We start by determining which normal deformations ϕt(M)subscriptitalic-ϕ𝑡𝑀\phi_{t}(M)italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) are still immersions.

Proposition 5.1.

Let u𝑢uitalic_u be a normal vector field of length at most one along ϕ(M)Nitalic-ϕ𝑀superscript𝑁\phi(M)\subset\mathbb{R}^{N}italic_ϕ ( italic_M ) ⊂ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, and let ϵitalic-ϵ\epsilonitalic_ϵ be defined in §5.1(1). Then ϕt(M)={ϕ(m)+tuϕ(m):mM}subscriptitalic-ϕ𝑡𝑀conditional-setitalic-ϕ𝑚𝑡subscript𝑢italic-ϕ𝑚𝑚𝑀\phi_{t}(M)=\{{\phi(m)+tu_{\phi(m)}:m\in M}\}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) = { italic_ϕ ( italic_m ) + italic_t italic_u start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT : italic_m ∈ italic_M } is immersed in Nsuperscript𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT for |t|<ϵ𝑡italic-ϵ|t|<\epsilon| italic_t | < italic_ϵ.

Proof.

Because ϕ:MN:italic-ϕ𝑀superscript𝑁\phi:M\longrightarrow{\mathbb{R}}^{N}italic_ϕ : italic_M ⟶ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT is an embedding, it suffices to show that the map Ft:ϕ(M)ϕt(M):subscript𝐹𝑡italic-ϕ𝑀subscriptitalic-ϕ𝑡𝑀F_{t}:\phi(M)\rightarrow\phi_{t}(M)italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT : italic_ϕ ( italic_M ) → italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ), Ft(q)=q+tuqsubscript𝐹𝑡𝑞𝑞𝑡subscript𝑢𝑞F_{t}(q)=q+tu_{q}italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_q ) = italic_q + italic_t italic_u start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT, is an immersion. In normal coordinates, we have

Ft(q1,,qk)=(q1,,qk,tuq1,,tuqNk).subscript𝐹𝑡superscript𝑞1superscript𝑞𝑘superscript𝑞1superscript𝑞𝑘𝑡subscriptsuperscript𝑢1𝑞𝑡subscriptsuperscript𝑢𝑁𝑘𝑞F_{t}(q^{1},\ldots,q^{k})=(q^{1},\ldots,q^{k},tu^{1}_{q},\ldots,tu^{N-k}_{q}).italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT ) = ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_t italic_u start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT , … , italic_t italic_u start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) .

The differential DFt𝐷subscript𝐹𝑡DF_{t}italic_D italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, written as an N×k𝑁𝑘N\times kitalic_N × italic_k matrix, is of the form

DFt=(Idk×k),𝐷subscript𝐹𝑡missing-subexpressionsubscriptId𝑘𝑘missing-subexpressionmissing-subexpressionmissing-subexpressionDF_{t}=\left(\begin{array}[]{c}\\ {\rm Id}_{k\times k}\\ \\ \hline\cr\\ {}^{\star}\end{array}\right),italic_D italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = ( start_ARRAY start_ROW start_CELL end_CELL end_ROW start_ROW start_CELL roman_Id start_POSTSUBSCRIPT italic_k × italic_k end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL end_ROW start_ROW start_CELL start_FLOATSUPERSCRIPT ⋆ end_FLOATSUPERSCRIPT end_CELL end_ROW end_ARRAY ) ,

where \star is some (Nk)×k𝑁𝑘𝑘(N-k)\times k( italic_N - italic_k ) × italic_k matrix. This has rank k𝑘kitalic_k, so Ftsubscript𝐹𝑡F_{t}italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an immersion. We note ϵitalic-ϵ\epsilonitalic_ϵ is implicitly used as normal coordinates are only defined in Bϵ(ϕ(M)).subscript𝐵italic-ϵitalic-ϕ𝑀B_{\epsilon}(\phi(M)).italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) .

Thus ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an embedding if it is injective. Theorem 5.2 proves injectivity for |t|t𝑡superscript𝑡|t|\leq t^{*}| italic_t | ≤ italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT, where tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT is defined in the Theorem statement. The proof of Theorem 5.2 follows after the proofs of Lemmas 5.3-5.5 and Proposition 5.6.

Theorem 5.2.

Let u𝑢uitalic_u be a normal vector field of length at most one along ϕ(M)Nitalic-ϕ𝑀superscript𝑁\phi(M)\subset\mathbb{R}^{N}italic_ϕ ( italic_M ) ⊂ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT Let t=min{K1,δ/3}superscript𝑡superscript𝐾1𝛿3t^{*}=\min\{K^{-1},\delta/3\}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ / 3 }. Then ϕt:MN:subscriptitalic-ϕ𝑡𝑀superscript𝑁\phi_{t}:M\rightarrow\mathbb{R}^{N}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT : italic_M → blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT given by mϕ(m)+tuϕ(m)maps-to𝑚italic-ϕ𝑚𝑡subscript𝑢italic-ϕ𝑚m\mapsto\phi(m)+tu_{\phi(m)}italic_m ↦ italic_ϕ ( italic_m ) + italic_t italic_u start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT is injective for |t|t𝑡superscript𝑡|t|\leq t^{*}| italic_t | ≤ italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

Here δ𝛿\deltaitalic_δ is given by §5.1(8), and will be estimated explicitly after the proof of Proposition 5.6.

Proof. As in the previous proof, it suffices to show that Ft:ϕ(M)ϕt(M):subscript𝐹𝑡italic-ϕ𝑀subscriptitalic-ϕ𝑡𝑀F_{t}:\phi(M)\longrightarrow\phi_{t}(M)italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT : italic_ϕ ( italic_M ) ⟶ italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_M ) is injective. We extend Ftsubscript𝐹𝑡F_{t}italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT to a map between open subsets of Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT by setting

Ht:Bϵt(ϕ(M))Bϵ(ϕ(M)),Ht(b)=b+tuq(b),:subscript𝐻𝑡formulae-sequencesubscript𝐵italic-ϵ𝑡italic-ϕ𝑀subscript𝐵italic-ϵitalic-ϕ𝑀subscript𝐻𝑡𝑏𝑏𝑡subscript𝑢𝑞𝑏H_{t}:B_{\epsilon-t}(\phi(M))\longrightarrow B_{\epsilon}(\phi(M)),\ \ H_{t}(b% )=b+tu_{q(b)},italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT : italic_B start_POSTSUBSCRIPT italic_ϵ - italic_t end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) ⟶ italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) , italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) = italic_b + italic_t italic_u start_POSTSUBSCRIPT italic_q ( italic_b ) end_POSTSUBSCRIPT ,

where q(b)𝑞𝑏q(b)italic_q ( italic_b ) is the closest point in ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) to b𝑏bitalic_b. Note that Ht|ϕ(M)=Ftevaluated-atsubscript𝐻𝑡italic-ϕ𝑀subscript𝐹𝑡H_{t}|_{\phi(M)}=F_{t}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_ϕ ( italic_M ) end_POSTSUBSCRIPT = italic_F start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and that Htsubscript𝐻𝑡H_{t}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is defined only for |t|<ϵ.𝑡italic-ϵ|t|<\epsilon.| italic_t | < italic_ϵ .

We now proceed with a series of Lemmas.

Lemma 5.3.

For each qϕ(M)𝑞italic-ϕ𝑀q\in\phi(M)italic_q ∈ italic_ϕ ( italic_M ), there exists a ball BδHtq(q)subscript𝐵subscriptsuperscript𝛿𝑞subscript𝐻𝑡𝑞B_{\delta^{q}_{H_{t}}}(q)italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_q ) of radius δHtqsuperscriptsubscript𝛿subscript𝐻𝑡𝑞\delta_{H_{t}}^{q}italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT around q𝑞qitalic_q on which Htsubscript𝐻𝑡H_{t}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is a diffeomorphism.

Proof.

In normal coordinates, we have

Ht(b)=Ht(q1,,qk,v1,,vNk)=(q1,,qk,v1+tuq(b)1,,vNk+tuq(b)Nk).subscript𝐻𝑡𝑏subscript𝐻𝑡superscript𝑞1superscript𝑞𝑘superscript𝑣1superscript𝑣𝑁𝑘superscript𝑞1superscript𝑞𝑘superscript𝑣1𝑡subscriptsuperscript𝑢1𝑞𝑏superscript𝑣𝑁𝑘𝑡subscriptsuperscript𝑢𝑁𝑘𝑞𝑏H_{t}(b)=H_{t}(q^{1},\ldots,q^{k},v^{1},\ldots,v^{N-k})=(q^{1},\ldots,q^{k},v^% {1}+tu^{1}_{q(b)},\ldots,v^{N-k}+tu^{N-k}_{q(b)}).italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_b ) = italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_v start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT ) = ( italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , … , italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT , italic_v start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT + italic_t italic_u start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q ( italic_b ) end_POSTSUBSCRIPT , … , italic_v start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT + italic_t italic_u start_POSTSUPERSCRIPT italic_N - italic_k end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q ( italic_b ) end_POSTSUBSCRIPT ) .

For q=(q,0)ϕ(M)𝑞𝑞0italic-ϕ𝑀q=(q,0)\in\phi(M)italic_q = ( italic_q , 0 ) ∈ italic_ϕ ( italic_M ), the differential of the Htsubscript𝐻𝑡H_{t}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT map has the matrix

DHt(q)=(Idk×kqivj(vi+tuqi)qj(vi+tuqi)vj)=(Idk×k0(vi+tuqi)qjId(Nk)×(Nk)).𝐷subscript𝐻𝑡𝑞subscriptId𝑘𝑘superscript𝑞𝑖superscript𝑣𝑗missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionsuperscript𝑣𝑖𝑡subscriptsuperscript𝑢𝑖𝑞superscript𝑞𝑗superscript𝑣𝑖𝑡subscriptsuperscript𝑢𝑖𝑞superscript𝑣𝑗subscriptId𝑘𝑘0missing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionmissing-subexpressionsuperscript𝑣𝑖𝑡subscriptsuperscript𝑢𝑖𝑞superscript𝑞𝑗subscriptId𝑁𝑘𝑁𝑘DH_{t}(q)=\left(\begin{array}[]{c|c}{\rm Id}_{k\times k}&\frac{\partial q^{i}}% {\partial v^{j}}\\ &\\ \hline\cr\\ \frac{\partial(v^{i}+tu^{i}_{q})}{\partial q^{j}}&\frac{\partial(v^{i}+tu^{i}_% {q})}{\partial v^{j}}\end{array}\right)=\left(\begin{array}[]{c|c}{\rm Id}_{k% \times k}&0\\ &\\ \hline\cr\\ \frac{\partial(v^{i}+tu^{i}_{q})}{\partial q^{j}}&{\rm Id}_{(N-k)\times(N-k)}% \end{array}\right).italic_D italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_q ) = ( start_ARRAY start_ROW start_CELL roman_Id start_POSTSUBSCRIPT italic_k × italic_k end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL divide start_ARG ∂ ( italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + italic_t italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL divide start_ARG ∂ ( italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + italic_t italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_v start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG end_CELL end_ROW end_ARRAY ) = ( start_ARRAY start_ROW start_CELL roman_Id start_POSTSUBSCRIPT italic_k × italic_k end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL divide start_ARG ∂ ( italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT + italic_t italic_u start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ) end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_ARG end_CELL start_CELL roman_Id start_POSTSUBSCRIPT ( italic_N - italic_k ) × ( italic_N - italic_k ) end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY ) .

This matrix is invertible, so the Lemma follows from the inverse function theorem. ∎

Let δHt=minq{δHtq}subscript𝛿subscript𝐻𝑡subscript𝑞superscriptsubscript𝛿subscript𝐻𝑡𝑞\delta_{H_{t}}=\min_{q}\{\delta_{H_{t}}^{q}\}italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT { italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT }. Set

δH=min{δHt:|t|.999ϵ}.subscript𝛿𝐻:subscript𝛿subscript𝐻𝑡𝑡.999italic-ϵ\delta_{H}=\min\{\delta_{H_{t}}:|t|\leq.999\epsilon\}.italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = roman_min { italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT : | italic_t | ≤ .999 italic_ϵ } . (5.1)

From the proof of the inverse function theorem, we can choose δHtq>0subscriptsuperscript𝛿𝑞subscript𝐻𝑡0\delta^{q}_{H_{t}}>0italic_δ start_POSTSUPERSCRIPT italic_q end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT > 0 to be continuous in t𝑡titalic_t. We need |t|<ϵ𝑡italic-ϵ|t|<\epsilon| italic_t | < italic_ϵ, and then the further restriction |t|.999ϵ𝑡.999italic-ϵ|t|\leq.999\epsilon| italic_t | ≤ .999 italic_ϵ ensures that t𝑡titalic_t lies in a compact subset of {\mathbb{R}}blackboard_R. Thus δHtsubscript𝛿subscript𝐻𝑡\delta_{H_{t}}italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT and δHsubscript𝛿𝐻\delta_{H}italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT are positive. Note that δH=δH(u)subscript𝛿𝐻subscript𝛿𝐻𝑢\delta_{H}=\delta_{H}(u)italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_u ) depends on the choice of the normal vector field u𝑢uitalic_u.

Lemma 5.4.

Ht|ϕ(M)evaluated-atsubscript𝐻𝑡italic-ϕ𝑀H_{t}|_{\phi(M)}italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_ϕ ( italic_M ) end_POSTSUBSCRIPT is injective for |t|<t=defmin{ϵ,δH/3}𝑡superscript𝑡superscriptdefitalic-ϵsubscript𝛿𝐻3|t|<t^{*}\stackrel{{\scriptstyle\rm def}}{{=}}\min\left\{\epsilon,\delta_{H}/3\right\}| italic_t | < italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG roman_def end_ARG end_RELOP roman_min { italic_ϵ , italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT / 3 }.

Proof.

Assume instead that there exist x,yϕ(M)𝑥𝑦italic-ϕ𝑀x,y\in\phi(M)italic_x , italic_y ∈ italic_ϕ ( italic_M ) such that x+tux=y+tuy𝑥𝑡subscript𝑢𝑥𝑦𝑡subscript𝑢𝑦x+tu_{x}=y+tu_{y}italic_x + italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_y + italic_t italic_u start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT for |t|<t𝑡superscript𝑡|t|<t^{*}| italic_t | < italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT. By Lemma 5.3, dN(x,y)>δHtsubscript𝑑superscript𝑁𝑥𝑦subscript𝛿subscript𝐻𝑡d_{\mathbb{R}^{N}}(x,y)>\delta_{H_{t}}italic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) > italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT. Then

δHtsubscript𝛿subscript𝐻𝑡\displaystyle\delta_{H_{t}}italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT <\displaystyle<< dN(x,y)=|xy|=|x(x+tux)+(x+tux)y|subscript𝑑superscript𝑁𝑥𝑦𝑥𝑦𝑥𝑥𝑡subscript𝑢𝑥𝑥𝑡subscript𝑢𝑥𝑦\displaystyle d_{\mathbb{R}^{N}}(x,y)=|x-y|=|x-(x+tu_{x})+(x+tu_{x})-y|italic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) = | italic_x - italic_y | = | italic_x - ( italic_x + italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) + ( italic_x + italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) - italic_y |
\displaystyle\leq |x(x+tux)|+|(y+tuy)y|=|tux|+|tuy|2|t|<2t𝑥𝑥𝑡subscript𝑢𝑥𝑦𝑡subscript𝑢𝑦𝑦𝑡subscript𝑢𝑥𝑡subscript𝑢𝑦2𝑡2superscript𝑡\displaystyle|x-(x+tu_{x})|+|(y+tu_{y})-y|=|tu_{x}|+|tu_{y}|\leq 2|t|<2t^{*}| italic_x - ( italic_x + italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) | + | ( italic_y + italic_t italic_u start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) - italic_y | = | italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT | + | italic_t italic_u start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT | ≤ 2 | italic_t | < 2 italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT
\displaystyle\leq 2δHt/3,2subscript𝛿subscript𝐻𝑡3\displaystyle 2\delta_{H_{t}}/3,2 italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT / 3 ,

since t<δHt/3.superscript𝑡subscript𝛿subscript𝐻𝑡3t^{*}<\delta_{H_{t}}/3.italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT < italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT / 3 . This is a contradiction. ∎

We now compute ϵitalic-ϵ\epsilonitalic_ϵ in §5.1(1) in terms of K𝐾Kitalic_K in §5.1(7) and δ𝛿\deltaitalic_δ in §5.1(8). As mentioned above, K𝐾Kitalic_K is computed locally on ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ), while δ𝛿\deltaitalic_δ is computed globally using the Euclidean distance.

Lemma 5.5.

Set ϵ=min{K1,δ/3}italic-ϵsuperscript𝐾1𝛿3\epsilon=\min\left\{K^{-1},\delta/3\right\}italic_ϵ = roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ / 3 }, where K𝐾Kitalic_K is given in §5.1(7) and δ𝛿\deltaitalic_δ is given in §5.1(8). Then every point in Bϵ(ϕ(M))subscript𝐵italic-ϵitalic-ϕ𝑀B_{\epsilon}(\phi(M))italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) has a unique closest point in ϕ(M).italic-ϕ𝑀\phi(M).italic_ϕ ( italic_M ) .

Proof.

By [31, Lem. 6.3], the focal points (Def. 5.1) of ϕ(M)italic-ϕ𝑀\phi(M)italic_ϕ ( italic_M ) along the normal line l=q+tv𝑙𝑞𝑡𝑣l=q+tvitalic_l = italic_q + italic_t italic_v are precisely the points q+pi1v𝑞superscriptsubscript𝑝𝑖1𝑣q+p_{i}^{-1}vitalic_q + italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_v, where the pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are the nonzero principal curvatures. The proof of the ϵitalic-ϵ\epsilonitalic_ϵ-Neighborhood Theorem in [26, Thm. 6.24] uses the invertibility of the endpoint map, so we must have ϵ<K1.italic-ϵsuperscript𝐾1\epsilon<K^{-1}.italic_ϵ < italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Suppose there exists bBϵ(ϕ(M))𝑏subscript𝐵italic-ϵitalic-ϕ𝑀b\in B_{\epsilon}(\phi(M))italic_b ∈ italic_B start_POSTSUBSCRIPT italic_ϵ end_POSTSUBSCRIPT ( italic_ϕ ( italic_M ) ) with closest points x,yϕ(M)𝑥𝑦italic-ϕ𝑀x,y\in\phi(M)italic_x , italic_y ∈ italic_ϕ ( italic_M ). Then b=x+tvx=y+tvy𝑏𝑥𝑡subscript𝑣𝑥𝑦superscript𝑡subscript𝑣𝑦b=x+tv_{x}=y+t^{\prime}v_{y}italic_b = italic_x + italic_t italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT = italic_y + italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT for unit normal vectors vxsubscript𝑣𝑥v_{x}italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT at x𝑥xitalic_x, vysubscript𝑣𝑦v_{y}italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT at y𝑦yitalic_y, and |t|,|t|<ϵ.𝑡superscript𝑡italic-ϵ|t|,|t^{\prime}|<\epsilon.| italic_t | , | italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | < italic_ϵ . By definition of δ𝛿\deltaitalic_δ, we have dN(x,y)>δsubscript𝑑superscript𝑁𝑥𝑦𝛿d_{\mathbb{R}^{N}}(x,y)>\deltaitalic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) > italic_δ. As in the previous proof, we have

δ<dN(x,y)𝛿subscript𝑑superscript𝑁𝑥𝑦\displaystyle\delta<d_{\mathbb{R}^{N}}(x,y)italic_δ < italic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) =\displaystyle== |xy|=|x(x+tvx)+(y+tvy)y|𝑥𝑦𝑥𝑥𝑡subscript𝑣𝑥𝑦superscript𝑡subscript𝑣𝑦𝑦\displaystyle|x-y|=|x-(x+tv_{x})+(y+t^{\prime}v_{y})-y|| italic_x - italic_y | = | italic_x - ( italic_x + italic_t italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ) + ( italic_y + italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ) - italic_y |
\displaystyle\leq |t||vx|+|t||vy|<2ϵ2δ/3,𝑡subscript𝑣𝑥superscript𝑡subscript𝑣𝑦2italic-ϵ2𝛿3\displaystyle|t||v_{x}|+|t^{\prime}||v_{y}|<2\epsilon\leq 2\delta/3,| italic_t | | italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT | + | italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | | italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT | < 2 italic_ϵ ≤ 2 italic_δ / 3 ,

a contradiction. ∎

We can now define δ𝛿\deltaitalic_δ in (5.2) below, after which we explicitly estimate it in the proof of Proposition 5.6. The steps of the estimate are recapped in Remark 5.2. We first restrict the endpoint map E:νϕN:𝐸subscript𝜈italic-ϕsuperscript𝑁E:\nu_{\phi}\longrightarrow\mathbb{R}^{N}italic_E : italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ⟶ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT to the compact set W={vνϕ:|v|.999K1}.𝑊conditional-set𝑣subscript𝜈italic-ϕ𝑣.999superscript𝐾1W=\{v\in\nu_{\phi}:|v|\leq.999K^{-1}\}.italic_W = { italic_v ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT : | italic_v | ≤ .999 italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT } . For fixed q0ϕ(M)subscript𝑞0italic-ϕ𝑀q_{0}\in\phi(M)italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_ϕ ( italic_M ) and (q0,v0)νϕ,q0W=Wq0subscript𝑞0subscript𝑣0subscript𝜈italic-ϕsubscript𝑞0𝑊subscript𝑊subscript𝑞0(q_{0},v_{0})\in\nu_{\phi,q_{0}}\cap W=W_{q_{0}}( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∩ italic_W = italic_W start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, the proof of Lemma 5.3 shows that DE(q0,v0)𝐷𝐸subscript𝑞0subscript𝑣0DE(q_{0},v_{0})italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is invertible. Therefore, there is a ball of radius δ(q0,v0)>0𝛿subscript𝑞0subscript𝑣00\delta(q_{0},v_{0})>0italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) > 0 around (q0,v0)subscript𝑞0subscript𝑣0(q_{0},v_{0})( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) on which E𝐸Eitalic_E is a diffeomorphism. Set δq0=δ(q0,0)subscript𝛿subscript𝑞0𝛿subscript𝑞00\delta_{q_{0}}=\delta(q_{0},0)italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 ) and

Aq0subscript𝐴subscript𝑞0\displaystyle A_{q_{0}}italic_A start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT =\displaystyle== {qϕ(M):dN(q,q0)<δq0/2}.conditional-set𝑞italic-ϕ𝑀subscript𝑑superscript𝑁𝑞subscript𝑞0subscript𝛿subscript𝑞02\displaystyle\{q\in\phi(M):d_{\mathbb{R}^{N}}\>(q,q_{0})<\delta_{q_{0}}/2\}.{ italic_q ∈ italic_ϕ ( italic_M ) : italic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_q , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) < italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / 2 } .

We claim that E𝐸Eitalic_E is a diffeomorphism on the the set Bq0νϕsubscript𝐵subscript𝑞0subscript𝜈italic-ϕB_{q_{0}}\subset\nu_{\phi}italic_B start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⊂ italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT given by

Bq0={(q,v):|v|<δq0/2,qAq0}.subscript𝐵subscript𝑞0conditional-set𝑞𝑣formulae-sequence𝑣subscript𝛿subscript𝑞02𝑞subscript𝐴subscript𝑞0B_{q_{0}}=\{(q,v):|v|<\delta_{q_{0}}/2,q\in A_{q_{0}}\}.italic_B start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = { ( italic_q , italic_v ) : | italic_v | < italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / 2 , italic_q ∈ italic_A start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT } .

Indeed, for (q1,v1)Bq0subscript𝑞1subscript𝑣1subscript𝐵subscript𝑞0(q_{1},v_{1})\in B_{q_{0}}( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∈ italic_B start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, we have

|(q1,v1)(q0,0)||(q1,v1)(q1,0)|+|(q1,0)(q0,0)|δq0+δq0/2δq0.subscript𝑞1subscript𝑣1subscript𝑞00subscript𝑞1subscript𝑣1subscript𝑞10subscript𝑞10subscript𝑞00subscript𝛿subscript𝑞0subscript𝛿subscript𝑞02subscript𝛿subscript𝑞0|(q_{1},v_{1})-(q_{0},0)|\leq|(q_{1},v_{1})-(q_{1},0)|+|(q_{1},0)-(q_{0},0)|% \leq\delta_{q_{0}}+\delta_{q_{0}}/2\leq\delta_{q_{0}}.| ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 ) | ≤ | ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , 0 ) | + | ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , 0 ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , 0 ) | ≤ italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT + italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT / 2 ≤ italic_δ start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

Thus for (q1,v1),(q2,v2)Bq0subscript𝑞1subscript𝑣1subscript𝑞2subscript𝑣2subscript𝐵subscript𝑞0(q_{1},v_{1}),(q_{2},v_{2})\in B_{q_{0}}( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) , ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_B start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and (q1,v1)(q2,v2)subscript𝑞1subscript𝑣1subscript𝑞2subscript𝑣2(q_{1},v_{1})\neq(q_{2},v_{2})( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ≠ ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), we conclude (q1,0),(q2,0)Aq0subscript𝑞10subscript𝑞20subscript𝐴subscript𝑞0(q_{1},0),(q_{2},0)\in A_{q_{0}}( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , 0 ) , ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 0 ) ∈ italic_A start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT and E(q1,v1)E(q2,v2)𝐸subscript𝑞1subscript𝑣1𝐸subscript𝑞2subscript𝑣2E(q_{1},v_{1})\neq E(q_{2},v_{2})italic_E ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ≠ italic_E ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Since E𝐸Eitalic_E is invertible on Bq0,subscript𝐵subscript𝑞0B_{q_{0}},italic_B start_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , it is a diffeomorphism onto its image.

We set

δ=12min{δ(q0,v0):(q0,v0)νϕ,|v0|.999K1}.𝛿12:𝛿subscript𝑞0subscript𝑣0formulae-sequencesubscript𝑞0subscript𝑣0subscript𝜈italic-ϕsubscript𝑣0.999superscript𝐾1\delta=\frac{1}{2}\min\{\delta(q_{0},v_{0}):(q_{0},v_{0})\in\nu_{\phi},|v_{0}|% \leq.999K^{-1}\}.italic_δ = divide start_ARG 1 end_ARG start_ARG 2 end_ARG roman_min { italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) : ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT , | italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ .999 italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT } . (5.2)

Since M𝑀Mitalic_M is compact and |v0|subscript𝑣0|v_{0}|| italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | lies in a compact interval, δ𝛿\deltaitalic_δ is positive. In other words, for q1,q2ϕ(M)subscript𝑞1subscript𝑞2italic-ϕ𝑀q_{1},q_{2}\in\phi(M)italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_ϕ ( italic_M ) with dN(q1,q2)<δsubscript𝑑superscript𝑁subscript𝑞1subscript𝑞2𝛿d_{\mathbb{R}^{N}}(q_{1},q_{2})<\deltaitalic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) < italic_δ, we have q1+v1q2+v2subscript𝑞1subscript𝑣1subscript𝑞2subscript𝑣2q_{1}+v_{1}\neq q_{2}+v_{2}italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT for |v1|,|v2|<δsubscript𝑣1subscript𝑣2𝛿|v_{1}|,|v_{2}|<\delta| italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | , | italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | < italic_δ and (q1,v1)νϕ,q1,(q2,v2)νϕ,q2.formulae-sequencesubscript𝑞1subscript𝑣1subscript𝜈italic-ϕsubscript𝑞1subscript𝑞2subscript𝑣2subscript𝜈italic-ϕsubscript𝑞2(q_{1},v_{1})\in\nu_{\phi,q_{1}},(q_{2},v_{2})\in\nu_{\phi,q_{2}}.( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT , ( italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_ν start_POSTSUBSCRIPT italic_ϕ , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

For a fixed (q0,v0)subscript𝑞0subscript𝑣0(q_{0},v_{0})( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), it remains to compute δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) explicitly, after which δ𝛿\deltaitalic_δ in (5.2) is explicit. The computation of δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) uses a quantitative version [28] of the Implicit Function Theorem given in the next Proposition. The proof is in the Appendix.

To set the notation, let the matrix norm Anorm𝐴\|A\|∥ italic_A ∥ be the sup norm of the absolute values of the entries. For GC1(m+n,m)𝐺superscript𝐶1superscript𝑚𝑛superscript𝑚G\in C^{1}(\mathbb{R}^{m+n},\mathbb{R}^{m})italic_G ∈ italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT , blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ), let (s0,y0)m+n×msubscript𝑠0subscript𝑦0superscript𝑚𝑛superscript𝑚(s_{0},y_{0})\in\mathbb{R}^{m+n}\times{\mathbb{R}}^{m}( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT satisfy G(s0,y0)=0𝐺subscript𝑠0subscript𝑦00G(s_{0},y_{0})=0italic_G ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0. For fixed δ>0𝛿0\delta>0italic_δ > 0, set Vδ=Vδ(s0,y0)={(s,y)m+n:|ss0|δ,|yy0|δ}subscript𝑉𝛿subscript𝑉𝛿subscript𝑠0subscript𝑦0conditional-set𝑠𝑦superscript𝑚𝑛formulae-sequence𝑠subscript𝑠0𝛿𝑦subscript𝑦0𝛿V_{\delta}=V_{\delta(s_{0},y_{0})}=\{(s,y)\in\mathbb{R}^{m+n}:|s-s_{0}|\leq% \delta,|y-y_{0}|\leq\delta\}italic_V start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = italic_V start_POSTSUBSCRIPT italic_δ ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT = { ( italic_s , italic_y ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT : | italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ , | italic_y - italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ }. We focus on the case G(s,y)=E(s)y𝐺𝑠𝑦𝐸𝑠𝑦G(s,y)=E(s)-yitalic_G ( italic_s , italic_y ) = italic_E ( italic_s ) - italic_y for m=n𝑚𝑛m=nitalic_m = italic_n, the usual method to derive the Inverse Function Theorem from the Implicit Function Theorem.

Proposition 5.6.

Assume that the m×m𝑚𝑚m\times mitalic_m × italic_m matrix sG(s0,y0)subscript𝑠𝐺subscript𝑠0subscript𝑦0\partial_{s}G(s_{0},y_{0})∂ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_G ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) of partial derivatives of G𝐺Gitalic_G in the s𝑠sitalic_s directions is invertible. Choose δ0>0superscript𝛿00\delta^{0}>0italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT > 0 such that

sup(s,y)Vδ0Id[sG(s0,y0)]1sG(s,y)1/2.subscriptsupremum𝑠𝑦subscript𝑉superscript𝛿0normIdsuperscriptdelimited-[]subscript𝑠𝐺subscript𝑠0subscript𝑦01subscript𝑠𝐺𝑠𝑦12\sup_{(s,y)\in V_{\delta^{0}}}\|{\rm Id}-[\partial_{s}G(s_{0},y_{0})]^{-1}% \partial_{s}G(s,y)\|\leq 1/2.roman_sup start_POSTSUBSCRIPT ( italic_s , italic_y ) ∈ italic_V start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ roman_Id - [ ∂ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_G ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_G ( italic_s , italic_y ) ∥ ≤ 1 / 2 . (5.3)

Set
(I)      Bδ0=sup(s,y)Vδ0yG(s,y)subscript𝐵superscript𝛿0subscriptsupremum𝑠𝑦subscript𝑉superscript𝛿0normsubscript𝑦𝐺𝑠𝑦B_{\delta^{0}}=\sup_{(s,y)\in V_{\delta^{0}}}\|\partial_{y}G(s,y)\|italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT ( italic_s , italic_y ) ∈ italic_V start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ ∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT italic_G ( italic_s , italic_y ) ∥,
(II)    P=sG(s0,y0)1𝑃normsubscript𝑠𝐺superscriptsubscript𝑠0subscript𝑦01P=\|\partial_{s}G(s_{0},y_{0})^{-1}\|italic_P = ∥ ∂ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_G ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥,
(III)   δ1=(2PBδ0)1δ0superscript𝛿1superscript2𝑃subscript𝐵superscript𝛿01superscript𝛿0\delta^{1}=(2PB_{\delta^{0}})^{-1}\delta^{0}italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = ( 2 italic_P italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT.
Then for the case n=m𝑛𝑚n=mitalic_n = italic_m and G(s,y)=E(s)y𝐺𝑠𝑦𝐸𝑠𝑦G(s,y)=E(s)-yitalic_G ( italic_s , italic_y ) = italic_E ( italic_s ) - italic_y, on the set {(s,y):ss0<δ0,yy0<δ1,G(s,y)=0}conditional-set𝑠𝑦formulae-sequencenorm𝑠subscript𝑠0superscript𝛿0formulae-sequencenorm𝑦subscript𝑦0superscript𝛿1𝐺𝑠𝑦0\{(s,y):\|s-s_{0}\|<\delta^{0},\|y-y_{0}\|<\delta^{1},G(s,y)=0\}{ ( italic_s , italic_y ) : ∥ italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ < italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT , ∥ italic_y - italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ < italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_G ( italic_s , italic_y ) = 0 }, E𝐸Eitalic_E has a C1superscript𝐶1C^{1}italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT inverse: E(s)=y𝐸𝑠𝑦E(s)=yitalic_E ( italic_s ) = italic_y iff s=E1(y).𝑠superscript𝐸1𝑦s=E^{-1}(y).italic_s = italic_E start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_y ) . Equivalently, E𝐸Eitalic_E is a C1superscript𝐶1C^{1}italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT diffeomorphism on

E1(Bδ1(y0))Bδ0(s0).superscript𝐸1subscript𝐵superscript𝛿1subscript𝑦0subscript𝐵superscript𝛿0subscript𝑠0E^{-1}(B_{\delta^{1}}(y_{0}))\cap B_{\delta^{0}}(s_{0}).italic_E start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ∩ italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (5.4)

To apply the Proposition, we set n=m=N𝑛𝑚𝑁n=m=Nitalic_n = italic_m = italic_N and G((q,v),y)=E(q,v)y𝐺𝑞𝑣𝑦𝐸𝑞𝑣𝑦G((q,v),y)=E(q,v)-yitalic_G ( ( italic_q , italic_v ) , italic_y ) = italic_E ( italic_q , italic_v ) - italic_y, where E𝐸Eitalic_E is the endpoint map. We follow the Proposition’s labeling in a series of steps:

Criterion I: Independent of the value of δ0=δ0((q0,v0),y0)superscript𝛿0superscript𝛿0subscript𝑞0subscript𝑣0subscript𝑦0\delta^{0}=\delta^{0}((q_{0},v_{0}),y_{0})italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), we have

Bδ0subscript𝐵superscript𝛿0\displaystyle B_{\delta^{0}}italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT =\displaystyle== sup((q,v),y)Vδ0yG((q,v),y)=sup((q,v),y)Vδ0y(E(q,v)y)subscriptsupremum𝑞𝑣𝑦subscript𝑉superscript𝛿0normsubscript𝑦𝐺𝑞𝑣𝑦subscriptsupremum𝑞𝑣𝑦subscript𝑉superscript𝛿0normsubscript𝑦𝐸𝑞𝑣𝑦\displaystyle\sup_{((q,v),y)\in V_{\delta^{0}}}||\partial_{y}G((q,v),y)||=\sup% _{((q,v),y)\in V_{\delta^{0}}}||\partial_{y}(E(q,v)-y)||roman_sup start_POSTSUBSCRIPT ( ( italic_q , italic_v ) , italic_y ) ∈ italic_V start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT | | ∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT italic_G ( ( italic_q , italic_v ) , italic_y ) | | = roman_sup start_POSTSUBSCRIPT ( ( italic_q , italic_v ) , italic_y ) ∈ italic_V start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT | | ∂ start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT ( italic_E ( italic_q , italic_v ) - italic_y ) | |
=\displaystyle== sup((q,v),y)Vδ0Id=1.subscriptsupremum𝑞𝑣𝑦subscript𝑉superscript𝛿0normId1\displaystyle\sup_{((q,v),y)\in V_{\delta^{0}}}\|-{\rm Id}\|=1.roman_sup start_POSTSUBSCRIPT ( ( italic_q , italic_v ) , italic_y ) ∈ italic_V start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∥ - roman_Id ∥ = 1 .

Criterion II: By §5.1(4),(7),

(q,v)G((q0,v0),y0)=DE(q0,v0)subscript𝑞𝑣𝐺subscript𝑞0subscript𝑣0subscript𝑦0𝐷𝐸subscript𝑞0subscript𝑣0\partial_{(q,v)}G((q_{0},v_{0}),y_{0})=DE(q_{0},v_{0})∂ start_POSTSUBSCRIPT ( italic_q , italic_v ) end_POSTSUBSCRIPT italic_G ( ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT )

is invertible for |v|<K1𝑣superscript𝐾1|v|<K^{-1}| italic_v | < italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT. In the notation of §5.1(4),

DE(q0,v0)=𝐷𝐸subscript𝑞0subscript𝑣0absent\displaystyle DE(q_{0},v_{0})=italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) =
((x1q1+viwi1q1)|(q0,v0)(x1qk+viwi1qk)|(q0,v0)w1,q01wNk,q01(xNq1+viwiNq1)|(q0,v0)(xNqk+viwiNqk)|(q0,v0)w1,q0NwNk,q0N)evaluated-atsuperscript𝑥1superscript𝑞1superscript𝑣𝑖superscriptsubscript𝑤𝑖1superscript𝑞1subscript𝑞0subscript𝑣0evaluated-atsuperscript𝑥1superscript𝑞𝑘superscript𝑣𝑖superscriptsubscript𝑤𝑖1superscript𝑞𝑘subscript𝑞0subscript𝑣0subscriptsuperscript𝑤11subscript𝑞0superscriptsubscript𝑤𝑁𝑘subscript𝑞01missing-subexpressionmissing-subexpressionevaluated-atsuperscript𝑥𝑁superscript𝑞1superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑁superscript𝑞1subscript𝑞0subscript𝑣0evaluated-atsuperscript𝑥𝑁superscript𝑞𝑘superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑁superscript𝑞𝑘subscript𝑞0subscript𝑣0subscriptsuperscript𝑤𝑁1subscript𝑞0superscriptsubscript𝑤𝑁𝑘subscript𝑞0𝑁\displaystyle\left(\begin{array}[]{cccccc}\left(\frac{\partial x^{1}}{\partial q% ^{1}}+v^{i}\frac{\partial w_{i}^{1}}{\partial q^{1}}\right)|_{(q_{0},v_{0})}&% \cdots&\left(\frac{\partial x^{1}}{\partial q^{k}}+v^{i}\frac{\partial w_{i}^{% 1}}{\partial q^{k}}\right)|_{(q_{0},v_{0})}&w^{1}_{1,q_{0}}&\cdots&w_{N-k,q_{0% }}^{1}\\ \vdots&&\vdots&\vdots&&\vdots\\ \left(\frac{\partial x^{N}}{\partial q^{1}}+v^{i}\frac{\partial w_{i}^{N}}{% \partial q^{1}}\right)|_{(q_{0},v_{0})}&\cdots&\left(\frac{\partial x^{N}}{% \partial q^{k}}+v^{i}\frac{\partial w_{i}^{N}}{\partial q^{k}}\right)|_{(q_{0}% ,v_{0})}&w^{N}_{1,q_{0}}&\cdots&w_{N-k,q_{0}}^{N}\end{array}\right)( start_ARRAY start_ROW start_CELL ( divide start_ARG ∂ italic_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT divide start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG ) | start_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL ( divide start_ARG ∂ italic_x start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT divide start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG ) | start_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT end_CELL start_CELL italic_w start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_w start_POSTSUBSCRIPT italic_N - italic_k , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL end_CELL start_CELL ⋮ end_CELL start_CELL ⋮ end_CELL start_CELL end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL ( divide start_ARG ∂ italic_x start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT divide start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_ARG ) | start_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL ( divide start_ARG ∂ italic_x start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT divide start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT end_ARG ) | start_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT end_CELL start_CELL italic_w start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_w start_POSTSUBSCRIPT italic_N - italic_k , italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_CELL end_ROW end_ARRAY ) (5.8)

By Cramer’s rule,

P=DE(q0,v0)1=(det(DE(q0,v0)))1(DE(q0,v0),P=\|DE(q_{0},v_{0})^{-1}\|=(\det(DE(q_{0},v_{0})))^{-1}\|(DE(q_{0},v_{0})^{*}\|,italic_P = ∥ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ = ( roman_det ( italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ ( italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT ∥ , (5.9)

where DE(q0,v0)(i,j)𝐷𝐸subscriptsuperscriptsubscript𝑞0subscript𝑣0𝑖𝑗DE(q_{0},v_{0})^{*}_{(i,j)}italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT ( italic_i , italic_j ) end_POSTSUBSCRIPT is the usual minor of DE(q0,v0)𝐷𝐸subscript𝑞0subscript𝑣0DE(q_{0},v_{0})italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) obtained by deleting the ithsuperscript𝑖thi^{\rm th}italic_i start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT row and jthsuperscript𝑗thj^{\rm th}italic_j start_POSTSUPERSCRIPT roman_th end_POSTSUPERSCRIPT column. Since ϕitalic-ϕ\phiitalic_ϕ and the wisubscript𝑤𝑖w_{i}italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT are given, we obtain an estimate for P𝑃Pitalic_P.

Criterion III: We now compute δ1=δ1(q0,v0),δ0=δ0(q0,v0)formulae-sequencesuperscript𝛿1superscript𝛿1subscript𝑞0subscript𝑣0superscript𝛿0superscript𝛿0subscript𝑞0subscript𝑣0\delta^{1}=\delta^{1}(q_{0},v_{0}),\delta^{0}=\delta^{0}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) such that (5.3) holds for ((q,v),y)𝑞𝑣𝑦((q,v),y)( ( italic_q , italic_v ) , italic_y ). Since (5.3) is independent of y𝑦yitalic_y in our case, we need δ0(q0,v0)superscript𝛿0subscript𝑞0subscript𝑣0\delta^{0}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) such that

|(q,v)|<δ0(q0,v0)Id[DE(q0,v0)]1DE(q,v)1/2.𝑞𝑣superscript𝛿0subscript𝑞0subscript𝑣0normIdsuperscriptdelimited-[]𝐷𝐸subscript𝑞0subscript𝑣01𝐷𝐸𝑞𝑣12|(q,v)|<\delta^{0}(q_{0},v_{0})\Rightarrow\|{\rm Id}-[DE(q_{0},v_{0})]^{-1}DE(% q,v)\|\leq 1/2.| ( italic_q , italic_v ) | < italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⇒ ∥ roman_Id - [ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_D italic_E ( italic_q , italic_v ) ∥ ≤ 1 / 2 . (5.10)

We consider a first order Taylor expansion of DE(q,v)𝐷𝐸𝑞𝑣DE(q,v)italic_D italic_E ( italic_q , italic_v ) around s0=(q0,v0)subscript𝑠0subscript𝑞0subscript𝑣0s_{0}=(q_{0},v_{0})italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). (Note: The summed index j𝑗jitalic_j below refers to coordinates in Nsuperscript𝑁\mathbb{R}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, not an exponent). For s=(q,v)𝑠𝑞𝑣s=(q,v)italic_s = ( italic_q , italic_v ), we have

DE(s)𝐷𝐸𝑠\displaystyle DE(s)italic_D italic_E ( italic_s ) =\displaystyle== DE(s0)+(Rj(1,1)(q,v)(sso)jRj(1,N)(q,v)(sso)jRj(N,1)(q,v)(sso)jRj(N,N)(q,v)(sso)j)𝐷𝐸subscript𝑠0subscriptsuperscript𝑅11𝑗𝑞𝑣superscript𝑠subscript𝑠𝑜𝑗subscriptsuperscript𝑅1𝑁𝑗𝑞𝑣superscript𝑠subscript𝑠𝑜𝑗missing-subexpressionsubscriptsuperscript𝑅𝑁1𝑗𝑞𝑣superscript𝑠subscript𝑠𝑜𝑗subscriptsuperscript𝑅𝑁𝑁𝑗𝑞𝑣superscript𝑠subscript𝑠𝑜𝑗\displaystyle DE(s_{0})+\left(\begin{array}[]{ccc}R^{(1,1)}_{j}(q,v)(s-s_{o})^% {j}&\cdots&R^{(1,N)}_{j}(q,v)(s-s_{o})^{j}\\ \vdots&&\vdots\\ R^{(N,1)}_{j}(q,v)(s-s_{o})^{j}&\cdots&R^{(N,N)}_{j}(q,v)(s-s_{o})^{j}\end{% array}\right)italic_D italic_E ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + ( start_ARRAY start_ROW start_CELL italic_R start_POSTSUPERSCRIPT ( 1 , 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_R start_POSTSUPERSCRIPT ( 1 , italic_N ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_R start_POSTSUPERSCRIPT ( italic_N , 1 ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_CELL start_CELL ⋯ end_CELL start_CELL italic_R start_POSTSUPERSCRIPT ( italic_N , italic_N ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT end_CELL end_ROW end_ARRAY )
=defsuperscriptdef\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG roman_def end_ARG end_RELOP DE(s0)+(Rj(p,r)(q,v)(sso)j).𝐷𝐸subscript𝑠0subscriptsuperscript𝑅𝑝𝑟𝑗𝑞𝑣superscript𝑠subscript𝑠𝑜𝑗\displaystyle DE(s_{0})+(R^{(p,r)}_{j}(q,v)(s-s_{o})^{j}).italic_D italic_E ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + ( italic_R start_POSTSUPERSCRIPT ( italic_p , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) .

As in Criterion II, set fpr=xr(q)qp+viwir(q)qpsubscriptsuperscript𝑓𝑟𝑝superscript𝑥𝑟𝑞superscript𝑞𝑝superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑟𝑞superscript𝑞𝑝f^{r}_{p}=\frac{\partial x^{r}(q)}{\partial q^{p}}+v^{i}\frac{\partial w_{i}^{% r}(q)}{\partial q^{p}}italic_f start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = divide start_ARG ∂ italic_x start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ( italic_q ) end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT divide start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT ( italic_q ) end_ARG start_ARG ∂ italic_q start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT end_ARG for all 1pN1𝑝𝑁1\leq p\leq N1 ≤ italic_p ≤ italic_N, 1rk1𝑟𝑘1\leq r\leq k1 ≤ italic_r ≤ italic_k, and fpr=wp,qrsubscriptsuperscript𝑓𝑟𝑝subscriptsuperscript𝑤𝑟𝑝𝑞f^{r}_{p}=w^{r}_{p,q}italic_f start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_w start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT for 1pN1𝑝𝑁1\leq p\leq N1 ≤ italic_p ≤ italic_N, k+1rN𝑘1𝑟𝑁k+1\leq r\leq Nitalic_k + 1 ≤ italic_r ≤ italic_N. A uniform bound on the error term is given by Taylor’s theorem with integral remainder:

|Rj(p,r)(q,v)(ss0)j|subscriptsuperscript𝑅𝑝𝑟𝑗𝑞𝑣superscript𝑠subscript𝑠0𝑗\displaystyle\left|R^{(p,r)}_{j}(q,v)(s-s_{0})^{j}\right|| italic_R start_POSTSUPERSCRIPT ( italic_p , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT | |01(1t)jfpr((1t)(q0,v0)+t(q,v))dt||(ss0)j|absentsuperscriptsubscript011𝑡subscript𝑗subscriptsuperscript𝑓𝑟𝑝1𝑡subscript𝑞0subscript𝑣0𝑡𝑞𝑣𝑑𝑡superscript𝑠subscript𝑠0𝑗\displaystyle\leq\left|\int_{0}^{1}(1-t)\partial_{j}f^{r}_{p}((1-t)(q_{0},v_{0% })+t(q,v))dt\right|\cdot\left|(s-s_{0})^{j}\right|≤ | ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( 1 - italic_t ) ∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( ( 1 - italic_t ) ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + italic_t ( italic_q , italic_v ) ) italic_d italic_t | ⋅ | ( italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT |
max{|jfpr(q,v)|:1jN,|v|.999K1,qϕ(M)}|ss0|absent:subscript𝑗subscriptsuperscript𝑓𝑟𝑝𝑞𝑣1𝑗𝑁𝑣.999superscript𝐾1𝑞italic-ϕ𝑀𝑠subscript𝑠0\displaystyle\leq\max\left\{\left|\partial_{j}f^{r}_{p}(q,v)\right|:1\leq j% \leq N,|v|\leq.999K^{-1},q\in\phi(M)\right\}|s-s_{0}|≤ roman_max { | ∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT italic_r end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_q , italic_v ) | : 1 ≤ italic_j ≤ italic_N , | italic_v | ≤ .999 italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_q ∈ italic_ϕ ( italic_M ) } | italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |
=defLj(p,r)|ss0|.superscriptdefabsentsubscriptsuperscript𝐿𝑝𝑟𝑗𝑠subscript𝑠0\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}L^{(p,r)}_{j}|s-s_{0}|.start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG roman_def end_ARG end_RELOP italic_L start_POSTSUPERSCRIPT ( italic_p , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | .

Here jsubscript𝑗\partial_{j}∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT differentiates in the s𝑠sitalic_s variable. Set

L=maxj,p,r{Lj(p,r)}.𝐿subscript𝑗𝑝𝑟subscriptsuperscript𝐿𝑝𝑟𝑗{L}=\max_{j,p,r}\{{L}^{(p,r)}_{j}\}.italic_L = roman_max start_POSTSUBSCRIPT italic_j , italic_p , italic_r end_POSTSUBSCRIPT { italic_L start_POSTSUPERSCRIPT ( italic_p , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } . (5.15)

Plugging (5.2) into the right hand side of (5.10) and canceling the identity matrix, the matrix norm in (5.10) becomes

[DE(q0,v0)]1(Rj(p,r)(q,v)(ss0)j)normsuperscriptdelimited-[]𝐷𝐸subscript𝑞0subscript𝑣01subscriptsuperscript𝑅𝑝𝑟𝑗𝑞𝑣superscript𝑠subscript𝑠0𝑗\displaystyle\left\|[DE(q_{0},v_{0})]^{-1}(R^{(p,r)}_{j}(q,v)(s-s_{0})^{j})\right\|∥ [ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_R start_POSTSUPERSCRIPT ( italic_p , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) ∥ =\displaystyle== maxj,p,r|([DE(q0,v0)]1)p(Rj(,r)(q,v)(ss0)j)|subscript𝑗𝑝𝑟subscriptsuperscriptsuperscriptdelimited-[]𝐷𝐸subscript𝑞0subscript𝑣01𝑝subscriptsuperscript𝑅𝑟𝑗𝑞𝑣superscript𝑠subscript𝑠0𝑗\displaystyle\max_{j,p,r}\left|([DE(q_{0},v_{0})]^{-1})^{p}_{\ell}(R^{(\ell,r)% }_{j}(q,v)(s-s_{0})^{j})\right|roman_max start_POSTSUBSCRIPT italic_j , italic_p , italic_r end_POSTSUBSCRIPT | ( [ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ( italic_R start_POSTSUPERSCRIPT ( roman_ℓ , italic_r ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) | (5.16)
\displaystyle\leq N[DE(q0,v0)]1Lδ0(q0,v0),𝑁normsuperscriptdelimited-[]𝐷𝐸subscript𝑞0subscript𝑣01𝐿superscript𝛿0subscript𝑞0subscript𝑣0\displaystyle N\|[DE(q_{0},v_{0})]^{-1}\|\cdot{L}\cdot\delta^{0}(q_{0},v_{0}),italic_N ∥ [ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ ⋅ italic_L ⋅ italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ,

where the N𝑁Nitalic_N comes from the sum over =1,,N1𝑁\ell=1,\ldots,Nroman_ℓ = 1 , … , italic_N. Setting

δ0(q0,v0)=[2NDE(q0,v0)1L]1,superscript𝛿0subscript𝑞0subscript𝑣0superscriptdelimited-[]2𝑁norm𝐷𝐸superscriptsubscript𝑞0subscript𝑣01𝐿1\delta^{0}(q_{0},v_{0})=\left[2N\|DE(q_{0},v_{0})^{-1}\|\cdot{L}\right]^{-1},italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = [ 2 italic_N ∥ italic_D italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∥ ⋅ italic_L ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , (5.17)

we conclude that the estimate (5.10) is satisfied.

In summary, we now have

δ1(q0,v0)=(2PBδ0(q0,v0))1δ0(q0,v0)=(2P)1δ0(q0,v0),superscript𝛿1subscript𝑞0subscript𝑣0superscript2𝑃subscript𝐵superscript𝛿0subscript𝑞0subscript𝑣01superscript𝛿0subscript𝑞0subscript𝑣0superscript2𝑃1superscript𝛿0subscript𝑞0subscript𝑣0\delta^{1}(q_{0},v_{0})=(2PB_{\delta^{0}(q_{0},v_{0})})^{-1}\delta^{0}(q_{0},v% _{0})=(2P)^{-1}\delta^{0}(q_{0},v_{0}),italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ( 2 italic_P italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ( 2 italic_P ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , (5.18)

by Criterion I. Thus δ1(q0,v0)superscript𝛿1subscript𝑞0subscript𝑣0\delta^{1}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is estimated by Criterion II and III.

By Proposition 5.6, E𝐸Eitalic_E is a diffeomorphism on E1(Bδ1(q0,v0)(y0))Bδ0(q0,v0)(q0,v0)superscript𝐸1subscript𝐵superscript𝛿1subscript𝑞0subscript𝑣0subscript𝑦0subscript𝐵superscript𝛿0subscript𝑞0subscript𝑣0subscript𝑞0subscript𝑣0E^{-1}(B_{\delta^{1}(q_{0},v_{0})}(y_{0}))\cap B_{\delta^{0}(q_{0},v_{0})}(q_{% 0},v_{0})italic_E start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ∩ italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). To be explicit, we want to find radius δ(q0,v0)𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) such that

Bδ(q0,v0)(q0,v0)E1(Bδ1(q0,v0)(y0))Bδ0(q0,v0)(q0,v0).subscript𝐵𝛿subscript𝑞0subscript𝑣0subscript𝑞0subscript𝑣0superscript𝐸1subscript𝐵superscript𝛿1subscript𝑞0subscript𝑣0subscript𝑦0subscript𝐵superscript𝛿0subscript𝑞0subscript𝑣0subscript𝑞0subscript𝑣0B_{\delta(q_{0},v_{0})}(q_{0},v_{0})\subset E^{-1}(B_{\delta^{1}(q_{0},v_{0})}% (y_{0}))\cap B_{\delta^{0}(q_{0},v_{0})}(q_{0},v_{0}).italic_B start_POSTSUBSCRIPT italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⊂ italic_E start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) ∩ italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (5.19)

We first find δ2(q0,v0)superscript𝛿2subscript𝑞0subscript𝑣0\delta^{2}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) such that

|(q,v)(q0,v0)|<δ2(q0,v0)|E(q,v)E(q0,v0)|=|E(q,v)y0|<δ1(q0,v0).𝑞𝑣subscript𝑞0subscript𝑣0superscript𝛿2subscript𝑞0subscript𝑣0𝐸𝑞𝑣𝐸subscript𝑞0subscript𝑣0𝐸𝑞𝑣subscript𝑦0superscript𝛿1subscript𝑞0subscript𝑣0|(q,v)-(q_{0},v_{0})|<\delta^{2}(q_{0},v_{0})\Rightarrow|E(q,v)-E(q_{0},v_{0})% |=|E(q,v)-y_{0}|<\delta^{1}(q_{0},v_{0}).| ( italic_q , italic_v ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | < italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⇒ | italic_E ( italic_q , italic_v ) - italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | = | italic_E ( italic_q , italic_v ) - italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | < italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) .

In other words, we want

|(q,v)(q0,v0)|<δ2(q0,v0)E(q,v)Bδ1(q0,v0)(y0).𝑞𝑣subscript𝑞0subscript𝑣0superscript𝛿2subscript𝑞0subscript𝑣0𝐸𝑞𝑣subscript𝐵superscript𝛿1subscript𝑞0subscript𝑣0subscript𝑦0|(q,v)-(q_{0},v_{0})|<\delta^{2}(q_{0},v_{0})\Rightarrow E(q,v)\in B_{\delta^{% 1}(q_{0},v_{0})}(y_{0}).| ( italic_q , italic_v ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | < italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⇒ italic_E ( italic_q , italic_v ) ∈ italic_B start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ( italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . (5.20)

As above, we compute δ2(q0,v0)superscript𝛿2subscript𝑞0subscript𝑣0\delta^{2}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) by a Taylor series expansion of E𝐸Eitalic_E around (q0,v0)subscript𝑞0subscript𝑣0(q_{0},v_{0})( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ):

E(q,v)=E(q0,v0)+(jRj1(q,v)((q,v)(q0,v0))j,,jRjN(q,v)((q,v)(q0,s0))j),𝐸𝑞𝑣𝐸subscript𝑞0subscript𝑣0subscript𝑗subscriptsuperscript𝑅1𝑗𝑞𝑣superscript𝑞𝑣subscript𝑞0subscript𝑣0𝑗subscript𝑗subscriptsuperscript𝑅𝑁𝑗𝑞𝑣superscript𝑞𝑣subscript𝑞0subscript𝑠0𝑗E(q,v)=E(q_{0},v_{0})+\left(\sum\limits_{j}R^{1}_{j}(q,v)((q,v)-(q_{0},v_{0}))% ^{j},\ldots,\sum\limits_{j}R^{N}_{j}(q,v)((q,v)-(q_{0},s_{0}))^{j}\right),italic_E ( italic_q , italic_v ) = italic_E ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) + ( ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_R start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( ( italic_q , italic_v ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT , … , ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) ( ( italic_q , italic_v ) - ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) ,

with

|Rjp(q,v)|subscriptsuperscript𝑅𝑝𝑗𝑞𝑣\displaystyle|R^{p}_{j}(q,v)|| italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_q , italic_v ) | \displaystyle\leq max{|j(ϕp+viwip)(q,v)|:1jN,|v|.999K1,qϕ(M)}:subscript𝑗superscriptitalic-ϕ𝑝superscript𝑣𝑖superscriptsubscript𝑤𝑖𝑝𝑞𝑣1𝑗𝑁𝑣.999superscript𝐾1𝑞italic-ϕ𝑀\displaystyle\max\left\{\left|\partial_{j}(\phi^{p}+v^{i}w_{i}^{p})(q,v)\right% |:1\leq j\leq N,|v|\leq.999K^{-1},q\in\phi(M)\right\}roman_max { | ∂ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT + italic_v start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT ) ( italic_q , italic_v ) | : 1 ≤ italic_j ≤ italic_N , | italic_v | ≤ .999 italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_q ∈ italic_ϕ ( italic_M ) } (5.21)
=defsuperscriptdef\displaystyle\stackrel{{\scriptstyle\rm def}}{{=}}start_RELOP SUPERSCRIPTOP start_ARG = end_ARG start_ARG roman_def end_ARG end_RELOP Sp.superscript𝑆𝑝\displaystyle{S}^{p}.italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT .

For s0=(q0,v0),s=(q,v)formulae-sequencesubscript𝑠0subscript𝑞0subscript𝑣0𝑠𝑞𝑣s_{0}=(q_{0},v_{0}),s=(q,v)italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_s = ( italic_q , italic_v ), we have

|E(s)E(s0)|2superscript𝐸𝑠𝐸subscript𝑠02\displaystyle|E(s)-E(s_{0})|^{2}| italic_E ( italic_s ) - italic_E ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =\displaystyle== p=1N(jRjp(s)(ss0)j)2p=1N(j|Rjp(s)|2)|ss0|2superscriptsubscript𝑝1𝑁superscriptsubscript𝑗subscriptsuperscript𝑅𝑝𝑗𝑠superscript𝑠subscript𝑠0𝑗2superscriptsubscript𝑝1𝑁subscript𝑗superscriptsubscriptsuperscript𝑅𝑝𝑗𝑠2superscript𝑠subscript𝑠02\displaystyle\sum\limits_{p=1}^{N}\left(\sum\limits_{j}R^{p}_{j}(s)(s-s_{0})^{% j}\right)^{2}\leq\sum\limits_{p=1}^{N}\left(\sum\limits_{j}|R^{p}_{j}(s)|^{2}% \right)|s-s_{0}|^{2}∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_s ) ( italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_j end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_R start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_s ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) | italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
\displaystyle\leq N(p=1N|Sp|2)|ss0|2p=1Nj|Spδ2(q0,v0)|2.𝑁superscriptsubscript𝑝1𝑁superscriptsuperscript𝑆𝑝2superscript𝑠subscript𝑠02superscriptsubscript𝑝1𝑁subscript𝑗superscriptsuperscript𝑆𝑝superscript𝛿2subscript𝑞0subscript𝑣02\displaystyle N\left(\sum\limits_{p=1}^{N}|{S}^{p}|^{2}\right)|s-s_{0}|^{2}% \leq\sum\limits_{p=1}^{N}\sum\limits_{j}|{S}^{p}\delta^{2}(q_{0},v_{0})|^{2}.italic_N ( ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) | italic_s - italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≤ ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT .

Therefore, for

δ2(q0,v0)=δ1(q0,v0)(Np=1N|Sp|2)1/2,superscript𝛿2subscript𝑞0subscript𝑣0superscript𝛿1subscript𝑞0subscript𝑣0superscript𝑁superscriptsubscript𝑝1𝑁superscriptsuperscript𝑆𝑝212\delta^{2}(q_{0},v_{0})=\delta^{1}(q_{0},v_{0})\left(N\sum\limits_{p=1}^{N}|{S% }^{p}|^{2}\right)^{-1/2},italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ( italic_N ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT - 1 / 2 end_POSTSUPERSCRIPT , (5.22)

estimate (5.20) holds. Finally, setting

δ(q0,v0)=min{δ2(q0,v0),δ0(q0,v0)}𝛿subscript𝑞0subscript𝑣0superscript𝛿2subscript𝑞0subscript𝑣0superscript𝛿0subscript𝑞0subscript𝑣0\delta(q_{0},v_{0})=\min\{\delta^{2}(q_{0},v_{0}),\delta^{0}(q_{0},v_{0})\}italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = roman_min { italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) } (5.23)

accomplishes (5.19).

By Lemmas 5.4, 5.5, and using (5.2) to define δ𝛿\deltaitalic_δ, we know that Theorem 5.2 holds, i.e., ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is injective, for

t<min{K1,δH/3,δ/3}.superscript𝑡superscript𝐾1subscript𝛿𝐻3𝛿3t^{*}<\min\{K^{-1},\delta_{H}/3,\delta/3\}.italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT < roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT / 3 , italic_δ / 3 } . (5.24)

If we prove that δH>δsubscript𝛿𝐻𝛿\delta_{H}>\deltaitalic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT > italic_δ, then we get injectivity of ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for t<min{K1,δ/3}superscript𝑡superscript𝐾1𝛿3t^{*}<\min\{K^{-1},\delta/3\}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT < roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ / 3 }, which is Theorem 5.2.

By the definition of δ𝛿\deltaitalic_δ in §3.1(8), we have x,yϕ(M)𝑥𝑦italic-ϕ𝑀x,y\in\phi(M)italic_x , italic_y ∈ italic_ϕ ( italic_M ) and dN(x,y)<δsubscript𝑑superscript𝑁𝑥𝑦𝛿d_{\mathbb{R}^{N}}(x,y)<\deltaitalic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) < italic_δ implies x+t1vxy+t2vy𝑥subscript𝑡1subscript𝑣𝑥𝑦subscript𝑡2subscript𝑣𝑦x+t_{1}v_{x}\neq y+t_{2}v_{y}italic_x + italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ≠ italic_y + italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT for |ti|<ϵsubscript𝑡𝑖italic-ϵ|t_{i}|<\epsilon| italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | < italic_ϵ and for any unit normal vectors vx,vysubscript𝑣𝑥subscript𝑣𝑦v_{x},v_{y}italic_v start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT at x,y,𝑥𝑦x,y,italic_x , italic_y , resp. By Lemma 5.3, for dN(x,y)<δHt=δHt(u)subscript𝑑superscript𝑁𝑥𝑦subscript𝛿subscript𝐻𝑡subscript𝛿subscript𝐻𝑡𝑢d_{\mathbb{R}^{N}}(x,y)<\delta_{H_{t}}=\delta_{H_{t}}(u)italic_d start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ( italic_x , italic_y ) < italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_u ) for a fixed normal vector field u𝑢uitalic_u of length at most one, we have x+tuxy+tuy𝑥𝑡subscript𝑢𝑥𝑦𝑡subscript𝑢𝑦x+tu_{x}\neq y+tu_{y}italic_x + italic_t italic_u start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ≠ italic_y + italic_t italic_u start_POSTSUBSCRIPT italic_y end_POSTSUBSCRIPT. (By the remarks above Lemma 5.3, we also have |t|<ϵ𝑡italic-ϵ|t|<\epsilon| italic_t | < italic_ϵ here.) Since δ𝛿\deltaitalic_δ does not depend on a choice of vector field u𝑢uitalic_u, we have δδHt(u).𝛿subscript𝛿subscript𝐻𝑡𝑢\delta\leq\delta_{H_{t}}(u).italic_δ ≤ italic_δ start_POSTSUBSCRIPT italic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_u ) . This implies δδH.𝛿subscript𝛿𝐻\delta\leq\delta_{H}.italic_δ ≤ italic_δ start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT . Thus we can conclude that ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is injective for t<min{K1,δ/3}superscript𝑡superscript𝐾1𝛿3t^{*}<\min\{K^{-1},\delta/3\}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT < roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ / 3 }, and the proof of Theorem 5.2 is complete.

Remark 5.2.

We review the explicit lower bound for δ𝛿\deltaitalic_δ. For L𝐿Litalic_L defined by (5.15), δ0(q0,v0)superscript𝛿0subscript𝑞0subscript𝑣0\delta^{0}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined by (5.17). For P𝑃Pitalic_P defined by (5.9), δ1(q0,v0)superscript𝛿1subscript𝑞0subscript𝑣0\delta^{1}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined by (5.18). For Spsuperscript𝑆𝑝S^{p}italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT defined in (5.21), δ2(q0,v0)superscript𝛿2subscript𝑞0subscript𝑣0\delta^{2}(q_{0},v_{0})italic_δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is defined in (5.22). Then (5.23) defines δ(q0,v0).𝛿subscript𝑞0subscript𝑣0\delta(q_{0},v_{0}).italic_δ ( italic_q start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) . Finally, (5.2) defines δ.𝛿\delta.italic_δ .

In particular, lower bounds on L,𝐿L,italic_L , P𝑃Pitalic_P, and Spsuperscript𝑆𝑝S^{p}italic_S start_POSTSUPERSCRIPT italic_p end_POSTSUPERSCRIPT will give a lower bound on δ.𝛿\delta.italic_δ . These constants depend on q𝑞qitalic_q-derivatives (i.e., M𝑀Mitalic_M coordinate derivatives) of the Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT coordinates of ϕitalic-ϕ\phiitalic_ϕ and of vectors in νϕsubscript𝜈italic-ϕ\nu_{\phi}italic_ν start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT (see e.g., (5.2)). Since the normal bundle is determined by M𝑀Mitalic_M and ϕitalic-ϕ\phiitalic_ϕ, our estimates are explicit in the sense of Remark 5.1.

5.3. The Main Theorem

Since M𝑀Mitalic_M is compact and since ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an injective immersion for |t|t𝑡superscript𝑡|t|\leq t^{*}| italic_t | ≤ italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT by Theorem 5.2, by Prop. 1.1 we obtain the main result that ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is an embedding for t𝑡titalic_t less than an explicit tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

Theorem 5.7.

Let u𝑢uitalic_u be a normal vector field of length at most one along ϕ(M)Nitalic-ϕ𝑀superscript𝑁\phi(M)\subset\mathbb{R}^{N}italic_ϕ ( italic_M ) ⊂ blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. Let t=min{K1,δ/3}superscript𝑡superscript𝐾1𝛿3t^{*}=\min\{K^{-1},\delta/3\}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = roman_min { italic_K start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT , italic_δ / 3 }, with K𝐾Kitalic_K defined in §5.1(7) and δ𝛿\deltaitalic_δ estimated in Remark 5.2. Then ϕt:MN:subscriptitalic-ϕ𝑡𝑀superscript𝑁\phi_{t}:M\rightarrow\mathbb{R}^{N}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT : italic_M → blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT given by mϕ(m)+tuϕ(m)maps-to𝑚italic-ϕ𝑚𝑡subscript𝑢italic-ϕ𝑚m\mapsto\phi(m)+tu_{\phi(m)}italic_m ↦ italic_ϕ ( italic_m ) + italic_t italic_u start_POSTSUBSCRIPT italic_ϕ ( italic_m ) end_POSTSUBSCRIPT is an embedding for |t|t𝑡superscript𝑡|t|\leq t^{*}| italic_t | ≤ italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT.

6. Discussion

In this paper, we have proposed treating manifold learning by gradient flow techniques that are standard in much of machine learning. By doing gradient flow in the infinite dimensional space of embeddings of a fixed manifold M𝑀Mitalic_M into Nsuperscript𝑁{\mathbb{R}}^{N}blackboard_R start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT, we avoid parametric and RKHS methods. These methods typically restrict the class of manifolds considered to a finite dimensional space, which speeds up computation time at the cost of perhaps oversimplifying the problem. In our approach, we give both a theoretical reason to move only in normal directions to the embedded manifold and theoretical lower bounds on the existence for each step of a good discretized version of gradient flow on the space of embeddings. However, this paper does not discuss computational issues, which must be addressed in future work. In particular, one has to recompute the estimates for the maximal time tsuperscript𝑡t^{*}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT of travel after each step. This reflects the theoretical issue that the gradient flow may leave the space of embeddings in finite time. It may be possible to add a penalty term to the objective function that forces the gradient flow to stay in the space of embeddings. This new term would involve the bounds we computed on both local quantities like K𝐾Kitalic_K and global quantities like δ𝛿\deltaitalic_δ in §5.1.

There are several practical and theoretical issues raised by this approach. On the practical side, if M𝑀Mitalic_M flows discretely in k𝑘kitalic_k steps to a Riemannian manifold Mksubscript𝑀𝑘M_{k}italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT with a thin neck, as typically happens in mean curvature flow, then in Thm. 5.7 K𝐾Kitalic_K will be very large and t=tksuperscript𝑡subscriptsuperscript𝑡𝑘t^{*}=t^{*}_{k}italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT = italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT will be small at Mksubscript𝑀𝑘M_{k}italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. Thus the discretized gradient flow will essentially stop. It may be reasonable to pick the first k𝑘kitalic_k such that K𝐾Kitalic_K at Mksubscript𝑀𝑘M_{k}italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT exceeds a specified threshold. We then backtrack to Mk1subscript𝑀𝑘1M_{k-1}italic_M start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT (or even further back to some Mkrsubscript𝑀𝑘𝑟M_{k-r}italic_M start_POSTSUBSCRIPT italic_k - italic_r end_POSTSUBSCRIPT for some r>1𝑟1r>1italic_r > 1) and move to Mksubscriptsuperscript𝑀𝑘M^{\prime}_{k}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT using the gradient at Mk1subscript𝑀𝑘1M_{k-1}italic_M start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT and new step size t¯k<tksubscript¯𝑡𝑘subscriptsuperscript𝑡𝑘\bar{t}_{k}<t^{*}_{k}over¯ start_ARG italic_t end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT < italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, e.g., t¯k=(1/2)tk.subscript¯𝑡𝑘12subscriptsuperscript𝑡𝑘\bar{t}_{k}=(1/2)t^{*}_{k}.over¯ start_ARG italic_t end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( 1 / 2 ) italic_t start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT . Since the gradient vector field at Mksubscriptsuperscript𝑀𝑘M^{\prime}_{k}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is different from the gradient vector field at Mksubscript𝑀𝑘M_{k}italic_M start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, the discretized flow may move Mksubscriptsuperscript𝑀𝑘M^{\prime}_{k}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT to Mk+1subscriptsuperscript𝑀𝑘1M^{\prime}_{k+1}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT with K𝐾Kitalic_K at Mk+1subscriptsuperscript𝑀𝑘1M^{\prime}_{k+1}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k + 1 end_POSTSUBSCRIPT still below the threshold. Thus we may be able to extend the flow for an increased number of steps.

There are two theoretical issues that need further examination. The first is the choice of M𝑀Mitalic_M: how is this manifold specified? Based on Riemannian geometry estimates dating to the 1980s, it is reasonable to assume that we want to consider manifolds of a fixed dimension with a priori a lower bound on volume, an upper bound on diameter, and two-sided bounds on sectional curvature. Cheeger’s finiteness theorem [8] asserts that there are only a finite number of diffeomorphism classes among all such manifolds. (It would be interesting to determine if the class 𝒢(d,V,τ)𝒢𝑑𝑉𝜏{\mathcal{G}}(d,V,\tau)caligraphic_G ( italic_d , italic_V , italic_τ ) in [20] has a similar finiteness theorem. We note that the approach of Fefferman et al. has the strong advantage of not specifying the diffeomorphism type of M𝑀Mitalic_M.) However, while this in theory provides us with a finite list of choices, the proof of the finiteness theorem is nonconstructive. In practice, in many cases we might as well assume that M𝑀Mitalic_M is the closed unit ball Bksuperscript𝐵𝑘B^{k}italic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT in ksuperscript𝑘{\mathbb{R}}^{k}blackboard_R start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT. For example, in the famous Swiss roll examples, the data set appears to lie on the image of a severely deformed B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. In contrast, if the training data appears to lie on a deformed torus, B2superscript𝐵2B^{2}italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is a worse choice for M𝑀Mitalic_M than the standard torus.

Perhaps even more importantly, it is unclear how to specify the dimension of M𝑀Mitalic_M in advance. This has been discussed in the literature: see e.g. [42] and its references for work done before the last decade, and [22] for more recent work. In these works, issues such as the potentially fractal/Hausdorff dimension of the data set have been discussed. From a more geometric mindset, we could speculatively start with a k𝑘kitalic_k-manifold, and hope that in the long run, M𝑀Mitalic_M would collapse in the sense of Cheeger-Gromov [9] to a lower dimensional manifold of “best” dimension. This would address the issue that the initial choice for M𝑀Mitalic_M has to be modified as more data is considered. Even more speculatively, since all Riemannian manifolds are via cut locus arguments homeomorphic to a closed ball with gluings on the boundary, we could start with the k𝑘kitalic_k-ball Bksuperscript𝐵𝑘B^{k}italic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT, add a regularization term, like the volume of Bk=Sk1,superscript𝐵𝑘superscript𝑆𝑘1\partial B^{k}=S^{k-1},∂ italic_B start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = italic_S start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , that penalizes the existence of a boundary, and hope that long time flow provides both dimension collapse and boundary gluing. We have no evidence that this will work, but a low dimensional computation is potentially feasible.

Acknowledgements

Our thanks to Carlangelo Liverani for allowing us to use his Quantitative Implicit Function Theorem. We are also grateful to Qinxun Bai, Andres Larrain-Hubach, Drew Lohn, and the referee for their helpful suggestions.

Appendix A The Quantitative Implicit Function Theorem

This quantitative version of the Implicit Function theorem and its proof are from [28] (see also [10, Appendix A]).

For notation, recall that Anorm𝐴\|A\|∥ italic_A ∥ is the sup norm of the absolute values of the entries of a matrix A𝐴Aitalic_A. For fixed (x0,λ0)m×nsubscript𝑥0subscript𝜆0superscript𝑚superscript𝑛(x_{0},\lambda_{0})\in{\mathbb{R}}^{m}\times{\mathbb{R}}^{n}( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT and fixed δ>0𝛿0\delta>0italic_δ > 0, set Vδ=Vδ(x0,λ0)={(x,λ)m+n:|xx0|δ,|λλ0|δ}subscript𝑉𝛿subscript𝑉𝛿subscript𝑥0subscript𝜆0conditional-set𝑥𝜆superscript𝑚𝑛formulae-sequence𝑥subscript𝑥0𝛿𝜆subscript𝜆0𝛿V_{\delta}=V_{\delta(x_{0},\lambda_{0})}=\{(x,\lambda)\in\mathbb{R}^{m+n}:|x-x% _{0}|\leq\delta,|\lambda-\lambda_{0}|\leq\delta\}italic_V start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = italic_V start_POSTSUBSCRIPT italic_δ ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT = { ( italic_x , italic_λ ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT : | italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ , | italic_λ - italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ }.

For FC1(m+n,m)𝐹superscript𝐶1superscript𝑚𝑛superscript𝑚F\in C^{1}(\mathbb{R}^{m+n},\mathbb{R}^{m})italic_F ∈ italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT , blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ), let (x0,λ0)m×nsubscript𝑥0subscript𝜆0superscript𝑚superscript𝑛(x_{0},\lambda_{0})\in\mathbb{R}^{m}\times\mathbb{R}^{n}( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT × blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT satisfy F(x0,λ0)=0𝐹subscript𝑥0subscript𝜆00F(x_{0},\lambda_{0})=0italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = 0.

Theorem A.1 (Quantitative Implicit Function Theorem).

Assume that the m×m𝑚𝑚m\times mitalic_m × italic_m matrix xF(x0,λ0)subscript𝑥𝐹subscript𝑥0subscript𝜆0\partial_{x}F(x_{0},\lambda_{0})∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is invertible and choose δ>0𝛿0\delta>0italic_δ > 0 such that

sup(x,λ)VδId[xF(x0,λ0)]1xF(x,λ)1/2.subscriptsupremum𝑥𝜆subscript𝑉𝛿normIdsuperscriptdelimited-[]subscript𝑥𝐹subscript𝑥0subscript𝜆01subscript𝑥𝐹𝑥𝜆12\sup_{(x,\lambda)\in V_{\delta}}||{\rm Id}-[\partial_{x}F(x_{0},\lambda_{0})]^% {-1}\partial_{x}F(x,\lambda)||\leq 1/2.roman_sup start_POSTSUBSCRIPT ( italic_x , italic_λ ) ∈ italic_V start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT | | roman_Id - [ ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x , italic_λ ) | | ≤ 1 / 2 .

Let Bδ=sup(x,λ)VδλF(x,λ)subscript𝐵𝛿subscriptsupremum𝑥𝜆subscript𝑉𝛿normsubscript𝜆𝐹𝑥𝜆B_{\delta}=\sup_{(x,\lambda)\in V_{\delta}}||\partial_{\lambda}F(x,\lambda)||italic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = roman_sup start_POSTSUBSCRIPT ( italic_x , italic_λ ) ∈ italic_V start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT | | ∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_F ( italic_x , italic_λ ) | | and M=xF(x0,λ0)1𝑀normsubscript𝑥𝐹superscriptsubscript𝑥0subscript𝜆01M=||\partial_{x}F(x_{0},\lambda_{0})^{-1}||italic_M = | | ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | |. Set δ1=(2MBδ)1δsuperscript𝛿1superscript2𝑀subscript𝐵𝛿1𝛿\delta^{1}=(2MB_{\delta})^{-1}\deltaitalic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = ( 2 italic_M italic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_δ, and set Γδ1={λn:|λλ0|<δ1}subscriptΓsuperscript𝛿1conditional-set𝜆superscript𝑛𝜆subscript𝜆0superscript𝛿1\Gamma_{\delta^{1}}=\{\lambda\in\mathbb{R}^{n}:|\lambda-\lambda_{0}|<\delta^{1}\}roman_Γ start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { italic_λ ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT : | italic_λ - italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | < italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT }, Vδ,δ1={(x,λ)m+n:|xx0|δ,|λλ0|δ1}subscript𝑉𝛿superscript𝛿1conditional-set𝑥𝜆superscript𝑚𝑛formulae-sequence𝑥subscript𝑥0𝛿𝜆subscript𝜆0superscript𝛿1V_{\delta,\delta^{1}}=\{(x,\lambda)\in\mathbb{R}^{m+n}:|x-x_{0}|\leq\delta,|% \lambda-\lambda_{0}|\leq\delta^{1}\}italic_V start_POSTSUBSCRIPT italic_δ , italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { ( italic_x , italic_λ ) ∈ blackboard_R start_POSTSUPERSCRIPT italic_m + italic_n end_POSTSUPERSCRIPT : | italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ , | italic_λ - italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT }.

Then there exists gC1(Γδ1,m)𝑔superscript𝐶1subscriptΓsuperscript𝛿1superscript𝑚g\in C^{1}(\Gamma_{\delta^{1}},\mathbb{R}^{m})italic_g ∈ italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( roman_Γ start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) such that all solutions of the equation F(x,λ)=0𝐹𝑥𝜆0F(x,\lambda)=0italic_F ( italic_x , italic_λ ) = 0 in the set Vδ,δ1subscript𝑉𝛿superscript𝛿1V_{\delta,\delta^{1}}italic_V start_POSTSUBSCRIPT italic_δ , italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT are given by (g(λ),λ)𝑔𝜆𝜆(g(\lambda),\lambda)( italic_g ( italic_λ ) , italic_λ ). In addition, λg(λ)=(xF(g(λ),λ))1λF(g(λ),λ)subscript𝜆𝑔𝜆superscriptsubscript𝑥𝐹𝑔𝜆𝜆1subscript𝜆𝐹𝑔𝜆𝜆\partial_{\lambda}g(\lambda)=-(\partial_{x}F(g(\lambda),\lambda))^{-1}\partial% _{\lambda}F(g(\lambda),\lambda)∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_g ( italic_λ ) = - ( ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ).

Proof.

Take λΓδ1={|λλ0|<δ1}𝜆subscriptΓsuperscript𝛿1𝜆subscript𝜆0superscript𝛿1\lambda\in\Gamma_{\delta^{1}}=\{|\lambda-\lambda_{0}|<\delta^{1}\}italic_λ ∈ roman_Γ start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = { | italic_λ - italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | < italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT }. Consider Uδ={xm:|xx0|δ}subscript𝑈𝛿conditional-set𝑥superscript𝑚𝑥subscript𝑥0𝛿U_{\delta}=\{x\in\mathbb{R}^{m}:|x-x_{0}|\leq\delta\}italic_U start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT = { italic_x ∈ blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT : | italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | ≤ italic_δ } and Ωλ:Uδm:subscriptΩ𝜆subscript𝑈𝛿superscript𝑚\Omega_{\lambda}:U_{\delta}\rightarrow\mathbb{R}^{m}roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT : italic_U start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT → blackboard_R start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT defined by

Ωλ(x)=xxF(x0,λ0)1F(x,λ).subscriptΩ𝜆𝑥𝑥subscript𝑥𝐹superscriptsubscript𝑥0subscript𝜆01𝐹𝑥𝜆\Omega_{\lambda}(x)=x-\partial_{x}F(x_{0},\lambda_{0})^{-1}F(x,\lambda).roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ) = italic_x - ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT italic_F ( italic_x , italic_λ ) .

For xUδ,F(x,λ)=0formulae-sequence𝑥subscript𝑈𝛿𝐹𝑥𝜆0x\in U_{\delta},F(x,\lambda)=0italic_x ∈ italic_U start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT , italic_F ( italic_x , italic_λ ) = 0 is equivalent to x=Ωλ(x)𝑥subscriptΩ𝜆𝑥x=\Omega_{\lambda}(x)italic_x = roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ). We have

|Ωλ(x0)Ωλ0(x0)|M|F(x0,λ)F(x0,λ0)|MBδδ1.subscriptΩ𝜆subscript𝑥0subscriptΩsubscript𝜆0subscript𝑥0𝑀𝐹subscript𝑥0𝜆𝐹subscript𝑥0subscript𝜆0𝑀subscript𝐵𝛿superscript𝛿1|\Omega_{\lambda}(x_{0})-\Omega_{\lambda_{0}}(x_{0})|\leq M|F(x_{0},\lambda)-F% (x_{0},\lambda_{0})|\leq MB_{\delta}\delta^{1}.| roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) - roman_Ω start_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | ≤ italic_M | italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ ) - italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | ≤ italic_M italic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT .

In addition, |xΩλ|=|IdxF(x0,λ0)1xF(x,λ)|1/2subscript𝑥subscriptΩ𝜆Idsubscript𝑥𝐹superscriptsubscript𝑥0subscript𝜆01subscript𝑥𝐹𝑥𝜆12|\partial_{x}\Omega_{\lambda}|=|{\rm Id}-\partial_{x}F(x_{0},\lambda_{0})^{-1}% \partial_{x}F(x,\lambda)|\leq 1/2| ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT | = | roman_Id - ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_x , italic_λ ) | ≤ 1 / 2, so |Ωλ(x)Ωλ(x0)|12|xx0|.subscriptΩ𝜆𝑥subscriptΩ𝜆subscript𝑥012𝑥subscript𝑥0|\Omega_{\lambda}(x)-\Omega_{\lambda}(x_{0})|\leq\frac{1}{2}|x-x_{0}|.| roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ) - roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG | italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | . Thus

|Ωλ(x)x0|subscriptΩ𝜆𝑥subscript𝑥0\displaystyle|\Omega_{\lambda}(x)-x_{0}|| roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ) - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | |Ωλ(x)Ωλ(x0)|+|Ωλ(x0)x0|absentsubscriptΩ𝜆𝑥subscriptΩ𝜆subscript𝑥0subscriptΩ𝜆subscript𝑥0subscript𝑥0\displaystyle\leq|\Omega_{\lambda}(x)-\Omega_{\lambda}(x_{0})|+|\Omega_{% \lambda}(x_{0})-x_{0}|≤ | roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ) - roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) | + | roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT |
12|xx0|+MBδδ1δ.absent12𝑥subscript𝑥0𝑀subscript𝐵𝛿superscript𝛿1𝛿\displaystyle\leq\frac{1}{2}|x-x_{0}|+MB_{\delta}\delta^{1}\leq\delta.≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG | italic_x - italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | + italic_M italic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ≤ italic_δ .

Thus ΩλsubscriptΩ𝜆\Omega_{\lambda}roman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT is a contraction on Uδsubscript𝑈𝛿U_{\delta}italic_U start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT, and Ωλ(x)=xsubscriptΩ𝜆𝑥𝑥\Omega_{\lambda}(x)=xroman_Ω start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT ( italic_x ) = italic_x has a unique solution x=g(λ)𝑥𝑔𝜆x=g(\lambda)italic_x = italic_g ( italic_λ ) by the Contraction Fixed Point Theorem. We have therefore obtained a function g:Γδ1Uδ:𝑔subscriptΓsuperscript𝛿1subscript𝑈𝛿g:\Gamma_{\delta^{1}}\rightarrow U_{\delta}italic_g : roman_Γ start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT → italic_U start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT such that F(g(λ),λ)=0𝐹𝑔𝜆𝜆0F(g(\lambda),\lambda)=0italic_F ( italic_g ( italic_λ ) , italic_λ ) = 0. All solutions in Vδ,δ1subscript𝑉𝛿superscript𝛿1V_{\delta,\delta^{1}}italic_V start_POSTSUBSCRIPT italic_δ , italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT are of this form: if F(x1,λ1)=0𝐹subscript𝑥1subscript𝜆10F(x_{1},\lambda_{1})=0italic_F ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = 0, then

|x1g(λ1)|=|Ωλ1(x1)Ωλ1(g(λ1))|12|x1g(λ1)|,subscript𝑥1𝑔subscript𝜆1subscriptΩsubscript𝜆1subscript𝑥1subscriptΩsubscript𝜆1𝑔subscript𝜆112subscript𝑥1𝑔subscript𝜆1|x_{1}-g(\lambda_{1})|=|\Omega_{\lambda_{1}}(x_{1})-\Omega_{\lambda_{1}}(g(% \lambda_{1}))|\leq\frac{1}{2}|x_{1}-g(\lambda_{1})|,| italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_g ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) | = | roman_Ω start_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) - roman_Ω start_POSTSUBSCRIPT italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_g ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) | ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG | italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_g ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) | ,

so x1=g(λ1).subscript𝑥1𝑔subscript𝜆1x_{1}=g(\lambda_{1}).italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_g ( italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) .

For the final statement in the Theorem, let λ,λΓδ1𝜆superscript𝜆subscriptΓsuperscript𝛿1\lambda,\lambda^{\prime}\in\Gamma_{\delta^{1}}italic_λ , italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ roman_Γ start_POSTSUBSCRIPT italic_δ start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT. As above, we have

|g(λ)g(λ)|12|g(λ)g(λ)|+MBδ|λλ|𝑔𝜆𝑔superscript𝜆12𝑔𝜆𝑔superscript𝜆𝑀subscript𝐵𝛿𝜆superscript𝜆|g(\lambda)-g(\lambda^{\prime})|\leq\frac{1}{2}|g(\lambda)-g(\lambda^{\prime})% |+MB_{\delta}|\lambda-\lambda^{\prime}|| italic_g ( italic_λ ) - italic_g ( italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | ≤ divide start_ARG 1 end_ARG start_ARG 2 end_ARG | italic_g ( italic_λ ) - italic_g ( italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) | + italic_M italic_B start_POSTSUBSCRIPT italic_δ end_POSTSUBSCRIPT | italic_λ - italic_λ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT |

This yields the Lipschitz continuity of g𝑔gitalic_g. To obtain differentiability, by Taylor’s theorem for FC1𝐹superscript𝐶1F\in C^{1}italic_F ∈ italic_C start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and the Lipschitz continuity of g𝑔gitalic_g, we obtain, for hnsuperscript𝑛h\in\mathbb{R}^{n}italic_h ∈ blackboard_R start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT,

0=lim|h|0|h|1|F(g(λ+h),λ+h)F(g(λ),λ)|0subscript0superscript1𝐹𝑔𝜆𝜆𝐹𝑔𝜆𝜆\displaystyle{0=\lim_{|h|\longrightarrow 0}|h|^{-1}|F(g(\lambda+h),\lambda+h)-% F(g(\lambda),\lambda)|}0 = roman_lim start_POSTSUBSCRIPT | italic_h | ⟶ 0 end_POSTSUBSCRIPT | italic_h | start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_F ( italic_g ( italic_λ + italic_h ) , italic_λ + italic_h ) - italic_F ( italic_g ( italic_λ ) , italic_λ ) |
=lim|h|0|h|1|F(g(λ+h),λ+h)F(g(λ),λ+h)+F(g(λ),λ+h)F(g(λ),λ)|absentsubscript0superscript1𝐹𝑔𝜆𝜆𝐹𝑔𝜆𝜆𝐹𝑔𝜆𝜆𝐹𝑔𝜆𝜆\displaystyle=\lim_{|h|\longrightarrow 0}|h|^{-1}|F(g(\lambda+h),\lambda+h)-F(% g(\lambda),\lambda+h)+F(g(\lambda),\lambda+h)-F(g(\lambda),\lambda)|= roman_lim start_POSTSUBSCRIPT | italic_h | ⟶ 0 end_POSTSUBSCRIPT | italic_h | start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_F ( italic_g ( italic_λ + italic_h ) , italic_λ + italic_h ) - italic_F ( italic_g ( italic_λ ) , italic_λ + italic_h ) + italic_F ( italic_g ( italic_λ ) , italic_λ + italic_h ) - italic_F ( italic_g ( italic_λ ) , italic_λ ) |
=lim|h|0|h|1|xF(g(λ),λ+h)(g(λ+h)g(λ))+λF(g(λ),λ)(λ+hλ)|absentsubscript0superscript1subscript𝑥𝐹𝑔𝜆𝜆𝑔𝜆𝑔𝜆subscript𝜆𝐹𝑔𝜆𝜆𝜆𝜆\displaystyle=\lim_{|h|\longrightarrow 0}|h|^{-1}|\partial_{x}F(g(\lambda),% \lambda+h)(g(\lambda+h)-g(\lambda))+\partial_{\lambda}F(g(\lambda),\lambda)(% \lambda+h-\lambda)|= roman_lim start_POSTSUBSCRIPT | italic_h | ⟶ 0 end_POSTSUBSCRIPT | italic_h | start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ + italic_h ) ( italic_g ( italic_λ + italic_h ) - italic_g ( italic_λ ) ) + ∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) ( italic_λ + italic_h - italic_λ ) |
=xF(g(λ),λ)limh0|h|1|g(λ+h)g(λ)+(xF(g(λ),λ))1|λF(g(λ),λ)|.\displaystyle=\partial_{x}F(g(\lambda),\lambda)\lim_{h\longrightarrow 0}|h|^{-% 1}|g(\lambda+h)-g(\lambda)+(\partial_{x}F(g(\lambda),\lambda))^{-1}|\partial_{% \lambda}F(g(\lambda),\lambda)|.= ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) roman_lim start_POSTSUBSCRIPT italic_h ⟶ 0 end_POSTSUBSCRIPT | italic_h | start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | italic_g ( italic_λ + italic_h ) - italic_g ( italic_λ ) + ( ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT | ∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) | .

Since xF(g(λ),λ)0,subscript𝑥𝐹𝑔𝜆𝜆0\partial_{x}F(g(\lambda),\lambda)\neq 0,∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) ≠ 0 , we get λg(λ)=(xF(g(λ),λ))1λF(g(λ),λ)subscript𝜆𝑔𝜆superscriptsubscript𝑥𝐹𝑔𝜆𝜆1subscript𝜆𝐹𝑔𝜆𝜆\partial_{\lambda}g(\lambda)=-(\partial_{x}F(g(\lambda),\lambda))^{-1}\partial% _{\lambda}F(g(\lambda),\lambda)∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_g ( italic_λ ) = - ( ∂ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ) ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ∂ start_POSTSUBSCRIPT italic_λ end_POSTSUBSCRIPT italic_F ( italic_g ( italic_λ ) , italic_λ ). ∎

References

  • [1] Luigi Ambrosia, Nicola Gigli, and Giuseppe Savaré, Gradient Flows in Metric Spaces and in the Space of Probability Measures, Birkhäuser, Basil, 2008.
  • [2] Michèle Audin and Mihai Damian, Morse theory and Floer homology, Universitext, Springer, London; EDP Sciences, Les Ulis, 2014.
  • [3] Qinxun Bai, Steven Rosenberg, Zheng Wu, and Stan Sclaroff, A differential geometric approach to classification, Proceedings of The 33rd International Conference on Machine Learning 48 (2016).
  • [4] Qinxun Bai, Steven Rosenberg, and Wei Xu, A geometric understanding of natural gradient, https://arxiv.longhoe.net/abs/2202.06232.
  • [5] Mihail Belkin and Partha Niyogi, Laplacian eigenmaps for dimensionality reduction and data representation, Neural Computation 15 (2003), 1373–1396.
  • [6] Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani, Manifold regularization: A geometric framework for learning from labeled and unlabeled examples, Journal of Machine Learning Research 7 (2006), 2399–2434.
  • [7] Ronny Bergmann et al., Discrete total variation of the normal vector field as shape prior with applications in geometric inverse problems, Inverse Problems 36 (2020).
  • [8] Jeff Cheeger, Finiteness theorems for Riemannian manifolds, Amer. J. Math. 92 (1970), 61–74.
  • [9] Jeff Cheeger and Mikhael Gromov, Collapsing Riemannian manifolds while kee** their curvature bounded. I, J. Differential Geom. 23 (1986), no. 3, 309–346.
  • [10] Luigi Chierchia, Kolomogorov-Arnold-Moser (KAM) theory, Mathematics of Complexity and Dynamical Systems. Vols. 1–3, Springer, New York (2012), 810–836.
  • [11] Yaim Cooper, Discrete gradient descent differs qualitatively from gradient flow, arXiv:1808.04839 (2018).
  • [12] Antonio Criminisi, Jamie Shotton, and Ender Konukoglu, Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning, Foundations and Trends in Computer Graphics and Vision 7 (2012), 81–227.
  • [13] David Donoho and Carrie Grimes, Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data, Proceedings of the National Academy of Sciences 100 (2003), no. 10, 5591–5596.
  • [14] Mark Droske and Martin Rumpf, A variational approach to nonrigid morphological image registration, SIAM Journal on Applied Mathematics 2 (2004), 668–687.
  • [15] James Eells, Jr., A setting for global analysis, Bull. Amer. Math. Soc. 72 (1966), 751–807.
  • [16] Charles Fefferman, Sergei Ivanov, Yaroslav Kurylev, Matti Lassas, **peng Lu, and Hariharan Narayanan, Reconstruction and interpolation of manifolds. II: Inverse problems for Riemannian manifolds with partial distance data, https://arxiv.longhoe.net/abs/2111.14528.
  • [17] Charles Fefferman, Sergei Ivanov, Yaroslav Kurylev, Matti Lassas, and Hariharan Narayanan, Reconstruction and interpolation of manifolds. I: The geometric Whitney problem, Found. Comput. Math. 20 (2020), no. 5, 1035–1133.
  • [18] Charles Fefferman, Sergei Ivanov, Matti Lassas, and Hariharan Narayanan, Fitting a manifold of large reach to noisy data, https://arxiv.longhoe.net/abs/1910.05084.
  • [19] by same author, Reconstruction of a Riemannian manifold from noisy intrinsic distances, SIAM J. Math. Data Sci. 2 (2020), no. 3, 770–808.
  • [20] Charles Fefferman, Sanjoy Mitter, and Hariharan Narayanan, Testing the manifold hypothesis, J. Amer. Math. Soc. 29 (2016), no. 4, 983–1049.
  • [21] Claus Gerhardt, Evolutionary surfaces of prescribed mean curvature, Journal of Differential Equations 36 (1980), 139–172.
  • [22] Daniele Granata and Vincenzo Carnevale, Accurate estimation of the intrinsic dimension using graph distances: Unraveling the geometric complexity of datasets, Sci. Rep. 6 (2016), https://www.nature.com/articles/srep31377.
  • [23] Guodong Guo, Yun Fu, Charles R. Dyer, and Thomas S. Huang, Image-based human age estimation by manifold learning and locally adjusted robust regression, IEEE Transactions on Image Processing 17 (2008), 1178–1188.
  • [24] Richard S. Hamilton, Harnack estimate for the mean curvature flow, Journal of Differential Geometry 41 (1995), 215–226.
  • [25] Gerhard Huisken and Carlo Sinestrari, Mean curvature flow singularities for mean convex surfaces, Calculus of Variations and Partial Differential Equations 8 (1999), 1–14.
  • [26] John M. Lee, Introduction to Smooth Manifolds, Graduate Texts in Mathematics, vol. 218, Springer, New York, 2013.
  • [27] Tong Lin, Hanlin Xue, Ling Wang, Bo Huang, and Hongbin Zha, Supervised learning via Euler’s elastica models, Journal of Machine Learning Research 16 (2015), 3637–3686.
  • [28] Calangelo Liverani, Implicit function theorem (a quantitative version), https://www.mat.uniroma2.it/~liverani/Calcolo1-2016/implicit.pdf.
  • [29] Yunqian Ma and Yun Fu (eds.), Manifold Learning and Applications, CRC Press, Boca Raton, 2011.
  • [30] Uwe F. Mayer, Gradient flows on nonpositively curved metric spaces and harmonic maps, Communications in Analysis and Geometry 6 (1998), no. 2, 199–253.
  • [31] John Milnor, Morse Theory, Princeton University Press, Princeton, NJ, 1969.
  • [32] Marston Morse, The foundations of the calculus of variations in m-space. Part I, Trans. Amer. Math. Soc. 31 (1929), 379–404.
  • [33] David Mumford and Jayant Shah, Optimal approximations by piecewise smooth functions and associated variational problems, Communications in Pure and Applied mathematics 42 (1989), no. 5, 577–685.
  • [34] Hideki Omori, Infinite-dimensional Lie groups, Translations of Mathematical Monographs, vol. 158, American Mathematical Society, Providence, RI, 1997.
  • [35] Stanley Osher and James. A Sethian, Fronts propogating with curvature dependant speed: Algorithms based on Hamilton-Jacobi formulations, Journal of Computational Physics 79 (1988), 12–49.
  • [36] Sam Roweis and Lawrence Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (2000), no. 5500, 2323–2326.
  • [37] Melanie Rupflin and Peter M. Top**, Flowing maps to minimal surfaces, American Journal of Mathematics 138 (2016), no. 4, 1095–1115.
  • [38] James A. Sethian, Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Materials Science, vol. 3, Cambridge University Press, 1999.
  • [39] Alexander Smola, Sebastian Mika, Bernhard Schölkopf, and Robert Williamson, Regularized principal manifolds, JMLR 1 (2001), 179–209.
  • [40] Joshua Tenenbaum, Vin de Silva, and John Langford, A global geometric framework for nonlinear dimensionality reduction, Science 290 (2000), 2319–2323.
  • [41] Kush Varshney and Alan Willsky, Classification using geometric level sets, Journal of Machine Learning Research 11 (2010), 491–516.
  • [42] Xiaohui Wang and J. S. Marron, A scale-based approach to finding effective dimensionality in manifold learning, Electron. J. Stat. 2 (2008), 127–148.
  • [43] Ling Xiao, Gradient estimates and lower bound for the blow-up time of star shaped mean curvature flow, https://arxiv.longhoe.net/pdf/1311.3721.pdf.
  • [44] Ye Yuan and Chuanjiang He, Variational level set methods for image segmentation based on both L2 and Sobolev gradients, Nonlinear Analysis: Real World Applications 13 (2012), 959–966.