Jensen's inequality in geodesic spaces with lower bounded curvature
JJensen’s inequality in geodesic spaces with lowerbounded curvature
Quentin Paris ∗ Abstract
Let (
M, d ) be a separable and complete geodesic space with curva-ture lower bounded, by κ ∈ R , in the sense of Alexandrov. Let µ be aBorel probability measure on M , such that µ ∈ P ( M ), and that hasat least one barycenter x ∗ ∈ M . We show that for any geodesically α -convex function f : M → R , for α ∈ R , the inequality f ( x ∗ ) ≤ (cid:90) M ( f − α d ( x ∗ , . )) d µ, holds provided f is locally Lipschitz at x ∗ and either positive or in L ( µ ). Our proof relies on the properties of tangent cones at barycen-ters and on the existence of gradients for semiconcave functions inspaces with lower bounded curvature. In this paper, we prove Jensen’s inequality in Polish (i.e., separable andcomplete) geodesic metric spaces with curvature lower bounded, by some κ ∈ R , in the sense of Alexandrov. Our result is formulated in terms ofbarycenters of probability measures and relies on the notion of geodesicconvexity. We first briefly recall these concepts and refer the reader toSection 2 for precise geometric definitions.Given a metric space ( M, d ), we denote P ( M ) the set of Borel probabilitymeasures µ on M such that, for all x ∈ M , V µ ( x ) := (cid:90) M d ( x, y ) d µ ( y ) < + ∞ . ∗ HSE University, Faculty of Computer Science, Moscow, Russia. This work has beenfunded by the Russian Academic Excellence Project ’5-100’. Email: [email protected] a r X i v : . [ m a t h . M G ] N ov iven µ ∈ P ( M ), we call V µ : M → R + its variance functional, denote V ∗ µ := inf x ∈ M V µ ( x ) , and call barycenter of µ any x ∗ ∈ M such that V µ ( x ∗ ) = V ∗ µ . Barycentersprovide a natural generalization of the notion of mean value of a probabilitymeasure when M has no linear structure. While alternative notions of meanvalue in a metric space have been proposed, barycenters are often favoredfor their simple interpretation and constructive definition as the solution ofan optimization problem. In full generality, a barycenter may not exist and,if it does, needs not be unique. The questions of existence and uniquenessof barycenters have been addressed in a number of settings and we referthe reader to Sturm (2003), Agueh and Carlier (2011), Afsari (2011), Ohta(2012), Yokota (2016), Kim and Pass (2017), Le Gouic and Loubes (2017),Ahidar-Coutrix et al. (2019), Le Gouic et al. (2019) and Huckemann andEltzner (2020) for discussions on this topic in different settings.Given a geodesic space ( M, d ), and α ∈ R , a function f : M → R is calledgeodesically α -convex if, for every geodesic γ : [0 , → M , the function t ∈ [0 , (cid:55)→ f ( γ ( t )) − α d ( γ (0) , γ ( t )) , is convex. This notion reduces to the classical definition of convexity (resp.strong convexity) in the Euclidean setting when α = 0 (resp. α > Theorem 1.1.
Let ( M, d ) be a Polish geodesic space with curvature lowerbounded by some κ ∈ R , in the sense of Alexandrov (Definition 2.2). Let µ ∈ P ( M ) and suppose µ admits a barycenter x ∗ ∈ M . Let α ∈ R and f : M → R be geodesically α -convex and locally Lipschitz at x ∗ . Then if f is either positive, or in L ( µ ) , we have f ( x ∗ ) ≤ (cid:90) M f d µ − α V ∗ µ . The generalization of Jensen’s inequality to non standard settings hasbeen studied in a number of contributions. In the context of Riemannian Note for instance that if (
M, d ) = ( R p , (cid:107) . − . (cid:107) ) and if µ ∈ P ( M ), then x ∗ := (cid:82) x d µ ( x )is the unique minimizer of x ∈ M (cid:55)→ (cid:90) (cid:107) x − y (cid:107) d µ ( y ) . M = W (Ω) over a (Polish) space Ω, known to be ageodesic space of curvature lower bounded by 0 if, and only if, Ω satisfies thesame property (Sturm, 2006, Proposition 2.10). In the case where Ω = R d ,and µ is a probability measure with finite support, Agueh and Carlier (2011)prove Jensen’s inequality for a number of classical functionals. This result islater generalized by Kim and Pass (2017) to the case where Ω is a compactRiemannian manifold and µ is a sufficiently well behaved Borel probabilitymeasure over W (Ω). On the one hand, the result of Kim and Pass (2017)is more general than Theorem 1.1, in the context of the Wasserstein space,since it does not impose restrictions on the sectional curvature of the basespace Ω. On the other hand, the generality of our result seems to imply thevalidity of Jensen’s inequality in W (Ω) as soon as Ω is a Polish geodesicspace with non-negative curvature, which needs not be a smooth manifold,and without the additional regularity assumptions on µ imposed in Kim andPass (2017).From a technical point of view, our proof is straightforward but relieson important results from metric geometry. The first main result we invoke(Lemma 2.7) concerns the properties of tangent cones at barycenters ofprobability measures in spaces with lower bounded curvature as studiedin Ohta (2012), Yokota (2012), Le Gouic et al. (2019) and Le Gouic (2020).The second fundamental result we use (Lemma 2.9) is the existence (anduniqueness) of an appropriate notion of gradient for semi-concave functionson spaces with lower bounded curvature (see, e.g., Petrunin, 2007).The paper is organized as follows. Section 2 introduces background inmetric geometry, necessary for our result. Section 3 presents the proof ofTheorem 1.1. Finally, Appendix A gives proofs of some key lemmas pre-sented in the preliminary section for completeness. In this preliminary section we summarize some definitions and results frommetric geometry necessary for our main result. Most of the material gatheredbelow can be found in classical texts such as Burago et al. (1992, 2001),3laut (2002) or in the book in preparation by Alexander et al. (2019). Weprovide proofs for some of the key statements in Appendix A to clarify thepresentation.
Let (
M, d ) be a metric space. We call path in M a continuous map γ : I → M defined on an interval I ⊂ R . The length L ( γ ) ∈ [0 , + ∞ ] of a path γ : I → M is defined by L ( γ ) := sup n ≥ , t ≤···≤ t n ∈ I n − (cid:88) i =0 d ( γ ( t i ) , γ ( t i +1 )) . (2.1)A path is called rectifiable if it has finite length. Given a path γ : I → M and an interval J ⊂ I , we denote γ J : J → M the restriction of γ to J .Two paths γ i : I i → M , i = 1 ,
2, are said to be equivalent if there existcontinuous, non-decreasing and surjective functions ϕ i : J → I i such that γ ◦ ϕ = γ ◦ ϕ . In this case, γ is said to be a reparametrisation of γ and one checks that L ( γ ) = L ( γ ). A path γ : [ a, b ] → M is said to haveconstant speed if, for all a ≤ s ≤ t ≤ b , L ( γ [ s,t ] ) = t − sb − a L ( γ ) . If −∞ < a < b < + ∞ and γ : [ a, b ] → M is rectifiable, then for any τ > γ admits a constant speed reparametrization [0 , τ ] → M . A path γ : [0 , τ ] → M is said to issue from x if γ (0) = x and is said to connect x to y if in addition γ ( τ ) = y . It follows from the definition of length, and thetriangular inequality, that d ( x, y ) ≤ L ( γ ) for any path γ connecting x to y .In particular, d ( x, y ) ≤ ¯ d ( x, y ) := inf γ L ( γ ) , (2.2)where the infimum is taken over all such paths γ . The function ¯ d defines a[0 , + ∞ ]-valued metric on M called the length metric. Definition 2.1.
The space M is called a length space if d = ¯ d . A lengthspace ( M, d ) is said to be a geodesic space if, in addition, the infimum in thedefinition of ¯ d is always attained. In a geodesic space, a path connecting x to y whose length is equal to d ( x, y ) is called a shortest path connecting x to y . A shortest path γ : [0 , τ ] → M , with constant speed, is called a geodesic. γ : [0 , τ ] → M is a geodesic iff, for all 0 ≤ s ≤ t ≤ τ , d ( γ ( s ) , γ ( t )) = t − sτ d ( γ (0) , γ ( τ )) . For a geodesic γ : [0 , τ ] → M , the metric speed | γ (cid:48) | := d ( γ (0) , γ ( t )) t , is constant by construction, for t ∈ (0 , τ ].For κ ∈ R , a remarkable geodesic space is the κ -plane ( M κ , d κ ) defined asthe unique (up to isometry) 2-dimensional complete and simply connectedRiemannian manifold with constant sectional curvature κ , equipped with itsRiemannian distance d κ . The diameter D κ of M κ is D κ := (cid:26) + ∞ if κ ≤ ,π/ √ κ if κ > . For κ ∈ R , there is a unique geodesic [0 , → M connecting x to y in M κ provided d κ ( x, y ) < D κ .Given a metric space ( M, d ), we call triangle in M any set of three points { p, x, y } ⊂ M . We call it non-degenerate if all three points are distinct. For κ ∈ R , a comparison triangle for { p, x, y } ⊂ M in M κ is an isometric copy { ¯ p, ¯ x, ¯ y } ⊂ M κ of { p, x, y } in M κ (i.e., pairwise distances are preserved).Such a comparison triangle always exists and is unique (up to an isometry)provided the perimeterperi { p, x, y } := d ( p, x ) + d ( p, y ) + d ( x, y ) < D κ . If { p, x, y } is non-degenerate and peri { p, x, y } < D κ , the triangular inequal-ity implies that d ( p, x ) , d ( p, y ) , d ( x, y ) < D κ . Given κ ∈ R , p, x, y ∈ M with p / ∈ { x, y } and peri { p, x, y } < D κ , wedefine the comparison angle (cid:94) κp ( x, y ) ∈ [0 , π ] at p bycos (cid:94) κp ( x, y ) := d ( p, x ) + d ( p, y ) − d ( x, y )2 d ( p, x ) d ( p, y ) if κ = 0 ,c κ ( d ( x, y )) − c κ ( d ( p, x )) · c κ ( d ( p, y )) κ · s κ ( d ( p, x )) s κ ( d ( p, y )) if κ (cid:54) = 0 , where, for r ≥ s κ ( r ) := (cid:26) sin( r √ κ ) / √ κ if κ > , sinh( r √− κ ) / √− κ if κ < , (2.3)5nd c κ ( r ) = s (cid:48) κ ( r ). When peri { p, x, y } ≥ D κ , we declare the angle (cid:94) κp ( x, y )undefined. Note that the comparison angle (cid:94) κp ( x, y ) corresponds to theRiemannian angle at ¯ p between the two unique geodesics connecting ¯ p to¯ x and ¯ y respectively in M κ where { ¯ p, ¯ x, ¯ y } ⊂ M κ denotes a comparisontriangle for { p, x, y } . Definition 2.2.
Given κ ∈ R , a metric space ( M, d ) is said to have cur-vature lower bounded by κ , which we denote by curv( M ) ≥ κ , if for all p, x, y, z ∈ M , such that p / ∈ { x, y, z } , we have (cid:94) κp ( x, y ) + (cid:94) κp ( x, z ) + (cid:94) κp ( y, z ) ≤ π, (2.4) when all three angles are defined. Definition 2.2 is of global nature as it requires comparison (2.4) to holdfor all quadruples p, x, y, z ∈ M for which angles at p are defined. A global-ization result due to Burago et al. (1992) states that, when M is a completelength space, then it has curvature lower bounded by κ in the sense of Def-inition 2.2 iff, for all p ∈ M , comparison (2.4) holds for all { x, y, z } in aneighborhood of p . In the case of geodesic spaces, we can give the followingequivalent characterization of lower bounded curvature. Theorem 2.3.
Let ( M, d ) be a geodesic space and κ ∈ R . Then the followingstatements are equivalent. (1) curv( M ) ≥ κ in the sense of Definition 2.2. (2) For all p, x, y ∈ M with p / ∈ { x, y } and peri { p, x, y } < D κ , andfor any geodesics γ x , γ y : [0 , → M connecting p to x and p to y respectively, we have ∀ s, t ∈ [0 , , d ( γ x ( s ) , γ y ( t )) ≥ d κ (¯ γ x ( s ) , ¯ γ y ( t )) , where, given a comparison triangle { ¯ p, ¯ x, ¯ y } of { p, x, y } in M κ , ¯ γ x , ¯ γ y :[0 , → M κ are geodesics (which are unique if the triangle is non-degenerate) connecting ¯ p to ¯ x and ¯ p to ¯ y respectively. We end the paragraph with a few standard examples of Polish geodesicspaces with lower bounded curvature in the sense of Definition 2.2. • A complete and connected Riemannian manifold, with its Riemanniandistance, is a Polish geodesic space with curvature lower bounded by κ ∈ R , iff its sectional curvatures are all lower bounded by κ .6 The frontier ∂K of a convex and compact subset K ⊂ R d (with non-empty interior) equipped with its length metric (inherited from theinduced euclidean distance) is a Polish geodesic space with curvaturelower bounded by 0. • Given a Polish geodesic space (Ω , d ) with curvature lower bounded by0, the 2-Wasserstein space W (Ω) := ( P (Ω) , W ) over Ω is a Polishgeodesic space with curvature lower bounded by 0. Let (
M, d ) be a geodesic space with lower bounded curvature in the senseof Definition 2.2.Given p ∈ M , we denote Γ p be the set of all non-trivial geodesics γ :[0 , τ ] → M issuing from p . For γ, σ ∈ Γ p , the angle between γ and σ isdefined by (cid:94) p ( γ, σ ) := lim s,t → (cid:94) p ( γ ( s ) , σ ( t )) . The angle (cid:94) p : Γ p → [0 , π ] is well defined for geodesic spaces with lowerbounded curvature, as a consequence of Theorem 2.3, point (2), and satisfies,for all γ, ω, σ ∈ Γ p , (cid:94) p ( γ, σ ) ≤ (cid:94) p ( γ, ω ) + (cid:94) p ( ω, σ ) . The angle (cid:94) p is therefore a pseudo-metric on Γ p and induces a metric onthe quotient space Σ (cid:48) p := Γ p / ∼ where γ ∼ σ iff (cid:94) p ( γ, σ ) = 0. We denote (cid:126)γ ∈ Σ (cid:48) p the equivalence class of γ ∈ Γ p for ∼ . The completion Σ p of Σ (cid:48) p is called thespace of directions at p . Below we use the same symbol (cid:94) p to denote thepseudo-metric on Γ p , the metric on Σ (cid:48) p or the metric on Σ p .Given a metric space (Ω , d ), with diameter at most π , consider the equiv-alence relation ≈ on Ω × R + defined by ( p, s ) ≈ ( q, t ) iff ( s = t = 0 or( p, s ) = ( q, t )). In other words, if [ p, s ] denotes the class of ( p, s ) for thisrelation, then [ p, s ] = { ( p, s ) } if s > p,
0] = Ω × { } . The Euclideancone over Ω, denoted cone(Ω), is the quotient set (Ω × R + ) / ≈ equippedwith the metric d c defined by d c ([ p, s ] , [ q, t ]) := s − st cos d ( p, q ) + t . We call [ p,
0] the tip of the cone. 7he tangent cone T p M of M at p is defined as the Euclidean cone overthe space of directions of M at p , i.e., T p M := cone(Σ p ) . We denote (cid:107) . − . (cid:107) p the metric on T p M and 0 p the tip of T p M . For u = [ ξ, s ] ∈ T p M and λ ∈ R + , we define λu := [ ξ, λs ]. For u = [ ξ, s ] , v = [ ζ, t ] ∈ T p M ,we set (cid:107) u (cid:107) p := (cid:107) u − p (cid:107) p and (cid:104) u, v (cid:105) p := st cos (cid:94) p ( ξ, ζ ) so that (cid:107) u − v (cid:107) p = (cid:107) u (cid:107) p − (cid:104) u, v (cid:105) p + (cid:107) v (cid:107) p . A useful alternative representation of T p M is obtained as follows. Forany two geodesics γ, σ ∈ Γ p , denote | γ − σ | p := lim t → d ( γ ( t ) , σ ( t )) t . Angles between elements of Γ p being well defined, the limit always exists and | . − . | p defines a pseudo metric on Γ p . Denoting ∝ the equivalence relationon Γ p defined by γ ∝ σ iff | γ − σ | p = 0, we define T (cid:48) p M as the quotient setΓ p / ∝ equipped with the induced metric | . − . | p . For γ ∈ Γ p , we denote˙ γ ∈ T (cid:48) p M its class for relation ∝ . Lemma 2.4.
The map ˙ γ ∈ T (cid:48) p M (cid:55)→ [ (cid:126)γ, | γ (cid:48) | ] ∈ Cone(Σ (cid:48) p ) , (2.5) is a well defined isometry and the completion of T (cid:48) p M is isometric to T p M . We report the proof of Lemma 2.4 in the appendix.From now on, we’ll therefore consider T (cid:48) p M as a dense subset of T p M andidentify ˙ γ ∈ T (cid:48) p M to the element [ (cid:126)γ, | γ (cid:48) | ] ∈ T p M . Suppose that (
M, d ) is a geodesic space with lower bounded curvature. For p ∈ M , we call logarithmic map at p any map log p : M → T p M such that,for all x ∈ M , log p ( x ) = ˙ γ x , for some geodesic γ x : [0 , → M connecting p to x . The next result showsthat the choice of a sufficiently well behaved log map is possible provided M is a Polish geodesic space. It was first cited as a remark in Le Gouic et al.(2019) and proved in Le Gouic (2020).8 emma 2.5. Let ( M, d ) be a Polish geodesic space with lower bounded cur-vature and equipped with its Borel σ -algebra. Then, for all p ∈ M , thereexists a logarithmic map log p : M → T p M which is measurable when T p M is equipped with the σ -algebra generated by open balls. We report the proof of Lemma 2.5 in the appendix.
Remark 2.6.
The proof of Lemma 2.5 is essentially based on the observa-tion that under these assumptions, we can first select a collection ( γ x ) x ∈ M of geodesics γ x : [0 , → M connecting p to x , such that the map x ∈ M (cid:55)→ γ x ∈ G p is Borel measurable when we equip G p ⊂ Γ p , the set of all geodesicsissuing from p and defined on [0 , , with the uniform metric. Then, we showthat the map γ ∈ G p (cid:55)→ ˙ γ ∈ T p M is measurable when T p M is equipped withthe σ -algebra generated by open balls and define log p as the composition ofthese two maps. It follows from Lemma 2.5 that we can choose a Borel-measurable log-arithmic map at any point p whenever the tangent cone T p M is separablesince in this case the Borel σ -algebra on T p M coincides with the σ -algebragenerated by open balls. This occurs for instance in the case where M isa proper metric space as noted in Ohta (2012). It is also known to be thecase for specific examples of non-proper spaces. For instance, if Ω denotes aPolish geodesic space with curvature lower bounded by 0, the 2-Wassersteinspace W (Ω) has a separable tangent cone at any point. This fact followsfrom Ambrosio et al. (2008, Definition 12.4.3) which characterises T p W (Ω)as a closed subset of a separable metric space.However, it should be noted that the measurability of log p with respectto the σ -algebra generated by open balls on T p M is enough for the resultswe present next. Indeed, statements presented below require only the Borel-measurability of maps of the form x (cid:55)→ (cid:104) log p ( x ) , u (cid:105) p , for some fixed p ∈ M and u ∈ T p M , which follows from this weaker measur-ability of log p .Note finally that the choice of a measurable log is in principle not unique.However, all results we’ll mention can be shown to hold independently of itschoice.Next is the first key result for the proof of Theorem 1.1. Lemma 2.7.
Let ( M, d ) be a Polish geodesic space with lower bounded cur-vature in the sense of Definition 2.2. Let µ ∈ P ( M ) and suppose it admits barycenter x ∗ ∈ M . Then for all u ∈ T x ∗ M , (cid:90) M (cid:104) log x ∗ ( x ) , u (cid:105) x ∗ d µ ( x ) = 0 . Lemma 2.7 follows by combining Le Gouic et al. (2019, Theorem 7)and Le Gouic (2020, Corollary 2).
Suppose (
M, d ) is a geodesic space with lower bounded curvature in thesense of Definition 2.2.A function f : M → R is called (geodesically) α -concave, for some α ∈ R ,if for any geodesic γ : [0 , → M , the map t ∈ [0 , (cid:55)→ f ( γ ( t )) − α d ( γ (0) , γ ( t )) , is concave. Hence, f is α -concave iff ( − f ) is ( − α )-convex. An α -concavefunction is sometimes called a semi-concave function.A function f : M → R is called locally Lipschitz at p if there exists aconstant λ > | f ( x ) − f ( y ) | ≤ λd ( x, y )holds for all x, y in some neighborhood of p . We denote Lip p ( f ) the smallestsuch constant.Let f : M → R be an α -concave function, locally Lipschitz at p ∈ M .Let d p f : T (cid:48) p M → R be defined by d p f ( ˙ γ ) := lim t → f ( γ ( t )) − f ( p ) t . (2.6) Lemma 2.8.
Let α ∈ R and f : M → R be an α -concave function, locallyLipschitz at p ∈ M . Then the limit in (2.6) is well defined. For any geodesic γ : [0 , τ ] → M issuing from p , this limit can be written lim t → f ( γ ( t )) − f ( p ) t = sup t ∈ (0 ,τ ] (cid:26) f ( γ ( t )) − f ( p ) t − αt | γ (cid:48) | (cid:27) , (2.7) and does not depend on the representative γ of ˙ γ . Furthermore, the map d p f : T (cid:48) p M → R admits a unique Lip p ( f ) -Lipschitz extension to T p M , whichwe also denote d p f , and call the differential of f at p . Finally, d p f : T p M → R is positively homogeneous, i.e., satisfies, for all λ ≥ and all v ∈ T p M , d p f ( λv ) = λd p f ( v ) .
10e prove Lemma 2.8 in the appendix.Given an α -concave function f : M → R , we call gradient of f at p anyelement g ∈ T p M such that, for all v ∈ T p M , d p f ( v ) ≤ (cid:104) g, v (cid:105) p and d p f ( g ) = (cid:107) g (cid:107) p . The existence of gradients for α -concave functions, in spaces with lowerbounded curvature, is the second essential result for the proof of Theorem1.1. Lemma 2.9 (Alexander et al., 2019, Theorem 11.4.2) . Let ( M, d ) be ageodesic space with lower bounded curvature. Let α ∈ R and f : M → R be an α -concave function, locally Lipschitz at p . Then there exists a uniquegradient of f at p , denoted ∇ f ( p ) . We include the proof of Lemma 2.9 in the appendix.
We are now in position to prove the main result of the paper.For all x ∈ M , let γ x : [0 , → M be a geodesic such that γ x (0) = x ∗ and γ x (1) = x . Suppose ( γ x ) x ∈ M is chosen as indicated in Remark 2.6, for p = x ∗ , and denote log x ∗ : M → T x ∗ M the corresponding logarithmic map at x ∗ . By α -convexity of f , wededuce that, for all t ∈ (0 ,
1] and all x ∈ M , f ( γ x ( t )) ≤ (1 − t ) f ( x ∗ ) + tf ( x ) − α t (1 − t ) d ( x ∗ , x ) . Rearranging terms, we get f ( x ∗ ) ≤ f ( x ) − f ( γ x ( t )) − f ( x ∗ ) t − α − t ) d ( x ∗ , x )= f ( x ) + ( − f )( γ x ( t )) − ( − f )( x ∗ ) t − α − t ) d ( x ∗ , x ) . Notice that ( − f ) : M → R is ( − α )-concave. Hence, taking the limit t → γ x = log x ∗ ( x ) , that f ( x ∗ ) ≤ f ( x ) + d x ∗ ( − f )(log x ∗ ( x )) − α d ( x ∗ , x ) . ∇ ( − f )( x ∗ ), we deduce that f ( x ∗ ) ≤ f ( x ) + (cid:104) log x ∗ ( x ) , ∇ ( − f )( x ∗ ) (cid:105) x ∗ − α d ( x ∗ , x ) . The statement of Theorem 1.1 now follows by integrating both sides withrespect to µ and using Lemma 2.7. A Appendix
Proof of Lemma 2.4
Direct computations reveal that, for all γ, σ ∈ Γ p , d ( γ ( t ) , σ ( t )) t = | γ (cid:48) | + | σ (cid:48) | − | γ (cid:48) || σ (cid:48) | cos (cid:94) p ( γ ( t ) , σ ( t )) , provided t > γ ( t ) and σ ( t ) are defined. Taking thelimit t →
0, it follows by definition of angles that | γ − σ | p = | γ (cid:48) | + | σ (cid:48) | − | γ (cid:48) || σ (cid:48) | cos (cid:94) p ( γ, σ ) (A.1)= ( | γ (cid:48) | − | σ (cid:48) | ) + 2 | γ (cid:48) || σ (cid:48) | (1 − cos (cid:94) p ( γ, σ )) . (A.2)Identity (A.2) shows that | γ − σ | p is indeed well defined and that ˙ γ = ˙ σ iff | γ (cid:48) | = | σ (cid:48) | and (cid:126)γ = (cid:126)σ . Hence, the map (2.5) is defined without ambiguity, asit doesn’t depend on particular representatives, and is injective. In addition,expression (A.1) translates precisely as | ˙ γ − ˙ σ | p = d c ([ (cid:126)γ, | γ (cid:48) | ] , [ (cid:126)σ, | σ (cid:48) | ]) , where d c denotes the metric in cone(Σ (cid:48) p ). This proves that the map (2.5)is distance preserving. Finally, this map is surjective since, for any s > γ : [0 , τ ] → M in Γ p , [ (cid:126)γ, s ] ∈ cone(Σ (cid:48) p ) is the image of ˙ γ α where γ α : [0 , τ /α ] → M is defined by γ α ( t ) = γ ( αt ) with α := | γ (cid:48) | /s . To showthat the completion of T (cid:48) p M is isometric to T p M it remains to observe that,more generally, the cone over the completion of a metric space is isometricto the completion of the cone over that space.To prove this statement, consider a metric space (Ω , d ) and denote ( ¯Ω , ¯ d )its completion. We denote ¯ x n the equivalence class of Cauchy sequence x n for the equivalence relation lim n d ( x n , y n ) = 0 and understand ¯ d as¯ d (¯ x n , ¯ y n ) = lim n d ( x n , y n ) . d c ([ p, s ] , [ q, t ]) = ( s − t ) + 2 st (1 − cos d ( p, q )) , we see that a sequence [ p n , s n ] is a Cauchy sequence in cone(Ω) iff ( s n → s n → s ∞ > p n is a Cauchy sequence in Ω). As a result, the map φ : cone( ¯Ω) → ¯cone(Ω) defined by φ ([¯ p n , s ]) := (cid:26) [ p n , s ] if s > , s = 0 , is well defined and distance preserving when both ¯Ω and ¯cone(Ω) are equippedwith the completion metric. One checks finally that it is invertible with in-verse given by φ − ([ p n , s n ]) = 0 if s n → φ − ([ p n , s n ]) = [¯ p n , s ∞ ] if s n → s ∞ > Proof of Lemma 2.5
We simply report and detail the proofs of Lemmas 3.3 and 4.2 in Ohta(2012) emphasizing that they actually do not require the metric space M tobe proper, an assumption imposed in Ohta (2012) for other reasons.Suppose that ( M, d ) is a Polish geodesic space and has lower boundedcurvature. Introduce the set G p ⊂ Γ p of all geodesics γ : [0 , → M issuingfrom p . Equipped with the supremum metric d ∞ ( γ, σ ) := sup t ∈ [0 , d ( γ ( t ) , σ ( t )) ,G p is a Polish metric space.For t ∈ [0 , e t : G p → M the evaluation map defined by e t ( γ ) := γ ( t ). This evaluation map is (Lipschitz) continuous. Hence, for all x ∈ M ,the set e − ( x ) ⊂ G p is closed and non-empty. Furthermore, for any openset U ⊂ G p , the set { x ∈ M : e − ( x ) ∩ U (cid:54) = ∅} = e ( U ) , is a Borel set.Indeed, fix a non-empty open set U ⊂ G p . For δ >
0, denote A δ := G p \ { γ ∈ G p : d ∞ ( γ, G p \ U ) < δ } . The set A δ is closed in G p , satisfies A δ ⊂ A δ (cid:48) iff δ (cid:48) ≤ δ and is such that ∪ δ> A δ = U . Given ε, δ >
0, consider the set M ε,δ ⊂ M of all points x σ : [0 , → M satisfying σ (0) = p , σ (1) = x and d ∞ ( σ, A δ ) < ε. The set M ε,δ is open in M and satisfies M ε,δ ⊂ M ε (cid:48) ,δ iff ε ≤ ε (cid:48) . Hence,the set e ( A δ ) = ∩ ε> M ε,δ is a Borel set in M since the intersection can berestricted to any countable sequence ε n ↓ e ( U ) = ∪ δ> e ( A δ ) is a Borel set in M as well since, again, thereunion can be restricted to any countable sequence δ n ↓ g : M → G p such that, for all x ∈ M , g ( x ) ∈ e − ( x ). Wedenote γ x := g ( x ) . It remains to show that the map θ p : γ ∈ G p (cid:55)→ ˙ γ ∈ T p M is measurablewhen T p M is equipped with the σ -algebra generated by open balls. To provethis fact, fix v ∈ T p M and r >
0. By density of T (cid:48) p M in T p M , there exists γ n ∈ G p such that ˙ γ n → v . Therefore, θ − p ( B ( v, r )) = { γ ∈ G p : (cid:107) ˙ γ − v (cid:107) p < r } = (cid:91) n ≥ (cid:92) m ≥ n { γ ∈ G p : (cid:107) ˙ γ − ˙ γ m (cid:107) p < r } = (cid:91) n ≥ (cid:92) m ≥ n { γ ∈ G p : lim t → t − d ( γ ( t ) , γ m ( t )) < r } = (cid:91) n ≥ (cid:92) m ≥ n (cid:91) (cid:96) ≥ (cid:92) k ≥ (cid:96) { γ ∈ G p : d ( γ (1 /k ) , γ m (1 /k )) < r/k } . But since, for fixed k, m ≥
1, the set { γ ∈ G p : d ( γ (1 /k ) , γ m (1 /k )) < r/k } is open in ( G p , d ∞ ), we see that θ − p ( B ( v, r )) is a Borel subset of ( G p , d ∞ )which completes the proof. Proof of Lemma 2.8
Let γ : [0 , τ ] → M be any geodesic issuing from p and let σ : [0 , → M bethe reparametrization of γ defined by σ ( t ) = γ ( τ t ). Then, for all t ∈ (0 , f ( γ ( τ t )) − f ( p ) τ t = 1 τ (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) + αt τ | σ (cid:48) | = 1 τ (cid:26) g ( t ) − g (0) t (cid:27) + αt τ | σ (cid:48) | , (A.3)14here g : [0 , → R is defined by g ( t ) = f ( σ ( t )) − αt | σ (cid:48) | = f ( σ ( t )) − α d ( p, σ ( t )) . The α -concavity of f implies that g is concave on [0 , g at 0 is well defined and satisfieslim t ↓ g ( t ) − g (0) t = sup t ∈ (0 , g ( t ) − g (0) t = sup t ∈ (0 , (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) . (A.4)Hence, combining (A.3), (A.4) and noticing that | σ (cid:48) | = τ | γ (cid:48) | , we obtainlim t ↓ f ( γ ( t )) − f ( p ) t = lim t ↓ f ( γ ( τ t )) − f ( p ) τ t = 1 τ sup t ∈ (0 , (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) = 1 τ sup t ∈ (0 , (cid:26) f ( γ ( τ t )) − f ( p ) t − ατ t | γ (cid:48) | (cid:27) = sup t ∈ (0 ,τ ] (cid:26) f ( γ ( t )) − f ( p ) t − αt | γ (cid:48) | (cid:27) , which proves (2.7). Now, for any ˙ γ, ˙ σ ∈ T (cid:48) p M , the local Lipschitz propertyof f implies that | d p f ( ˙ γ ) − d p f ( ˙ σ ) | = lim t → | f ( γ ( t )) − f ( σ ( t )) | t ≤ Lip p ( f ) lim t → d ( γ ( t ) , σ ( t )) t = Lip p ( f ) (cid:107) ˙ γ − ˙ σ (cid:107) p . On the one hand, this inequality shows that the limit on the right handside of (2.6) is independent of the chosen representative γ of ˙ γ . On theother hand, by density of T (cid:48) p M in T p M , it implies that d p f admits a uniqueLip p ( f )-Lipschitz extension to T p M .Finally, to prove d p f is positively homogeneous on T p M it is enough toshow it on T (cid:48) p M and conclude by continuity. But, for any geodesic γ : [0 , τ ] → and any λ >
0, the geodesic σ : [0 , λτ ] → M defined by σ ( t ) = γ ( t/λ )satisfies ˙ σ = λ ˙ γ and we see that d p f ( ˙ σ ) = lim t → f ( σ ( t )) − f ( p ) t = λ lim t → f ( γ ( t/λ )) − f ( p ) t/λ = λd p f ( ˙ γ ) , which completes the proof. Proof of Lemma 2.9
The proof follows the lines devised in Alexander et al. (2019) with minormodifications. In particular, we use the following result, due to Lang andSchroeder (1997).
Lemma A.1 (Lang and Schroeder, 1997, Lemma A.4) . Let ( M, d ) be geodesicwith lower bounded curvature, in the sense of Definition 2.2. Let p ∈ M and γ, σ : [0 , → M two geodesics issuing from p . For all t ∈ (0 , , let δ t : [0 , → M be a geodesic connecting γ ( t ) to σ ( t ) . Introducing the mid-point m t = δ t (1 / of γ ( t ) and σ ( t ) , we have t ↓ d ( p, m t ) t = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p . We can now establish the following result.
Lemma A.2 (Alexander et al., 2019, Lemma 11.2.3) . Let ( M, d ) be geodesicwith lower bounded curvature, in the sense of Definition 2.2. Suppose f : M → R is locally Lipschitz at p and α -concave. Then for all u, v ∈ T p M , sup w ∈ T p M : (cid:107) w (cid:107) p =1 d p f ( w ) ≥ d p f ( u ) + d p f ( v ) (cid:113) (cid:107) u (cid:107) p + 2 (cid:104) u, v (cid:105) p + (cid:107) v (cid:107) p . Proof of Lemma A.2.
By density of T (cid:48) p M in T p M and continuity of d p f , itis enough to prove the result for all u = ˙ γ ∈ T (cid:48) p M and v = ˙ σ ∈ T (cid:48) p M where γ, σ ∈ Γ p . For two such geodesics γ : [0 , → M and σ : [0 , → M , and all t ∈ (0 , δ t : [0 , → M be a geodesic connecting γ ( t ) to σ ( t ). Then, byconcavity of f ,2 f ( δ t (1 / ≥ f ( γ ( t )) + f ( σ ( t )) − α d ( γ ( t ) , σ ( t )) . (A.5)Observing that f ( γ ( t )) = f ( p ) + td p f ( ˙ γ ) + o( t ) , f ( σ ( t )) = f ( p ) + td p f ( ˙ σ ) + o( t ) , d ( γ ( t ) , σ ( t )) = t (cid:107) ˙ γ − ˙ σ (cid:107) p + o( t ) , we deduce from (A.5) thatlim inf t ↓ (cid:18) f ( δ t (1 / − f ( p ) t (cid:19) ≥ d p f ( ˙ γ ) + d p f ( ˙ σ )2 . Now, for all t ∈ (0 , ω t : [0 , t ] → M be a geodesic connecting p to δ t (1 / f ( δ t (1 / − f ( p ) t = f ( ω t ( t )) − f ( p ) t ≤ sup s ∈ (0 ,t ] (cid:26) f ( ω t ( s )) − f ( p ) s − αs | ω (cid:48) t | (cid:27) + αt | ω (cid:48) t | = d p f ( ˙ ω t ) + αt | ω (cid:48) t | = | ω (cid:48) t | (cid:18) d p f ( | ω (cid:48) t | − ˙ ω t ) + αt | ω (cid:48) t | (cid:19) . Since (cid:107)| ω (cid:48) t | − ˙ ω t (cid:107) p = 1, for all t ∈ (0 , | ω (cid:48) t | = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p + ε ( t ) , where ε ( t ) → t →
0. But this follows by observing that,4 lim t ↓ | ω (cid:48) t | = 4 lim t ↓ d ( p, ω t ( t )) t = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p , according to Lemma A.1.We are now in position to prove existence and uniqueness of gradientsfor α -concave functions. Proof of Lemma 2.9.
Let w n = [ ξ n , ∈ T p M be such thatlim n → + ∞ d p f ( w n ) = d sup := sup w ∈ T p M : (cid:107) w (cid:107) p =1 d p f ( w ) . Applying Lemma A.2, we get for all m, n ≥ d sup ≥ d p f ( w n ) + d p f ( w m ) (cid:112) (cid:104) w n , w m (cid:105) p = d p f ( w n ) + d p f ( w m ) (cid:112) (cid:94) p ( ξ n , ξ m ) . n, m → + ∞ , this implies that ξ n is a Cauchy sequence in the com-plete space Σ p , and hence converges towards some ξ ∞ ∈ Σ p . Denoting w ∞ = [ ξ ∞ , d p f ( w ∞ ) = d sup by continuity. Now, denote g := d sup w ∞ , and select an arbitrary w ∈ T p M . Then applying Lemma A.2 to u = w ∞ and v = εw , we get d sup ≥ d sup + εd p f ( w ) (cid:113) ε (cid:104) w ∞ , w (cid:105) p + ε (cid:107) w (cid:107) p = d sup + ε ( d p f ( w ) − (cid:104) w ∞ , w (cid:105) p ) + o( ε ) . Letting ε ↓
0, we obtain d p f ( w ) ≤ (cid:104) g, w (cid:105) p . Since it is clear by construc-tion that d p f ( g ) = (cid:107) g (cid:107) p , this proves the existence of a gradient. To proveuniqueness, consider another g (cid:48) satisfying the same properties. Than, wehave (cid:107) g (cid:48) (cid:107) p = d p f ( g (cid:48) ) ≤ (cid:104) g, g (cid:48) (cid:105) p and (cid:107) g (cid:107) p = d p f ( g ) ≤ (cid:104) g, g (cid:48) (cid:105) p . As a result, 0 ≤ (cid:107) g (cid:48) − g (cid:107) p = (cid:107) g (cid:48) (cid:107) p − (cid:104) g, g (cid:48) (cid:105) p + (cid:107) g (cid:107) ≤ , which imposes g = g (cid:48) and completes the proof. Acknowledgments . I would like to thank Thibaut Le Gouic and PhilippeRigollet for discussions on metric geometry, and its applications, that sparkedmy interest for the problem addressed in this paper.
References
B. Afsari. Riemannian L p center of mass: existence, uniqueness, andconvexity. Proc. Amer. Math. Soc. , 139(2):655–673, 2011. ISSN 0002-9939. doi: 10.1090/S0002-9939-2010-10541-5. URL https://doi.org/10.1090/S0002-9939-2010-10541-5 . 2M. Agueh and G. Carlier. Barycenters in the Wasserstein space.
SIAMJ. Math. Anal. , 43(2):904–924, 2011. ISSN 0036-1410. doi: 10.1137/100805741. URL https://doi.org/10.1137/100805741 . 2, 3A. Ahidar-Coutrix, T. Le Gouic, and Q. Paris. Convergence rates for em-pirical barycenters in metric spaces: curvature, convexity and extend-able geodesics.
Probability Theory and Related Fields , Oct 2019. ISSN1432-2064. doi: 10.1007/s00440-019-00950-0. URL https://doi.org/10.1007/s00440-019-00950-0 . 218. Alexander, V. Kapovitch, and A. Petrunin.
Alexandrov geometry: pre-liminary version no. 1 . Book in preparation, Mar. 2019. URL http://arxiv.org/abs/1903.08539 . arXiv:1903.08539. 4, 11, 16L. Ambrosio, N. Gigli, and G. Savar´e.
Gradient flows in metric spaces and inthe space of probability measures . Lectures in Mathematics ETH Z¨urich.Birkh¨auser Verlag, Basel, second edition, 2008. ISBN 978-3-7643-8721-1.9V. I. Bogachev.
Measure theory. Vol. I, II . Springer-Verlag, Berlin, 2007.ISBN 978-3-540-34513-8; 3-540-34513-2. doi: 10.1007/978-3-540-34514-5.URL https://doi.org/10.1007/978-3-540-34514-5 . 14D. Burago, Y. Burago, and S. Ivanov.
A course in metric geometry . Amer-ican Mathematical Society, 2001. 3Y. Burago, M. Gromov, and G. Perel’man. A.D. Alexandrov spaces withcurvature bounded below.
Russian Mathematical Surveys , 47(2), 1992. 3,6M. ´Emery and G. Mokobodzki. Sur le barycentre d’une probabilit´e dansune vari´et´e. In
S´eminaire de Probabilit´es, XXV , volume 1485 of
LectureNotes in Math. , pages 220–233. Springer, Berlin, 1991. doi: 10.1007/BFb0100858. URL https://doi.org/10.1007/BFb0100858 . 3S. F. Huckemann and B. Eltzner. Data analysis on nonstandard spaces.
WIREs Computational Statistics , n/a(n/a):e1526, 2020. doi: https://doi.org/10.1002/wics.1526. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.1526 . 2W. S. Kendall. Probability, convexity, and harmonic maps with small image.I. Uniqueness and fine existence.
Proc. London Math. Soc. (3) , 61(2):371–406, 1990. ISSN 0024-6115. doi: 10.1112/plms/s3-61.2.371. URL https://doi.org/10.1112/plms/s3-61.2.371 . 3Y.-H. Kim and B. Pass. Wasserstein barycenters over Riemannian manifolds.
Adv. Math. , 307:640–683, 2017. ISSN 0001-8708. doi: 10.1016/j.aim.2016.11.026. URL https://doi.org/10.1016/j.aim.2016.11.026 . 2, 3K. Kuwae. Jensen’s inequality over CAT( κ )-space with small diameter. In Potential theory and stochastics in Albac , volume 11 of
Theta Ser. Adv.Math. , pages 173–182. Theta, Bucharest, 2009. 319. Kuwae. Jensen’s inequality on convex spaces.
Calc. Var. Par-tial Differential Equations , 49(3-4):1359–1378, 2014. ISSN 0944-2669.doi: 10.1007/s00526-013-0625-5. URL https://doi.org/10.1007/s00526-013-0625-5 . 3U. Lang and V. Schroeder. Kirszbraun’s theorem and metric spaces ofbounded curvature.
Geom. Funct. Anal. , 7(3):535–560, 1997. ISSN 1016-443X. doi: 10.1007/s000390050018. URL https://doi.org/10.1007/s000390050018 . 16T. Le Gouic. A note on flatness of non separable tangent cone at a barycen-ter.
C. R. Math. Acad. Sci. Paris , 358(4):489–495, 2020. ISSN 1631-073X.doi: 10.5802/crmath.66. URL https://doi.org/10.5802/crmath.66 . 3,8, 10T. Le Gouic and J.-M. Loubes. Existence and consistency of Wassersteinbarycenters.
Probab. Theory Related Fields , 168(3-4):901–917, 2017. ISSN0178-8051. doi: 10.1007/s00440-016-0727-z. URL https://doi.org/10.1007/s00440-016-0727-z . 2T. Le Gouic, Q. Paris, P. Rigollet, and A. J. Stromme. Fast convergenceof empirical barycenters in alexandrov spaces and the wasserstein space,2019. 2, 3, 8, 10S.-I. Ohta. Barycenters in Alexandrov spaces of curvature bounded below.
Advances in geometry , 14:571–587, 2012. 2, 3, 9, 13A. Petrunin. Semiconcave functions in Alexandrov’s geometry. In
Surveysin differential geometry. Vol. XI , volume 11 of
Surv. Differ. Geom. , pages137–201. Int. Press, Somerville, MA, 2007. doi: 10.4310/SDG.2006.v11.n1.a6. URL https://doi.org/10.4310/SDG.2006.v11.n1.a6 . 3C. Plaut. Metric spaces of curvature ≥ k . Handbook of Geometric Topology ,pages 819–898, 2002. 4K.-T. Sturm. Probability measures on metric spaces of nonpositive cur-vature. In
Heat kernels and analysis on manifolds, graphs, and metricspaces (Paris, 2002) , volume 338 of
Contemp. Math. , pages 357–390.Amer. Math. Soc., Providence, RI, 2003. doi: 10.1090/conm/338/06080.URL https://doi.org/10.1090/conm/338/06080 . 2, 3K.-T. Sturm. On the geometry of metric measure spaces. I.
Acta Math. , 196(1):65–131, 2006. ISSN 0001-5962. doi: 10.1007/s11511-006-0002-8. URL https://doi.org/10.1007/s11511-006-0002-8 . 320. Yokota. A rigidity theorem in Alexandrov spaces with lower cur-vature bound.
Math. Ann. , 353(2):305–331, 2012. ISSN 0025-5831.doi: 10.1007/s00208-011-0686-8. URL https://doi.org/10.1007/s00208-011-0686-8 . 3T. Yokota. Convex functions and barycenter on CAT(1)-spaces of smallradii.
J. Math. Soc. Japan , 68(3):1297–1323, 2016. ISSN 0025-5645.doi: 10.2969/jmsj/06831297. URL https://doi.org/10.2969/jmsj/06831297https://doi.org/10.2969/jmsj/06831297