[PDF] Jensen's inequality in geodesic spaces with lower bounded curvature

Abstract

Let (M,d) be a separable and complete geodesic space with curvature lower bounded, by κ∈R , in the sense of Alexandrov. Let μ be a Borel probability measure on M , such that μ∈ P 2 (M) , and that has at least one barycenter x ∗ ∈M . We show that for any geodesically α -convex function f:M→R , for α∈R , the inequality f( x ∗ )≤ ∫ M (f− α 2 d 2 ( x ∗ ,.))dμ, holds provided f is locally Lipschitz at x ∗ and either positive or in L 1 (μ) . Our proof relies on the properties of tangent cones at barycenters and on the existence of gradients for semi-concave functions in spaces with lower bounded curvature.

Full PDF

JJensen’s inequality in geodesic spaces with lowerbounded curvature

Quentin Paris ∗ Abstract

Let (

M, d ) be a separable and complete geodesic space with curva-ture lower bounded, by κ ∈ R , in the sense of Alexandrov. Let µ be aBorel probability measure on M , such that µ ∈ P ( M ), and that hasat least one barycenter x ∗ ∈ M . We show that for any geodesically α -convex function f : M → R , for α ∈ R , the inequality f ( x ∗ ) ≤ (cid:90) M ( f − α d ( x ∗ , . )) d µ, holds provided f is locally Lipschitz at x ∗ and either positive or in L ( µ ). Our proof relies on the properties of tangent cones at barycen-ters and on the existence of gradients for semiconcave functions inspaces with lower bounded curvature. In this paper, we prove Jensen’s inequality in Polish (i.e., separable andcomplete) geodesic metric spaces with curvature lower bounded, by some κ ∈ R , in the sense of Alexandrov. Our result is formulated in terms ofbarycenters of probability measures and relies on the notion of geodesicconvexity. We ﬁrst brieﬂy recall these concepts and refer the reader toSection 2 for precise geometric deﬁnitions.Given a metric space ( M, d ), we denote P ( M ) the set of Borel probabilitymeasures µ on M such that, for all x ∈ M , V µ ( x ) := (cid:90) M d ( x, y ) d µ ( y ) < + ∞ . ∗ HSE University, Faculty of Computer Science, Moscow, Russia. This work has beenfunded by the Russian Academic Excellence Project ’5-100’. Email: [email protected] a r X i v : . [ m a t h . M G ] N ov iven µ ∈ P ( M ), we call V µ : M → R + its variance functional, denote V ∗ µ := inf x ∈ M V µ ( x ) , and call barycenter of µ any x ∗ ∈ M such that V µ ( x ∗ ) = V ∗ µ . Barycentersprovide a natural generalization of the notion of mean value of a probabilitymeasure when M has no linear structure. While alternative notions of meanvalue in a metric space have been proposed, barycenters are often favoredfor their simple interpretation and constructive deﬁnition as the solution ofan optimization problem. In full generality, a barycenter may not exist and,if it does, needs not be unique. The questions of existence and uniquenessof barycenters have been addressed in a number of settings and we referthe reader to Sturm (2003), Agueh and Carlier (2011), Afsari (2011), Ohta(2012), Yokota (2016), Kim and Pass (2017), Le Gouic and Loubes (2017),Ahidar-Coutrix et al. (2019), Le Gouic et al. (2019) and Huckemann andEltzner (2020) for discussions on this topic in diﬀerent settings.Given a geodesic space ( M, d ), and α ∈ R , a function f : M → R is calledgeodesically α -convex if, for every geodesic γ : [0 , → M , the function t ∈ [0 , (cid:55)→ f ( γ ( t )) − α d ( γ (0) , γ ( t )) , is convex. This notion reduces to the classical deﬁnition of convexity (resp.strong convexity) in the Euclidean setting when α = 0 (resp. α > Theorem 1.1.

Let ( M, d ) be a Polish geodesic space with curvature lowerbounded by some κ ∈ R , in the sense of Alexandrov (Deﬁnition 2.2). Let µ ∈ P ( M ) and suppose µ admits a barycenter x ∗ ∈ M . Let α ∈ R and f : M → R be geodesically α -convex and locally Lipschitz at x ∗ . Then if f is either positive, or in L ( µ ) , we have f ( x ∗ ) ≤ (cid:90) M f d µ − α V ∗ µ . The generalization of Jensen’s inequality to non standard settings hasbeen studied in a number of contributions. In the context of Riemannian Note for instance that if (

M, d ) = ( R p , (cid:107) . − . (cid:107) ) and if µ ∈ P ( M ), then x ∗ := (cid:82) x d µ ( x )is the unique minimizer of x ∈ M (cid:55)→ (cid:90) (cid:107) x − y (cid:107) d µ ( y ) . M = W (Ω) over a (Polish) space Ω, known to be ageodesic space of curvature lower bounded by 0 if, and only if, Ω satisﬁes thesame property (Sturm, 2006, Proposition 2.10). In the case where Ω = R d ,and µ is a probability measure with ﬁnite support, Agueh and Carlier (2011)prove Jensen’s inequality for a number of classical functionals. This result islater generalized by Kim and Pass (2017) to the case where Ω is a compactRiemannian manifold and µ is a suﬃciently well behaved Borel probabilitymeasure over W (Ω). On the one hand, the result of Kim and Pass (2017)is more general than Theorem 1.1, in the context of the Wasserstein space,since it does not impose restrictions on the sectional curvature of the basespace Ω. On the other hand, the generality of our result seems to imply thevalidity of Jensen’s inequality in W (Ω) as soon as Ω is a Polish geodesicspace with non-negative curvature, which needs not be a smooth manifold,and without the additional regularity assumptions on µ imposed in Kim andPass (2017).From a technical point of view, our proof is straightforward but relieson important results from metric geometry. The ﬁrst main result we invoke(Lemma 2.7) concerns the properties of tangent cones at barycenters ofprobability measures in spaces with lower bounded curvature as studiedin Ohta (2012), Yokota (2012), Le Gouic et al. (2019) and Le Gouic (2020).The second fundamental result we use (Lemma 2.9) is the existence (anduniqueness) of an appropriate notion of gradient for semi-concave functionson spaces with lower bounded curvature (see, e.g., Petrunin, 2007).The paper is organized as follows. Section 2 introduces background inmetric geometry, necessary for our result. Section 3 presents the proof ofTheorem 1.1. Finally, Appendix A gives proofs of some key lemmas pre-sented in the preliminary section for completeness. In this preliminary section we summarize some deﬁnitions and results frommetric geometry necessary for our main result. Most of the material gatheredbelow can be found in classical texts such as Burago et al. (1992, 2001),3laut (2002) or in the book in preparation by Alexander et al. (2019). Weprovide proofs for some of the key statements in Appendix A to clarify thepresentation.

Let (

M, d ) be a metric space. We call path in M a continuous map γ : I → M deﬁned on an interval I ⊂ R . The length L ( γ ) ∈ [0 , + ∞ ] of a path γ : I → M is deﬁned by L ( γ ) := sup n ≥ , t ≤···≤ t n ∈ I n − (cid:88) i =0 d ( γ ( t i ) , γ ( t i +1 )) . (2.1)A path is called rectiﬁable if it has ﬁnite length. Given a path γ : I → M and an interval J ⊂ I , we denote γ J : J → M the restriction of γ to J .Two paths γ i : I i → M , i = 1 ,

2, are said to be equivalent if there existcontinuous, non-decreasing and surjective functions ϕ i : J → I i such that γ ◦ ϕ = γ ◦ ϕ . In this case, γ is said to be a reparametrisation of γ and one checks that L ( γ ) = L ( γ ). A path γ : [ a, b ] → M is said to haveconstant speed if, for all a ≤ s ≤ t ≤ b , L ( γ [ s,t ] ) = t − sb − a L ( γ ) . If −∞ < a < b < + ∞ and γ : [ a, b ] → M is rectiﬁable, then for any τ > γ admits a constant speed reparametrization [0 , τ ] → M . A path γ : [0 , τ ] → M is said to issue from x if γ (0) = x and is said to connect x to y if in addition γ ( τ ) = y . It follows from the deﬁnition of length, and thetriangular inequality, that d ( x, y ) ≤ L ( γ ) for any path γ connecting x to y .In particular, d ( x, y ) ≤ ¯ d ( x, y ) := inf γ L ( γ ) , (2.2)where the inﬁmum is taken over all such paths γ . The function ¯ d deﬁnes a[0 , + ∞ ]-valued metric on M called the length metric. Deﬁnition 2.1.

The space M is called a length space if d = ¯ d . A lengthspace ( M, d ) is said to be a geodesic space if, in addition, the inﬁmum in thedeﬁnition of ¯ d is always attained. In a geodesic space, a path connecting x to y whose length is equal to d ( x, y ) is called a shortest path connecting x to y . A shortest path γ : [0 , τ ] → M , with constant speed, is called a geodesic. γ : [0 , τ ] → M is a geodesic iﬀ, for all 0 ≤ s ≤ t ≤ τ , d ( γ ( s ) , γ ( t )) = t − sτ d ( γ (0) , γ ( τ )) . For a geodesic γ : [0 , τ ] → M , the metric speed | γ (cid:48) | := d ( γ (0) , γ ( t )) t , is constant by construction, for t ∈ (0 , τ ].For κ ∈ R , a remarkable geodesic space is the κ -plane ( M κ , d κ ) deﬁned asthe unique (up to isometry) 2-dimensional complete and simply connectedRiemannian manifold with constant sectional curvature κ , equipped with itsRiemannian distance d κ . The diameter D κ of M κ is D κ := (cid:26) + ∞ if κ ≤ ,π/ √ κ if κ > . For κ ∈ R , there is a unique geodesic [0 , → M connecting x to y in M κ provided d κ ( x, y ) < D κ .Given a metric space ( M, d ), we call triangle in M any set of three points { p, x, y } ⊂ M . We call it non-degenerate if all three points are distinct. For κ ∈ R , a comparison triangle for { p, x, y } ⊂ M in M κ is an isometric copy { ¯ p, ¯ x, ¯ y } ⊂ M κ of { p, x, y } in M κ (i.e., pairwise distances are preserved).Such a comparison triangle always exists and is unique (up to an isometry)provided the perimeterperi { p, x, y } := d ( p, x ) + d ( p, y ) + d ( x, y ) < D κ . If { p, x, y } is non-degenerate and peri { p, x, y } < D κ , the triangular inequal-ity implies that d ( p, x ) , d ( p, y ) , d ( x, y ) < D κ . Given κ ∈ R , p, x, y ∈ M with p / ∈ { x, y } and peri { p, x, y } < D κ , wedeﬁne the comparison angle (cid:94) κp ( x, y ) ∈ [0 , π ] at p bycos (cid:94) κp ( x, y ) :=  d ( p, x ) + d ( p, y ) − d ( x, y )2 d ( p, x ) d ( p, y ) if κ = 0 ,c κ ( d ( x, y )) − c κ ( d ( p, x )) · c κ ( d ( p, y )) κ · s κ ( d ( p, x )) s κ ( d ( p, y )) if κ (cid:54) = 0 , where, for r ≥ s κ ( r ) := (cid:26) sin( r √ κ ) / √ κ if κ > , sinh( r √− κ ) / √− κ if κ < , (2.3)5nd c κ ( r ) = s (cid:48) κ ( r ). When peri { p, x, y } ≥ D κ , we declare the angle (cid:94) κp ( x, y )undeﬁned. Note that the comparison angle (cid:94) κp ( x, y ) corresponds to theRiemannian angle at ¯ p between the two unique geodesics connecting ¯ p to¯ x and ¯ y respectively in M κ where { ¯ p, ¯ x, ¯ y } ⊂ M κ denotes a comparisontriangle for { p, x, y } . Deﬁnition 2.2.

Given κ ∈ R , a metric space ( M, d ) is said to have cur-vature lower bounded by κ , which we denote by curv( M ) ≥ κ , if for all p, x, y, z ∈ M , such that p / ∈ { x, y, z } , we have (cid:94) κp ( x, y ) + (cid:94) κp ( x, z ) + (cid:94) κp ( y, z ) ≤ π, (2.4) when all three angles are deﬁned. Deﬁnition 2.2 is of global nature as it requires comparison (2.4) to holdfor all quadruples p, x, y, z ∈ M for which angles at p are deﬁned. A global-ization result due to Burago et al. (1992) states that, when M is a completelength space, then it has curvature lower bounded by κ in the sense of Def-inition 2.2 iﬀ, for all p ∈ M , comparison (2.4) holds for all { x, y, z } in aneighborhood of p . In the case of geodesic spaces, we can give the followingequivalent characterization of lower bounded curvature. Theorem 2.3.

Let ( M, d ) be a geodesic space and κ ∈ R . Then the followingstatements are equivalent. (1) curv( M ) ≥ κ in the sense of Deﬁnition 2.2. (2) For all p, x, y ∈ M with p / ∈ { x, y } and peri { p, x, y } < D κ , andfor any geodesics γ x , γ y : [0 , → M connecting p to x and p to y respectively, we have ∀ s, t ∈ [0 , , d ( γ x ( s ) , γ y ( t )) ≥ d κ (¯ γ x ( s ) , ¯ γ y ( t )) , where, given a comparison triangle { ¯ p, ¯ x, ¯ y } of { p, x, y } in M κ , ¯ γ x , ¯ γ y :[0 , → M κ are geodesics (which are unique if the triangle is non-degenerate) connecting ¯ p to ¯ x and ¯ p to ¯ y respectively. We end the paragraph with a few standard examples of Polish geodesicspaces with lower bounded curvature in the sense of Deﬁnition 2.2. • A complete and connected Riemannian manifold, with its Riemanniandistance, is a Polish geodesic space with curvature lower bounded by κ ∈ R , iﬀ its sectional curvatures are all lower bounded by κ .6 The frontier ∂K of a convex and compact subset K ⊂ R d (with non-empty interior) equipped with its length metric (inherited from theinduced euclidean distance) is a Polish geodesic space with curvaturelower bounded by 0. • Given a Polish geodesic space (Ω , d ) with curvature lower bounded by0, the 2-Wasserstein space W (Ω) := ( P (Ω) , W ) over Ω is a Polishgeodesic space with curvature lower bounded by 0. Let (

M, d ) be a geodesic space with lower bounded curvature in the senseof Deﬁnition 2.2.Given p ∈ M , we denote Γ p be the set of all non-trivial geodesics γ :[0 , τ ] → M issuing from p . For γ, σ ∈ Γ p , the angle between γ and σ isdeﬁned by (cid:94) p ( γ, σ ) := lim s,t → (cid:94) p ( γ ( s ) , σ ( t )) . The angle (cid:94) p : Γ p → [0 , π ] is well deﬁned for geodesic spaces with lowerbounded curvature, as a consequence of Theorem 2.3, point (2), and satisﬁes,for all γ, ω, σ ∈ Γ p , (cid:94) p ( γ, σ ) ≤ (cid:94) p ( γ, ω ) + (cid:94) p ( ω, σ ) . The angle (cid:94) p is therefore a pseudo-metric on Γ p and induces a metric onthe quotient space Σ (cid:48) p := Γ p / ∼ where γ ∼ σ iﬀ (cid:94) p ( γ, σ ) = 0. We denote (cid:126)γ ∈ Σ (cid:48) p the equivalence class of γ ∈ Γ p for ∼ . The completion Σ p of Σ (cid:48) p is called thespace of directions at p . Below we use the same symbol (cid:94) p to denote thepseudo-metric on Γ p , the metric on Σ (cid:48) p or the metric on Σ p .Given a metric space (Ω , d ), with diameter at most π , consider the equiv-alence relation ≈ on Ω × R + deﬁned by ( p, s ) ≈ ( q, t ) iﬀ ( s = t = 0 or( p, s ) = ( q, t )). In other words, if [ p, s ] denotes the class of ( p, s ) for thisrelation, then [ p, s ] = { ( p, s ) } if s > p,

0] = Ω × { } . The Euclideancone over Ω, denoted cone(Ω), is the quotient set (Ω × R + ) / ≈ equippedwith the metric d c deﬁned by d c ([ p, s ] , [ q, t ]) := s − st cos d ( p, q ) + t . We call [ p,

0] the tip of the cone. 7he tangent cone T p M of M at p is deﬁned as the Euclidean cone overthe space of directions of M at p , i.e., T p M := cone(Σ p ) . We denote (cid:107) . − . (cid:107) p the metric on T p M and 0 p the tip of T p M . For u = [ ξ, s ] ∈ T p M and λ ∈ R + , we deﬁne λu := [ ξ, λs ]. For u = [ ξ, s ] , v = [ ζ, t ] ∈ T p M ,we set (cid:107) u (cid:107) p := (cid:107) u − p (cid:107) p and (cid:104) u, v (cid:105) p := st cos (cid:94) p ( ξ, ζ ) so that (cid:107) u − v (cid:107) p = (cid:107) u (cid:107) p − (cid:104) u, v (cid:105) p + (cid:107) v (cid:107) p . A useful alternative representation of T p M is obtained as follows. Forany two geodesics γ, σ ∈ Γ p , denote | γ − σ | p := lim t → d ( γ ( t ) , σ ( t )) t . Angles between elements of Γ p being well deﬁned, the limit always exists and | . − . | p deﬁnes a pseudo metric on Γ p . Denoting ∝ the equivalence relationon Γ p deﬁned by γ ∝ σ iﬀ | γ − σ | p = 0, we deﬁne T (cid:48) p M as the quotient setΓ p / ∝ equipped with the induced metric | . − . | p . For γ ∈ Γ p , we denote˙ γ ∈ T (cid:48) p M its class for relation ∝ . Lemma 2.4.

The map ˙ γ ∈ T (cid:48) p M (cid:55)→ [ (cid:126)γ, | γ (cid:48) | ] ∈ Cone(Σ (cid:48) p ) , (2.5) is a well deﬁned isometry and the completion of T (cid:48) p M is isometric to T p M . We report the proof of Lemma 2.4 in the appendix.From now on, we’ll therefore consider T (cid:48) p M as a dense subset of T p M andidentify ˙ γ ∈ T (cid:48) p M to the element [ (cid:126)γ, | γ (cid:48) | ] ∈ T p M . Suppose that (

M, d ) is a geodesic space with lower bounded curvature. For p ∈ M , we call logarithmic map at p any map log p : M → T p M such that,for all x ∈ M , log p ( x ) = ˙ γ x , for some geodesic γ x : [0 , → M connecting p to x . The next result showsthat the choice of a suﬃciently well behaved log map is possible provided M is a Polish geodesic space. It was ﬁrst cited as a remark in Le Gouic et al.(2019) and proved in Le Gouic (2020).8 emma 2.5. Let ( M, d ) be a Polish geodesic space with lower bounded cur-vature and equipped with its Borel σ -algebra. Then, for all p ∈ M , thereexists a logarithmic map log p : M → T p M which is measurable when T p M is equipped with the σ -algebra generated by open balls. We report the proof of Lemma 2.5 in the appendix.

Remark 2.6.

The proof of Lemma 2.5 is essentially based on the observa-tion that under these assumptions, we can ﬁrst select a collection ( γ x ) x ∈ M of geodesics γ x : [0 , → M connecting p to x , such that the map x ∈ M (cid:55)→ γ x ∈ G p is Borel measurable when we equip G p ⊂ Γ p , the set of all geodesicsissuing from p and deﬁned on [0 , , with the uniform metric. Then, we showthat the map γ ∈ G p (cid:55)→ ˙ γ ∈ T p M is measurable when T p M is equipped withthe σ -algebra generated by open balls and deﬁne log p as the composition ofthese two maps. It follows from Lemma 2.5 that we can choose a Borel-measurable log-arithmic map at any point p whenever the tangent cone T p M is separablesince in this case the Borel σ -algebra on T p M coincides with the σ -algebragenerated by open balls. This occurs for instance in the case where M isa proper metric space as noted in Ohta (2012). It is also known to be thecase for speciﬁc examples of non-proper spaces. For instance, if Ω denotes aPolish geodesic space with curvature lower bounded by 0, the 2-Wassersteinspace W (Ω) has a separable tangent cone at any point. This fact followsfrom Ambrosio et al. (2008, Deﬁnition 12.4.3) which characterises T p W (Ω)as a closed subset of a separable metric space.However, it should be noted that the measurability of log p with respectto the σ -algebra generated by open balls on T p M is enough for the resultswe present next. Indeed, statements presented below require only the Borel-measurability of maps of the form x (cid:55)→ (cid:104) log p ( x ) , u (cid:105) p , for some ﬁxed p ∈ M and u ∈ T p M , which follows from this weaker measur-ability of log p .Note ﬁnally that the choice of a measurable log is in principle not unique.However, all results we’ll mention can be shown to hold independently of itschoice.Next is the ﬁrst key result for the proof of Theorem 1.1. Lemma 2.7.

Let ( M, d ) be a Polish geodesic space with lower bounded cur-vature in the sense of Deﬁnition 2.2. Let µ ∈ P ( M ) and suppose it admits barycenter x ∗ ∈ M . Then for all u ∈ T x ∗ M , (cid:90) M (cid:104) log x ∗ ( x ) , u (cid:105) x ∗ d µ ( x ) = 0 . Lemma 2.7 follows by combining Le Gouic et al. (2019, Theorem 7)and Le Gouic (2020, Corollary 2).

Suppose (

M, d ) is a geodesic space with lower bounded curvature in thesense of Deﬁnition 2.2.A function f : M → R is called (geodesically) α -concave, for some α ∈ R ,if for any geodesic γ : [0 , → M , the map t ∈ [0 , (cid:55)→ f ( γ ( t )) − α d ( γ (0) , γ ( t )) , is concave. Hence, f is α -concave iﬀ ( − f ) is ( − α )-convex. An α -concavefunction is sometimes called a semi-concave function.A function f : M → R is called locally Lipschitz at p if there exists aconstant λ > | f ( x ) − f ( y ) | ≤ λd ( x, y )holds for all x, y in some neighborhood of p . We denote Lip p ( f ) the smallestsuch constant.Let f : M → R be an α -concave function, locally Lipschitz at p ∈ M .Let d p f : T (cid:48) p M → R be deﬁned by d p f ( ˙ γ ) := lim t → f ( γ ( t )) − f ( p ) t . (2.6) Lemma 2.8.

Let α ∈ R and f : M → R be an α -concave function, locallyLipschitz at p ∈ M . Then the limit in (2.6) is well deﬁned. For any geodesic γ : [0 , τ ] → M issuing from p , this limit can be written lim t → f ( γ ( t )) − f ( p ) t = sup t ∈ (0 ,τ ] (cid:26) f ( γ ( t )) − f ( p ) t − αt | γ (cid:48) | (cid:27) , (2.7) and does not depend on the representative γ of ˙ γ . Furthermore, the map d p f : T (cid:48) p M → R admits a unique Lip p ( f ) -Lipschitz extension to T p M , whichwe also denote d p f , and call the diﬀerential of f at p . Finally, d p f : T p M → R is positively homogeneous, i.e., satisﬁes, for all λ ≥ and all v ∈ T p M , d p f ( λv ) = λd p f ( v ) .

10e prove Lemma 2.8 in the appendix.Given an α -concave function f : M → R , we call gradient of f at p anyelement g ∈ T p M such that, for all v ∈ T p M , d p f ( v ) ≤ (cid:104) g, v (cid:105) p and d p f ( g ) = (cid:107) g (cid:107) p . The existence of gradients for α -concave functions, in spaces with lowerbounded curvature, is the second essential result for the proof of Theorem1.1. Lemma 2.9 (Alexander et al., 2019, Theorem 11.4.2) . Let ( M, d ) be ageodesic space with lower bounded curvature. Let α ∈ R and f : M → R be an α -concave function, locally Lipschitz at p . Then there exists a uniquegradient of f at p , denoted ∇ f ( p ) . We include the proof of Lemma 2.9 in the appendix.

We are now in position to prove the main result of the paper.For all x ∈ M , let γ x : [0 , → M be a geodesic such that γ x (0) = x ∗ and γ x (1) = x . Suppose ( γ x ) x ∈ M is chosen as indicated in Remark 2.6, for p = x ∗ , and denote log x ∗ : M → T x ∗ M the corresponding logarithmic map at x ∗ . By α -convexity of f , wededuce that, for all t ∈ (0 ,

1] and all x ∈ M , f ( γ x ( t )) ≤ (1 − t ) f ( x ∗ ) + tf ( x ) − α t (1 − t ) d ( x ∗ , x ) . Rearranging terms, we get f ( x ∗ ) ≤ f ( x ) − f ( γ x ( t )) − f ( x ∗ ) t − α − t ) d ( x ∗ , x )= f ( x ) + ( − f )( γ x ( t )) − ( − f )( x ∗ ) t − α − t ) d ( x ∗ , x ) . Notice that ( − f ) : M → R is ( − α )-concave. Hence, taking the limit t → γ x = log x ∗ ( x ) , that f ( x ∗ ) ≤ f ( x ) + d x ∗ ( − f )(log x ∗ ( x )) − α d ( x ∗ , x ) . ∇ ( − f )( x ∗ ), we deduce that f ( x ∗ ) ≤ f ( x ) + (cid:104) log x ∗ ( x ) , ∇ ( − f )( x ∗ ) (cid:105) x ∗ − α d ( x ∗ , x ) . The statement of Theorem 1.1 now follows by integrating both sides withrespect to µ and using Lemma 2.7. A Appendix

Proof of Lemma 2.4

Direct computations reveal that, for all γ, σ ∈ Γ p , d ( γ ( t ) , σ ( t )) t = | γ (cid:48) | + | σ (cid:48) | − | γ (cid:48) || σ (cid:48) | cos (cid:94) p ( γ ( t ) , σ ( t )) , provided t > γ ( t ) and σ ( t ) are deﬁned. Taking thelimit t →

0, it follows by deﬁnition of angles that | γ − σ | p = | γ (cid:48) | + | σ (cid:48) | − | γ (cid:48) || σ (cid:48) | cos (cid:94) p ( γ, σ ) (A.1)= ( | γ (cid:48) | − | σ (cid:48) | ) + 2 | γ (cid:48) || σ (cid:48) | (1 − cos (cid:94) p ( γ, σ )) . (A.2)Identity (A.2) shows that | γ − σ | p is indeed well deﬁned and that ˙ γ = ˙ σ iﬀ | γ (cid:48) | = | σ (cid:48) | and (cid:126)γ = (cid:126)σ . Hence, the map (2.5) is deﬁned without ambiguity, asit doesn’t depend on particular representatives, and is injective. In addition,expression (A.1) translates precisely as | ˙ γ − ˙ σ | p = d c ([ (cid:126)γ, | γ (cid:48) | ] , [ (cid:126)σ, | σ (cid:48) | ]) , where d c denotes the metric in cone(Σ (cid:48) p ). This proves that the map (2.5)is distance preserving. Finally, this map is surjective since, for any s > γ : [0 , τ ] → M in Γ p , [ (cid:126)γ, s ] ∈ cone(Σ (cid:48) p ) is the image of ˙ γ α where γ α : [0 , τ /α ] → M is deﬁned by γ α ( t ) = γ ( αt ) with α := | γ (cid:48) | /s . To showthat the completion of T (cid:48) p M is isometric to T p M it remains to observe that,more generally, the cone over the completion of a metric space is isometricto the completion of the cone over that space.To prove this statement, consider a metric space (Ω , d ) and denote ( ¯Ω , ¯ d )its completion. We denote ¯ x n the equivalence class of Cauchy sequence x n for the equivalence relation lim n d ( x n , y n ) = 0 and understand ¯ d as¯ d (¯ x n , ¯ y n ) = lim n d ( x n , y n ) . d c ([ p, s ] , [ q, t ]) = ( s − t ) + 2 st (1 − cos d ( p, q )) , we see that a sequence [ p n , s n ] is a Cauchy sequence in cone(Ω) iﬀ ( s n → s n → s ∞ > p n is a Cauchy sequence in Ω). As a result, the map φ : cone( ¯Ω) → ¯cone(Ω) deﬁned by φ ([¯ p n , s ]) := (cid:26) [ p n , s ] if s > , s = 0 , is well deﬁned and distance preserving when both ¯Ω and ¯cone(Ω) are equippedwith the completion metric. One checks ﬁnally that it is invertible with in-verse given by φ − ([ p n , s n ]) = 0 if s n → φ − ([ p n , s n ]) = [¯ p n , s ∞ ] if s n → s ∞ > Proof of Lemma 2.5

We simply report and detail the proofs of Lemmas 3.3 and 4.2 in Ohta(2012) emphasizing that they actually do not require the metric space M tobe proper, an assumption imposed in Ohta (2012) for other reasons.Suppose that ( M, d ) is a Polish geodesic space and has lower boundedcurvature. Introduce the set G p ⊂ Γ p of all geodesics γ : [0 , → M issuingfrom p . Equipped with the supremum metric d ∞ ( γ, σ ) := sup t ∈ [0 , d ( γ ( t ) , σ ( t )) ,G p is a Polish metric space.For t ∈ [0 , e t : G p → M the evaluation map deﬁned by e t ( γ ) := γ ( t ). This evaluation map is (Lipschitz) continuous. Hence, for all x ∈ M ,the set e − ( x ) ⊂ G p is closed and non-empty. Furthermore, for any openset U ⊂ G p , the set { x ∈ M : e − ( x ) ∩ U (cid:54) = ∅} = e ( U ) , is a Borel set.Indeed, ﬁx a non-empty open set U ⊂ G p . For δ >

0, denote A δ := G p \ { γ ∈ G p : d ∞ ( γ, G p \ U ) < δ } . The set A δ is closed in G p , satisﬁes A δ ⊂ A δ (cid:48) iﬀ δ (cid:48) ≤ δ and is such that ∪ δ> A δ = U . Given ε, δ >

0, consider the set M ε,δ ⊂ M of all points x σ : [0 , → M satisfying σ (0) = p , σ (1) = x and d ∞ ( σ, A δ ) < ε. The set M ε,δ is open in M and satisﬁes M ε,δ ⊂ M ε (cid:48) ,δ iﬀ ε ≤ ε (cid:48) . Hence,the set e ( A δ ) = ∩ ε> M ε,δ is a Borel set in M since the intersection can berestricted to any countable sequence ε n ↓ e ( U ) = ∪ δ> e ( A δ ) is a Borel set in M as well since, again, thereunion can be restricted to any countable sequence δ n ↓ g : M → G p such that, for all x ∈ M , g ( x ) ∈ e − ( x ). Wedenote γ x := g ( x ) . It remains to show that the map θ p : γ ∈ G p (cid:55)→ ˙ γ ∈ T p M is measurablewhen T p M is equipped with the σ -algebra generated by open balls. To provethis fact, ﬁx v ∈ T p M and r >

0. By density of T (cid:48) p M in T p M , there exists γ n ∈ G p such that ˙ γ n → v . Therefore, θ − p ( B ( v, r )) = { γ ∈ G p : (cid:107) ˙ γ − v (cid:107) p < r } = (cid:91) n ≥ (cid:92) m ≥ n { γ ∈ G p : (cid:107) ˙ γ − ˙ γ m (cid:107) p < r } = (cid:91) n ≥ (cid:92) m ≥ n { γ ∈ G p : lim t → t − d ( γ ( t ) , γ m ( t )) < r } = (cid:91) n ≥ (cid:92) m ≥ n (cid:91) (cid:96) ≥ (cid:92) k ≥ (cid:96) { γ ∈ G p : d ( γ (1 /k ) , γ m (1 /k )) < r/k } . But since, for ﬁxed k, m ≥

1, the set { γ ∈ G p : d ( γ (1 /k ) , γ m (1 /k )) < r/k } is open in ( G p , d ∞ ), we see that θ − p ( B ( v, r )) is a Borel subset of ( G p , d ∞ )which completes the proof. Proof of Lemma 2.8

Let γ : [0 , τ ] → M be any geodesic issuing from p and let σ : [0 , → M bethe reparametrization of γ deﬁned by σ ( t ) = γ ( τ t ). Then, for all t ∈ (0 , f ( γ ( τ t )) − f ( p ) τ t = 1 τ (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) + αt τ | σ (cid:48) | = 1 τ (cid:26) g ( t ) − g (0) t (cid:27) + αt τ | σ (cid:48) | , (A.3)14here g : [0 , → R is deﬁned by g ( t ) = f ( σ ( t )) − αt | σ (cid:48) | = f ( σ ( t )) − α d ( p, σ ( t )) . The α -concavity of f implies that g is concave on [0 , g at 0 is well deﬁned and satisﬁeslim t ↓ g ( t ) − g (0) t = sup t ∈ (0 , g ( t ) − g (0) t = sup t ∈ (0 , (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) . (A.4)Hence, combining (A.3), (A.4) and noticing that | σ (cid:48) | = τ | γ (cid:48) | , we obtainlim t ↓ f ( γ ( t )) − f ( p ) t = lim t ↓ f ( γ ( τ t )) − f ( p ) τ t = 1 τ sup t ∈ (0 , (cid:26) f ( σ ( t )) − f ( p ) t − αt | σ (cid:48) | (cid:27) = 1 τ sup t ∈ (0 , (cid:26) f ( γ ( τ t )) − f ( p ) t − ατ t | γ (cid:48) | (cid:27) = sup t ∈ (0 ,τ ] (cid:26) f ( γ ( t )) − f ( p ) t − αt | γ (cid:48) | (cid:27) , which proves (2.7). Now, for any ˙ γ, ˙ σ ∈ T (cid:48) p M , the local Lipschitz propertyof f implies that | d p f ( ˙ γ ) − d p f ( ˙ σ ) | = lim t → | f ( γ ( t )) − f ( σ ( t )) | t ≤ Lip p ( f ) lim t → d ( γ ( t ) , σ ( t )) t = Lip p ( f ) (cid:107) ˙ γ − ˙ σ (cid:107) p . On the one hand, this inequality shows that the limit on the right handside of (2.6) is independent of the chosen representative γ of ˙ γ . On theother hand, by density of T (cid:48) p M in T p M , it implies that d p f admits a uniqueLip p ( f )-Lipschitz extension to T p M .Finally, to prove d p f is positively homogeneous on T p M it is enough toshow it on T (cid:48) p M and conclude by continuity. But, for any geodesic γ : [0 , τ ] → and any λ >

0, the geodesic σ : [0 , λτ ] → M deﬁned by σ ( t ) = γ ( t/λ )satisﬁes ˙ σ = λ ˙ γ and we see that d p f ( ˙ σ ) = lim t → f ( σ ( t )) − f ( p ) t = λ lim t → f ( γ ( t/λ )) − f ( p ) t/λ = λd p f ( ˙ γ ) , which completes the proof. Proof of Lemma 2.9

The proof follows the lines devised in Alexander et al. (2019) with minormodiﬁcations. In particular, we use the following result, due to Lang andSchroeder (1997).

Lemma A.1 (Lang and Schroeder, 1997, Lemma A.4) . Let ( M, d ) be geodesicwith lower bounded curvature, in the sense of Deﬁnition 2.2. Let p ∈ M and γ, σ : [0 , → M two geodesics issuing from p . For all t ∈ (0 , , let δ t : [0 , → M be a geodesic connecting γ ( t ) to σ ( t ) . Introducing the mid-point m t = δ t (1 / of γ ( t ) and σ ( t ) , we have t ↓ d ( p, m t ) t = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p . We can now establish the following result.

Lemma A.2 (Alexander et al., 2019, Lemma 11.2.3) . Let ( M, d ) be geodesicwith lower bounded curvature, in the sense of Deﬁnition 2.2. Suppose f : M → R is locally Lipschitz at p and α -concave. Then for all u, v ∈ T p M , sup w ∈ T p M : (cid:107) w (cid:107) p =1 d p f ( w ) ≥ d p f ( u ) + d p f ( v ) (cid:113) (cid:107) u (cid:107) p + 2 (cid:104) u, v (cid:105) p + (cid:107) v (cid:107) p . Proof of Lemma A.2.

By density of T (cid:48) p M in T p M and continuity of d p f , itis enough to prove the result for all u = ˙ γ ∈ T (cid:48) p M and v = ˙ σ ∈ T (cid:48) p M where γ, σ ∈ Γ p . For two such geodesics γ : [0 , → M and σ : [0 , → M , and all t ∈ (0 , δ t : [0 , → M be a geodesic connecting γ ( t ) to σ ( t ). Then, byconcavity of f ,2 f ( δ t (1 / ≥ f ( γ ( t )) + f ( σ ( t )) − α d ( γ ( t ) , σ ( t )) . (A.5)Observing that f ( γ ( t )) = f ( p ) + td p f ( ˙ γ ) + o( t ) , f ( σ ( t )) = f ( p ) + td p f ( ˙ σ ) + o( t ) , d ( γ ( t ) , σ ( t )) = t (cid:107) ˙ γ − ˙ σ (cid:107) p + o( t ) , we deduce from (A.5) thatlim inf t ↓ (cid:18) f ( δ t (1 / − f ( p ) t (cid:19) ≥ d p f ( ˙ γ ) + d p f ( ˙ σ )2 . Now, for all t ∈ (0 , ω t : [0 , t ] → M be a geodesic connecting p to δ t (1 / f ( δ t (1 / − f ( p ) t = f ( ω t ( t )) − f ( p ) t ≤ sup s ∈ (0 ,t ] (cid:26) f ( ω t ( s )) − f ( p ) s − αs | ω (cid:48) t | (cid:27) + αt | ω (cid:48) t | = d p f ( ˙ ω t ) + αt | ω (cid:48) t | = | ω (cid:48) t | (cid:18) d p f ( | ω (cid:48) t | − ˙ ω t ) + αt | ω (cid:48) t | (cid:19) . Since (cid:107)| ω (cid:48) t | − ˙ ω t (cid:107) p = 1, for all t ∈ (0 , | ω (cid:48) t | = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p + ε ( t ) , where ε ( t ) → t →

0. But this follows by observing that,4 lim t ↓ | ω (cid:48) t | = 4 lim t ↓ d ( p, ω t ( t )) t = (cid:107) ˙ γ (cid:107) p + 2 (cid:104) ˙ γ, ˙ σ (cid:105) p + (cid:107) ˙ σ (cid:107) p , according to Lemma A.1.We are now in position to prove existence and uniqueness of gradientsfor α -concave functions. Proof of Lemma 2.9.

Let w n = [ ξ n , ∈ T p M be such thatlim n → + ∞ d p f ( w n ) = d sup := sup w ∈ T p M : (cid:107) w (cid:107) p =1 d p f ( w ) . Applying Lemma A.2, we get for all m, n ≥ d sup ≥ d p f ( w n ) + d p f ( w m ) (cid:112) (cid:104) w n , w m (cid:105) p = d p f ( w n ) + d p f ( w m ) (cid:112) (cid:94) p ( ξ n , ξ m ) . n, m → + ∞ , this implies that ξ n is a Cauchy sequence in the com-plete space Σ p , and hence converges towards some ξ ∞ ∈ Σ p . Denoting w ∞ = [ ξ ∞ , d p f ( w ∞ ) = d sup by continuity. Now, denote g := d sup w ∞ , and select an arbitrary w ∈ T p M . Then applying Lemma A.2 to u = w ∞ and v = εw , we get d sup ≥ d sup + εd p f ( w ) (cid:113) ε (cid:104) w ∞ , w (cid:105) p + ε (cid:107) w (cid:107) p = d sup + ε ( d p f ( w ) − (cid:104) w ∞ , w (cid:105) p ) + o( ε ) . Letting ε ↓

0, we obtain d p f ( w ) ≤ (cid:104) g, w (cid:105) p . Since it is clear by construc-tion that d p f ( g ) = (cid:107) g (cid:107) p , this proves the existence of a gradient. To proveuniqueness, consider another g (cid:48) satisfying the same properties. Than, wehave (cid:107) g (cid:48) (cid:107) p = d p f ( g (cid:48) ) ≤ (cid:104) g, g (cid:48) (cid:105) p and (cid:107) g (cid:107) p = d p f ( g ) ≤ (cid:104) g, g (cid:48) (cid:105) p . As a result, 0 ≤ (cid:107) g (cid:48) − g (cid:107) p = (cid:107) g (cid:48) (cid:107) p − (cid:104) g, g (cid:48) (cid:105) p + (cid:107) g (cid:107) ≤ , which imposes g = g (cid:48) and completes the proof. Acknowledgments . I would like to thank Thibaut Le Gouic and PhilippeRigollet for discussions on metric geometry, and its applications, that sparkedmy interest for the problem addressed in this paper.

References

B. Afsari. Riemannian L p center of mass: existence, uniqueness, andconvexity. Proc. Amer. Math. Soc. , 139(2):655–673, 2011. ISSN 0002-9939. doi: 10.1090/S0002-9939-2010-10541-5. URL https://doi.org/10.1090/S0002-9939-2010-10541-5 . 2M. Agueh and G. Carlier. Barycenters in the Wasserstein space.

SIAMJ. Math. Anal. , 43(2):904–924, 2011. ISSN 0036-1410. doi: 10.1137/100805741. URL https://doi.org/10.1137/100805741 . 2, 3A. Ahidar-Coutrix, T. Le Gouic, and Q. Paris. Convergence rates for em-pirical barycenters in metric spaces: curvature, convexity and extend-able geodesics.

Probability Theory and Related Fields , Oct 2019. ISSN1432-2064. doi: 10.1007/s00440-019-00950-0. URL https://doi.org/10.1007/s00440-019-00950-0 . 218. Alexander, V. Kapovitch, and A. Petrunin.

Alexandrov geometry: pre-liminary version no. 1 . Book in preparation, Mar. 2019. URL http://arxiv.org/abs/1903.08539 . arXiv:1903.08539. 4, 11, 16L. Ambrosio, N. Gigli, and G. Savar´e.

Gradient ﬂows in metric spaces and inthe space of probability measures . Lectures in Mathematics ETH Z¨urich.Birkh¨auser Verlag, Basel, second edition, 2008. ISBN 978-3-7643-8721-1.9V. I. Bogachev.

Measure theory. Vol. I, II . Springer-Verlag, Berlin, 2007.ISBN 978-3-540-34513-8; 3-540-34513-2. doi: 10.1007/978-3-540-34514-5.URL https://doi.org/10.1007/978-3-540-34514-5 . 14D. Burago, Y. Burago, and S. Ivanov.

A course in metric geometry . Amer-ican Mathematical Society, 2001. 3Y. Burago, M. Gromov, and G. Perel’man. A.D. Alexandrov spaces withcurvature bounded below.

Russian Mathematical Surveys , 47(2), 1992. 3,6M. ´Emery and G. Mokobodzki. Sur le barycentre d’une probabilit´e dansune vari´et´e. In

S´eminaire de Probabilit´es, XXV , volume 1485 of

LectureNotes in Math. , pages 220–233. Springer, Berlin, 1991. doi: 10.1007/BFb0100858. URL https://doi.org/10.1007/BFb0100858 . 3S. F. Huckemann and B. Eltzner. Data analysis on nonstandard spaces.

WIREs Computational Statistics , n/a(n/a):e1526, 2020. doi: https://doi.org/10.1002/wics.1526. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/wics.1526 . 2W. S. Kendall. Probability, convexity, and harmonic maps with small image.I. Uniqueness and ﬁne existence.

Proc. London Math. Soc. (3) , 61(2):371–406, 1990. ISSN 0024-6115. doi: 10.1112/plms/s3-61.2.371. URL https://doi.org/10.1112/plms/s3-61.2.371 . 3Y.-H. Kim and B. Pass. Wasserstein barycenters over Riemannian manifolds.

Adv. Math. , 307:640–683, 2017. ISSN 0001-8708. doi: 10.1016/j.aim.2016.11.026. URL https://doi.org/10.1016/j.aim.2016.11.026 . 2, 3K. Kuwae. Jensen’s inequality over CAT( κ )-space with small diameter. In Potential theory and stochastics in Albac , volume 11 of

Theta Ser. Adv.Math. , pages 173–182. Theta, Bucharest, 2009. 319. Kuwae. Jensen’s inequality on convex spaces.

Calc. Var. Par-tial Diﬀerential Equations , 49(3-4):1359–1378, 2014. ISSN 0944-2669.doi: 10.1007/s00526-013-0625-5. URL https://doi.org/10.1007/s00526-013-0625-5 . 3U. Lang and V. Schroeder. Kirszbraun’s theorem and metric spaces ofbounded curvature.

Geom. Funct. Anal. , 7(3):535–560, 1997. ISSN 1016-443X. doi: 10.1007/s000390050018. URL https://doi.org/10.1007/s000390050018 . 16T. Le Gouic. A note on ﬂatness of non separable tangent cone at a barycen-ter.

C. R. Math. Acad. Sci. Paris , 358(4):489–495, 2020. ISSN 1631-073X.doi: 10.5802/crmath.66. URL https://doi.org/10.5802/crmath.66 . 3,8, 10T. Le Gouic and J.-M. Loubes. Existence and consistency of Wassersteinbarycenters.

Probab. Theory Related Fields , 168(3-4):901–917, 2017. ISSN0178-8051. doi: 10.1007/s00440-016-0727-z. URL https://doi.org/10.1007/s00440-016-0727-z . 2T. Le Gouic, Q. Paris, P. Rigollet, and A. J. Stromme. Fast convergenceof empirical barycenters in alexandrov spaces and the wasserstein space,2019. 2, 3, 8, 10S.-I. Ohta. Barycenters in Alexandrov spaces of curvature bounded below.

Advances in geometry , 14:571–587, 2012. 2, 3, 9, 13A. Petrunin. Semiconcave functions in Alexandrov’s geometry. In

Surveysin diﬀerential geometry. Vol. XI , volume 11 of

Surv. Diﬀer. Geom. , pages137–201. Int. Press, Somerville, MA, 2007. doi: 10.4310/SDG.2006.v11.n1.a6. URL https://doi.org/10.4310/SDG.2006.v11.n1.a6 . 3C. Plaut. Metric spaces of curvature ≥ k . Handbook of Geometric Topology ,pages 819–898, 2002. 4K.-T. Sturm. Probability measures on metric spaces of nonpositive cur-vature. In

Heat kernels and analysis on manifolds, graphs, and metricspaces (Paris, 2002) , volume 338 of

Contemp. Math. , pages 357–390.Amer. Math. Soc., Providence, RI, 2003. doi: 10.1090/conm/338/06080.URL https://doi.org/10.1090/conm/338/06080 . 2, 3K.-T. Sturm. On the geometry of metric measure spaces. I.

Acta Math. , 196(1):65–131, 2006. ISSN 0001-5962. doi: 10.1007/s11511-006-0002-8. URL https://doi.org/10.1007/s11511-006-0002-8 . 320. Yokota. A rigidity theorem in Alexandrov spaces with lower cur-vature bound.

Math. Ann. , 353(2):305–331, 2012. ISSN 0025-5831.doi: 10.1007/s00208-011-0686-8. URL https://doi.org/10.1007/s00208-011-0686-8 . 3T. Yokota. Convex functions and barycenter on CAT(1)-spaces of smallradii.

J. Math. Soc. Japan , 68(3):1297–1323, 2016. ISSN 0025-5645.doi: 10.2969/jmsj/06831297. URL https://doi.org/10.2969/jmsj/06831297https://doi.org/10.2969/jmsj/06831297