Sectional Curvature in terms of the Cometric, with Applications to the Riemannian Manifolds of Landmarks
SSectional Curvature in terms of the Cometric, withApplications to the Riemannian Manifolds of Landmarks
Mario Micheli Peter W. Michor David MumfordDepartment of Mathematics Fakult¨at f¨ur Mathematik Div. of Applied MathematicsUniv. of California, Los Angeles Universit¨at Wien Brown University520 Portola Plaza Nordbergstrasse 15 182 George StreetLos Angeles, CA 90095, USA A-1090 Wien, Austria Providence, RI 02012, USA [email protected] [email protected] David [email protected]
Keywords: shape spaces, landmark points, cometric, sectional curvature.
Acknowledgements : MM was supported by ONR grant N00014-09-1-0256, PWM was supportedby FWF-project 21030, DM was supported by NSF grant DMS-0704213, and all authors weresupported by NSF grant DMS-0456253 (Focused Research Group: The geometry, mechanics, andstatistics of the infinite dimensional shape manifolds). MM would like to thank Andrea Bertozzi ofUCLA for her continuous advice and and support.
Abstract
This paper deals with the computation of sectional curvature for the manifolds of N land-marks (or feature points) in D dimensions, endowed with the Riemannian metric induced by thegroup action of diffeomorphisms. The inverse of the metric tensor for these manifolds (i.e. thecometric), when written in coordinates, is such that each of its elements depends on at most 2 D of the ND coordinates. This makes the matrices of partial derivatives of the cometric verysparse in nature, thus suggesting solving the highly non-trivial problem of developing a for-mula that expresses sectional curvature in terms of the cometric and its first and second partialderivatives (we call this Mario’s formula). We apply such formula to the manifolds of landmarksand in particular we fully explore the case of geodesics on which only two points have non-zeromomenta and compute the sectional curvatures of 2-planes spanned by the tangents to suchgeodesics. The latter example gives insight to the geometry of the full manifolds of landmarks. In the past few years there has been a growing interest, in diverse scientific communities, in modeling shape spaces as Riemannian manifolds. The study of shapes and their similarities is in fact centralin computer vision and related fields (e.g. for object recognition, target detection and tracking,classification of biometric data, and automated medical diagnostics), in that it allows one to recognizeand classify objects from their representation. In particular, a distance function between shapesshould express the meaning of similarity between them for the application that one has in mind. Oneof the most mathematically sound and tractable methods for defining a distance on a manifold is tomeasure infinitesimal distance by a Riemannian structure and global distance by the correspondinglengths of geodesics.Among the several ways of endowing a shape manifold with a Riemannian structure (see, forexample, [17, 18, 20, 25, 28, 30]), one of the most natural is inducing it through the action ofthe infinite-dimensional Lie group of diffeomorphisms of the manifold ambient to the shapes beingstudied. You start by putting a right-invariant metric on this diffeomorphism group, as describedin [27]. Then fixing a base point on the shape manifold, one gets a surjective map from the group of1 a r X i v : . [ m a t h . DG ] J un iffeomorphisms to the shape manifold. The right-invariance of the metric “upstairs” implies thatwe get a quotient metric on the shape manifold for which this map is a submersion (see below).This approach can be used to define a metric on very many shape spaces, such as the manifoldsof curves [12, 26], surfaces [33], scalar images [4], vector fields [6], diffusion tensor images [5], mea-sures [11, 13], and labeled landmarks (or “feature points”) [14, 15]. The actual geometry of theseRiemannian manifolds has remained almost completely unknown until very recently, when certainfundamental questions about their curvature have started being addressed [25, 26, 32].Among all shape manifolds, the simplest case of the manifold of landmarks in Euclidean spaceplays a central role. This is defined as L N ( R D ) := n ( P , . . . , P N ) (cid:12)(cid:12) P a ∈ R D , a = 1 , . . . , N o . (typically we consider landmarks P a , a = 1 , . . . , N that do not coincide pairwise). It is finite-dimensional, albeit with high dimension n = N D , where N is the number of landmarks and D isthe dimension of the ambient space in which they live (e.g. D = 2 for the plane). Therefore itsmetric tensor may be written, in any set of coordinates, as a finite-dimensional matrix. This spaceis important in the study of all other shape manifolds because of a simple property of submersions:for any submersive map f : X → Y , all geodesics on Y lift to geodesics on X and give you, in fact,all geodesics on X which at one and hence all points are perpendicular to the fiber of f (so called“horizontal” geodesics). This means that geodesics on the space of landmarks lift to geodesics on thediffeomorphism group and then project down to geodesics on all other shape manifolds associated tothe same underlying ambient space R D . Thus geodesics of curves, surfaces, etc. in R D can be derivedfrom geodesics of landmark points. Technically, these are the geodesics on these shape manifoldswhose momentum has finite support. This efficient way of constructing geodesics on many shapemanifolds has been exploited in much recent work, e.g. [2, 8, 29].What sort of metrics arise from submersions? Mathematically, the key point is that the inverseof the metric tensor, the inner product on the cotangent space hence called the co-metric, behavessimply in a submersion. Namely, for a submersion f : X → Y , the co-metric on Y is simply therestriction of the co-metric on X to the pull-back 1-forms. Therefore, for the space of landmarks thecometric has a simple structure. In our case, we will see that each of its elements depends only onat most 2 D of the ND coordinates. Hence the matrices obtained by taking first and second partialderivatives of the cometric have a very sparse structure — that is, most of their entries are zero.This suggests that for the purpose of calculating curvature (rather than following the “classical”path of computing first and second partial derivatives of the metric tensor itself, the Christoffelsymbols, et cetera) it would be convenient to write sectional curvature in terms of the inverse ofthe metric tensor and its derivatives. We have solved the highly non-trivial problem of developinga formula (that we call “Mario’s formula”) precisely for this purpose: for a given pair of cotangentvectors this formula expresses the corresponding sectional curvature as a function of the cometricand its first and second partial derivatives except for one term which requires the metric (but notits derivatives). This formula is closely connected to O’Neill’s formula which, for any submersion asabove, connects the curvatures of X and Y . Subtracting Mario’s formula on X and Y gives O’Neill’sas a corollary.This paper deals with the problem of computing geodesics and sectional curvature for landmarkspaces, and is based on results from the thesis of the first author [23]. The paper is organized asfollows. We first give a few more details about the manifold of landmarks, and describe the metricinduced by the action of the Lie group of diffeomorphisms. We then give a proof for the generalformula expressing sectional curvature in terms of the cometric. This formula is used in the followingsection to compute the sectional curvature for the manifold of labeled landmarks. In the last section,2e analyze the case of geodesics on which only two points have non-zero momenta and the sectionalcurvatures of 2-planes made up of the tangents to such geodesics. In this case, both the geodesicsand the curvature are much simpler and give insight into the geometry of the full landmark space. In this section we briefly summarize how the shape space of landmarks can be given the structure ofa Riemannian manifold. We refer the reader to [27, 31] for the general framework on how to endow generic shape manifolds with a Riemannian metric via the action of Lie groups of diffeomorphisms.
We will first define a distance function d : L N ( R D ) × L N ( R D ) → R + on landmark space which willthen turn out to be the geodesic distance with respect to a Riemannian metric. Let Q be the set ofdifferentiable landmark paths , that is: Q := n q = ( q , . . . , q N ) : [0 , → L N ( R D ) (cid:12)(cid:12)(cid:12) q a ∈ C (cid:0) [0 , , R D (cid:1) , a = 1 , . . . , N o . Following [31, Chapters 9, 12, 13], a Hilbert space (cid:0) V, h , i V (cid:1) of vector fields on Euclidean space(which we consider as functions R D → R D ) is said to be admissible if (i) V is continuously embeddedin the space of C -mappings on R D → R D which are bounded together with their derivatives, (ii) V is large enough: For any positive integer M , if x , . . . , x M ∈ R D and α , . . . , α M ∈ R D are suchthat, for all u ∈ V , P Ma =1 (cid:10) α a , u ( x a ) (cid:11) R D = 0, then α = . . . = α M = 0.The space ( V, h , i V ) admits a reproducing kernel : that is, for each α, x ∈ R D there exists K αx ∈ V with h K αx , f i V = h α, f ( x ) i R D for all f ∈ V . Further, h K βy , K αx i V = h β, K αx ( y ) i R D = h α, K βy ( x ) i R D which is a bilinear form in ( α, β ) ∈ ( R D ) , thus given by a D × D matrix K ( x, y ); thesymmetry of the inner product implies that K ( y, x ) = K ( x, y ) T (where T indicates the transpose).In this paper we shall assume that K ( x, y ) is a multiple of the identity and is translation invariant:we then write K ( x, y ) simply as K ( x − y ) I D (where I D is the D × D identity matrix); the scalarreproducing kernel K : R D → R must be symmetric , and positive definite (see [31, § v whose kernels are not multiplesof the identity, e.g. one can add a multiple of div( v ) to any norm and then K will intertwinedifferent components of v . The most natural examples of the norms we will consider are given byinner products h u, v i V = h u, v i L := Z R D (cid:10) Lu ( x ) , v ( x ) (cid:11) R D dx, (1)where L is a self-adjoint elliptic scalar differential operator of order greater than D + 2 with constantcoefficients which is applied separately to each of the scalar components of the vector field u =( u , . . . , u D ). By the Sobolev embedding theorem then V consists of C -functions on R D whichare bounded together with their derivatives. If K is a scalar fundamental solution (or Green’sfunction [9]) so that L ( K )( x ) = δ ( x ), then the reproducing kernel is given by K αx = K ( − x ) α . Apossible choice of the operator is L = (1 − A ∆) k (where A ∈ R is a scaling factor, k ∈ N and ∆ isthe Laplacian operator), with k > D + 1, in which case (1) becomes the Sobolev norm: k u k L = Z R D D X ‘ =1 k X m =0 (cid:18) km (cid:19) A m X | α | = m (cid:12)(cid:12) D α u ‘ (cid:12)(cid:12) dx, (2)3hen L = (1 − A ∆) k the scalar kernel K has the form K ( x − y ) = γ (cid:0) k x − y k R D (cid:1) , with: γ ( % ) = 12 k + D − π D Γ( k ) A D (cid:16) %A (cid:17) k − D K k − D (cid:0) %A (cid:1) , % > , (3)where K ν (with ν = k − D ) is a modified Bessel function [1] of order ν ( not to be confused with thesymbol K we use for the kernel of V ).In summary, the scalar kernels that we consider in this paper will always have the properties: (K1) K is positive definite ; (K2) K is symmetric , i.e. K ( x ) = K ( − x ), x ∈ R D .In addition, in certain sections we will introduce the following simplifying assumptions: (K3) K is twice continuously differentiable , K ∈ C ( R D ); (K4) K is rotationally invariant , i.e. K ( x ) = γ ( k x k R D ), x ∈ R D , for some γ ∈ C (cid:0) [0 , ∞ ) (cid:1) .Note that if (K4) holds then γ (0) ≥ | γ ( ρ ) | for all ρ ≥ k > D + 1.Now fix any admissible Hilbert space of vector fields. The space L p ([0 , , V ) is the set of func-tions v : [0 , → V such that: k v k L p ([0 , ,V ) := (cid:16) Z k v ( t, ) k pV dt (cid:17) p < ∞ . The space L ([0 , , V ) is a subset of L ([0 , , V ) and is in fact a Hilbert space with inner product h u, v i L ([0 , ,V ) := R h u, v i V dt . It is well known from the theory of ordinary differential equations [7]that for any v ∈ L ([0 , , V ), the D -dimensional non-autonomous dynamical system ˙ z = v t ( z ), withinitial condition z ( t ) = x , has a unique solution of the type z ( t ) = ψ ( t, t , x ). Let ϕ vst ( x ) := ψ ( t, s, x ); fixing t = 1 and s = 0 we get ϕ v := ϕ v , which is the diffeomorphism generated by v . Foran admissible Hilbert space we will call the set G V := (cid:8) ϕ v : v ∈ L (cid:0) [0 , , V (cid:1)(cid:9) the group of diffeomorphisms generated by V ; by [31, Chapter 12] it is a metric space and a topo-logical group. But, in the language of manifolds, G V is not an infinite-dimensional Lie group [19]. V is not a Lie algebra, but is the completion of the Lie algebra of C ∞ -vector fields with compactsupport with respect to k k V . For velocity vector fields v ∈ L ([0 , , V ) and landmark trajectories q ∈ Q define the energy E λ [ v, q ] ≡ E [ v, q ] := Z (cid:16)(cid:13)(cid:13) v ( t, ) (cid:13)(cid:13) V + λ N X a =1 (cid:13)(cid:13)(cid:13) dq a dt ( t ) − v (cid:0) t, q a ( t ) (cid:1)(cid:13)(cid:13)(cid:13) R D (cid:17) dt, (4)where λ ∈ (0 , ∞ ] is a fixed smoothing parameter (soon to be described). We claim that a dis-tance function d on L N ( R D ) between two landmark sets (or shapes) I = ( x , x , . . . , x N ) and I = ( y , y , . . . , y N ) can be defined as d ( I, I ) := inf v,q np E [ v, q ] : v ∈ L (cid:0) [0 , , V (cid:1) , q ∈ Q with q (0) = I, q (1) = I o ; (5)4n the next subsection we will argue that the above function is in fact a geodesic distance with respectto a Riemannian metric. We treat the minimization of (4) as our starting point; it is the “energy ofa metamorphosis” as formulated in [31, Chapter 13].The above infimum is computed over all differentiable landmark paths q ∈ Q that satisfy theboundary conditions ( q a (0) = x a and q a (1) = y a , a = 1 , . . . , N ), and vector fields v ∈ L ([0 , , V ).The resulting landmark trajectories { q a ( t ) , t ∈ [0 , } a =1 ,...,N follow the minimizing velocity fieldmore or less exactly, depending on the value of the smoothing parameter λ ∈ (0 , ∞ ]; it is a weightbetween the first term, that measures the smoothness of the vector field that generates the diffeo-morphism, and the second term, that measures how closely the landmark trajectories actually followthe vector field.The exact matching problem is the following: given two sets of landmarks I = ( x , x , . . . , x N )and I = ( y , y , . . . , y N ) with x a = x b and y a = y b for any a = b , minimize the energy E ∞ [ v ] := Z k v ( t, ) k V dt among all v ∈ L ([0 , , V ) such that ϕ v ( x a ) = y a , a = 1 , . . . , N . In this case the landmarktrajectories are defined as the solutions to the ordinary differential equations ˙ q a = v ( t, q a ), a =1 , . . . , N . Note that this is equivalent to solving (4) for λ = ∞ , since such equations are obtained bysetting the integrands of the second term in the right-hand side of (4) equal to zero. When λ < ∞ in (4) we have regularized matching, i.e. the landmark trajectories “almost” satisfy such set ofordinary differential equations; this allows for the time varying vector field to be smoother. Forthis reason the second term in the right-hand side of (4) is often referred to as smoothing term ;by allowing smoother vector fields the distance d is made tolerant to small diffeomorphisms andtherefore more robust to object variations due to noise in the data. By manipulating expression (4) we will now show that it is equivalent to the energy of a path q ∈ Q with respect to a Riemannian metric. Notation.
Consider a landmark q = ( q , . . . , q N ) in L N ( R D ). The D scalar components in Eu-clidean coordinates of the N landmark trajectories q a = ( q a , . . . , q aD ), a = 1 , . . . , N can be orderedeither into an N × D matrix or in a tall concatenated column vector. We shall always use indices a, b, c, . . . ∈ { , . . . , N } as landmark indices , and i, j, k, . . . ∈ { , . . . , D } as space coordinates in R D .We will associate to each of the N landmarks q a ∈ R × D a momentum p a ∈ R × D (defined in thenext proposition) which we will write, in coordinates, as p a = ( p a , . . . , p aD ), for each a = 1 , . . . , N .The components of momenta can also be ordered into an N × D matrix or in a long row vector. Wechose superscript indices for landmark coordinates and subscript indices for momenta.For a given set of landmarks ( q , . . . , q N ) ∈ L N ( R D ) we will define the symmetric N × N matrix K ( q ) := (cid:0) K ( q a − q b ) (cid:1) a,b =1 ,...,N . The matrix K ( q ) is positive definite by property (K1) of the kernel. Proposition 1.
For a fixed landmark path q = (cid:8) q a : [0 , → R D (cid:9) Na =1 ∈ Q there exists a uniqueminimizer with respect to v ∈ L ([0 , , V ) of the energy E [ v, q ] , namely: v ∗ ( t, x ) := N X a =1 p a ( t ) K (cid:0) x − q a ( t ) (cid:1) , t ∈ [0 , , x ∈ R D , (6)5 here the components of the momenta are given by: p ai ( t ) = N X b =1 (cid:16) K (cid:0) q ( t ) (cid:1) + I N λ (cid:17) − ab · ddt q bi ( t ) , t ∈ [0 , , (7) a = 1 , . . . , N , i = 1 , . . . , D (here I N indicates the N × N identity matrix). Remark.
What the above proposition essentially says is that the vector field of minimum energythat transports the N landmarks along fixed trajectories is, at any point of time, the linear combi-nation of N lumps of velocity, each centered at a landmark point. The directions and amplitudes ofthe summands are determined precisely by the momenta. Proof of Proposition 1.
Using property (ii) of the admissible Hilbert space V , [31, Lemma 9.5] showsthat for given q = ( q , . . . , q N ) ∈ L N ( R D ) we have the orthogonal decomposition V = (cid:8) v ∈ V : v ( q a ) = 0 , a = 1 , . . . N (cid:9) ⊕ (cid:8) v = P Na =1 α a K ( − q a ) : α a ∈ R D (cid:9) . (8)Thus the minimizer must have the form v ( t, x ) = N X a =1 α a ( t ) K (cid:0) x − q a ( t ) (cid:1) , t ∈ [0 , , x ∈ R D , (9)for some coefficients α a ∈ C ([0 , , R D ), a = 1 , . . . , N , to be computed. For velocities of the type (9)the energy (4) can be rewritten as E [ v, q ] = Z D X i =1 N X a,b =1 n α ai K ( q a − q b ) α bi + λ (cid:12)(cid:12) α ai K ( q a − q b ) − ˙ q bi (cid:12)(cid:12) o dt. (10)Setting the first variation of (10) with respect to coefficients α ai to zero yields the momenta (7).It is convenient, at this point, to introduce the N D × N D , block-diagonal matrix g ( q ) := (cid:0) K ( q ) + I N λ (cid:1) − · · · (cid:0) K ( q ) + I N λ (cid:1) − · · · · · · (cid:0) K ( q ) + I N λ (cid:1) − , (11)where the N × N block (cid:0) K ( q ) + I N λ (cid:1) − is repeated D times; the choice of symbol g is justified bythe fact that (11) is, as we shall see soon, precisely the Riemannian metric tensor with which we areendowing the manifold of landmarks, written in coordinates.Thus for a fixed path q ∈ Q the minimizer of E [ v, q ] with respect to v ∈ L ([0 , , V ) is givenby (6); since it depends on q we will write it, with an abuse of notation, as v ∗ ( q ). We can define e E [ q ] := E [ v ∗ ( q ) , q ] , (12)which depends only on the arbitrary path q ∈ Q . The energy (12) is “equivalent” to the en-ergy E [ v, q ], in that: (a) if (ˆ v, ˆ q ) minimizes E [ v, q ] then ˆ q minimizes e E [ q ], and E [ˆ v, ˆ q ] = e E [ˆ q ]; (b) if ˆ q minimizes e E [ q ] then ( v ∗ (ˆ q ) , ˆ q ) minimizes E [ v, q ], and E [ v ∗ (ˆ q ) , ˆ q ] = e E [ˆ q ].6 roposition 2. For an arbitrary landmark trajectory q ∈ Q the energy e E [ q ] is given by: e E [ q ] = Z ˙ q ( t ) T g (cid:0) q ( t ) (cid:1) ˙ q ( t ) dt = Z N X a,b =1 D X i =1 ˙ q ai ( t ) ˙ q bi ( t ) (cid:16) K (cid:0) q (cid:0) t )) + I N λ (cid:17) − ab dt (13)In the above equation ˙ q ( t ) is intended as an N D -dimensional column vector obtained by stack-ing the column vectors ( ˙ q i ( t ) , . . . , ˙ q Ni ( t )) T , i = 1 , . . . , D (again, the superscript T indicates thetranspose of a vector). Proof.
Following definition (12), formulae (7) for the momenta are inserted into the modified expres-sion (10) for energy E [ v, q ]. Simple matrix manipulations finally yield the right-hand side of (13). Remarks.
Expression (13) has exactly the form of the energy of a path q with respect to Riemannianmetric tensor (11). Whence given two landmark configurations I and I in L N ( R D ) we have thatif ˆ q minimizes (13) among all paths in q ∈ Q such that q (0) = I and q (1) = I then ( e E [ˆ q ]) / is the geodesic distance between I and I . By point (b) above we also have that ( v ∗ (ˆ q ) , ˆ q ) is a minimumof energy E [ v, q ], so d ( I, I ) defined in (5) coincides with ( e E [ˆ q ]) / and is the geodesic distancebetween I and I with respect to the metric tensor g .The Lagrangian function that corresponds to the energy (13) is: L ( q, ˙ q ) = 12 ˙ q T g ( q ) ˙ q = 12 N X a,b =1 D X i =1 ˙ q ai ˙ q bi (cid:16) K ( q ) + I N λ (cid:17) − ab . (14)In Hamiltonian mechanics [3, p. 60] the “momenta” are defined as p ai = ∂ L /∂ ˙ q ai , or, in vectornotation, p ( i ) = ∂ L /∂ ˙ q ( i ) (for i = 1 , . . . , D ). Applying such definition to (14) yields preciselyequations (7) of Proposition 1. Whence the use of the term momenta is justified.Note that for small values of the parameter λ the metric tensor g , written in coordinates, getsclose (up to a multiplicative constant) to the N D × N D identity matrix; in other words, for λ → g converges to a Euclidean metric and the geodesic curves become straight lines. On the otherhand, for λ → ∞ (exact matching) the metric converges to [diag { K ( q ) , . . . , K ( q ) } ] − (block K ( q )is repeated D times). In general, the block-diagonal form of the metric tensor g given by (11)follows from the fact that the operator L in (2) is applied separately to each of the components ofthe velocity field; however the dynamics of the D dimensions of q are not decoupled since all ND components of q appear in each diagonal block of g .In the case of exact matching landmarks “never collide” (their trajectories are precisely defined bydiffeomorphisms of R D ): it takes an infinite amount of energy to make any two landmarks coincide.So under the condition λ = ∞ the manifold of landmarks can actually be taken as the set: L N ( R D ) = n ( P , . . . , P N ) (cid:12)(cid:12) P a ∈ R D , P a = P b if a = b o . (15)Figure 1 shows the qualitative behavior of geodesics in L ( R ), with λ = ∞ . In the caseillustrated on the left-hand side both landmarks travel in the same direction (from left to right, asindicated by the arrows): the two arcs of the geodesic “attract” each other, or in other words thetwo landmarks tend to “carpool” by using a velocity field with the smallest possible support so tominimize the L part (i.e. the first term) of the Sobolev norm (2) of the velocity field. On the otherhand when the two landmarks travel in opposite directions (as illustrated on the right-hand side ofFigure 1) they try to avoid each other so that the higher order terms of the Sobolev norm are keptsmall; we shall return on the issue of obstacle avoidance at the end of this paper. A typical geodesicin L ( R ) (again with λ = ∞ ) is shown in Figure 2.7 − − − − − − − − − − − Figure 1: Two trajectories in L ( R ). Bullets ( • ) and circles ( ◦ ) are the initial and final sets oflandmarks, respectively. The grids represents the two corresponding diffeomorphisms ϕ v . Conclusion.
We have shown that distance d ( I, I ), I, I ∈ L N ( R D ) defined in (5) is in fact thegeodesic distance with respect to a Riemannian metric. In coordinates, the corresponding Rieman-nian metric tensor is given by (11), which is such that each element of its inverse (the cometric)depends on at most 2 D of the ND coordinates. Whence the first and second partial derivatives ofthe cometric have a very sparse structure. This gives us motivation for deriving a general formulafor computing sectional curvature in terms of the cometric and its derivatives in lieu of the metricand its derivatives, which will be done in the next section. Let M be an n -dimensional Riemannian manifold. If we consider a local chart ( U, ϕ ) on the manifoldwith coordinates ( x , . . . , x n ), we have the induced 1-forms dx , . . . , dx n and coordinate vector fields { ∂ := ∂∂x , . . . , ∂ n = ∂∂x n } . The metric tensor g : T M × M T M → R can be represented as g | U = g ( ∂ i , ∂ j ) dx i ⊗ dx j =: g ij dx i ⊗ dx j (here, as in the rest of the current section, we are using Einstein’ssummation convention). For each p ∈ M we get a positive definite matrix with elements g ij ( p ) = g p ( ∂ i , ∂ j ). With an abuse of notation we will write g ij ( x ) instead of ( g ij ◦ ϕ − )( x ), x ∈ ϕ ( U ) ⊂ R n . Notation.
We shall denote the partial derivatives of the elements of the metric tensor g as g ij,k ( x ) := ∂∂x k g ij ( x ) = ∂ k g ij and g ij,k‘ ( x ) := ∂ ∂x ‘ ∂x k g ij ( x ) = ∂ ‘ ∂ k g ij . Also, we will indicate the cometric as g − | U = g ij ∂ i ⊗ ∂ j (so that g ij g jk = δ ik ) and their partial derivatives with g ijij,k ( x ) := ∂∂x k g ij ( x )and g ijij,k‘ ( x ) := ∂ ∂x ‘ ∂x k g ij ( x ). 8 − − − − − Figure 2: A typical geodesic trajectory in L ( R ). Bullets ( • ) and circles ( ◦ ) are the initial and finalsets of landmarks, respectively. The grid represents the corresponding diffeomorphism ϕ v .For a tangent vectors X = X i ∂ i we consider the 1-form X [ := X i g ij dx j =: X j dx j (indiceslowered), and for a 1-form α = α i dx i we have the tangent vector α ] := α i g ij ∂ j (indices lifted).Indicating with X ( M ) the space of smooth vector fields on the manifold M , let ∇ : X ( M ) ×X ( M ) → X ( M ) be the Levi-Civita connection [16, 21] of the Riemannian manifold. The Christof-fel symbols Γ kij are defined by ∇ ∂ i ∂ j = Γ kij ∂ k , and it is well known that they have the form:Γ kij = g k‘ ( g i‘,j + g j‘,i − g ij,‘ ). The Riemannian curvature endomorphism is the map R : X ( M ) ×X ( M ) × X ( M ) → X ( M ) given by R ( X, Y ) Z = ∇ X ∇ Y Z − ∇ Y ∇ X Z − ∇ [ X,Y ] Z. In local coordi-nates R ( ∂ i , ∂ j ) ∂ k = R ‘ijk ∂ ‘ , and R ijkm := h R ( ∂ i , ∂ j ) ∂ k , ∂ m i g = g m‘ R ‘ijk . The
Riemannian curvaturetensor acts on vector fields as follows: R ( X, Y, Z, W ) := h R ( X, Y ) Z, W i g (16)and in coordinates it is written as R = R ijkm dx i ⊗ dx j ⊗ dx k ⊗ dx m . The Riemannian curvaturetensor has a number of symmetries: (i) R ijk‘ = − R jik‘ ; (ii) R ijk‘ = − R ij‘k ; (iii) R ijk‘ = R k‘ij ;and (iv) R ijk‘ + R jki‘ + R kij‘ = 0 (first Bianchi identity). With such conventions, the sectionalcurvature associated to a pair of non-parallel tangent vectors X and Y is computed by: K ( X, Y ) = R ( X, Y, Y, X ) k X k g k Y k g − h X, Y i g = R ijkm X i Y j Y k X m k X k g k Y k g − h X, Y i g . (17)In order to express the numerator of sectional curvature (17) in terms of the elements of thecometric and its derivatives (i.e. g ij , g ijij,k , and g ijij,k‘ ) we consider the covariant expression of theRiemannian curvature tensor: R ursv := R ijkm g iu g jr g ks g mv , (18)9hich we call the dual Riemannian curvature tensor . Similarly we consider the covariant or dualChristoffel symbols Γ rsu := g ir g js g ku Γ kij , (19)which are symmetric in the indices r and s .To achieve notational compactness we will use the following symbols: g ij,k := g ijij,ξ g ξk and g ij,k‘ := g ijij,ξη g ξk g η‘ ; (20)Using that g = Q − implies ∂ k g = − Q − · ∂ k Q · Q − one immediately sees thatΓ rsu = − g uϕ (cid:0) g sϕ,r + g rϕ,s − g rs,ϕ (cid:1) . Proposition 3.
The following expression holds for the Riemannian curvature tensor: R ijkm = g ik,jm + g jm,ik − g jk,im − g im,jk + 2Γ αik Γ βjm g αβ − αjk Γ βim g αβ . (21)For a proof see [24, § Proposition 4.
The following expression holds for the dual Riemannian curvature tensor: R ursv = − g us,rv − g rv,us + g rs,uv + g uv,rs + 2Γ rvρ Γ usσ g ρσ − rsρ Γ uvσ g ρσ + g rλ,u g λµ g µv,s − g rλ,u g λµ g µs,v + g uλ,r g λµ g µs,v − g uλ,r g λµ g µv,s (22)+ g rλ,s g λµ g µv,u + g uλ,v g λµ g µs,r − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r . Proof.
We will manipulate (21) and write it in the form R ijkm = g iu g jr g ks g mv R ursv by factor-ing g iu g jr g ks g mv out of each term; what will be left will be precisely the expression for R ursv .The terms in (21) involving Christoffel symbols are, by (19):Γ αik Γ βjm g αβ = g iu g ks g ασ Γ usσ g jr g mv g ρβ Γ rvρ g αβ = g iu g jr g ks g mv (cid:0) Γ rvρ Γ usσ g ρσ (cid:1) , (23)and similarly: Γ αjk Γ βim g αβ = g iu g jr g ks g mv (cid:0) Γ rsρ Γ uvσ g ρσ (cid:1) . (24)As we noted before, if g = Q − then ∂ j g = − Q − · ∂ j Q · Q − and similarly it is the case that ∂ m ∂ j g = Q − · (cid:0) ∂ m Q · Q − · ∂ j Q + ∂ j Q · Q − · ∂ m Q − ∂ m ∂ j Q (cid:1) · Q − , i.e., in index notation, g ik,jm = g iu (cid:0) g uλuλ,m g λµ g µsµs,j + g uλuλ,j g λµ g µsµs,m − g usus,jm (cid:1) g sk = g iu g ks δ ξj δ ηm (cid:0) g uλuλ,η g λµ g µsµs,ξ + g uλuλ,ξ g λµ g µsµs,η − g usus,ξη (cid:1) = g iu g ks g jr g mv (cid:2) g rξ g vη (cid:0) g uλuλ,η g λµ g µsµs,ξ + g uλuλ,ξ g λµ g µsµs,η − g usus,ξη (cid:1)(cid:3) = g iu g jr g ks g mv (cid:0) g uλ,v g λµ g µs,r + g uλ,r g λµ g µs,v − g us,rv (cid:1) , (25)where we have used definitions (20). Similarly, we can achieve the factorizations: g jm,ik = g iu g jr g ks g mv (cid:0) g rλ,u g λµ g µv,s + g rλ,s g λµ g µv,u − g rv,us (cid:1) , (26) − g jk,im = g iu g jr g ks g mv (cid:0) − g rλ,u g λµ g µs,v − g rλ,v g λµ g µs,u + g rs,uv (cid:1) , (27) − g im,jk = g iu g jr g ks g mv (cid:0) − g uλ,r g λµ g µv,s − g uλ,s g λµ g µv,r + g uv,rs (cid:1) . (28)Inserting (23) ÷ (28) into (21) we can write R ijkm = g iu g jr g ks g mv R ursv , with R ursv given by (22).10 roposition 5. The dual Riemannian curvature tensor may also be written as follows: R ursv = − g us,rv − g rv,us + g rs,uv + g uv,rs (T ) − (cid:8) g rsrs,ρ g ρσ g uvuv,σ − g rsrs,ρ (cid:0) g ρu,v + g ρv,u (cid:1) − g uvuv,σ (cid:0) g σr,s + g σs,r (cid:1)(cid:9) (T )+ 12 (cid:8) g rvrv,ρ g ρσ g usus,σ − g rvrv,ρ (cid:0) g ρu,s + g ρs,u (cid:1) − g usus,σ (cid:0) g σr,v + g σv,r (cid:1)(cid:9) (T ) − (cid:0) g λr,s − g λs,r (cid:1) g λµ (cid:0) g µu,v − g µv,u (cid:1) (T )+ 12 (cid:0) g λr,v − g λv,r (cid:1) g λµ (cid:0) g µu,s − g µs,u (cid:1) (T )+ (cid:0) g λr,u − g λu,r (cid:1) g λµ (cid:0) g µv,s − g µs,v (cid:1) . (T ) Proof.
We will expand and recombine the terms in expression (22). The terms involving secondderivatives need no manipulation and correspond to term T . The terms in the second line of (22)can be written as: g rλ,u g λµ g µv,s − g rλ,u g λµ g µs,v + g uλ,r g λµ g µs,v − g uλ,r g λµ g µv,s = ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v )which is precisely T . It is also the case that:2 Γ rvρ Γ usσ g ρσ − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = (cid:2) ( g λr,v + g λv,r ) − g rv,λ (cid:3) g λρ g ρσ g σµ (cid:2) ( g µu,s + g µs,u ) − g us,µ (cid:3) − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = (cid:8) g rvrv,ρ g ρσ g usus,σ − g rvrv,ρ ( g ρu,s + g ρs,u ) − g usus,σ ( g σr,v + g σv,r ) (cid:9) + ( g λr,v + g λv,r ) g λµ ( g µu,s + g µs,u ) − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = T + ( g λr,v − g λv,r ) g λµ ( g µu,s − g µs,u ) = T + T . Similarly one can prove that: − rsρ Γ uvσ g ρσ + g rλ,s g λµ g µv,u + g uλ,v g λµ g µs,r = T + T .For any point p ∈ M and an arbitrary pair of tangent vectors X = X i ∂ i , Y = Y i ∂ i in T p M weconsider the covectors X [ = X i dx i and Y [ = Y i dx i in T ∗ p M , with X i = g ij X j and Y i = g ij Y j . Thenumerator of sectional curvature (17) may be rewritten as R ijkm X i Y j Y k X m = R ursv X u Y r Y s X v . Theorem (Mario’s formula).
For an arbitrary pair of vectors X = X i ∂ i and Y = Y i ∂ i in T p M the numerator of sectional curvature (17) at point p ∈ M may be written as: g (cid:0) R ( X, Y ) Y, X (cid:1) = R ursv X u Y r Y s X v == (cid:0) X u Y r − Y u X r (cid:1)(cid:16) g su,rv + g usus,ρ g ρr,v − g usus,σ g rv,σ − g λu,r g λµ g µs,v (cid:17)(cid:0) X s Y v − Y s X v (cid:1) . Moreover, if we extend X [ and Y [ locally on M to constant X u , Y r constant functions), then the formula becomes: g (cid:0) R ( X, Y ) Y, X (cid:1) == n XX ( k Y [ k ) + Y Y ( k X [ k ) − ( XY + Y X ) g − ( X [ , Y [ ) o + n k d ( g − ( X [ , Y [ )) k − g − (cid:0) d ( k X [ k ) , d ( k Y [ k ) (cid:1)o − g (cid:0) [ X, Y ] , [ X, Y ] (cid:1) , here the term in the first set of braces equals the sum of the first two terms in the coordinate form,the term in the second set of braces equals the third term in the coordinate form and finally the lastterms are equal. In the above formula, k X [ k = X s X u g su and k Y [ k = Y r Y v g rv .Proof. We will write the six terms provided by Proposition 5 as T ursvi , i = 1 , . . . ,
6. We have:T ursv X u Y r Y s X v = − g us,rv X u Y r Y s X v − g rv,us X u Y r Y s X v + g rs,uv X u Y r Y s X v + g uv,rs X u Y r Y s X v = g us,rv ( − X u Y r Y s X v − X r Y u Y v X s + X r Y u Y s X v + X u Y r Y v X s )= g us,rv ( X u Y r − Y u X r )( X s Y v − Y s X v ) , where the second step follows from relabeling the indices. As far as T and T are concerned,(T ursv + T ursv ) X u Y r Y s X v = − (cid:8) Y r Y s g rsrs,ρ g ρσ g uvuv,σ X u X v − Y r X v g rvrv,ρ g ρσ g usus,σ X u Y s (cid:9) + (cid:8) Y r Y s g rsrs,ρ (cid:0) g ρu,v + g ρv,u (cid:1) X u X v + X u X v g uvuv,ρ (cid:0) g ρr,s + g ρs,r (cid:1) Y r Y s − Y r X v g rvrv,ρ (cid:0) g ρu,s + g ρs,u (cid:1) X u Y s − X u Y s g usus,ρ (cid:0) g ρr,v + g ρv,r (cid:1) Y r X v (cid:9) = − (cid:8) Y r Y s g rsrs,ρ g ρσ g uvuv,σ X u X v − Y r X v g rvrv,ρ g ρσ g usus,σ X u Y s (cid:9) + (cid:8) Y r Y s g rsrs,ρ g ρu,v X u X v + 2 X u X v g uvuv,σ g σr,s Y r Y s − X u Y s g usus,ρ (cid:0) g ρr,v + g ρv,r (cid:1) Y r X v (cid:9) ( ∗ ) = − g rvrv,ρ g ρσ g usus,σ (cid:8) Y u Y s X r X v + Y r Y v X u X s − Y r X v X u Y s − Y v X r X s Y u (cid:9) + g usus,ρ g ρr,v (cid:8) Y u Y s X r X v + X u X s Y r Y v − X u Y s Y r X v − X s Y u Y v X r (cid:9) = (cid:0) − g usus,σ g rv,σ + g usus,ρ g ρr,v (cid:1) ( X u Y r − Y u X r )( X s Y v − Y s X v ) , where, once again, step ( ∗ ) follows from relabeling the indices. Also, one can easily see thatT ursv X u Y r Y s Y v = − Y r Y s ( g λr,s − g λs,r ) g λµ ( g µu,v − g µv,u ) X u X v = 0 . Finally,(T ursv + T ursv ) X u Y r Y s Y v = Y r X v ( g λr,v − g λv,r ) g λµ ( g µu,s − g µs,u ) X u Y s + Y r X u ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v ) X v Y s = Y r X u ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v ) X v Y s = Y r X u (cid:8) g λr,u g λµ g µv,s − g λr,u g λµ g µs,v − g λu,r g λµ g µv,s + g λu,r g λµ g µs,v (cid:9) X v Y s = − g λu,r g λµ g µs,v (cid:8) − Y u X r X s Y v + Y u X r X v Y s + Y r X u X s Y v − Y r X u X v Y s (cid:9) = − g λu,r g λµ g µs,v ( X u Y r − Y u X r )( X s Y v − Y s X v ) . Divide by 2 to get the coordinate formula. The non-local version of the formula follows easily bybringing the X and Y ’s into the formula. Thus (indicating ∂ i with the subscript ,i ): Y u X r ( g su,rv + g susu,ρ g ρr,v ) Y s X v = X r X v (cid:0) Y s Y u g susu,ρσ g ρξ g ση + Y s Y u g susu,ρ g ρrρr,σ g σv (cid:1) = X r X v (cid:0) ( k Y [ k ) ,ρσ g ρr g σv + ( k Y [ k ) ,ρ g ρrρr,σ g σv (cid:1) (because Y s , Y u are constants)= X v g σv (cid:0) X r g ρr ( k Y [ k ) ,ρ (cid:1) ,σ = X σ (cid:0) X ρ ( k Y [ k ) ,ρ (cid:1) ,σ = XX (cid:0) k Y [ k (cid:1) . A typical term from the third part of Mario’s formula is rewritten like this: Y u X r g usus,σ g rv,σ Y s X v = X r X v (cid:0) k Y [ k (cid:1) ,σ g rv,σ = (cid:0) k Y [ k (cid:1) ,σ (cid:0) k X [ k (cid:1) ,ρ g ρσ = g − (cid:0) d ( k Y [ k ) , d ( k X [ k ) (cid:1) ;12he other terms are similar. Finally, it is the case that:( X u Y r − Y u X r ) g λu,r ∂ λ = ( X u Y r − Y u X r ) g λuλu,η g ηr ∂ λ = ( X u Y η − Y u X η ) g λuλu,η ∂ λ = (cid:0) ( X u g λu ) ,η Y η − ( Y u g λu ) ,η X η (cid:1) ∂ λ = (cid:0) X λ,η Y η − Y λ,η X η ) ∂ λ = − [ X, Y ] , and the proof is easily completed. Remark.
It is convenient to split Mario’s formula in four terms: R := ( X u Y r − Y u X r (cid:1) g su,rv (cid:0) X s Y v − Y s X v (cid:1) , (30) R := ( X u Y r − Y u X r (cid:1) g usus,ρ g ρr,v (cid:0) X s Y v − Y s X v (cid:1) , (31) R := ( X u Y r − Y u X r (cid:1)(cid:0) − g usus,σ g rv,σ (cid:1)(cid:0) X s Y v − Y s X v (cid:1) , (32) R := ( X u Y r − Y u X r (cid:1)(cid:0) − g λu,r g λµ g µs,v (cid:1)(cid:0) X s Y v − Y s X v (cid:1) ; (33)all the terms with the exception of R (where g appears, but not its derivatives) depend only onelements of the cometric and their derivatives. Remark.
The denominator of sectional curvature (17) can also be expressed in terms of the comet-ric: k X k g k Y k g − h X, Y i g = X u X s Y r Y v ( g us g rv − g uv g sr ) . (34) In this section we will apply Mario’s formula to the computation of sectional curvature for theRiemannian manifold of landmarks, introduced in section 2. We first introduce the Hamiltonianformalism, since it will allow us to write the geodesic equations in a simple form and to introducegeometric quantities that will eventually appear in the formula for sectional curvature.
On the
N D -dimensional manifold L = L N ( R D ) of landmarks we consider the Riemannian metric g given, in coordinates, by the matrix (11); it is in block-diagonal form and we write its genericelement as g ( ai )( bj ) , with a, b = 1 , . . . , N (landmark labels) and i, j = 1 , . . . , D (coordinate labels,respectively of landmarks a and b ). More precisely: the matrix g ( q ) is made of D square ( N × N )blocks; indices i, j = 1 , . . . , D indicate the block, whereas indices a, b = 1 , . . . , N locate the elementwithin the ( i, j )-block. Therefore if we indicate with h ab ( q ) the generic element of the N × N matrix (cid:0) K ( q ) + I N λ (cid:1) − we have that g ( ai )( bj ) = h ab ( q ) δ ij , a, b = 1 , . . . , N, i, j = 1 , . . . , D, where δ ij is Kronecker’s delta. Similarly, if we indicate as g ( ai )( bj ) the elements of the cometrictensor g ( q ) − , they are given by g ( ai )( bj ) ( q ) = h ab ( q ) δ ij , where h ab ( q ) = K ( q a − q b )+ δ ab λ . In analogywith the notation introduced in section 3 we also denote the partial derivatives by g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂∂q ck g ( ai )( bj ) and g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂ ∂q ck ∂q d‘ g ( ai )( bj ) ; they will be computed later.13or simplicity from now on we shall assume that λ = ∞ , i.e. that we are dealing with exactmatching of landmarks so that L N ( R D ) has the form (15). The element of the cometric becomes g ( ai )( bj ) ( q ) = K ( q a − q b ) δ ij and the Hamiltonian [16, p. 50] for the system can be written as: H ( p, q ) = 12 p T g ( q ) − p = 12 N X a,b =1 D X i,j =1 g ( ai )( bj ) ( q ) p ai p bj = 12 N X a,b =1 D X i,j =1 K ( q a − q b ) δ ij p ai p bj , that is H ( p, q ) = 12 N X a,b =1 K ( q a − q b ) (cid:10) p a , p b (cid:11) R D . Proposition 6.
Hamilton’s equations for the Riemannian manifold of landmarks are: ˙ q a = N X b =1 K ( q a − q b ) p b ˙ p a = − N X b =1 ∇ K ( q a − q b ) (cid:10) p a , p b (cid:11) R D a = 1 , . . . , N. (35) Proof.
Equation (7) can be written as ˙ q ai = P Nb =1 K ( q a − q b ) p bi , for a = 1 , . . . , N , i = 1 , . . . , D ;alternatively, computing ˙ q ai = ∂ H ∂p ai yields the same result. Also: ∂∂q ai K ( q b − q c , . . . , q bD − q cD ) = P D‘ =1 ∂K∂x ‘ ( q b − q c ) ∂∂q ai ( q b‘ − q c‘ )= P D‘ =1 ∂K∂x ‘ ( q b − q c ) ( δ ba − δ ca ) δ ‘i = ∂K∂x i ( q b − q c ) ( δ ba − δ ca ) (36)so that˙ p ai = − ∂ H ∂q ai ( p, q ) = − P Nc =1 ∂K∂x i ( q a − q c ) h p a , p c i R D + P Nb =1 ∂K∂x i ( q b − q a ) h p b , p a i R D ( ∗ ) = − P Nb =1 ∂K∂x i ( q a − q b ) h p a , p b i R D ;in ( ∗ ) we used the skew-symmetry of ∇ K ( q a − q b ) in indices a and b , which follows from (K2). Corollary 7. If p a ( t ) = 0 for some landmark a = 1 , . . . , N and time t ∈ R , then p a ( t ) ≡ . From now on we shall also assume that (K3) holds , i.e. that the kernel K is twice continuouslydifferentiable; for the time being we will not assume rotational invariance. We define: K ab := K ( q a − q b ) ∈ R ,∂ i K ( x ) := ∂K∂x i ( x ) , ∂ i K ab := ∂ i K ( q a − q b ) ∈ R , ∇ K := ( ∂ K, · · · , ∂ D K ) T , ∇ K ab := ∇ K ( q a − q b ) ∈ R D ,∂ ij K ( x ) := ∂ K∂x i ∂x j ( x ) , ∂ ij K ab := ∂ ij K ( q a − q b ) ∈ R ,D K := Hessian( K ) , D K ab := D K ( q a − q b ) ∈ R D × D . (37)14ote that ∇ K ab = −∇ K ba , ∇ K aa = 0 and D K ab = D K ba , for all a, b = 1 , . . . , N , by (K2).For a fixed set of landmark points q in L = L N ( R D ) consider any pair of cotangent vectors α, β ∈ T ∗ q L : we shall write α = ( α , . . . , α N ) and β = ( β , . . . , β N ), where each component is D -dimensional. We define the vector field α hor : R D → R D and its values at the landmark points by: α hor ( x ) := N X b =1 K ( x − q b ) α b , x ∈ R D , ( α ] ) a := α hor ( q a ) = N X b =1 K ab α b , which are, by virtue of formula (6), the velocity field α hor on R D induced by the landmark mo-mentum α = ( α , . . . , α N ) and the corresponding landmark velocity α ] ∈ T q L (which obviouslycoincides with the first of Hamilton’s equations (35)). Note that α ] = ( α ] , . . . , α ]N ) is the tangentvector in T q L with metrically lifted indices. Note that α hor is the horizontal lift [10, p. 148] of thetangent vector α ] on the admissible Hilbert space V : simply put, of all vector fields v : R D → R D in V such that v ( q a ) = ( α ] ) a , a = 1 , . . . , N , α hor is the one of minimum norm.The curvature of the Riemannian manifold of landmarks will be expressed in terms of threeauxiliary quantities which we now introduce. We will call these force , discrete strain and landmarkderivative . We start with the force. For a fixed covector α = ( α , . . . , α N ) ∈ T ∗ q L , having the dualvector extended to a vector field α hor on all of R D allows us to take its derivatives at the landmarkpoints, a D × D matrix-valued function on R D :( Dα hor ) ji ( x ) := ∂ i ( α hor ) j ( x ) = N X b =1 α bj ∂ i K ( x − q b ) , ( Dα hor ) ji ( q a ) = N X b =1 ∂ i K ab α bj . For a trajectory ( q ( t ) , p ( t )) of the cotangent flow one has that ( p ( t ) , . . . , p N ( t )) ∈ T ∗ q ( t ) L for all t where the trajectory is defined, so the above notation can be used to rewrite Hamilton’s equationsin a more compact form. In particular, the following result holds. Proposition 8.
The second of Hamilton’s equations (35) can be written as ˙ p a = − Dp hor ( q a ) · p a a = 1 , . . . , N. (38) Proof. ˙ p ai = − P Nb =1 ∂ i K ab h p b , p a i R D = − P Dj =1 (cid:0) P Nb =1 ∂ i K ab p bj (cid:1) p aj = − P Dj =1 ( Dp hor ) ji ( q a ) p aj = − (cid:0) Dp hor ( q a ) · p a (cid:1) i , for any a = 1 , . . . , N and i = 1 , . . . , D .For a fixed cotangent vector α ∈ T ∗ q L , this motivates defining the negative right-hand side of (38)to be force : F a ( α, α ) := Dα hor ( q a ) · α a , a = 1 , . . . , N. F : T ∗ q L × T ∗ q L → T ∗ q L . We callthe covectors given by this the mixed force, with the definition: F a ( α, β ) := (cid:0) Dα hor ( q a ) · β a + Dβ hor ( q a ) · α a (cid:1) ,F ai ( α, β ) := D X j =1 N X b =1 ∂ i K ab (cid:0) α bj β aj + β bj α aj (cid:1) = N X b =1 ∂ i K ab (cid:0) h α a , β b i R D + h β a , α b i R D (cid:1) , (39)for a = 1 , . . . , N and i = 1 , . . . , D . (The angle brackets are inner products in R D .) Note that the“complete” cotangent vectors α = ( α , . . . , α N ) and β = ( β , . . . , β N ) (not only their a -components)are needed to compute each component F a ( α, β ) of the mixed force. The mixed force has simpleinterpretation. If we extend α and β to constant 1-forms on L , then the differential of the map q g − q ( α, β ) = P a,b K ( q a − q b ) h α a , β b i R D is given by: d (cid:0) g − q ( α, β ) (cid:1) = N X a,b =1 D X i =1 ∂ i K ( q a − q b ) ( dq ai − dq bi ) h α a , β b i R D = N X a,b =1 D X i =1 ∂ i K ( q a − q b ) (cid:0) h α a , β b i R D + h β a , α b i R D (cid:1) dq ai = 2 F ( α, β ) . (40)For a fixed α ∈ T ∗ q L we define the discrete vector strain : S ab ( α ) := ( α ] ) a − ( α ] ) b , or S ab ( α ) i := N X c =1 D X j =1 ( K ac − K bc ) δ ij α cj = N X c =1 ( K ac − K bc ) α ci for all a, b = 1 , . . . , N (we call it like that because it measures the infinitesimal change of relativeposition of the landmarks a and b induced by the cotangent vector α ). These are vectors and areskew-symmetric in the points a, b : S ab ( α ) = − S ba ( α ), S aa ( α ) = 0. The scalar quantities: C ab ( α ) := (cid:10) ( α ] ) a − ( α ] ) b , ∇ K ab (cid:11) R D = N X c =1 D X i =1 ( K ac − K bc ) ∂ i K ab α ci we define to be the scalar compressions felt by kernel K ; they are symmetric (since both factors inthe inner product are skew-symmetric), i.e. C ab ( α ) = C ba ( α ), with the property C aa ( α ) = 0. Wecall these compressions because if K is a monotone decreasing function of the distance from theorigin (the most common case), then ∇ K ab points from q a to q b .Finally, if v and w are any two vector fields on the manifold of landmarks, we may write theirLie derivative as the difference of covariant derivatives:[ v, w ] L = ∇ L , flat v ( w ) − ∇ L , flat w ( v )where the flat connection on L is just the one induced by its embedding in R ND . In other words, ∇ L , flat v ( w ) is the usual derivative of w in the direction v if we use the coordinates q ai on landmarkspace: that is, ∇ L , flat v ( w ) := P ai v ( w ai ) ∂ ai = P ai P bj v bj ( ∂ bj w ai ) ∂ ai . If α , β are constant 1-forms16verywhere on L N we can take v = α ] and w = β ] , now as vector fields on L , and then we find: ∇ L , flat α ] ( β ] ) = X a,i X b,j ( α ] ) bj ∂∂q bj ( β ] ) ai ∂ ai = X a,i X b,j ( α ] ) bj (cid:16) ∂∂q bj X c K ( q a − q c ) β ci (cid:17) ∂ ai = X a,i X b,c,j ( α ] ) bj ∂ j K ac ( δ ab − δ cb ) β ci ∂ ai = X a,i X b,j (cid:0) ( α ] ) aj − ( α ] ) bj (cid:1) ∂ j K ab β bi ∂ ai = X a,i X b (cid:10) ( α ] ) a − ( α ] ) b , ∇ K ab (cid:11) R D β bi ∂ ai = X a,i (cid:16) X b C ab ( α ) β bi (cid:17) ∂ ai . This is a vector in T q L which we define to be the landmark derivative of β ] with respect to α ] . Thecoefficients with respect to ∂ a , . . . , ∂ aD (for fixed a ) are the elements of the following vector: D a ( α, β ) := N X b =1 C ab ( α ) β b = N X b,c =1 ( K ac − K bc ) h α c , ∇ K ab i R D β b , a = 1 , . . . , N. (41)We have that D ( α, β ) = ( D a ( α, β )) Na =1 is the N D -dimensional vector of the coefficients of ∇ L , flat α ] ( β ] )with respect to the basis { ∂ ai } of T q L . In particular, the coefficients of the Lie bracket of α ] and β ] as vector fields on L are given by D ( α, β ) − D ( β, α ). L N ( R D ) We can write sectional curvature of L N ( R D ) in the following way, where we have split it in the termsintroduced by (30)–(33). Notation: from now on h , i will indicate the dot product in R D , while h , i T L and h , i T ∗ L will be the inner products in the tangent and cotangent bundles of L = L N ( R D ), respectively. Theorem 9.
The numerator of sectional curvature of L N ( R D ) , for an arbitrary pair of cotangentvectors α and β , is given by R ( α ] , β ] , β ] , α ] ) = P i =1 R i , with: R = X a = b (cid:0) α a ⊗ S ab ( β ) − β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) − β b ⊗ S ab ( α ) (cid:1) , (42) R = X a (cid:16)(cid:10) D a ( α, α ) , F a ( β, β ) (cid:11) + (cid:10) D a ( β, β ) , F a ( α, α ) (cid:11) − (cid:10) D a ( α, β ) + D a ( β, α ) , F a ( α, β ) (cid:11)(cid:17) , (43) R = (cid:13)(cid:13) F ( α, β ) (cid:13)(cid:13) T ∗ L − (cid:10) F ( α, α ) , F ( β, β ) (cid:11) T ∗ L = X ac K ac (cid:16)(cid:10) F a ( α, β ) , F c ( α, β ) i − (cid:10) F a ( α, α ) , F c ( β, β ) (cid:11)(cid:17) , (44) R = − (cid:13)(cid:13) [ α ] , β ] ] L (cid:13)(cid:13) T L = − (cid:13)(cid:13) D ( α, β ) − D ( β, α ) (cid:13)(cid:13) K − . (45)In the formula we have used the definition: ( v ⊗ v ) T ( M ⊗ M )( w ⊗ w ) := ( v T M w )( v T M w )for the first term R , while we have used the norm for D × N matrices k J k A := P Di =1 P Na,b =1 J ia J ib A ab for the fourth term R .The theorem is proven by applying Mario’s formula to the cometric of the manifolds of landmarks.One needs to compute the elements of the cometric and its derivatives in terms of the kernel and17ts derivatives (37). In agreement with notation (20) we will define (note that we will keep usingEinstein’s summation convention wherever possible): g ( ai )( bj ) , ( d‘ ) := g ( ai )( bj )( ai )( bj ) , ( ck ) g ( ck )( d‘ ) and g ( ai )( bj ) , ( ck )( d‘ ) := g ( ai )( bj )( ai )( bj ) , ( µρ )( ξσ ) g ( µρ )( ck ) g ( ξσ )( d‘ ) . Lemma 10.
It is the case that g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂ k K ab ( δ ac − δ bc ) δ ij , (46) g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂ k‘ K ab ( δ ac − δ bc ) ( δ ad − δ bd ) δ ij , (47) g ( ai )( bj ) , ( d‘ ) = ∂ ‘ K ab ( K ad − K bd ) δ ij , (48) g ( ai )( bj ) , ( ck )( d‘ ) = ∂ k‘ K ab ( K ac − K bc ) ( K ad − K bd ) δ ij . (49) Proof.
Since g ( ai )( bj ) = K ab δ ij and also ∂∂q ck K ( q a − q b ) = ∂ k K ab ( δ ac − δ bc ) by (36), equation (46)follows immediately. Similarly to (36) one can prove that ∂∂q d‘ ∂ k K ( q a − q b ) = ∂ ‘k K ab ( δ ad − δ bd ),whence: g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂∂q d‘ g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂ ‘k K ab ( δ ad − δ bd ) ( δ ac − δ bc ) δ ij , so (47) holds too. Now,by expression (46): g ( ai )( bj ) , ( d‘ ) = g ( ai )( bj )( ai )( bj ) , ( ck ) g ( ck )( d‘ ) = P ck ∂ k K ab ( δ ac − δ bc ) δ ij K cd δ k‘ = ∂ ‘ K ab ( K ad − K bd ) δ ij . which is (48). We can use (47) to compute g ( ai )( bj ) , ( ck )( d‘ ) = g ( ai )( bj )( ai )( bj ) , ( µρ )( ξσ ) g ( µρ )( ck ) g ( ξσ )( d‘ ) : g ( ai )( bj ) , ( ck )( d‘ ) = P µρξσ ∂ ρσ K ab ( δ aµ − δ bµ ) ( δ aξ − δ bξ ) δ ij K µc δ ρk K ξd δ σ‘ = ∂ k‘ K ab ( K ac − K bc ) ( K ad − K bd ) δ ij , which completes the proof. Proof of Theorem 9.
We will compute terms R , . . . , R introduced by formulae (30)–(33). For sim-plicity, sometimes we will write Dα hor a instead of Dα hor ( q a ). • Computation of R . We have R = ( α au β cr − β au α cr ) g ( au )( bs ) , ( cr )( dv ) ( α bs β dv − β bs α dv ). Insertingexpression (49) into such formula yields:2 R = P all indices ( α au β cr − β au α cr (cid:1) ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us (cid:0) α bs β dv − β bs α dv (cid:1) . Performing the above multiplications gives rise to four terms, which we will now compute one byone. First of all we have:2 R , := P all indices α au β cr α bs β dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = P abrv (cid:2)P us α au δ us α bs (cid:3)(cid:2)P c ( K ac − K bc ) β cr (cid:3) ∂ rv K ab (cid:2)P d ( K ad − K bd ) β dv (cid:3) = P ab α Ta α b P rv S ab ( β ) r ∂ rv K ab S ab ( β ) v = P ab α Ta α b (cid:0) S ab ( β ) (cid:1) T D K ab S ab ( β );= P ab (cid:0) α a ⊗ S ab ( β ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) (cid:1) , T indicates the transpose of a vector; similarly,2 R , := − P all α au β cr β bs α dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = − P ab (cid:0) α a ⊗ S ab ( β ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) β b ⊗ S ab ( α ) (cid:1) , R , := − P all β au α cr α bs β dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = − P ab (cid:0) β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) (cid:1) , R , := P all β au α cr β bs α dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = P ab (cid:0) β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) β b ⊗ S ab ( α ) (cid:1) . Now we can take the summation R = P i =1 R ,i , which yields precisely expression (42). • Computation of R . We may combine equations (46) and (48) from Lemma 10 to get: g ( au )( bs )( au )( bs ) , ( λρ ) g ( λρ )( cr ) , ( dv ) = P λρ ∂ ρ K ab ( δ aλ − δ bλ ) δ us ∂ v K λc ( K λd − K cd ) δ ρr = ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us . (50)Inserting (50) into 2 R = ( α au β cr − β au α cr ) g ( au )( bs )( au )( bs ) , ( λρ ) g ( λρ )( cr ) , ( dv ) ( α bs β dv − β bs α dv ) yields:2 R = P all indices (cid:8) α au β cr α bs β dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us − α au β cr β bs α dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us − β au α cr α bs β dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us + β au α cr β bs α dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us (cid:9) , which immediately implies: R = P abcd h α a , α b ih β c , ∇ K ab i (cid:2) h β d , ∇ K ac i ( K ad − K cd ) −h β d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , ) − P abcd h α a , β b ih β c , ∇ K ab i (cid:2) h α d , ∇ K ac i ( K ad − K cd ) −h α d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , ) − P abcd h β a , α b ih α c , ∇ K ab i (cid:2) h β d , ∇ K ac i ( K ad − K cd ) − h β d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , )+ P abcd h β a , β b ih α c , ∇ K ab i (cid:2) h α d , ∇ K ac i ( K ad − K cd ) − h α d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , )We will now manipulate terms R , ,. . . , R , one by one. Since ∇ K ab = −∇ K ba , by relabeling theindices we have R , = P abcd h α a , α b ih β c , ∇ K ab ih β d , ∇ K ac i ( K ad − K cd )= P abc h α a , α b ih β c , ∇ K ab i (cid:10)P d K ad β d − P d K cd β d , ∇ K ac (cid:11) = P abc h α a , α b ih β c , ∇ K ab ih S ac ( β ) , ∇ K ac i = P abc h α a , α b ih β c , ∇ K ab i C ac ( β )= P ab h α a , α b i (cid:10)P c C ac ( β ) β c , ∇ K ab (cid:11) = P ab h α a , α b ih D a ( β, β ) , ∇ K ab i = P ab D a ( β, β ) T ∇ K ab α Tb α a = P a D a ( β, β ) T Dα hor a · α a = P a h D a ( β, β ) , F a ( α, α ) i . Similarly, R , = P a h D a ( α, α ) , F a ( β, β ) i . It is also the case that R , = − P abc h α a , β b ih β c , ∇ K ab i (cid:2)(cid:10)P d ( K ad − K cd ) α d , ∇ K ac (cid:11) − (cid:10)P d ( K bd − K cd ) α d , ∇ K bc (cid:11)(cid:3) = − P abc h α a , β b ih β c , ∇ K ab i (cid:2) h S ac ( α ) , ∇ K ac i − h S bc ( α ) , ∇ K bc i (cid:3) = − P abc h α a , β b ih β c , ∇ K ab i (cid:2) C ac ( α ) − C bc ( α ) (cid:3) ;19elabeling the indices (and using the fact that ∇ K ab = −∇ K ba ) yields: R , = − P abc (cid:2) h α a , β b i + h α b , β a i (cid:3) h β c , ∇ K ab i C ac ( α )= − P ab (cid:2) h α a , β b i + h α b , β a i (cid:3)(cid:10)P c C ac ( α ) β c , ∇ K ab (cid:11) = − P ab (cid:2) h α a , β b i + h α b , β a i (cid:3) h D a ( α, β ) , ∇ K ab i = − P ab D a ( α, β ) T (cid:2) ∇ K ab β Tb α a + ∇ K ab α Tb β a (cid:3) = − P a D a ( α, β ) T (cid:2) Dβ hor a · α a + Dα hor a · β a (cid:3) = − P a h D a ( α, β ) , F a ( α, β ) i . Similarly, R , = − P a h D a ( β, α ) , F a ( β, α ) i . By the symmetry of F a ( · , · ), R , + R , = − P a h D a ( α, β ) + D a ( β, α ) , F a ( α, β ) i Adding the above sum to the expressions for R , and R , finally yields (43). • Computation of R . We have R = −
18 ( α au β cr − β au α cr (cid:1) g ( au )( bs )( au )( bs ) , ( ησ ) g ( cr )( dv ) , ( ησ ) (cid:0) α bs β dv − β bs α dv (cid:1) . But by Lemma 10, g ( au )( bs )( au )( bs ) , ( ησ ) g ( cr )( dv ) , ( ησ ) = P Nη =1 P Dσ =1 ∂ σ K ab ( δ aη − δ bη ) δ us ∂ σ K cd ( K cη − K dη ) δ rv = h∇ K ab , ∇ K cd i δ us δ rv ( K ac − K ad − K bc + K bd ) , whence: − R = P all (cid:8) h∇ K ab , ∇ K cd i ( K ac − K ad − K bc + K bd ) · (cid:0) α au β cr α bs β dv δ us δ rv − α au β cr β bs α dv δ us δ rv − β au α cr α bs β dv δ us δ rv + β au α cr β bs α dv δ us δ rv (cid:1)(cid:9) = P abcd (cid:8)(cid:0) h α a , α b ih β c , β d i − h α a , β b ih β c , α d i − h β a , α b ih α c , β d i + h β a , β b ih α c , α d i (cid:1) · h∇ K ab , ∇ K cd i ( K ac − K ad − K bc + K bd ) (cid:9) Relabeling the indices in the above expression yields: − R = P abcd (cid:2) h α a , α b ih β c , β d i − h α a , β b ih β c , α d i − h β a , α b ih α c , β d i− h α a , β b ih β d , α c i − h β a , α b ih α d , β c i (cid:3) h∇ K ab , ∇ K cd i K ac = P abcd K ac (cid:2) α Ta α b ( ∇ K ab ) T ∇ K cd β Td β c − α Ta β b ( ∇ K ab ) T ∇ K cd α Td β c − β Ta α b ( ∇ K ab ) T ∇ K cd β Td α c − α Ta β b ( ∇ K ab ) T ∇ K cd β Td α c − β Ta α b ( ∇ K ab ) T ∇ K cd α Td β c (cid:3) = P ac K ac (cid:2) α Ta ( Dα hor a ) T ( Dβ hor c ) β c − α Ta ( Dβ hor a ) T ( Dα hor c ) β c − β Ta ( Dα hor a ) T ( Dβ hor c ) α c − α Ta ( Dβ hor a ) T ( Dβ hor c ) α c − β Ta ( Dα hor a ) T ( Dα hor c ) β c (cid:3) = P ac K ac (cid:2) h Dα hor a · α a , Dβ hor c · β c , i − h Dα hor a · β a + Dβ hor a · α a , Dα hor c · β c + Dβ hor c · α c i (cid:3) . = P ac K ac (cid:2) h F a ( α, α ) , F c ( β, β ) i − h F a ( α, β ) , F c ( α, β ) i (cid:3) , which is precisely (44). Alternatively, this can be derived from formula (40). • Computation of R . It is the case that: R = −
34 ( α au β cr − β au α cr (cid:1) g ( ξλ )( au ) , ( cr ) g ( ξλ )( ηµ ) g ( ηµ )( bs ) , ( dv ) (cid:0) α bs β dv − β bs α dv (cid:1) .
20y Lemma 10: P aucr ( α au β cr − β au α cr (cid:1) g ( ξλ )( au ) , ( cr ) = P aucr ( α au β cr − β au α cr (cid:1) ∂ r K ξa ( K ξc − K ac ) δ λu = P au (cid:8) α au (cid:2)P r ∂ r K ξa ( P c K ξc β cr ) (cid:3) − α au (cid:2)P r ∂ r K ξa ( P c K ac β cr ) (cid:3) − β au (cid:2)P r ∂ r K ξa ( P c K ξc α cr ) (cid:3) + β au (cid:2)P r ∂ r K ξa ( P c K ac α cr ) (cid:3)(cid:9) δ λu = P au (cid:8) α au (cid:2) h∇ K ξa , β hor ξ i − h∇ K ξa , β hor a i (cid:3) − β au (cid:2) h∇ K ξa , α hor ξ i − h∇ K ξa , α hor a i (cid:3)(cid:9) δ λu = P au (cid:8) α au h∇ K ξa , S ξa ( β ) i − β au h∇ K ξa , S ξa ( α ) i (cid:9) δ λu = P au (cid:8) C ξa ( β ) α au − C ξa ( α ) β au (cid:9) δ λu . So if we define the matrix H ia := P b (cid:2) C ab ( β ) α bi − C ab ( α ) β bi (cid:3) , i = 1 , . . . , D , a = 1 , . . . , N we have: R = − X us X ξη X λµ H uξ δ λu ( K − ) ξη δ λµ H sη δ µs = − D X u =1 N X ξ,η =1 H uξ H uη ( K − ) ξη = − k H k K − . Alternatively, this can be derived from formula (41).The denominator (34) of sectional curvature for L N ( R D ) is given by the simple formula: Proposition 11.
For any pair of cotangent vectors α, β ∈ T ∗ q L , k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = X abcd K ab K cd (cid:0) h α a , α b ih β c , β d i − h α a , β b ih α c , β d i (cid:1) . (51) Proof.
Using double-index notation we may write equation (34) as follows: k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = α au α bs β cr β dv (cid:0) g ( au )( bs ) g ( cr )( dv ) − g ( au )( dv ) g ( bs )( cr ) (cid:1) = P abcd α au α bs β cr β dv (cid:0) K ab δ us K cd δ rv − K ad δ uv K bc δ sr (cid:1) = P abcd h α a , α b ih β c , β d i K ab K cd − P abcd h α a , β d ih α b , β d i K ad K bc , and (51) follows by relabeling the indices. Finally, suppose the Green’s function K is rotationally invariant, i.e. that (K4) holds: K ( x ) = γ ( k x k ) , x ∈ R D , with γ ∈ C (cid:0) [0 , ∞ ) (cid:1) . We will use the convenient notation: γ := γ (0), γ ab := γ ( k q a − q b k ), γ ab := γ ( k q a − q b k ), and γ ab := γ ( k q a − q b k ) for a, b = 1 , . . . , N . Then we can evaluate the first and second derivatives of K : Lemma 12.
For rotationally invariant kernels, it is the case that: ∇ K ( x ) = γ ( k x k ) x k x k , (52) D K ( x ) = h γ ( k x k ) − γ ( k x k ) k x k i xx T k x k + γ ( k x k ) k x k I D (53)= γ ( k x k ) xx T k x k + γ ( k x k ) k x k Pr ⊥ ( x ) , where I D is the D × D identity matrix and Pr ⊥ ( x ) := I D − xx T k x k is projection to the hyperplane of R D that is normal to x . roof. We have that ∂ i K ( x ) = γ ( k x k ) x i k x k and (52) follows immediately. Also, ∂ j ∂ i K ( x ) = x i k x k ∂∂x j γ ( k x k ) + γ ( k x k ) k x k ∂∂x j x i + γ ( k x k ) x i ∂∂x j k x k = γ ( k x k ) x i x j k x k + γ ( k x k ) k x k δ ij − γ ( k x k ) x i x j k x k = (cid:2) γ ( k x k ) − γ ( k x k ) k x k (cid:3) x i x j k x k + γ ( k x k ) k x k δ ij , which implies (53).Because of (52), in the rotationally invariant case, the “scalar compression” C ab ( α ) really doesmeasure a multiple compression of the flow α ] between q a and q b . We can decompose the vectorstrain S ab ( α ) into the part parallel to the vector q a − q b and the part perpendicular to this: let u ab := q a − q b k q a − q b k and define S ab ( α ) k := (cid:10) S ab ( α ) , u ab (cid:11) , S ab ( α ) ⊥ := S ab ( α ) − S ab ( α ) k u ab . (54)Note that S ab ( α ) k is a scalar while S ab ( α ) ⊥ is a vector . In particular it is the case that C ab ( α ) = γ ab · S ab ( α ) k . Moreover, formula (53) allows us to simplify the first term R in the curvature formula.Substituting (53) into (42), we get the rotationally invariant case for R : Proposition 13.
In the rotationally invariant case (K4) we have that R = X a = b (cid:16) γ ab (cid:10) S ab ( α ) k β a − S ab ( β ) k α a , S ab ( α ) k β b − S ab ( β ) k α b (cid:11) (55)+ γ ab k q a − q b k (cid:10) S ab ( α ) ⊥ ⊗ β a − S ab ( β ) ⊥ ⊗ α a , S ab ( α ) ⊥ ⊗ β b − S ab ( β ) ⊥ ⊗ α b (cid:11)(cid:17) . In the above we use the inner product of tensor products, h v ⊗ w , v ⊗ w i := h v , v ih w , w i . Proof.
For any pair of covectors η and µ in T ∗ q L , by (53) we have that: S ab ( η ) T D K ab S ab ( µ ) = γ ab S ab ( η ) T u ab ( u ab ) T S ab ( µ ) + γ ab k q a − q b k S ab ( η ) T Pr ⊥ ( u ab ) S ab ( µ )= γ ab S ab ( η ) k S ab ( µ ) k + γ ab k q a − q b k (cid:10) S ab ( η ) ⊥ , S ab ( µ ) ⊥ (cid:11) . Inserting this expressions into (42) yields the desired result.
A simple special case is when only one landmark carries momentum. We now compute the numeratorof sectional curvature when both cotangent vectors are nonzero at only one of the D -dimensionallandmarks ( q , . . . , q N ). We define:( T ∗ q L ) := (cid:8) η ∈ T ∗ q L (cid:12)(cid:12) η a = 0 for a > (cid:9) so that the elements of the above set are cotangent vectors of the type η = ( η , , . . . , ! ! " ! ! $ % $ ! ! ! " ! ! $%$ − − − − − − − − Figure 3: Dragging effect of one momentum-carrying landmark q (bullet • ) on a grid of landmarks(circles ◦ ), with γ ( x ) = exp( − x σ ), σ = 1 .
5. Left: initial configuration, with initial momentum p = (2 . , .
8) also shown. Right: configuration after one unit of time, with trajectory of q alsoshown; the grid represents the diffeomorphism ϕ v , obtained by integrating α hor in time. Proposition 14. In L N ( R D ) , for any pair α, β ∈ ( T ∗ q L ) the four terms of R ( α ] , β ] , β ] , α ] ) aregiven by R = R = R = 0 and R = − P Na,b =2 h H a , H b i R D ( K − ) ab , where H a := ( γ a − γ ) (cid:0) h α , ∇ K a i β − h β , ∇ K a i α (cid:1) , for a > . Proof.
The vanishing of R can be checked directly (note that the sum in (42) is taken over a = b since S cc ( η ) = 0 for all c and η ). Also, using formula (39) we see that all mixed forces F a are zero,therefore R = R = 0 by formulae (43) and (44). Also, by (41), D a ( α, β ) = ( γ a − γ ) (cid:10) α , ∇ K a (cid:11) β since α, β ∈ ( T ∗ q L ) ; a similar expression holds for D a ( β, α ), which concludes the proof by (45).Therefore when α, β ∈ ( T ∗ q L ) the sectional curvature is always negative; we can understandthis by considering the geodesic flow in this case. It follows immediately from Proposition 6 that ifwe start with zero momenta p a at all q a , a >
1, then the momenta at these points stay zero, whilethe momentum at q remains constant. Thus the velocity of q is just given by K (0) p and this isconstant. The point q carrying the momentum moves in a straight line at constant speed, whilethe other points q a ( a >
1) are carried along by the global flow that the motion of q causes andmove at speeds ˙ q a = K a p , which are parallel to ˙ q (but not constant). As shown in Figure 3 (thecentral landmark q is the only one carrying momentum) what happens is that all other landmarkpoints are dragged along by q , more strongly when close, less when far away. Points directly infront of the path of q pile up and points behind space out.Negative curvature can be seen by the divergence of geodesics. If you imagine slightly changingthe direction of p in Figure 3, the final configuration of the landmark points (say, after one unitof time) will differ greatly from the one caused by the original value of p . Also, if you imagine q L = L ( R D ) (two landmarks only). We shall write: α k := h α , u i , α ⊥ := α − α k u , β k := h β , u i , β ⊥ := β − β k u . (56) Proposition 15.
In the case of L = L ( R D ) , when α, β ∈ ( T ∗ q L ) the numerator and denominatorof sectional curvature are given by, respectively: R ( α ] , β ] , β ] , α ] ) = R = − γ γ − γ γ + γ (cid:0) γ (cid:1) (cid:13)(cid:13) β k α ⊥ − α k β ⊥ (cid:13)(cid:13) , (57) k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = γ (cid:16)(cid:13)(cid:13) β k α ⊥ − α k β ⊥ (cid:13)(cid:13) + (cid:13)(cid:13) β ⊥ ⊗ α ⊥ − α ⊥ ⊗ β ⊥ (cid:13)(cid:13) (cid:17) . (58)Again we have used the inner product of tensor products h v ⊗ w , v ⊗ w i := h v , v ih w , w i . Proof.
It is the case that R = − k H k ( K − ) (matrices H a were defined in Proposition 14).But ( K − ) = ( γ − γ ) − γ , whereas from Proposition 14 we have k H k = ( γ − γ ) (cid:0) h α , ∇ K i k β k + h β , ∇ K i k α k − h α , ∇ K ih β , ∇ K ih α , β i (cid:1) , where ∇ K = γ u by (52). Inserting expressions (56) into the above formula yields (57).From Proposition 11 we have that the denominator is given by γ ( k α k k β k − h α , β i ); again,inserting (56) into such formula yields (58).We will generalize the above results in the next section. The complexity of the formula for curvature reflects a real complexity in the geometry of the land-mark space. But there is one case in which the geometry such space can be analyzed quite completely.This is when there are only two nonzero momenta along a geodesic. To put this in context, we firstintroduce a basic structural relation between landmark spaces.
Instead of labeling the landmarks as 1 , , · · · , N , one can use any finite index set A and label thelandmarks as q a with a ∈ A . And instead of calling the landmark space L N , we can call it L A .Now suppose we have a subset B ⊂ A . Then there is a natural projection π : L A → L B gotten byforgetting about the points with labels in A − B . In the metrics we have been discussing this is asubmersion . In fact, the kernel of dπ , the vertical subspace of T L A , is the space of vectors v a suchthat v a = 0 if a ∈ B . Its perpendicular in T ∗ is:( T ∗ L A ) B := (cid:8) p ∈ T ∗ L A (cid:12)(cid:12) p a = 0 for a ∈ A − B (cid:9)
24o the orthogonal complement of ker( dπ ) in T L A is the space of vectors p ] where p is in ( T ∗ L A ) B .On this subspace, the norm is just X b,b ∈B K ( q b − q b ) h p b , p b i whether p ] is taken to be a tangent vector to A or to B . In other words, the horizontal subspacefor the submersion π is the subbundle ( T ∗ L A ) ] B ⊂ T L A of tangent vectors p ] where p has zerocomponents in A − B and this has the same metric as the tangent space to L B . In particular,from the general theory of submersions, we know that every geodesic in L B beginning at somepoint π ( { q a } ) has a unique lift to a horizontal geodesic in L A starting at { q a } . The picture to haveis that all the landmark spaces form a sort of inverse system of spaces whose inverse limit is thegroup of diffeomorpisms of R D .We don’t want to pursue this is in general, but rather we will study the special case where thecardinality of B is two. We might as well, then, go back to our former terminology and consider themap π : L N → L gotten by mapping an N -tuple ( q , q , · · · , q N ) to the pair ( q , q ). Moreover, wewant to consider only the case in which the kernel K is rotationally invariant as in (K4). A basicquantity in all that follows is the distance ρ := k q − q k between the two momentum bearing points. Remarkably, we can describe, more or less explicitly, all the geodesics which arise as horizontallifts from this map. These are the geodesics with nonzero momenta only at q and q . Moreover,the formula for sectional curvature for the 2-plane spanned by any two horizontal vectors can beanalyzed. This analysis was started in the PhD thesis of the first author [23] and has been pursuedfurther in [22].The metric tensor of L = L ( R D ) in coordinates is obtained by inverting the 2 × K : K = (cid:20) γ γ ( ρ ) γ ( ρ ) γ (cid:21) = ⇒ (cid:26) ( K − ) = ( K − ) = ( γ − γ ( ρ ) ) − γ ( K − ) = ( K − ) = − ( γ − γ ( ρ ) ) − γ ( ρ ) , (59)so that the cometric and metric, for all covectors α, β ∈ T ∗ q L and vectors v, w ∈ T q L , are simply: g − ( α, β ) = γ (cid:0) h α , β i + h α , β i (cid:1) + γ ( ρ ) (cid:0) h α , β i + h α , β i (cid:1) , (60) g ( v, w ) = 1 γ − γ ( ρ ) h γ (cid:0) h v , w i + h v , w i (cid:1) − γ ( ρ ) (cid:0) h v , w i + h v , w i (cid:1)i . The geometry of the two-point space is best understood by changing variables for the landmarkcoordinates ( q , q ) and the momentum ( p , p ) to their means and semi-differences, that is: q := q + q , δq := q − q , p := p + p , δp := p − p , so that: q = q + δq, q = q − δq, p = p + δp, p = p − δp. Then the cometric (60) becomes: g − (cid:0) ( α, δα ) , ( β, δβ ) (cid:1) = 2 (cid:0) γ + γ ( ρ ) (cid:1) (cid:10) α, β (cid:11) + 2 (cid:0) γ − γ ( ρ ) (cid:1) (cid:10) δα, δβ (cid:11) . (61)With these coordinates, the two-point landmark space becomes a product V × V δ in which all fibers V × { δq } are flat Euclidean spaces though with variable scales, all fibers { q } × V δ are conformally flat metrics sitting on the manifold R D −{ } and the tangent spaces of the two factors are orthogonal.25 roposition 16. In terms of means and semi-differences, the geodesic equations for L ( R D ) are: ˙ q = (cid:0) γ + γ ( ρ ) (cid:1) p, ˙ p = 0 , ˙ δq = (cid:0) γ − γ ( ρ ) (cid:1) δp, ˙ δp = − γ ( ρ ) ρ (cid:0) k p k − k δp k (cid:1) δq. (62)The above result is proven by direct computation. We can solve these equations in four steps . First the linear momentum p is a constant, so “center of mass” q moves in a straight lineparallel to this constant: q ( t ) = q (0) + (cid:16) Z t (cid:0) γ + γ ( ρ ( τ )) (cid:1) dτ (cid:17) p. (63) Secondly, if we treat vectors δq and δp as 1-forms in R D , equations (62) also show that:( δq ∧ δp ) (cid:5) = ˙ δq ∧ δp + δq ∧ ˙ δp = [(scalar) δp ] ∧ δp + δq ∧ [(scalar) δq ] = 0 , so the angular momentum δq ∧ δp ∈ V R D is constant ; we write this as ω e ∧ e where ω is the nonnegative real magnitude of the angular momentum and ( e , e ) is an orthonormal pair.Then it follows that: δq ( t ) = ρ ( t ) (cid:2) cos (cid:0) θ ( t ) (cid:1) e + sin (cid:0) θ ( t ) (cid:1) e (cid:3) , for some function θ ( t ) . Thirdly, we can express θ ( t ) as an integral:˙ δq = ˙ ρ (cid:2) cos( θ ) e + sin( θ ) e (cid:3) + ρ ˙ θ (cid:2) − sin( θ ) e + cos( θ ) e (cid:3) , so˙ δq ∧ δq = − ρ ˙ θ e ∧ e , as well as (from (62)):˙ δq ∧ δq = (cid:0) γ − γ ( ρ ) (cid:1) δp ∧ δq = − ω (cid:0) γ − γ ( ρ ) (cid:1) e ∧ e ;combining the second and third lines, we find: θ ( t ) = θ (0) + 4 ω Z t γ − γ ( ρ ( τ )) ρ ( τ ) dτ ; (64)note that by (K1) and (K2) it is the case that γ ≥ γ ( ρ ) for any ρ ≥
0, so θ is a monotone increasing function if ω = 0, otherwise it is a constant. The last step is to solve for ρ ( t ). This can be done using conservation of energy [16, p. 51].Equations (62) are in fact the cogeodesic equations for the Hamiltonian H ( p, q ) of section 4.1, whichwe may rewrite in terms of means and semi-differences as H = (cid:0) γ + γ ( ρ ) (cid:1) k p k + (cid:0) γ − γ ( ρ ) (cid:1) k δp k by (61); hence this function of ρ and k δp k is a constant ( p is also a constant). Then we calculate:( ρ ) (cid:5) = 4 h δq, δq i (cid:5) = 8 h ˙ δq, δq i = 8 (cid:0) γ − γ ( ρ ) (cid:1) h δp, δq i = ⇒ ˙ ρ = 4 γ − γ ( ρ ) ρ h δp, δq i . But: h δp, δq i + ω = h δp, δq i + k δp ∧ δq k = k δp k · k δq k = ρ (cid:18) H − ( γ + γ ( ρ )) k p k γ − γ ( ρ ) (cid:19) , = ⇒ ˙ ρ = 2 p γ − γ ( ρ ) ρ q ρ (cid:2) H − (cid:0) γ + γ ( ρ ) (cid:1) k p k (cid:3) − ω (cid:0) γ − γ ( ρ ) (cid:1) . − − − − − − − − − − − − − − − − − − − Figure 4: Converging and diverging trajectories for two landmarks in two dimensions. In these ex-amples γ ( x ) = exp( − x ), ( q (0) , q (0)) = ((1 , , ( − , p (0) , p (0)) = (( − , . , (10 , − . p (0) , p (0)) = (( − , , (10 , − ρ ( t ) is the solution of: t = Z ρ ( t ) ρ (0) x dx p F ( x ) , where: F ( x ) := H x (cid:0) γ − γ ( x ) (cid:1) − k p k x (cid:0) γ − γ ( x ) (cid:1) − ω (cid:0) γ − γ ( x ) (cid:1) . (65) Summary.
If we fix constants H , p , ω , ρ (0), θ (0), q a (0) (for all a ), we can first integrate (65)to get ρ ( t ) (the separation of q and q ), then integrate (64) to find their relative angle θ ( t ), thenintegrate (63) to get their center of mass q ( t ). This gives the trajectories of q and q . The remainingpoints are dragged along as solutions of: ddt q a ( t ) = γ (cid:0) k q a ( t ) − q ( t ) k (cid:1) p ( t ) + γ (cid:0) k q a ( t ) − q ( t ) k (cid:1) p ( t ) . As worked out in [22], one can classify the global behavior of these geodesics into two types. Oneis the scattering type in which q , q diverge from each other as time goes to either ±∞ . This occursif the linear or angular momentum is large enough compared to the energy. In the other case wherethe energy is large enough compared to both momenta, they come together asymptotically at either t = + ∞ or −∞ , diverging at the other limit. In both cases, they may spiral around each other anarbitrarily large number of times (see Figure 4). Next we consider L N ( R D ): we want to compute the sectional curvature R ( α ] , β ] , β ] , α ] ) for cotangentvectors that are nonzero at only ( q , q ). Also, we will use the notation u := q − q k q − q k for the unitvector from q to q as well as ρ = k q − q k for their distance. Similarly to (54), we will also wantto decompose any vector in η ∈ R D into its parts tangent to u and perpendicular to u : η k := h η, u i , and η ⊥ := η − η k u. η k is a scalar whereas η ⊥ is a vector. Following the notation used to describegeodesics above, for any α ∈ ( T ∗ q L ) , := (cid:8) η ∈ T ∗ q L (cid:12)(cid:12) η a = 0 for a > (cid:9) , we write α = ( α + α )and δα = ( α − α ). Proposition 17. In L N ( R D ) for any pair α, β ∈ ( T ∗ q L ) , , the terms R , R and R in the numer-ator of sectional curvature can be written as R = 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) + 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ (cid:10) δα ⊥ ⊗ β − δβ ⊥ ⊗ α , δα ⊥ ⊗ β − δβ ⊥ ⊗ α (cid:11) ,R = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) ,R = γ − γ ( ρ )2 γ ( ρ ) (cid:2)(cid:0) h α , β i + h β , α i (cid:1) − h α , α ih β , β i (cid:3) . We need the following result.
Lemma 18.
For any α ∈ ( T ∗ q L ) , , the discrete strain S ( α ) is given by: S ( α ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) δα. (66) For any pair α, β ∈ ( T ∗ q L ) , it is the case that F a ( α, β ) = 0 for a > , whereas F ( α, β ) = − F ( α, β ) = γ ( ρ )2 (cid:0) h α , β i + h β , α i (cid:1) u. (67) Also, D ( α, β ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k β , D ( α, β ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k β . (68) Remark.
We are not interested in D a ( α, β ) for a > F a ( α, β ) = 0 for a > Proof of Lemma 18.
The formula for the discrete strain results from: S ( α ) = ( α ] ) − ( α ] ) = P b ( K b − K b ) α b = γ α + γ ( ρ ) α − γ ( ρ ) α − γ α = 2 (cid:0) γ − γ ( ρ ) (cid:1) δα. The values for F follow immediately from formula (39) and ∇ K = γ ( ρ ) u . Note that: C ( α ) = C ( α ) = (cid:10) S ( α ) , ∇ K (cid:11) = (cid:10) (cid:0) γ − γ ( ρ ) (cid:1) δα, γ ( ρ ) u (cid:11) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k , so D ( α, β ) = C ( α ) β = 2( γ − γ ( ρ )) γ ( ρ ) δα k β and D ( α, β ) = C ( α ) β = 2( γ − γ ( ρ )) γ ( ρ ) δα k β . Proof of Proposition 17.
The R expression follows by substituting the expressions in (66) into for-mula (55), noting that the only non-zero terms in the latter are for ( a, b ) = (1 ,
2) and ( a, b ) = (2 , F = − F from Lemma 18, R is given by R = (cid:10) D ( α, α ) − D ( α, α ) , F ( β, β ) (cid:11) + (cid:10) D ( β, β ) − D ( β, β ) , F ( α, α ) (cid:11) − (cid:10) D ( α, β ) − D ( α, β ) + D ( β, α ) − D ( β, α ) , F ( α, β ) (cid:11) . (69)28gain by Lemma 18 we have that D ( η, ζ ) − D ( η, ζ ) = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δη k δζ for any pair η, ζ ∈ ( T ∗ q L ) , , while F ( η, ζ ) = γ ( ρ ) (cid:0) h η , ζ (cid:11) + h η , ζ i (cid:1) u. Applying this to all the terms we getthe expression for R in the statement of the proposition.As far as R is concerned, by Theorem 9: R = γ (cid:2) h F ( α, β ) ,F ( α, β ) i−h F ( α, α ) ,F ( β, β ) i (cid:3) + γ ( ρ ) (cid:2) h F ( α, β ) ,F ( α, β ) i −h F ( α, α ) ,F ( β, β ) i (cid:3) + γ ( ρ ) (cid:2) h F ( α, β ) ,F ( α, β ) i−h F ( α, α ) ,F ( β, β ) i (cid:3) + γ (cid:2) h F ( α, β ) ,F ( α, β ) i −h F ( α, α ) ,F ( β, β ) i (cid:3) = 2( γ − γ ( ρ )) (cid:2) h F ( α, β ) , F ( α, β ) i − h F ( α, α ) , F ( β, β ) i (cid:3) = γ − γ ( ρ )2 ( γ ( ρ )) (cid:2) ( h α , β i + h β , α i ) − h α , α ih β , β i (cid:3) , where we have used the fact that F = − F , by equation (67). This completes the proof.The expressions provided by Proposition 17 become much clearer if we go over to means andsemi-differences, i.e. if we use the substitutions: α = α + δα, α = α − δα, β = β + δβ, β = β − δβ. (70) Corollary 19.
For any α, β ∈ ( T ∗ q L ) , , with L = L N ( R D ) , it is the case that: R = 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ k α − δα k β k − k δβ k δα − δα k δβ k (cid:1) + 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ (cid:0) k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k − k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k (cid:1) ,R = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ k α − δα k β k − k δβ k δα − δα k δβ k (cid:1) ,R = (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ ⊗ α − δα ⊗ β k − k β ⊗ α − α ⊗ β k − k δβ ⊗ δα − δα ⊗ δβ k (cid:1) . Proof.
By insertion of formulae (70) it is easily seen that (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) = k δβ k α − δα k β k − k δβ k δα − δα k δβ k , (cid:10) δα ⊥ ⊗ β − δβ ⊥ ⊗ α , δα ⊥ ⊗ β − δβ ⊥ ⊗ α (cid:11) = k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k − k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k , so the new expressions for R and R follow immediately. Also: (cid:2) h α ,β i + h β , α i (cid:3) − h α , α ih β , β i = (cid:2) h α, β i − h δα, δβ i ) (cid:3) − (cid:0) h α, α i − h δα, δα i (cid:1)(cid:0) h β, β i − h δβ, δβ i (cid:1) = − (cid:2) h α, α ih β, β i − h α, β i ) (cid:3) − (cid:2) h δα, δα ih δβ, δβ i − h δα, δβ i ) (cid:3) + 4 (cid:2) h α, α ih δβ, δβ i + h β, β ih δα, δα i − h α, β ih δα, δβ i (cid:3) = − k β ⊗ α − α ⊗ β k − k δβ ⊗ δα − δα ⊗ δβ k + 4 k δβ ⊗ α − δα ⊗ β k . The fourth term R is the only one which involves the other points q a , a >
2. But one has aninequality for this term involving the same expressions in α and β : Proposition 20.
Any pair α, β ∈ ( T ∗ L N ) , are constant 1-forms on L N which are pull-backs viathe submersion L N → L of constant 1-forms on L . We can therefore consider the curvature term R ( L N ) = − k [ α ] , β ] ] L N k on L N and the corresponding term R ( L ) = − k [ α ] , β ] ] L k on L .Then we have the inequality: R ( L N ) ≤ R ( L ) = − γ ( ρ ) (cid:20) (cid:0) γ − γ ( ρ ) (cid:1) γ + γ ( ρ ) k δβ k α − δα k β k + (cid:0) γ − γ ( ρ ) (cid:1) k δβ k δα − δα k δβ k (cid:21) . • q q α α δ k T ∗ q L (cid:27) - • • q q α α δ ⊥ T ∗ q L ? 6 • • q q α α T ∗ q L HHHY HHHY
Figure 5: Typical covectors α = ( α , α ) in spaces δ k T ∗ q L , δ ⊥ T ∗ q L , and T ∗ q L , for L = L ( R ). Proof.
Firstly, note that [ α ] , β ] ] L N breaks into perpendicular parts: a vertical part in the kernel of dπ and a horizontal part which is simply the horizontal lift of [ α ] , β ] ] L . This explains the inequalityassertion in Proposition 20. To calculate R ( L ), we use the last expression in (45), i.e. R ( L ) = − P a,b =1 (cid:10) D a ( α, β ) − D a ( β, α ) , D b ( α, β ) − D b ( β, α ) (cid:11) ( K − ) ab = − γ − γ ( ρ ) γ + γ ( ρ ) γ ( ρ ) n γ h k δα k β − δβ k α k + k δα k β − δβ k α k i − γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11)o , where we have used (59) and (68). The final result follows after inserting (70) into the aboveexpression and performing some algebra.Note that all terms in Corollary 19 and Proposition 20 are very similar. In fact, they are all“components” of the norm k α ∧ β k of the 2-form whose sectional curvature is being computed.First note that we can decompose T ∗ L into the direct sum of three pieces, namely: δ k T ∗ L := (cid:8) ( au, − au ) (cid:12)(cid:12) a ∈ R (cid:9) , dim (cid:0) δ k T ∗ L (cid:1) = 1 ,δ ⊥ T ∗ L := (cid:8) ( p, − p ) (cid:12)(cid:12) p ∈ R D , p ⊥ u (cid:9) , dim (cid:0) δ ⊥ T ∗ L (cid:1) = D − ,T ∗ L := (cid:8) ( p, p ) (cid:12)(cid:12) p ∈ R D (cid:9) , dim (cid:0) T ∗ L (cid:1) = D, where as usual u := q − q k q − q k (see Figure 5). Note that these three subspaces are orthogonal withrespect to the cometric by virtue of (60). An arbitrary covector α = ( α , α ) ∈ T ∗ q L can be uniquelydecomposed into the summation α = α (1) + α (2) + α (3) , with: α (1) := ( δα k u, − δα k u ) ∈ δ k T ∗ L , α (2) := ( δα ⊥ , − δα ⊥ ) ∈ δ ⊥ T ∗ L , α (3) := ( α, α ) ∈ T ∗ L . (71)So it is the case that: (i) α ∈ δ k T ∗ L ⇔ δ ⊥ α = 0 and α = 0; (ii) α ∈ δ ⊥ T ∗ L ⇔ δ k α = 0 and α = 0;(iii) α ∈ T ∗ L ⇔ δα k = 0 and δα ⊥ = 0.Consequently the space of 2-forms V T ∗ L decomposes into the direct sum of five pieces: ^ T ∗ L = M i =1 V i , with: V := δ k T ∗ L ∧ T ∗ L ,V := δ ⊥ T ∗ L ∧ T ∗ L , V := δ k T ∗ L ∧ δ ⊥ T ∗ L ,V := ^ (cid:0) δ ⊥ T ∗ L (cid:1) , V := ^ (cid:0) T ∗ L (cid:1) . δ k T ∗ L is one-dimensional it creates no 2-forms.) Once again, note that the spaces V , . . . , V are pairwise orthogonal with respect to the inner product (cid:10) α ∧ β, ξ ∧ η (cid:11) V T ∗ L := (cid:10) α, ξ (cid:11) T ∗ L (cid:10) β, η (cid:11) T ∗ L − (cid:10) α, η (cid:11) T ∗ L (cid:10) β, ξ (cid:11) T ∗ L , α, β, ξ, η ∈ T ∗ L (72)by the orthogonality of δ k T ∗ q L , δ ⊥ T ∗ q L , and T ∗ q L . Any 2-form α ∧ β then decomposes into the sumof its five projections onto these subspaces and its norm squared is the sum of the norm squared ofthese components. Let us first give the five pieces of its norm names: T := k δβ k u ⊗ α − δα k u ⊗ β k ,T := k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k , T := k δβ k u ⊗ δα ⊥ − δα k u ⊗ δβ ⊥ k ,T := k δβ ⊥ ⊗ δα ⊥ − δα ⊥ ⊗ δβ ⊥ k , T := k β ⊗ α − α ⊗ β k . In the above definitions k k indicates the Euclidean norm. We have to be careful here: we havebeen using Euclidean norms in R D in all our formulas above and now we are dealing with norms in T ∗ L ; these essentially differ only by a factor, by (61). More precisely, the following result holds: Proposition 21.
The denominator of the sectional curvature (17) for L ( R ) can be written as: k α ∧ β k V T ∗ L = 4 (cid:0) γ − γ ( ρ ) (cid:1) ( T + T ) + 2 (cid:0) γ − γ ( ρ ) (cid:1) (2 T + T ) + 2 (cid:0) γ + γ ( ρ ) (cid:1) T . (73) Proof.
We may apply decomposition (71) to both α = P i =1 α ( i ) and β = P i =1 β ( i ) , and write α ∧ β = (cid:0) α (1) ∧ β (3) − β (1) ∧ α (3) (cid:1) + (cid:0) α (2) ∧ β (3) − β (2) ∧ α (3) (cid:1) + (cid:0) α (1) ∧ β (2) − β (1) ∧ α (2) (cid:1) + α (2) ∧ β (2) + α (3) ∧ β (3) , where the five summands on the right-hand side belong to V , . . . , V respectively. We have k α (1) ∧ β (3) − β (1) ∧ α (3) k V T ∗ L == (cid:13)(cid:13) α (1) ∧ β (3) k V T ∗ L + (cid:13)(cid:13) β (1) ∧ α (3) k V T ∗ L − h α (1) ∧ β (3) , β (1) ∧ α (3) i V T ∗ L ( ∗ ) = 4 (cid:0) γ − γ ( ρ ) (cid:1)(cid:2) ( δα k ) k β k + ( δβ k ) k α k − δα k δβ k h α, β i (cid:3) = 4 (cid:0) γ − γ ( ρ ) (cid:1) T , where we have used (72) and (61) in step ( ∗ ). The square norm of the remaining four terms iscomputed similarly. Orthogonality of V , . . . , V finally yields (73).To express the formulas for the numerator of sectional curvature succinctly, let us also introduceabbreviations for the coefficients involving γ : k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) , k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ ,k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) , k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ + γ ( ρ ) γ ( ρ ) . (74)Note that k , k , k and k are all homogeneous of degree 3 in γ and degree − ρ or dρ on L N . Moreover k is negative, k and k are positive, while k may be positive or negative.For all γ of interest, γ is everywhere negative, starting at 0 decreasing to a minimum at some ρ ,then increasing back to 0 at ∞ . Then k is negative for ρ < ρ and positive for ρ > ρ .31 k − k − +k − k − k − k − − k Figure 6: The coefficients of T (top left), T (top right), T (bottom left) and T (bottom right) forthe Bessel kernels γ (shown with thin lines) and the Gaussian kernel (shown with the thick line).The kernels are scaled to normalize γ (0) and γ (0).The following equalities are proven by direct computation: k δβ ⊗ α − δα ⊗ β k = T + T , k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k = T + T , k δβ ⊗ δα − δα ⊗ δβ k = 2 T + T . Inserting notation (74) and the above equalities into Propositions 17 and 20 immediately yields:
Proposition 22.
We can write the terms in the numerator of sectional curvature for L ( R D ) as: R = 4 k ( T − T ) + 4 k ( T − T − T ) , R = − k ( T − T ) ,R = k (2( T + T ) − T − T − T ) , R = − k T + k T ) , (75) hence R = R ( α ] , β ] , β ] , α ] ) = P i =1 R i may be expressed as: R = 2 (cid:0) k − k − k (cid:1) T + 2 (cid:0) k + k (cid:1) T + 4 (cid:0) − k − k − k (cid:1) T + (cid:0) − k − k (cid:1) T − k T . (76)By virtue of Proposition 20 the above proposition still holds in the case of L N ( R D ) as longas α, β ∈ ( T ∗ q L ) , and the equality signs for R in (75) and R in (76) are substituted by “ ≤ ”. The32 − − − − ! " q GEODESIC 1GEODESIC 2
Figure 7: Left: sectional curvature K for L ( R ) (from Proposition 23), as a function of ρ = | q − q | ;here γ ( x ) = exp( − x ). Right: two trajectories in L ( R ) shown in the ( q , q ) plane (under theassumption that q < q ). Both geodesics originate at ( q , q ) = (0 , K > K at | q − q | ’ . p , p ) = (1 ,
1) and ( p , p ) = (1 , . γ of interestare the Bessel kernels (3) and the Gaussian kernel, which is their asymptotic limit as their ordergoes to infinity. The coefficients for these kernels are shown in Figure 6. We see that the coefficientsof T and T are negative while those of T are positive. Henceforth, we assume we have a kernelfor which this is true. L ( R ) Finally, we will now explore the important example of two landmarks on the real line. In this partic-ular case the manifold is two dimensional, so sectional curvature K will turn out to be independentof cotangent vectors α and β . In fact, given the translation invariance of the metric tensor, it willonly depend on the distance ρ = | q − q | between the two landmarks.The spaces δ k T ∗ L ( R ) and T ∗ L ( R ) are one-dimensional while δ ⊥ T ∗ L ( R ) = { } . Thus ^ T ∗ L ( R ) = δ k T ∗ L ( R ) ∧ T ∗ L ( R )and the only non-zero term in (76) is T . Therefore combining formulas (73) and (76) we get: Proposition 23.
The sectional curvature of L ( R ) is given by K = 2 k − k − k γ − γ ( ρ ) ) = γ − γ ( ρ ) γ + γ ( ρ ) γ − γ − γ ( ρ )( γ + γ ( ρ )) (cid:0) γ ( ρ ) (cid:1) . − − − − − Figure 8: Existence of conjugate points in L ( R ), with γ ( x ) = exp( − x ). Both geodesicsoriginate at landmark set ( q , q ) = (cid:0) ( − , − , (2 , (cid:1) ; the first one (dashed) has initial momen-tum ( p , p ) = (cid:0) (0 , , (0 , (cid:1) ∈ T ∗ L while the second one (continuous) has initial momentum( p , p ) = (cid:0) (6 , , ( − , (cid:1) ∈ δ k T ∗ L ⊕ T ∗ L . The geodesic trajectories exhibit conjugate points.The above function K is shown on the left-hand side of Figure 7 as a function of ρ , for theGaussian kernel. The coefficient of the term T in (76) is negative for ρ small and positive for ρ large. The “cause” of the positive curvature has been analyzed in [23]. Roughly speaking, supposetwo points both want to move a fixed distance to the right. Then if they are far enough away, theycan just move more or less independently (we shall refer to this as Geodesic 1). Or (i) the one inback can speed up while the one in front slows down, then (ii) when the pair are close, they move intandem using less energy because they are close and finally (iii) the back one slows down, the frontone speeds up when they near their destinations (Geodesic 2). This gives explicit conjugate points(in the sense that two points are joined by distinct geodesics) and is illustrated on the right-handside of figure Figure 7 (where Geodesics 1 and 2 are represented, respectively, by the dashed andthick curves). There is another source of positive curvature in L in higher dimensions. It is clear from equation (76)and Figure 6 that any positive curvature must come from the term with T or the term with T . Asthe five terms are orthogonal, we can make all of them but one zero.For example, if we choose α = ( δα k u, − δα k u ) ∈ δ k T ∗ L and β = ( β, β ) ∈ T ∗ L , then it is the casethat T = ( δα k ) k β k and it is the only non-zero term. Then, if ρ is sufficiently large, the sectionalcurvature for this 2-plane is positive as discussed in the last section. Figure 8 illustrates an instanceof the existence of conjugate points for two geodesics in L ( R ); the momenta ( p , p ) of each of thetwo trajectories belong at all times to δ k T ∗ L ⊕ T ∗ L .34he other possibility is that T is the non-zero term, which happens when α = ( δα ⊥ , − δα ⊥ ) ∈ δ ⊥ T ∗ L and β = ( δβ ⊥ , − δβ ⊥ ) ∈ δ ⊥ T ∗ L . We have T = 2( k δα ⊥ k k δβ ⊥ k − h δα ⊥ , δβ ⊥ i ), and for itto be nonzero it is required that D ≥ T is the norm of a 2-form in V (cid:0) δ ⊥ T ∗ L (cid:1) , whichhas dimension ( D − D − /
2. The positive curvature of this section is readily seen by consideringthe geodesics which these vectors generate. The simplest example is the following:
Proposition 24.
The circular periodic orbit of radius r : q ( t ) = ( r cos t, r sin t ) , q ( t ) = − q ( t ) , (77) t ∈ R , is a geodesic in L ( R ) if and only if r is the solution of the equation γ − γ (2 r ) + rγ (2 r ) = 0 .Proof. For orbit (77) it is the case that q ≡ δq = q and ρ ≡ r ; also p = (cid:0) γ − γ ( ρ ) (cid:1) − ˙ q and p = − p , so that p = 0 and δp = p . The first three equations of (62) can easily be checked; thefourth one holds if and only if γ − γ (2 r ) + rγ (2 r ) = 0.(The above result was also proven by Fran¸cois-Xavier Vialard of Imperial College, London.)Orbit (77) has the property that at time π , q and q interchange their positions: it is a geodesicfrom the set of landmark points (cid:0) ( r, , ( − r, (cid:1) ∈ L ( R ) to the set (cid:0) ( − r, , ( r, (cid:1) ∈ L ( R ). Butif these points live in R , they can move around each other in any plane containing the points. Thuswe have a circle of geodesics in L ( R ): q ( t ) = ( r cos t, r cos θ sin t, r sin θ sin t ) , q ( t ) = − q ( t )all connecting (cid:0) ( r, , , ( − r, , (cid:1) to (cid:0) ( − r, , , ( r, , (cid:1) , for any θ ∈ [0 , π ). This is exactly like allthe lines of fixed longitude connecting the north and south pole on the 2-sphere and means that oneset of landmark points is a conjugate point of the other in L ( R ). This is the simplest example ofhow geodesics between landmark points must avoid collisions and so make a choice between differentpossible detours, leading to conjugate points and thus positive curvature. We believe that L N ( R D ), the Riemannian manifold of N landmark points in D dimensions, is afundamental object for differential geometry and that we have only scratched the surface in its study.We started with a basic formula which computes sectional curvature of a Riemannian manifold interms of the cometric, its partial derivatives, and the metric itself (but not its derivatives). This isparticularly adapted to computing curvature for manifolds which arise as submersive quotients ofother manifolds and gives O’Neill’s formula as a corollary. We then applied this to derive a formulafor sectional curvature of the space of landmarks. This formula is not simple but, like Arnold’sformula for curvature of Lie groups under left- (or right)-invariant metrics, splits into a sum of fourterms. The four terms involve interesting intermediate expressions in the two vectors (or co-vectors)which define the section and which have relatively simple geometric interpretations. We called thesethe mixed force , the discrete vector strain , the scalar compression and the landmark derivative . Thegeodesic equation in its Hamiltonian form is quite simple and involves the force as expected. Wealso gave several concrete examples to illustrate the nature of these geodesics.Finally, we have examined in detail the case of geodesics in which only one or two landmark pointshave non-zero momenta, and computed the curvature in sections spanned by such geodesics. Wefound that in this case there are essentially two sources of positive curvature. One can understandthem through the non-uniqueness of geodesics joining two N -tuples: the first sort of non-uniqueness35s caused by the two points with non-zero momentum choosing between converging in the middle ofthe geodesic (“car-pooling”) or moving independently and not converging; the second occurs onlywhen D ≥ D = 2, this sort non-uniqueness also occurs but comes from non-trivialtopology, not curvature).One of the most important questions left open is explore how prevalent positive curvature is ingeneral, i.e. for geodesics in which all points carry momentum. Answering this question is centralto applications of landmark space in which geodesics are actually computed. One might hope thatthe picture for two momenta is true in general but this is far from clear. It seems interesting toexplore whether there is some sort of “index” for curvature forms — a numerical measure of howmuch positive vs. negative curvature is present. Another important question is to explore the shapeof the coefficients in (76) for different kernels. More generally, what is the impact of different kerneltypes (Bessel, Gaussian, Cauchy) on the corresponding geodesics? Finally, note that all kernels havea length constant built into their definition so the geometry of the space of landmarks is far fromscale invariant. Thus one should analyze what happens asymptotically when the points are veryclose relative to this constant or are very far from each other. References [1] M. Abramowitz and I. A. Stegun.
Handbook of Mathematical Functions . Dover Publications,New York, 1964.[2] S. Allassonni`ere, Y. Amit, and A. Trouv´e. Toward a coherent statistical framework for densedeformable template estimation.
Journal of the Royal Statistical Society: Series B , 69(1):3–29,2007.[3] V. I. Arnold.
Mathematical Methods of Classical Mechanics , volume 60 of
Graduate Texts inMathematics . Springer, New York, second edition, 1989.[4] M. F. Beg, M. I. Miller, A. Trouv´e, and L. Younes. Computing large deformation metricmappings via geodesic flows of diffeomorphisms.
International Journal on Computer Vision ,61(2):139–157, 2005.[5] Y. Cao, M. I. Miller, S. Mori, R. L. Winslow, and L. Younes. Diffeomorphic matching ofdiffusion tensor images. In
Proceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR ’06) , New York, June 2006.[6] Y. Cao, M. I. Miller, R. L. Winslow, and L. Younes. Large deformation diffeomorphic metricmapping of vector fields.
IEEE Transactions on Medical Imaging , 24(9):1216–1230, 2005.[7] C. Chicone.
Ordinary Differential Equations and Applications , volume 34 of
Texts in AppliedMathematics . Springer, 1999.[8] S. Durrleman, M. Prastawa, G. Gerig, and S. Joshi. Optimal data-driven sparse parameteri-zation of diffeomorphisms for population analysis. In
Proc. of the 22nd Conf. on InformationProcessing in Medical Imaging (IPMI) , Bavaria, Germany, July 2011.[9] L. C. Evans.
Partial Differential Equations , volume 19 of
Graduate Studies in Mathematics .American Mathematical Society, Providence, Rhode Island, 1998.3610] S. Gallot, D. Hulin, and J. Jacques Lafontaine.
Riemannian Geometry . Springer, 3rd edition,2004.[11] J. Glaun`es.
Transport par diff´eomorphismes de points, de mesures et de courants pour la com-paraison de formes et l’anatomie num´erique . PhD thesis, Universit´e Paris 13, France, Sept.2005.[12] J. Glaun`es, A. Qiu, M. I. Miller, and L. Younes. Large deformation diffeomorphic metric curvemapping.
International Journal of Computer Vision , 80(3):317–336, Dec. 2008.[13] J. Glaun`es, A. Trouv´e, and L. Younes. Diffeomorphic matching of distributions: a new approachfor unlabeled point-sets and sub-manifolds matching. In
Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR ’04) , volume 2, pages 712–718, Washington,DC, June 2004.[14] J. Glaun`es, M. Vaillant, and M. I. Miller. Landmark matching via large deformation diffeomor-phisms on the sphere.
Journal of Mathematical Imaging and Vision , 20:170–200, 2004.[15] S. C. Joshi and M. I. Miller. Landmark matching via large deformation diffeomorphisms.
IEEETransactions on Image Processing , 9(8):1357–1370, Aug. 2000.[16] J. Jost.
Riemannian Geometry and Geometric Analysis . Springer-Verlag, New York, 5th edition,2008.[17] D. G. Kendall. Shape manifolds, Procrustean metrics, and complex projective spaces.
Bulletinof the London Mathematical Society , 16(2):81–121, 1984.[18] E. Klassen, A. Srivastava, W. Mio, and S. Joshi. Analysis of planar shapes using geodesic pathson shape spaces.
IEEE Transactions on Pattern Analysis and Machine Intelligence , 26(3):372–383, Mar. 2004.[19] A. Kriegl and P. W. Michor.
The Convenient Setting of Global Analysis , volume 53 of
Mathe-matical Surveys and Monographs . American Mathematical Society, Providence, Rhode Island,1997.[20] S. Kushnarev. Teichons: Soliton-like geodesics on universal Teichm¨uller space.
ExperimentalMathematics , 18(3), 2009.[21] J. M. Lee.
Riemannian Manifolds: an Introduction to Curvature , volume 176 of
Graduate Textsin Mathematics . Springer, New York, 1997.[22] R. L. McLachlan and S. Marsland. N -particle dynamics of the Euler equations for planardiffeomorphisms. Dynamical Systems , 22:269–290, 2007.[23] M. Micheli.
The Geometry of Landmark Shape Spaces: Metrics, Geodesics, and Curvature . PhDthesis, Brown University, Providence, Rhode Island, 2008.[24] P. W. Michor.
Topics in differential geometry , volume 93 of
Graduate Studies in Mathematics .American Mathematical Society, Providence, RI, 2008.[25] P. W. Michor and D. B. Mumford. Riemannian geometries on spaces of plane curves.
Journalof the European Mathematical Society , 8:1–48, 2006.3726] P. W. Michor and D. B. Mumford. An overview of the Riemannian metrics on spaces of curvesusing the Hamiltonian approach.
Applied and Computational Harmonic Analysis , 23:74–113,2007.[27] M. I. Miller and L. Younes. Group actions, homeomorphisms, and matching: A general frame-work.
International Journal of Computer Vision , 41(1/2):61–84, 2001.[28] E. Sharon and D. B. Mumford. 2D-shape analysis using conformal mapping.
InternationalJournal of Computer Vision , 70(1):55–75, Oct. 2006.[29] S. Sommer, F. Lauze, M. Nielsen, and X. Pennec. Kernel Bundle EPDiff: Evolution equationsfor multi-scale diffeomorphic image registration. In
Proc. of the 3rd Conf. on Scale Space andVariational Methods in Computer Vision (SSVM) , Ein-Gedi, Israel, May-June 2011.[30] G. Sundaramoorthi, A. Yezzi, and A. Mennucci. Sobolev active contours.
International Journalof Computer Vision , 73(3):345–366, July 2007.[31] L. Younes.
Shapes and Diffeomorphisms , volume 171 of
Applied Mathematical Sciences .Springer, 2010.[32] L. Younes, P. W. Michor, J. Shah, and D. B. Mumford. A metric on shape space with explicitgeodesics.
Rendiconti Lincei – Matematica e Applicazioni , 9:25–57, 2008.[33] S. Zhang, L. Younes, J. Zweck, and J. T. Ratnanather. Diffeomorphic surface flows: A novelmethod of surface evolution.
SIAM Journal on Applied Mathematics , 68(3):806–824, Jan. 2008.38 ectional Curvature in terms of the Cometric, withApplications to the Riemannian Manifolds of Landmarks
Mario Micheli Peter W. Michor David MumfordDepartment of Mathematics Fakult¨at f¨ur Mathematik Div. of Applied MathematicsUniv. of California, Los Angeles Universit¨at Wien Brown University520 Portola Plaza Nordbergstrasse 15 182 George StreetLos Angeles, CA 90095, USA A-1090 Wien, Austria Providence, RI 02012, USA [email protected] [email protected] David [email protected]
Keywords: shape spaces, landmark points, cometric, sectional curvature.
Acknowledgements : MM was supported by ONR grant N00014-09-1-0256, PWM was supportedby FWF-project 21030, DM was supported by NSF grant DMS-0704213, and all authors wheresupported by NSF grant DMS-0456253 (Focused Research Group: The geometry, mechanics, andstatistics of the infinite dimensional shape manifolds). MM would like to thank Andrea Bertozzi ofUCLA for her continuous advice and and support.
Abstract
This paper deals with the computation of sectional curvature for the manifolds of N land-marks (or feature points) in D dimensions, endowed with the Riemannian metric induced by thegroup action of diffeomorphisms. The inverse of the metric tensor for these manifolds (i.e. thecometric), when written in coordinates, is such that each of its elements depends on at most 2 D of the ND coordinates. This makes the matrices of partial derivatives of the cometric verysparse in nature, thus suggesting solving the highly non-trivial problem of developing a for-mula that expresses sectional curvature in terms of the cometric and its first and second partialderivatives (we call this Mario’s formula). We apply such formula to the manifolds of landmarksand in particular we fully explore the case of geodesics on which only two points have non-zeromomenta and compute the sectional curvatures of 2-planes spanned by the tangents to suchgeodesics. The latter example gives insight to the geometry of the full manifolds of landmarks. In the past few years there has been a growing interest, in diverse scientific communities, in modeling shape spaces as Riemannian manifolds. The study of shapes and their similarities is in fact centralin computer vision and related fields (e.g. for object recognition, target detection and tracking,classification of biometric data, and automated medical diagnostics), in that it allows one to recognizeand classify objects from their representation. In particular, a distance function between shapesshould express the meaning of similarity between them for the application that one has in mind,and also be mathematically sound and treatable.Among the several ways of endowing a shape manifold with a Riemannian structure (see, forexample, [13, 14, 16, 21, 24, 25]), one that has recently gained popularity is inducing it through theaction on the manifold itself of an infinite-dimensional Lie group of diffeomorphisms with a givenmetric, as described in [23, 26]. This approach can be used to provide a metric to several deformation-related shape spaces, such as the manifolds of curves [8, 22], surfaces [28], scalar images [2], vectorfields [4], diffusion tensor images [3], measures [7, 9], and labeled landmarks (or “feature points”) [10,11]. The actual geometry of these Riemannian manifolds has remained almost completely unknown1 a r X i v : . [ m a t h . DG ] J un ntil very recently, when certain fundamental questions about their curvature have finally startedbeing addressed [21, 22, 27].This paper deals with the problem of computing sectional curvature for landmark points, and isbased on results from the thesis of the first author [19]. The manifold of landmarks in Euclideanspace is defined as L N ( R D ) := n ( P , . . . , P N ) (cid:12)(cid:12) P a ∈ R D , P a = P b if a = b o . This is among the simplest shape manifolds in that it is finite-dimensional, albeit with high dimension n = N D , where N is the number of landmarks, D is the dimension of the ambient space in whichthey live (e.g. D = 2 for the plane or the sphere). Therefore its metric tensor may be written,in any set of coordinates, as a finite-dimensional matrix. In fact it turns out that the inverse ofthe matrix defining the metric induced by the group action of diffeomorphisms (the cometric ), hasa relatively simple structure since each of its elements depends only on at most 2 D of the ND coordinates. Hence the matrices obtained by taking first and second partial derivatives of thecometric have a very sparse structure — that is, most of their entries are zero. This suggests thatfor the purpose of calculating curvature (rather than following the “classical” path of computingfirst and second partial derivatives of the metric tensor itself, the Christoffel symbols, et cetera) itwould be convenient to write sectional curvature in terms of the inverse of the metric tensor andits derivatives. So we have solved the highly non-trivial problem of developing a formula (that wecall “Mario’s formula”) precisely for this purpose: for a given pair of cotangent vectors this formulaexpresses the corresponding sectional curvature as a function of the cometric and its first and secondpartial derivatives except for one term which requires the metric (but not its derivatives).The paper is organized as follows. We first give a few more details about the motivationalexample provided by the manifold of landmarks, and describe the metric induced by the action ofthe Lie group of diffeomorphisms. We then give a proof for the general formula expressing sectionalcurvature in terms of the cometric. This formula is used in the following section to compute thesectional curvature for the manifold of labeled landmarks. In the last section, we analyze the case ofgeodesics on which only two points have non-zero momenta and the sectional curvatures of 2-planesmade up of the tangents to such geodesics. In this case, both the geodesics and the curvature aremuch simpler and give insight into the geometry of the full landmark space. In this section we briefly summarize how the shape space of landmarks can be given the structure ofa Riemannian manifold. We refer the reader to [23, 26] for the general framework on how to endow generic shape manifolds with a Riemannian metric via the action of Lie groups of diffeomorphisms.
We will first define a distance function d : L N ( R D ) × L N ( R D ) → R + on landmark space which willthen turn out to be the geodesic distance with respect to a Riemannian metric. Let Q be the set ofdifferentiable landmark paths , that is: Q := n q = ( q , . . . , q N ) : [0 , → L N ( R D ) (cid:12)(cid:12)(cid:12) q a ∈ C (cid:0) [0 , , R D (cid:1) , a = 1 , . . . , N o . Following [26, Chapters 12 and 13], a Hilbert space (cid:0) V, h , i V (cid:1) of vector fields on Euclidean space(which we consider as functions R D → R D ) is said to be admissible if (i) V is continuously embedded2n the space of C -mappings on R D → R D which are bounded together with their derivatives, (ii) V is large enough: For any positive integer M , if x , . . . , x M ∈ R D and α , . . . , α M ∈ R D are suchthat, for all u ∈ V , P Ma =1 (cid:10) α a , u ( x a ) (cid:11) R D = 0, then α = . . . = α M = 0.Thus ( V, h , i V ) admits a reproducing kernel : For each α, x ∈ R D there exists K αx ∈ V with h α, f ( x ) i R D = h K αx , f i V for all f ∈ V . Further, h β, K αx ( y ) i R D = h K βy , K αx i V which is a bilinear formin ( α, β ) ∈ ( R D ) , thus given by a D × D matrix K ( x, y ).In this paper we shall assume that K ( x, y ) is a multiple of the identity and is translation invariant:we then write K ( x, y ) simply as K ( x − y ) I D (where I D is the D × D identity matrix). There areother very natural admissible norms on vector fields v whose kernels are not multiples of the identity,e.g. one can add a multiple of div( v ) to any norm and then K will intertwine different componentsof v . The most natural examples of the norms we will consider are given by inner products h u, v i V = h u, v i L := Z R D (cid:10) Lu ( x ) , v ( x ) (cid:11) R D dx, (1)where L is a self-adjoint elliptic scalar differential operator of order greater than D + 2 with constantcoefficients which is applied separately to each of the scalar components of the vector field u =( u , . . . , u D ). By the Sobolev embedding theorem then V consists of C -functions on R D whichare bounded together with their derivatives. If K is a scalar fundamental solution (or Green’sfunction [6]) so that L ( K )( x ) = δ ( x ), then the reproducing kernel is given by K αx = K ( − x ) α . Apossible choice of the operator is L = (1 − A ∆) k (where A ∈ R is a scaling factor, k ∈ N and ∆ isthe Laplacian operator), with k > D + 1, in which case (1) becomes the Sobolev norm: k u k L = Z R D D X ‘ =1 k X m =0 (cid:18) km (cid:19) A m X | α | = m (cid:12)(cid:12) D α u ‘ (cid:12)(cid:12) dx, (2)When L = (1 − A ∆) k the scalar kernel K has the form K ( x − y ) = γ (cid:0) k x − y k R D (cid:1) , with: γ ( % ) = 12 k + D − π D Γ( k ) A D (cid:16) %A (cid:17) k − D K k − D (cid:0) %A (cid:1) , % > , (3)where K ν (with ν = k − D ) is a modified Bessel function [1] of order ν .Now fix any admissible Hilbert space of vector fields. The space L p ([0 , , V ) is the set of func-tions v : [0 , → V such that: k v k L p ([0 , ,V ) := (cid:16) Z k v ( t, ) k pV dt (cid:17) p < ∞ . The space L ([0 , , V ) is a subset of L ([0 , , V ) and is in fact a Hilbert space with inner product h u, v i L ([0 , ,V ) := R h u, v i V dt . It is well known [5] from the theory of ordinary differential equationsthat for any v ∈ L ([0 , , V ), the D -dimensional non-autonomous dynamical system ˙ z = v t ( z ), withinitial condition z ( t ) = x , has a unique solution of the type z ( t ) = ψ ( t, t , x ). Let ϕ vst ( x ) := ψ ( t, s, x ); fixing t = 1 and s = 0 we get ϕ v := ϕ v , which is the diffeomorphism generated by v . Foran admissible Hilbert space we will call the set G V := (cid:8) ϕ v : v ∈ L (cid:0) [0 , , V (cid:1)(cid:9) the group of diffeomorphisms generated by V ; by [26, Chapter 12] it is a metric space and a topo-logical group. But, in the language of manifolds, G V is not an infinite-dimensional Lie group [15]. V is not a Lie algebra, but is the completion of the Lie algebra of C ∞ -vector fields with compactsupport with respect to k k V . 3 .2 Definition of the distance function For velocity vector fields v ∈ L ([0 , , V ) and landmark trajectories q ∈ Q define the energy E [ v, q ] := Z (cid:16)(cid:13)(cid:13) v ( t, ) (cid:13)(cid:13) V + λ N X a =1 (cid:13)(cid:13)(cid:13) dq a dt ( t ) − v (cid:0) t, q a ( t ) (cid:1)(cid:13)(cid:13)(cid:13) R D (cid:17) dt. (4)We claim that a distance function d on L N ( R D ) between two landmark sets (or shapes) I =( x , x , . . . , x N ) and I = ( y , y , . . . , y N ) can be defined as d ( I, I ) := inf v,q np E [ v, q ] : v ∈ L ([0 , , V ) , q ∈ Q with q (0) = I, q (1) = I o ; (5)in the next subsection we will argue that the above function is in fact a geodesic distance with respectto a Riemannian metric. We treat the minimization of (4) as our starting point; it is the “energy ofa metamorphosis” as formulated in [26, Chapter 13].The above infimum is computed over all differentiable landmark paths q ∈ Q that satisfy theboundary conditions ( q a (0) = x a and q a (1) = y a , i = 1 , . . . , N ), and vector fields v ∈ L ([0 , , V ).The resulting landmark trajectories { q a ( t ) , t ∈ [0 , } a =1 ,...,N follow the minimizing velocity fieldmore or less exactly, depending on the value of the smoothing parameter λ ∈ (0 , ∞ ]; it is a weightbetween the first term, that measures the smoothness of the vector field that generates the dif-feomorphism, and the second term, that measures how closely the landmark trajectories actuallyfollow the vector field. When λ = ∞ we have exact matching, i.e. the landmark trajectories exactlysatisfy the ordinary differential equations ˙ q a = v ( t, q a ), a = 1 , . . . , N that are obtained by settingthe integrands of the second term in the right-hand side of (4) equal to zero. In fact, the exact matching problem can be equivalently expressed as the minimization of E [ v ] := Z k v ( t, ) k V dt among all v ∈ L ([0 , , V ) such that ϕ v ( x a ) = y a , a = 1 , . . . , N . When λ < ∞ in (4) we have regularized matching, i.e. the landmark trajectories “almost” satisfy such set of ordinary differentialequations; this allows for the time varying vector field to be smoother. For this reason the secondterm in (4) is often referred to as smoothing term ; by allowing smoother vector fields the distance d is made tolerant to small diffeomorphisms and therefore more robust to object variations due tonoise in the data. By manipulating expression (4) we will now show that it is equivalent to the energy of a path q ∈ Q with respect to a Riemannian metric tensor. Notation.
Consider a landmark q = ( q , . . . , q N ) in L N ( R D ). The D scalar components in Eu-clidean coordinates of the N landmark trajectories q a = ( q a , . . . , q aD ), a = 1 , . . . , N can be orderedeither into an N × D matrix or in a tall concatenated column vector. We shall always use indices a, b, c, . . . ∈ { , . . . , N } as landmark indices , and i, j, k, . . . ∈ { , . . . , D } as space coordinates in R D .We will associate to each of the N landmarks q a ∈ R × D a momentum p a ∈ R × D (defined in thenext Proposition) which we will write, in coordinates, as p a = ( p a , . . . , p aD ), for each a = 1 , . . . , N .The components of momenta can also be ordered into an N × D matrix or in a long row vector. Wechose superscript indices for landmark coordinates and subscript indices for momenta.4or a given set of landmarks ( q , . . . , q N ) ∈ L N ( R D ) we will define the symmetric N × N matrix K ( q ) := (cid:0) K ( q a − q b ) (cid:1) a,b =1 ,...,N . The matrix K ( q ) is positive definite, whence invertible. Proposition 1.
For a fixed landmark path q = (cid:8) q a : [0 , → R D (cid:9) Na =1 ∈ Q there exists a uniqueminimizer with respect to v ∈ L ([0 , , V ) of the energy E [ v, q ] , namely: v ∗ ( t, x ) := N X a =1 p a ( t ) K (cid:0) x − q a ( t ) (cid:1) , t ∈ [0 , , x ∈ R D , (6) where the components of the momenta are given by: p ai ( t ) = N X b =1 (cid:16) K (cid:0) q ( t ) (cid:1) + I N λ (cid:17) − ab · ddt q bi ( t ) , t ∈ [0 , , (7) a = 1 , . . . , N , i = 1 , . . . , D (here I N indicates the N × N identity matrix). Remark.
What the above proposition essentially says is that the vector field of minimum energythat transports the N landmarks along fixed trajectories is, at any point of time, the linear combi-nation of N lumps of velocity, each centered at a landmark point. The directions and amplitudes ofthe summands are determined precisely by the momenta. Proof of Proposition 1.
Using property (ii) of the admissible Hilbert space V , [26, Lemma 9.5] showsthat for given q = ( q , . . . , q N ) ∈ L N ( R D ) we have the orthogonal decomposition V = (cid:8) v ∈ V : v ( q a ) = 0 , a = 1 , . . . N (cid:9) ⊕ (cid:8) v = P Na =1 α a K ( − q a ) : α a ∈ R D (cid:9) . (8)Thus the minimizer must have the form v ( t, x ) = N X a =1 α a ( t ) K (cid:0) x − q a ( t ) (cid:1) , t ∈ [0 , , x ∈ R D , (9)for some coefficients α a ∈ C ([0 , , R D ), a = 1 , . . . , N , to be computed. For velocities of the type (9)the energy (4) can be rewritten as E [ v, q ] = Z D X i =1 N X a,b =1 n α ai K ( q a − q b ) α bi + λ (cid:12)(cid:12) α ai K ( q a − q b ) − ˙ q bi (cid:12)(cid:12) o dt. (10)Setting the first variation of (10) with respect to coefficients α ai to zero yields the momenta (7).It is convenient, at this point, to introduce the N D × N D , block-diagonal matrix g ( q ) := (cid:0) K ( q ) + I N λ (cid:1) − · · · (cid:0) K ( q ) + I N λ (cid:1) − · · · · · · (cid:0) K ( q ) + I N λ (cid:1) − , (11)5here the N × N block (cid:0) K ( q ) + I N λ (cid:1) − is repeated D times; the choice of symbol g is justified bythe fact that (11) is, as we shall see soon, precisely the Riemannian metric tensor with which we areendowing the manifold of landmarks, written in coordinates.Thus for a fixed path q ∈ Q the minimizer of E [ v, q ] with respect to v ∈ L ([0 , , V ) is givenby (6); since it depends on q we will write it, with an abuse of notation, as v ∗ ( q ). We can define e E [ q ] := E [ v ∗ ( q ) , q ] , (12)which depends only on the arbitrary path q ∈ Q . The energy (12) is “equivalent” to the en-ergy E [ v, q ], in that: (a) if (ˆ v, ˆ q ) minimizes E [ v, q ] then ˆ q minimizes e E [ q ], and E [ˆ v, ˆ q ] = e E [ˆ q ]; (b) if ˆ q minimizes e E [ q ] then ( v ∗ (ˆ q ) , ˆ q ) minimizes E [ v, q ], and E [ v ∗ (ˆ q ) , ˆ q ] = e E [ˆ q ]. Proposition 2.
For an arbitrary landmark trajectory q ∈ Q energy e E [ q ] is given by: e E [ q ] = Z ˙ q ( t ) T g (cid:0) q ( t ) (cid:1) ˙ q ( t ) dt = Z N X a,b =1 D X i =1 ˙ q ai ( t ) ˙ q bi ( t ) (cid:16) K (cid:0) q (cid:0) t )) + I N λ (cid:17) − ab dt (13) Proof.
Following definition (12), formulae (7) for the momenta are inserted into the modified expres-sion (10) for energy E [ v, q ]. Simple matrix manipulations finally yield the right-hand side of (13). Remarks.
Expression (13) has exactly the form of the energy of a path q with respect to Riemannianmetric tensor (11). Whence given two landmark configurations I and I in L N ( R D ) we have thatif ˆ q minimizes (13) among all paths in q ∈ Q such that q (0) = I and q (1) = I then ( e E [ˆ q ]) / is the geodesic distance between I and I . By point (b) above we also have that ( v ∗ (ˆ q ) , ˆ q ) is a minimumof energy E [ v, q ], so d ( I, I ) defined in (5) coincides with ( e E [ˆ q ]) / and is the geodesic distancebetween I and I with respect to metric tensor g .The Lagrangian function that corresponds to energy (13) is: L ( q, ˙ q ) = 12 ˙ q T g ( q ) ˙ q = 12 N X a,b =1 D X i =1 ˙ q ai ˙ q bi (cid:16) K ( q ) + I N λ (cid:17) − ab . (14)In Hamiltonian mechanics [12] the “momenta” are defined as p ai = ∂ L /∂q ai , or, in vector notation, p ( i ) = ∂ L /∂q ( i ) (for i = 1 , . . . , D ). Applying such definition to (14) yields precisely equations (7) ofProposition 1. Whence the use of the term momenta is justified.Note that for small values of parameter λ the metric tensor g , written in coordinates, gets close(up to a multiplicative constant) to the N D × N D identity matrix; in other words, for λ → g converges to a Euclidean metric and the geodesic curves become straight lines. On the otherhand, for λ → ∞ (exact matching) the metric converges to [diag { K ( q ) , . . . , K ( q ) } ] − (block K ( q ) isrepeated D times). In general, the block-diagonal form of the metric tensor g given by (11) followsfrom operator L in (2) being separately applied to each of the components of the velocity field;however the dynamics of the D dimensions of q are not decoupled since all ND components of q appear in each diagonal block of g . In the case of exact matching one can prove that landmarks“never collide”: in other words, it takes an infinite energy to make any two landmarks coincide.Figure 1 shows the qualitative behavior of geodesics in L ( R ), with λ = ∞ . In the caseillustrated on the left-hand side both landmarks travel in the same direction (from left to right, as6 − − − − − − − − − − − Figure 1: Two trajectories in L ( R ). Bullets ( • ) and circles ( ◦ ) are the initial and final sets oflandmarks, respectively. The grids represents the two corresponding diffeomorphisms ϕ v .indicated by the arrows): the two arcs of the geodesic “attract” each other, or in other words thetwo landmarks tend to “carpool” by using a velocity field with the smallest possible support so tominimize the L part (i.e. the first term) of the Sobolev norm (2) of the velocity field. On the otherhand when the two landmarks travel in opposite directions (as illustrated on the right-hand side ofFigure 1) they try to avoid each other so that the higher order terms of the Sobolev norm are keptsmall; we shall return on the issue of obstacle avoidance at the end of this paper. A typical geodesicin L ( R ) (again with λ = ∞ ) is shown in Figure 2. Conclusion.
We have shown that distance d ( I, I ), I, I ∈ L N ( R D ) defined in (5) is in fact thegeodesic distance with respect to a Riemannian metric. In coordinates, the corresponding Rieman-nian metric tensor is given by (11), which is such that each element of its inverse (the cometric)depends on at most 2 D of the ND coordinates. Whence the first and second partial derivatives ofthe cometric have a very sparse structure. This gives us motivation for deriving a general formulafor computing sectional curvature in terms of the cometric and its derivatives in lieu of the metricand its derivatives, which will be done in the next section. Let M be an n -dimensional Riemannian manifold. If we consider a local chart ( U, ϕ ) on the manifoldwith coordinates ( x , . . . , x n ), we have the induced 1-forms dx , . . . , dx n and coordinate vector fields { ∂ := ∂∂x , . . . , ∂ n = ∂∂x n } . The metric tensor g : T M × M T M → R can be represented as g | U = g ( ∂ i , ∂ j ) dx i ⊗ dx j =: g ij dx i ⊗ dx j . We get a positive definite matrix with elements g ij ( p ) = g p ( ∂ i , ∂ j ).With an abuse of notation we will write g ij ( x ) instead of ( g ij ◦ ϕ − )( x ), x ∈ ϕ ( U ).7 − − − − − Figure 2: A typical geodesic trajectory in L ( R ). Bullets ( • ) and circles ( ◦ ) are the initial and finalsets of landmarks, respectively. The grid represents the corresponding diffeomorphism ϕ v . Notation.
We shall denote the partial derivatives of the elements of the metric tensor g as g ij,k ( x ) := ∂∂x k g ij ( x ) = ∂ k g ij and g ij,k‘ ( x ) := ∂ ∂x ‘ ∂x k g ij ( x ) = ∂ ‘ ∂ k g ij . Also, we will indicate the cometric as g − | U = g ij ∂ i ⊗ ∂ j (so that g ij g jk = δ ik ) and their partial derivatives with g ijij,k ( x ) := ∂∂x k g ij ( x )and g ijij,k‘ ( x ) := ∂ ∂x ‘ ∂x k g ij ( x ).For a tangent vectors X = X i ∂ i we consider the 1-form X [ := X i g ij dx j =: X j dx j (indiceslowered), and for a 1-form α = α i dx i we have the tangent vector α ] := α i g ij ∂ j (indices lifted).Indicating with X ( M ) the space of smooth vector fields on the manifold M , let ∇ : X ( M ) ×X ( M ) → X ( M ) be the Levi-Civita connection [12, 17] of the Riemannian manifold. The Christof-fel symbols Γ kij are defined by ∇ ∂ i ∂ j = Γ kij ∂ k , and it is well known that they have the form:Γ kij = g k‘ ( g i‘,j + g j‘,i − g ij,‘ ). The Riemannian curvature endomorphism is the map R : X ( M ) ×X ( M ) × X ( M ) → X ( M ) given by R ( X, Y ) Z = ∇ X ∇ Y Z − ∇ Y ∇ X Z − ∇ [ X,Y ] Z. In local coordinates R ( ∂ i , ∂ j ) ∂ k = R ‘ijk ∂ ‘ , and R ijkm := h R ( ∂ i , ∂ j ) ∂ k , ∂ m i = g m‘ R ‘ijk . The
Riemannian curvature tensor acts on vector fields as follows: R ( X, Y, Z, W ) = h R ( X, Y ) Z, W i (15)and in coordinates it is written as R = R ijkm dx i ⊗ dx j ⊗ dx k ⊗ dx m . The Riemannian curvaturetensor has a number of symmetries: (i) R ijk‘ = − R jik‘ ; (ii) R ijk‘ = − R ij‘k ; (iii) R ijk‘ = R k‘ij ;and (iv) R ijk‘ + R jki‘ + R kij‘ = 0 (first Bianchi identity). With such conventions, the sectionalcurvature associated to a pair of non-parallel tangent vectors X and Y is computed by: K ( X, Y ) = R ( X, Y, Y, X ) k X k k Y k − h X, Y i = R ijkm X i Y j Y k X m k X k k Y k − h X, Y i . (16)In order to express the numerator of sectional curvature (16) in terms of the elements of thecometric and its derivatives (i.e. g ij , g ijij,k , and g ijij,k‘ ) we consider the covariant expression of the8iemannian curvature tensor: R ursv := R ijkm g iu g jr g ks g mv , (17)which we call the dual Riemannian curvature tensor . Similarly we consider the covariant or dualChristoffel symbols Γ rsu := g ir g js g ku Γ kij , (18)which are symmetric in the indices r and s .To achieve notational compactness we will use the following symbols: g ij,k := g ijij,ξ g ξk and g ij,k‘ := g ijij,ξη g ξk g η‘ ; (19)Using that g = Q − implies ∂ k g = − Q − · ∂ k Q · Q − one immediately sees thatΓ rsu = − g uϕ (cid:0) g sϕ,r + g rϕ,s − g rs,ϕ (cid:1) . Proposition 3.
The following expression holds for the Riemannian curvature tensor: R ijkm = g ik,jm + g jm,ik − g jk,im − g im,jk + 2Γ αik Γ βjm g αβ − αjk Γ βim g αβ . (20)For a proof see [20, § Proposition 4.
The following expression holds for the dual Riemannian curvature tensor: R ursv = − g us,rv − g rv,us + g rs,uv + g uv,rs + 2Γ rvρ Γ usσ g ρσ − rsρ Γ uvσ g ρσ + g rλ,u g λµ g µv,s − g rλ,u g λµ g µs,v + g uλ,r g λµ g µs,v − g uλ,r g λµ g µv,s (21)+ g rλ,s g λµ g µv,u + g uλ,v g λµ g µs,r − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r . Proof.
We will manipulate (20) and write it in the form R ijkm = g iu g jr g ks g mv R ursv by factor-ing g iu g jr g ks g mv out of each term; what will be left will be precisely the expression for R ursv .The terms in (20) involving Christoffel symbols are, by (18):Γ αik Γ βjm g αβ = g iu g ks g ασ Γ usσ g jr g mv g ρβ Γ rvρ g αβ = g iu g jr g ks g mv (cid:0) Γ rvρ Γ usσ g ρσ (cid:1) , (22)and similarly: Γ αjk Γ βim g αβ = g iu g jr g ks g mv (cid:0) Γ rsρ Γ uvσ g ρσ (cid:1) . (23)As we did in the proof of Lemma 18, if g = Q − then ∂ j g = − Q − · ∂ j Q · Q − and ∂ m ∂ j g = Q − · (cid:0) ∂ m Q · Q − · ∂ j Q + ∂ j Q · Q − · ∂ m Q − ∂ m ∂ j Q (cid:1) · Q − , i.e., in index notation, g ik,jm = g iu (cid:0) g uλuλ,m g λµ g µsµs,j + g uλuλ,j g λµ g µsµs,m − g usus,jm (cid:1) g sk = g iu g ks δ ξj δ ηm (cid:0) g uλuλ,η g λµ g µsµs,ξ + g uλuλ,ξ g λµ g µsµs,η − g usus,ξη (cid:1) = g iu g ks g jr g mv (cid:2) g rξ g vη (cid:0) g uλuλ,η g λµ g µsµs,ξ + g uλuλ,ξ g λµ g µsµs,η − g usus,ξη (cid:1)(cid:3) = g iu g jr g ks g mv (cid:0) g uλ,v g λµ g µs,r + g uλ,r g λµ g µs,v − g us,rv (cid:1) , (24)9here we have used definitions (19). Similarly, we can achieve the factorizations: g jm,ik = g iu g jr g ks g mv (cid:0) g rλ,u g λµ g µv,s + g rλ,s g λµ g µv,u − g rv,us (cid:1) , (25) − g jk,im = g iu g jr g ks g mv (cid:0) − g rλ,u g λµ g µs,v − g rλ,v g λµ g µs,u + g rs,uv (cid:1) , (26) − g im,jk = g iu g jr g ks g mv (cid:0) − g uλ,r g λµ g µv,s − g uλ,s g λµ g µv,r + g uv,rs (cid:1) . (27)Inserting (22) ÷ (27) into (20) we can write R ijkm = g iu g jr g ks g mv R ursv , with R ursv given by (21). Proposition 5.
The dual Riemannian curvature tensor may also be written as follows: R ursv = − g us,rv − g rv,us + g rs,uv + g uv,rs (T ) − (cid:8) g rsrs,ρ g ρσ g uvuv,σ − g rsrs,ρ (cid:0) g ρu,v + g ρv,u (cid:1) − g uvuv,σ (cid:0) g σr,s + g σs,r (cid:1)(cid:9) (T )+ 12 (cid:8) g rvrv,ρ g ρσ g usus,σ − g rvrv,ρ (cid:0) g ρu,s + g ρs,u (cid:1) − g usus,σ (cid:0) g σr,v + g σv,r (cid:1)(cid:9) (T ) − (cid:0) g λr,s − g λs,r (cid:1) g λµ (cid:0) g µu,v − g µv,u (cid:1) (T )+ 12 (cid:0) g λr,v − g λv,r (cid:1) g λµ (cid:0) g µu,s − g µs,u (cid:1) (T )+ (cid:0) g λr,u − g λu,r (cid:1) g λµ (cid:0) g µv,s − g µs,v (cid:1) . (T ) Proof.
We will expand and recombine the terms in expression (21). The terms involving secondderivatives need no manipulation and correspond to term T . The terms in the second line of (21)can be written as: g rλ,u g λµ g µv,s − g rλ,u g λµ g µs,v + g uλ,r g λµ g µs,v − g uλ,r g λµ g µv,s = ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v )which is precisely T . It is also the case that:2 Γ rvρ Γ usσ g ρσ − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = (cid:2) ( g λr,v + g λv,r ) − g rv,λ (cid:3) g λρ g ρσ g σµ (cid:2) ( g µu,s + g µs,u ) − g us,µ (cid:3) − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = (cid:8) g rvrv,ρ g ρσ g usus,σ − g rvrv,ρ ( g ρu,s + g ρs,u ) − g usus,σ ( g σr,v + g σv,r ) (cid:9) + ( g λr,v + g λv,r ) g λµ ( g µu,s + g µs,u ) − g rλ,v g λµ g µs,u − g uλ,s g λµ g µv,r = T + ( g λr,v − g λv,r ) g λµ ( g µu,s − g µs,u ) = T + T . Similarly one can prove that: − rsρ Γ uvσ g ρσ + g rλ,s g λµ g µv,u + g uλ,v g λµ g µs,r = T + T .For an arbitrary pair of tangent vectors X = X i ∂ i , and Y = Y i ∂ i in T p M we consider thecovectors X [ = X i dx i and Y [ = Y i dx i in T ∗ p M , with X i = g ij X j and Y i = g ij Y j . The numeratorof sectional curvature (16) may be rewritten as R ijkm X i Y j Y k X m = R ursv X u Y r Y s X v . Theorem (Mario’s formula).
For an arbitrary pair of vectors X = X i ∂ i and Y = Y i ∂ i in T p M the numerator of sectional curvature (16) at point p ∈ M may be written as: g (cid:0) R ( X, Y ) Y, X (cid:1) = R ursv X u Y r Y s X v == (cid:0) X u Y r − Y u X r (cid:1)(cid:16) g su,rv + g usus,ρ g ρr,v − g usus,σ g rv,σ − g λu,r g λµ g µs,v (cid:17)(cid:0) X s Y v − Y s X v (cid:1) . oreover, if we extend X [ and Y [ locally on M to constant 1-forms in terms of local coordinates(i.e. make its coefficients X u , Y r constant functions), then the formula becomes: g (cid:0) R ( X, Y ) Y, X (cid:1) == n XX ( k Y [ k ) + Y Y ( k X [ k ) − ( XY + Y X ) g − ( X [ , Y [ ) o + n k d ( g − ( X [ , Y [ )) k − g − (cid:0) d ( k X [ k ) , d ( k Y [ k ) (cid:1)o − g (cid:0) [ X, Y ] , [ X, Y ] (cid:1) , where the term in the first set of braces equals the sum of the first two terms in the coordinate form,the term in the second set of braces equals the third term in the coordinate form and finally the lastterms are equal.Proof. We will write the six terms provided by Proposition 5 as T ursvi , i = 1 , . . . ,
6. We have:T ursv X u Y r Y s X v = − g us,rv X u Y r Y s X v − g rv,us X u Y r Y s X v + g rs,uv X u Y r Y s X v + g uv,rs X u Y r Y s X v = g us,rv ( − X u Y r Y s X v − X r Y u Y v X s + X r Y u Y s X v + X u Y r Y v X s )= g us,rv ( X u Y r − Y u X r )( X s Y v − Y s X v ) , where the second step follows from relabeling the indices. As far as T and T are concerned,(T ursv + T ursv ) X u Y r Y s X v = − (cid:8) Y r Y s g rsrs,ρ g ρσ g uvuv,σ X u X v − Y r X v g rvrv,ρ g ρσ g usus,σ X u Y s (cid:9) + (cid:8) Y r Y s g rsrs,ρ (cid:0) g ρu,v + g ρv,u (cid:1) X u X v + X u X v g uvuv,ρ (cid:0) g ρr,s + g ρs,r (cid:1) Y r Y s − Y r X v g rvrv,ρ (cid:0) g ρu,s + g ρs,u (cid:1) X u Y s − X u Y s g usus,ρ (cid:0) g ρr,v + g ρv,r (cid:1) Y r X v (cid:9) = − (cid:8) Y r Y s g rsrs,ρ g ρσ g uvuv,σ X u X v − Y r X v g rvrv,ρ g ρσ g usus,σ X u Y s (cid:9) + (cid:8) Y r Y s g rsrs,ρ g ρu,v X u X v + 2 X u X v g uvuv,σ g σr,s Y r Y s − X u Y s g usus,ρ (cid:0) g ρr,v + g ρv,r (cid:1) Y r X v (cid:9) ( ∗ ) = − g rvrv,ρ g ρσ g usus,σ (cid:8) Y u Y s X r X v + Y r Y v X u X s − Y r X v X u Y s − Y v X r X s Y u (cid:9) + g usus,ρ g ρr,v (cid:8) Y u Y s X r X v + X u X s Y r Y v − X u Y s Y r X v − X s Y u Y v X r (cid:9) = (cid:0) − g usus,σ g rv,σ + g usus,ρ g ρr,v (cid:1) ( X u Y r − Y u X r )( X s Y v − Y s X v ) , where, once again, step ( ∗ ) follows from relabeling the indices. Also, one can easily see thatT ursv X u Y r Y s Y v = − Y r Y s ( g λr,s − g λs,r ) g λµ ( g µu,v − g µv,u ) X u X v = 0 . Finally,(T ursv + T ursv ) X u Y r Y s Y v = Y r X v ( g λr,v − g λv,r ) g λµ ( g µu,s − g µs,u ) X u Y s + Y r X u ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v ) X v Y s = Y r X u ( g λr,u − g λu,r ) g λµ ( g µv,s − g µs,v ) X v Y s = Y r X u (cid:8) g λr,u g λµ g µv,s − g λr,u g λµ g µs,v − g λu,r g λµ g µv,s + g λu,r g λµ g µs,v (cid:9) X v Y s = − g λu,r g λµ g µs,v (cid:8) − Y u X r X s Y v + Y u X r X v Y s + Y r X u X s Y v − Y r X u X v Y s (cid:9) = − g λu,r g λµ g µs,v ( X u Y r − Y u X r )( X s Y v − Y s X v ) . Divide by 2 to get the coordinate formula. The non-local version of the formula follows easily by11ringing the X and Y ’s into the formula; thus: Y u X r ( g su,rv + g susu,ρ g ρr,v ) Y s X v = X r X v (cid:0) ( Y u Y s g su ) ,rv + ( Y u Y s g su ) ,ρ g ρr,v (cid:1) = X r X v (cid:0) ( k Y [ k ) ,ρσ g ρr g σv + ( k Y [ k ) ,ρ g ρrρr,σ g σv (cid:1) = X v g σv (cid:0) X r g ρr k Y [ k ) ,ρ (cid:1) ,σ = X σ (cid:0) X ρ k Y [ k ,ρ (cid:1) ,σ = XX (cid:0) k Y [ k (cid:1) . A typical term from the third part of Mario’s formula is rewritten like this: Y u X r g usus,σ g rv,σ Y s X v = X r X v (cid:0) k Y [ k (cid:1) ,σ g rv,σ = k Y [ k (cid:0) k X [ k (cid:1) ,ρ g ρσ = g − (cid:0) d ( k Y [ k ) , d ( k X [ k ) (cid:1) ;the other terms are similar. Finally, it is the case that:( X u Y r − Y u X r ) g λu,r ∂ λ = ( X u Y r − Y u X r ) g λuλu,η g ηr ∂ λ = ( X u Y η − Y u X η ) g λuλu,η ∂ λ = (cid:0) ( X u g λu ) ,η Y η − ( Y u g λu ) ,η X η (cid:1) ∂ λ = (cid:0) X λ,η Y η − Y λ,η X η ) ∂ λ = − [ X, Y ] , and the proof is easily completed. Remark.
It is convenient to split Mario’s formula in four terms: R := ( X u Y r − Y u X r (cid:1) g su,rv (cid:0) X s Y v − Y s X v (cid:1) , (29) R := ( X u Y r − Y u X r (cid:1) g usus,ρ g ρr,v (cid:0) X s Y v − Y s X v (cid:1) , (30) R := ( X u Y r − Y u X r (cid:1)(cid:0) − g usus,σ g rv,σ (cid:1)(cid:0) X s Y v − Y s X v (cid:1) , (31) R := ( X u Y r − Y u X r (cid:1)(cid:0) − g λu,r g λµ g µs,v (cid:1)(cid:0) X s Y v − Y s X v (cid:1) ; (32)all the terms with the exception of R (where g appears, but not its derivatives) depend only onelements of the cometric and their derivatives. Remark.
The denominator of sectional curvature (16) can also be expressed in terms of the comet-ric: k X k k Y k − h X, Y i = X u X s Y r Y v ( g us g rv − g uv g sr ) . (33) In this section we will apply Mario’s formula to the computation of sectional curvature for theRiemannian manifold of landmarks, introduced in section 2.
On the
N D -dimensional manifold L = L N ( R D ) of landmarks we consider the Riemannian metric g given, in coordinates, by the matrix (11); it is in block-diagonal form and we write its genericelement as g ( ai )( bj ) , with a, b = 1 , . . . , N (landmark labels) and i, j = 1 , . . . , D (coordinate labels,respectively of landmarks a and b ). More precisely: the matrix g ( q ) is made of D square ( N × N )blocks; indices i, j = 1 , . . . , D indicate the block, whereas indices a, b = 1 , . . . , N locate the elementwithin the ( i, j )-block. Therefore if we indicate with h ab ( q ) the generic element of the N × N matrix (cid:0) K ( q ) + I N λ (cid:1) − we have that g ( ai )( bj ) = h ab ( q ) δ ij , a, b = 1 , . . . , N, i, j = 1 , . . . , D, δ ij is Kronecker’s delta. Similarly, if we indicate as g ( ai )( bj ) the elements of the cometrictensor g ( q ) − , they are given by g ( ai )( bj ) ( q ) = h ab ( q ) δ ij , where h ab ( q ) = K ( q a − q b )+ δ ab λ . In analogywith the notation introduced in section 3 we also denote the partial derivatives by g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂∂q ck g ( ai )( bj ) and g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂ ∂q ck ∂q d‘ g ( ai )( bj ) ; they will be computed later.For simplicity from now on we shall assume that λ = ∞ (i.e. that we are dealing with exactmatching of landmarks); so the element of the cometric becomes g ( ai )( bj ) ( q ) = K ( q a − q b ) δ ij . TheHamiltonian function [12] for the system can be written as: H ( p, q ) = 12 p T g ( q ) − p = 12 N X a,b =1 D X i,j =1 g ( ai )( bj ) ( q ) p ai p bj = 12 N X a,b =1 D X i,j =1 K ( q a − q b ) δ ij p ai p bj , that is H ( p, q ) = P Na,b =1 K ( q a − q b ) (cid:10) p a , p b (cid:11) R D . Proposition 6.
Hamilton’s equations for the Riemannian manifold of landmarks are: ˙ q a = N X b =1 K ( q a − q b ) p b ˙ p a = − N X b =1 ∇ K ( q a − q b ) (cid:10) p a , p b (cid:11) R D a = 1 , . . . , N. (34) Proof.
Equation (7) can be written as ˙ q ai = P Nb =1 K ( q a − q b ) p bi , for a = 1 , . . . , N , i = 1 , . . . , D ;alternatively, computing ˙ q ai = ∂ H ∂p ai yields the same result. Also: ∂∂q ai K ( q b − q c , . . . , q bD − q cD ) = P D‘ =1 ∂K∂x ‘ ( q b − q c ) ∂∂q ai ( q b‘ − q c‘ )= P D‘ =1 ∂K∂x ‘ ( q b − q c ) ( δ ba − δ ca ) δ ‘i = ∂K∂x i ( q b − q c ) ( δ ba − δ ca ) (35)so that p ai = − ∂ H ∂q ai ( p, q ) = − P Nc =1 ∂K∂x i ( q a − q c ) h p a , p c i R D + P Nb =1 ∂K∂x i ( q b − q a ) h p b , p a i R D = − P Nb =1 ∂K∂x i ( q a − q b ) h p a , p b i R D , where the last step follows from the skew-symmetry of ∇ K ab in indices a, b . Corollary 7. If p a ( t ) = 0 for some landmark a = 1 , . . . , N and time t ∈ R , then p a ( t ) ≡ . Let K ( x ), x ∈ R D be the scalar kernel that defines the metric; we assume that it is twice continuouslydifferentiable and symmetric, K ( x ) = K ( − x ); for now we shall not assume rotational invariance.13e define: K ab := K ( q a − q b ) ∈ R ,∂ i K ( x ) := ∂K∂x i ( x ) , ∂ i K ab := ∂ i K ( q a − q b ) ∈ R , ∇ K := ( ∂ K, · · · , ∂ D K ) T , ∇ K ab := ∇ K ( q a − q b ) ∈ R D ,∂ ij K ( x ) := ∂ K∂x i ∂x j ( x ) , ∂ ij K ab := ∂ ij K ( q a − q b ) ∈ R ,D K := Hessian ( D ij K ) , D K ab := D K ( q a − q b ) ∈ R D × D . (36)Note that ∇ K ab = −∇ K ba , ∇ K aa = 0 and D K ab = D K ba , for all a, b = 1 , . . . , N .For a fixed set of landmark points q in L = L N ( R D ) consider any pair of cotangent vectors α, β ∈ T ∗ q L : we shall write α = ( α , . . . , α N ) and β = ( β , . . . , β N ), where each component is D -dimensional. We define the vector field α hor : R D → R D and its values at the landmark points by: α hor ( x ) := N X b =1 K ( x − q b ) α b = N X b =1 D X j =1 K ( x − q b ) α bj ∂ j = N X b =1 D X i,j =1 K ( x − q b ) α bi δ ij ∂ j , x ∈ R D , ( α ] ) a := α hor ( q a ) = N X b =1 K ab α b , which are, by virtue of formula (6), the velocity field α hor on R D induced by the landmark momen-tum α = ( α , . . . , α N ) and the corresponding landmark velocity α ] ∈ T q L (which obviously coincideswith the second of Hamilton’s equations (34)). Note that α ] = ( α ] , . . . , α ]N ) is the tangent vectorin T q L with metrically lifted indices.The curvature of the Riemannian manifold of landmarks will be expressed in terms of threeauxiliary quantities which we now introduce. We will call these force , discrete strain and landmarkderivative . We start with the force. For a fixed covector α = ( α , . . . , α N ) ∈ T ∗ q L , having the dualvector extended to a vector field α hor on all of R D allows us to take its derivatives at the landmarkpoints, a D × D matrix-valued function on R D :( Dα hor ) ji ( x ) := ∂ i ( α hor ) j ( x ) = N X b =1 α bj ∂ i K ( x − q b ) , ( Dα hor ) ji ( q a ) = N X b =1 ∂ i K ab α bj . For a trajectory ( q ( t ) , p ( t )) of the cotangent flow one has that ( p ( t ) , . . . , p N ( t )) ∈ T ∗ q ( t ) L for all t where the trajectory is defined, so the above notation can be used to rewrite Hamilton’s equationsin a more compact form. In particular, the following result holds. Proposition 8.
The second of Hamilton’s equations (34) can be written as ˙ p a = − Dp hor ( q a ) · p a a = 1 , . . . , N. (37) Proof. ˙ p ai = − P Nb =1 ∂ i K ab h p b , p a i R D = − P Dj =1 (cid:0) P Nb =1 ∂ i K ab p bj (cid:1) p aj = − P Dj =1 ( Dp hor ) ji p aj = − (cid:0) Dp hor ( q a ) · p a (cid:1) i , for any a = 1 , . . . , N and i = 1 , . . . , D .14or a fixed cotangent vector α ∈ T ∗ q L , this motivates defining the right-hand side of (37) to be force : F a ( α, α ) := Dα hor ( q a ) · α a , a = 1 , . . . , N. The full bilinear, symmetrized force may be thought of as a map F : T ∗ q L × T ∗ q L → T ∗ q L . We callthe covectors given by this the mixed force, with the definition: F a ( α, β ) := (cid:0) Dα hor ( q a ) · β a + Dβ hor ( q a ) · α a (cid:1) ,F ai ( α, β ) := D X j =1 N X b =1 ∂ i K ab (cid:0) α bj β aj + β bj α aj (cid:1) = N X b =1 ∂ i K ab (cid:0) h α a , β b i + h β a , α b i (cid:1) , (38)for a = 1 , . . . , N and i = 1 , . . . , D . (The angle brackets are inner products in R D .) Note that the“complete” cotangent vectors α = ( α , . . . , α N ) and β = ( β , . . . , β N ) (not only their a -components)are needed to compute each component F a ( α, β ) of the mixed force. The mixed force has simpleinterpretation. If we extend α and β to constant 1-forms on L , then the differential of the map q g − q ( α, β ) = P a,b K ( q a − q b ) h α a , β b i is given by: d g − q ( α, β ) = N X a,b =1 D X i =1 ∂ i K ( q a − q b ) ( dq ai − dq bi ) h α a , β b i = N X a,b =1 D X i =1 ∂ i K ( q a − q b ) (cid:0) h α a , β b i + h β a , α b i (cid:1) dq ai = 2 F ( α, β ) . (39)For a fixed α ∈ T ∗ q L the second component of the curvature formula, called the discrete vectorstrain , is defined by: S ab ( α ) := ( α ] ) a − ( α ] ) b , or S ab ( α ) i := N X c =1 ( K ac − K bc ) α ci for all a, b = 1 , . . . , N . These are vectors and are skew-symmetric in the points a, b : S ab ( α ) = − S ba ( α ), S aa ( α ) = 0. The scalar quantities: C ab ( α ) := (cid:10) ( α ] ) a − ( α ] ) b , ∇ K ab (cid:11) R D = N X c =1 D X i =1 ( K ac − K bc ) ∂ i K ab α ci we define to be the scalar compressions felt by kernel K ; they are symmetric (since both factors inthe inner product are skew-symmetric), i.e. C ab ( α ) = C ba ( α ), with the property C aa ( α ) = 0. Wecall these compressions because if K is a monotone decreasing function of the distance from theorigin (the most common case), then ∇ K ab points from q a to q b .Finally, if v and w are any two vector fields on landmark space, we may write their Lie derivativeas the difference of covariant derivatives:[ v, w ] L = ∇ L , flat v ( w ) − ∇ L , flat w ( v )where the flat connection on L is just the one induced by its embedding in R ND . In other words, ∇ L , flat v ( w ) is the usual derivative of w in the direction v if we use the coordinates q ai on landmark15pace: that is, ∇ L , flat v ( w ) := P ai v ( w ai ) ∂ ai = P ai P bj v bj ( ∂ bj w ai ) ∂ ai . If α , β are constant 1-formseverywhere on L N we can take v = α ] and w = β ] , now as vector fields on L , and then we find: ∇ L , flat α ] ( β ] ) = X a,i X b,j ( α ] ) bj ∂∂q bj ( β ] ) ai ∂ ai = X a,i X b,j ( α ] ) bj (cid:16) ∂∂q bj X c K ( q a − q c ) β ci (cid:17) ∂ ai = X a,i X b,c,j ( α ] ) bj ∂ j K ac ( δ ab − δ cb ) β ci ∂ ai = X a,i X b,j (cid:0) ( α ] ) aj − ( α ] ) bj (cid:1) ∂ j K ab β bi ∂ ai = X a,i X b (cid:10) ( α ] ) a − ( α ] ) b , ∇ K ab (cid:11) β bi ∂ ai = X a,i (cid:16) X b C ab ( α ) β bi (cid:17) ∂ ai . This is a vector in T q L which we define to be the landmark derivative of β ] with respect to α ] . Thecoefficients with respect to ∂ a , . . . , ∂ aD (for fixed a ) are the elements of the following vector: D a ( α, β ) := N X b =1 C ab ( α ) β b = N X b,c =1 ( K ac − K bc ) h α c , ∇ K ab i β b , a = 1 , . . . , N. (40)We have that D ( α, β ) = ( D a ( α, β )) Na =1 is the N D -dimensional vector of the coefficients of ∇ L , flat α ] ( β ] )with respect to the basis { ∂ ai } of T q L . In particular, the coefficients of the Lie bracket of α ] and β ] as vector fields on L are given by D ( α, β ) − D ( β, α ). L N ( R D ) We can write sectional curvature of L N ( R D ) in the following way, where we have split it in the termsintroduced by (29)–(32). From now on h , i will indicate the dot product in R D , while h , i T L and h , i T ∗ L will be the inner products in the tangent and cotangent bundles of L = L N ( R D ),respectively. Theorem 9.
The numerator of sectional curvature of L N ( R D ) , for an arbitrary pair of cotangentvectors α and β , is given by R ( α ] , β ] , β ] , α ] ) = P i =1 R i , with: R = X a = b (cid:0) α a ⊗ S ab ( β ) − β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) − β b ⊗ S ab ( α ) (cid:1) , (41) R = X a (cid:16)(cid:10) D a ( α, α ) , F a ( β, β ) (cid:11) + (cid:10) D a ( β, β ) , F a ( α, α ) (cid:11) − (cid:10) D a ( α, β ) + D a ( β, α ) , F a ( α, β ) (cid:11)(cid:17) , (42) R = (cid:13)(cid:13) F ( α, β ) (cid:13)(cid:13) T ∗ L − (cid:10) F ( α, α ) , F ( β, β ) (cid:11) T ∗ L = X ac K ac (cid:16)(cid:10) F a ( α, β ) , F c ( α, β ) i − (cid:10) F a ( α, α ) , F c ( β, β ) (cid:11)(cid:17) , (43) R = − (cid:13)(cid:13) [ α ] , β ] ] L (cid:13)(cid:13) T L = − (cid:13)(cid:13) D ( α, β ) − D ( β, α ) (cid:13)(cid:13) K − . (44)In the formula for the first term R we have used the well known definition of tensor product:( v ⊗ v ) T ( M ⊗ M )( w ⊗ w ) := ( v T M w )( v T M w ), and in the formula for the fourth term R the norm for D × N matrices k J k A := P Di =1 P Na,b =1 J ia J ib A ab .The theorem is proven by applying Mario’s formula to the cometric of the manifolds of landmarks.One needs to compute the elements of the cometric and its derivatives in terms of the kernel and16ts derivatives (36). In agreement with notation (19) we will define (note that we will keep usingEinstein’s summation convention wherever possible): g ( ai )( bj ) , ( d‘ ) := g ( ai )( bj )( ai )( bj ) , ( ck ) g ( ck )( d‘ ) and g ( ai )( bj ) , ( ck )( d‘ ) := g ( ai )( bj )( ai )( bj ) , ( µρ )( ξσ ) g ( µρ )( ck ) g ( ξσ )( d‘ ) . Lemma 10.
It is the case that g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂ k K ab ( δ ac − δ bc ) δ ij , (45) g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂ k‘ K ab ( δ ac − δ bc ) ( δ ad − δ bd ) δ ij , (46) g ( ai )( bj ) , ( d‘ ) = ∂ ‘ K ab ( K ad − K bd ) δ ij , (47) g ( ai )( bj ) , ( ck )( d‘ ) = ∂ k‘ K ab ( K ac − K bc ) ( K ad − K bd ) δ ij . (48) Proof.
Since g ( ai )( bj ) = K ab δ ij and also ∂∂q ck K ( q a − q b ) = ∂ k K ab ( δ ac − δ bc ) by (35), equation (45)follows immediately. Similarly to (35) one can prove that ∂∂q d‘ ∂ k K ( q a − q b ) = ∂ ‘k K ab ( δ ad − δ bd ),whence: g ( ai )( bj )( ai )( bj ) , ( ck )( d‘ ) = ∂∂q d‘ g ( ai )( bj )( ai )( bj ) , ( ck ) = ∂ ‘k K ab ( δ ad − δ bd ) ( δ ac − δ bc ) δ ij , so (46) holds too. Now,by expression (45): g ( ai )( bj ) , ( d‘ ) = g ( ai )( bj )( ai )( bj ) , ( ck ) g ( ck )( d‘ ) = P ck ∂ k K ab ( δ ac − δ bc ) δ ij K cd δ k‘ = ∂ ‘ K ab ( K ad − K bd ) δ ij . which is (47). We can use (46) to compute g ( ai )( bj ) , ( ck )( d‘ ) = g ( ai )( bj )( ai )( bj ) , ( µρ )( ξσ ) g ( µρ )( ck ) g ( ξσ )( d‘ ) : g ( ai )( bj ) , ( ck )( d‘ ) = P µρξσ ∂ ρσ K ab ( δ aµ − δ bµ ) ( δ aξ − δ bξ ) δ ij K µc δ ρk K ξd δ σ‘ = ∂ k‘ K ab ( K ac − K bc ) ( K ad − K bd ) δ ij , which completes the proof. Proof of Theorem 9.
We will compute terms R , . . . , R introduced by formulae (29) ÷ (32). Forsimplicity, sometimes we will write Dα hor a instead of Dα hor ( q a ). • Computation of R . We have R = ( α au β cr − β au α cr ) g ( au )( bs ) , ( cr )( dv ) ( α bs β dv − β bs α dv ). Insertingexpression (48) into such formula yields:2 R = P all indices ( α au β cr − β au α cr (cid:1) ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us (cid:0) α bs β dv − β bs α dv (cid:1) . Performing the above multiplications gives rise to four terms, which we will now compute one byone. First of all we have:2 R , := P all indices α au β cr α bs β dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = P abrv (cid:2)P us α au δ us α bs (cid:3) ∂ rv K ab (cid:2)P c ( K ac − K bc ) β cr (cid:3)(cid:2)P c ( K ac − K bc ) β cr (cid:3) = P ab α Ta α b P rv (cid:2) S ab ( β ) (cid:3) r ∂ rv K ab (cid:2) S ab ( β ) (cid:3) v = P ab α Ta α b (cid:0) S ab ( β ) (cid:1) T D K ab S ab ( β );= P ab (cid:0) α a ⊗ S ab ( β ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) (cid:1) ;17imilarly, 2 R , := − P all α au β cr β bs α dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = − P ab (cid:0) α a ⊗ S ab ( β ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) β b ⊗ S ab ( α ) (cid:1) , R , := − P all β au α cr α bs β dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = − P ab (cid:0) β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) α b ⊗ S ab ( β ) (cid:1) , R , := P all β au α cr β bs α dv ∂ rv K ab ( K ac − K bc ) ( K ad − K bd ) δ us = P ab (cid:0) β a ⊗ S ab ( α ) (cid:1) T (cid:0) I D ⊗ D K ab (cid:1)(cid:0) β b ⊗ S ab ( α ) (cid:1) . Now we can take the summation R = P i =1 R ,i , which yields precisely expression (41). • Computation of R . We may combine equations (45) and (47) from Lemma 10 to get: g ( au )( bs )( au )( bs ) , ( λρ ) g ( λρ )( cr ) , ( dv ) = P λρ ∂ ρ K ab ( δ aλ − δ bλ ) δ us ∂ v K λc ( K λd − K cd ) δ ρr = ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us . (49)Inserting (49) into 2 R = ( α au β cr − β au α cr ) g ( au )( bs )( au )( bs ) , ( λρ ) g ( λρ )( cr ) , ( dv ) ( α bs β dv − β bs α dv ) yields:2 R = P all indices (cid:8) α au β cr α bs β dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us − α au β cr β bs α dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us − β au α cr α bs β dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us + β au α cr β bs α dv ∂ r K ab (cid:2) ∂ v K ac ( K ad − K cd ) − ∂ v K bc ( K bd − K cd ) (cid:3) δ us (cid:9) , which immediately implies: R = P abcd h α a , α b ih β c , ∇ K ab i (cid:2) h β d , ∇ K ac i ( K ad − K cd ) −h β d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , ) − P abcd h α a , β b ih β c , ∇ K ab i (cid:2) h α d , ∇ K ac i ( K ad − K cd ) −h α d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , ) − P abcd h β a , α b ih α c , ∇ K ab i (cid:2) h β d , ∇ K ac i ( K ad − K cd ) − h β d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , )+ P abcd h β a , β b ih α c , ∇ K ab i (cid:2) h α d , ∇ K ac i ( K ad − K cd ) − h α d , ∇ K bc i ( K bd − K cd ) (cid:3) (=: R , )We will now manipulate terms R , ,. . . , R , one by one. Since ∇ K ab = −∇ K ba , by relabeling theindices we have R , = P abcd h α a , α b ih β c , ∇ K ab ih β d , ∇ K ac i ( K ad − K cd )= P abc h α a , α b ih β c , ∇ K ab i (cid:10)P d K ad β d − P d K cd β d , ∇ K ac (cid:11) = P abc h α a , α b ih β c , ∇ K ab ih S ac ( β ) , ∇ K ac i = P abc h α a , α b ih β c , ∇ K ab i C ac ( β )= P ab h α a , α b i (cid:10)P c C ac ( β ) β c , ∇ K ab (cid:11) = P ab h α a , α b ih D a ( β, β ) , ∇ K ab i = P ab D a ( β, β ) T ∇ K ab α Tb α a = P a D a ( β, β ) T Dα hor a · α a = P a h D a ( β, β ) , F a ( α, α ) i . Similarly, R , = P a h D a ( α, α ) , F a ( β, β ) i . It is also the case that R , = − P abc h α a , β b ih β c , ∇ K ab i (cid:2)(cid:10)P d ( K ad − K cd ) α d , ∇ K ac (cid:11) − (cid:10)P d ( K bd − K cd ) α d , ∇ K bc (cid:11)(cid:3) = − P abc h α a , β b ih β c , ∇ K ab i (cid:2) h S ac ( α ) , ∇ K ac i − h S bc ( α ) , ∇ K bc i (cid:3) = − P abc h α a , β b ih β c , ∇ K ab i (cid:2) C ac ( α ) − C bc ( α ) (cid:3) ;18elabeling the indices (and using the fact that ∇ K ab = −∇ K ba ) yields: R , = − P abc (cid:2) h α a , β b i + h α b , β a i (cid:3) h β c , ∇ K ab i C ac ( α )= − P ab (cid:2) h α a , β b i + h α b , β a i (cid:3)(cid:10)P c C ac ( α ) β c , ∇ K ab (cid:11) = − P ab (cid:2) h α a , β b i + h α b , β a i (cid:3) h D a ( α, β ) , ∇ K ab i = − P ab D a ( α, β ) T (cid:2) ∇ K ab β Tb α a + ∇ K ab α Tb β a (cid:3) = − P a D a ( α, β ) T (cid:2) Dβ hor a · α a + Dα hor a · β a (cid:3) = − P a h D a ( α, β ) , F a ( α, β ) i . Similarly, R , = − P a h D a ( β, α ) , F a ( β, α ) i . By the symmetry of F a ( · , · ), R , + R , = − P a h D a ( α, β ) + D a ( β, α ) , F a ( α, β ) i Adding the above sum to the expressions for R , and R , finally yields (42). • Computation of R . We have R = −
18 ( α au β cr − β au α cr (cid:1) g ( au )( bs )( au )( bs ) , ( ησ ) g ( cr )( dv ) , ( ησ ) (cid:0) α bs β dv − β bs α dv (cid:1) . But by Lemma 10, g ( au )( bs )( au )( bs ) , ( ησ ) g ( cr )( dv ) , ( ησ ) = P Nη =1 P Dσ =1 ∂ σ K ab ( δ aη − δ bη ) δ us ∂ σ K cd ( K cη − K dη ) δ rv = h∇ K ab , ∇ K cd i δ us δ rv ( K ac − K ad − K bc − K bd ) , whence: − R = P all (cid:8) h∇ K ab , ∇ K cd i ( K ac − K ad − K bc − K bd ) · (cid:0) α au β cr α bs β dv δ us δ rv − α au β cr α bs β dv δ us δ rv − α au β cr α bs β dv δ us δ rv + α au β cr α bs β dv δ us δ rv (cid:1)(cid:9) = P abcd (cid:8)(cid:0) h α a , α b ih β c , β d i − h α a , β b ih β c , α d i − h β a , α b ih α c , β d i + h β a , β b ih α c , α d i (cid:1) · h∇ K ab , ∇ K cd i ( K ac − K ad − K bc − K bd ) (cid:9) Relabeling the indices in the above expression yields: − R = P abcd (cid:2) h α a , α b ih β c , β d i − h α a , β b ih β c , α d i − h β a , α b ih α c , β d i− h α a , β b ih β d , α c i − h β a , α b ih α d , β c i (cid:3) h∇ K ab , ∇ K cd i K ac = P abcd K ac (cid:2) α Ta α b ( ∇ K ab ) T ∇ K cd β Td β c − α Ta β b ( ∇ K ab ) T ∇ K cd α Td β c − β Ta α b ( ∇ K ab ) T ∇ K cd β Td α c − α Ta β b ( ∇ K ab ) T ∇ K cd β Td α c − β Ta α b ( ∇ K ab ) T ∇ K cd α Td β c (cid:3) = P ac K ac (cid:2) α Ta ( Dα hor a ) T ( Dβ hor c ) β c − α Ta ( Dβ hor a ) T ( Dα hor c ) β c − β Ta ( Dα hor a ) T ( Dβ hor c ) α c − α Ta ( Dβ hor a ) T ( Dβ hor c ) α c − β Ta ( Dα hor a ) T ( Dα hor c ) β c (cid:3) = P ac K ac (cid:2) h Dα hor a · α a , Dβ hor c · β c , i − h Dα hor a · β a + Dβ hor a · α a , Dα hor c · β c + Dβ hor c · α c i (cid:3) . = P ac K ac (cid:2) h F a ( α, α ) , F c ( β, β ) i − h F a ( α, β ) , F c ( α, β ) i (cid:3) , which is precisely (44). Alternatively, this can be derived from formula (39). • Computation of R . It is the case that: R = −
34 ( α au β cr − β au α cr (cid:1) g ( ξλ )( au ) , ( cr ) g ( ξλ )( ηµ ) g ( ηµ )( bs ) , ( dv ) (cid:0) α bs β dv − β bs α dv (cid:1) .
19y Lemma 10: P aucr ( α au β cr − β au α cr (cid:1) g ( ξλ )( au ) , ( cr ) = P aucr ( α au β cr − β au α cr (cid:1) ∂ r K ξa ( K ξc − K ac ) δ λu = P au (cid:8) α au (cid:2)P r ∂ r K ξa ( P c K ξc β cr ) (cid:3) − α au (cid:2)P r ∂ r K ξa ( P c K ac β cr ) (cid:3) − β au (cid:2)P r ∂ r K ξa ( P c K ξc α cr ) (cid:3) + β au (cid:2)P r ∂ r K ξa ( P c K ac α cr ) (cid:3)(cid:9) δ λu = P au (cid:8) α au (cid:2) h∇ K ξa , β hor ξ i − h∇ K ξa , β hor a i (cid:3) − β au (cid:2) h∇ K ξa , α hor ξ i − h∇ K ξa , α hor a i (cid:3)(cid:9) δ λu = P au (cid:8) α au h∇ K ξa , S ξa ( β ) i − β au h∇ K ξa , S ξa ( α ) i (cid:9) δ λu = P au (cid:8) C ξa ( β ) α au − C ξa ( α ) β au (cid:9) δ λu . So if we define the matrix H ia := P b (cid:2) C ab ( β ) α bi − C ab ( α ) β bi (cid:3) , i = 1 , . . . , D , a = 1 , . . . , N we have: R = − X us X ξη X λµ H uξ δ λu ( K − ) ξη δ λµ H sη δ µs = − D X u =1 N X ξ,η =1 H uξ H uη ( K − ) ξη = − k H k K − . Alternatively, this can be derived from formula (40).The denominator (33) of sectional curvature for L N ( R D ) is given by the simple formula: Proposition 11.
For any pair of cotangent vectors α, β ∈ T ∗ q L , k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = X abcd K ab K cd (cid:0) h α a , α b ih β c , β d i − h α a , β b ih α c , β d i (cid:1) . (51) Proof.
Using double-index notation we may write equation (33) as follows: k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = α au α bs β cr β dv (cid:0) g ( au )( bs ) g ( cr )( dv ) − g ( au )( dv ) g ( bs )( cr ) (cid:1) = P abcd α au α bs β cr β dv (cid:0) K ab δ us K cd δ rv − K ad δ uv K bc δ sr (cid:1) = P abcd h α a , α b ih β c , β d i K ab K cd − P abcd h α a , β d ih α b , β d i K ad K bc , and (51) follows by relabeling the indices. Finally, suppose the Green’s function K is rotationally invariant: K ( x ) = γ ( k x k ) , x ∈ R D , with γ ∈ C (cid:0) [0 , ∞ ) (cid:1) . (52)Whenever this holds, we will use the convenient notation: γ := γ (0), γ ab := γ ( k q a − q b k ), γ ab := γ ( k q a − q b k ), and γ ab := γ ( k q − q k ) for a, b = 1 , . . . , N . Then we can evaluate the first andsecond derivatives of K : Lemma 12.
For rotationally invariant kernels, it is the case that: ∇ K ( x ) = γ ( k x k ) x k x k , (53) D K ( x ) = h γ ( k x k ) − γ ( k x k ) k x k i xx T k x k + γ ( k x k ) k x k I D (54)= γ ( k x k ) xx T k x k + γ ( k x k ) k x k Pr ⊥ ( x ) , here I D is the D × D identity matrix and Pr ⊥ ( x ) := I D − xx T k x k is projection to the hyperplane of R D normal to x .Proof. We have that ∂ i K ( x ) = γ ( k x k ) x i k x k and (53) follows immediately. Also, ∂ j ∂ i K ( x ) = x i k x k ∂∂x j γ ( k x k ) + γ ( k x k ) k x k ∂∂x j x i + γ ( k x k ) x i ∂∂x j k x k = γ ( k x k ) x i x j k x k + γ ( k x k ) k x k δ ij − γ ( k x k ) x i x j k x k = (cid:2) γ ( k x k ) − γ ( k x k ) k x k (cid:3) x i x j k x k + γ ( k x k ) k x k δ ij , which implies (54).Because of (53), in the rotationally invariant case, the “scalar compression” C ab ( α ) really doesmeasure a multiple compression of the flow α ] between q a and q b . We can decompose the vectorstrain S ab ( α ) into the part parallel to the vector q a − q b and the part perpendicular to this: let u ab := q a − q b k q a − q b k and define S k ab ( α ) := (cid:10) S ab ( α ) , u ab (cid:11) , S ⊥ ab ( α ) := S ab ( α ) − S k ab ( α ) u ab . (55)Note that S k ab ( α ) is a scalar while S ⊥ ab ( α ) is a vector . In particular we have that C ab ( α ) = γ ab · S k ab ( α ).Moreover, formula (54) allows us to simplify the first term R in the curvature formula. Substituting(54) into (41), we get the rotationally invariant case for R : Proposition 13.
In the rotationally invariant case (52) , we have that R = X a = b (cid:16) γ ab (cid:10) S k ab ( α ) β a − S k ab ( β ) α a , S k ab ( α ) β b − S k ab ( β ) α b (cid:11) (56)+ γ ab k q a − q b k (cid:10) S ⊥ ab ( α ) ⊗ β a − S ⊥ ab ( β ) ⊗ α a , S ⊥ ab ( α ) ⊗ β b − S ⊥ ab ( β ) ⊗ α b (cid:11)(cid:17) . Proof.
For any pair of covectors η and µ in T ∗ q L , by (54) we have that: S ab ( η ) D K ab S ab ( µ ) = γ ab S ab ( η ) T u ab u Tab S ab ( µ ) + γ ab k q a − q b k S ab ( η ) T Pr ⊥ ( u ab ) S ab ( µ )= γ ab S k ab ( η ) S k ab ( µ ) + γ ab k q a − q b k (cid:10) S ⊥ ab ( η ) , S ⊥ ab ( µ ) (cid:11) . Inserting this expressions into (41) yields the desired result.
A simple special case is when only one landmark carries momentum. We now compute the numeratorof sectional curvature when both cotangent vectors are nonzero at only one of the D -dimensionallandmarks ( q , . . . , q N ). We define:( T ∗ q L ) := (cid:8) η ∈ T ∗ q L (cid:12)(cid:12) η a = 0 for a > (cid:9) so that the elements of the above set are cotangent vectors of the type η = ( η , , . . . , ! ! " ! ! $ % $ ! ! ! " ! ! $%$ − − − − − − − − Figure 3: Dragging effect of one momentum-carrying landmark q (bullet • ) on a grid of landmarks(circles ◦ ), with γ ( x ) = exp( − x σ ), σ = 1 .
5. Left: initial configuration, with initial momentum p = (2 . , .
8) also shown. Right: configuration after one unit of time, with trajectory of q alsoshown; the red grid represents the diffeomorphism ϕ v , obtained by integrating α hor in time. Proposition 14. In L N ( R D ) , for any pair α, β ∈ ( T ∗ q L ) the four terms of R ( α ] , β ] , β ] , α ] ) aregiven by R = R = R = 0 and R = − P Na,b =2 h H a , H b i R D ( K − ) ab , where H a := ( γ a − γ ) (cid:0) h α , ∇ K a i β − h β , ∇ K a i α (cid:1) , for a > . Proof.
Using formula (38), we see that all mixed forces F a are zero. Therefore, the result forthe first three terms follows from Theorem 9. Also, by (40), D a ( α, β ) = ( γ a − γ ) (cid:10) η , ∇ K a (cid:11) β since α, β ∈ ( T ∗ q L ) ; a similar expression holds for D a ( β, α ), which concludes the proof.Therefore when α, β ∈ ( T ∗ q L ) the sectional curvature is always negative; we can understandthis by considering the geodesic flow in this case. It follows immediately from Proposition 6 that ifwe start with zero momenta p a at all q a , a >
1, then the momenta at these points stay zero, whilethe momentum at q remains constant. Thus the velocity of q is just given by K (0) p and this isconstant. The point q carrying the momentum moves in a straight line at constant speed, whilethe other points q a ( a >
1) are carried along by the global flow that the motion of q causes andmove at speeds ˙ q a = K a p , which are parallel to ˙ q (but not constant). As shown in Figure 3 (thecentral landmark q is the only one carrying momentum) what happens is that all other landmarkpoints are dragged along by q , more strongly when close, less when far away. Points directly infront of the path of q pile up and points behind space out.Negative curvature can be seen by the divergence of geodesics. If you imagine slightly changingthe direction of p in Figure 3, the final configuration of the landmark points (say, after one unitof time) will differ greatly from the one caused by the original value of p . Also, if you imagine q moving along two nearby parallel straight lines, the differential effect on the cloud of other points22ccumulates so that the final configurations will differ everywhere; thus, even though the initiallandmark configurations are close, the final configurations will be far away. In general, the lastnegative term in the curvature expresses the same effect: the global drag effect of each point resultsin a kind of turbulent mixing of all the other points (think of a kitchen mixer the motion of whoseblades mixes the whole bowl).Proposition 14 simplifies in the case of L = L ( R D ) (two landmarks only). We shall write: α k := h α , u i , α ⊥ := α − α k u , β k := h β , u i , β ⊥ := β − β k u . (57) Proposition 15.
In the case of L = L ( R D ) , when α, β ∈ ( T ∗ q L ) the numerator and denominatorof sectional curvature are given by, respectively: R ( α ] , β ] , β ] , α ] ) = R = − γ γ − γ γ + γ (cid:0) γ (cid:1) (cid:13)(cid:13) β k α ⊥ − α k β ⊥ (cid:13)(cid:13) , (58) k α k T ∗ L k β k T ∗ L − h α, β i T ∗ L = γ (cid:16)(cid:13)(cid:13) β k α ⊥ − α k β ⊥ (cid:13)(cid:13) + (cid:13)(cid:13) β ⊥ ⊗ α ⊥ − α ⊥ ⊗ β ⊥ (cid:13)(cid:13) (cid:17) . (59) Proof.
It is the case that R = − k H k ( K − ) . But ( K − ) = ( γ − γ ) − γ , whereas fromProposition 14 we have k H k = ( γ − γ ) (cid:0) h α , ∇ K i k β k + h β , ∇ K i k α k − h α , ∇ K ih β , ∇ K ih α , β i (cid:1) , where ∇ K = γ u by (53). Inserting expressions (57) into the above formula yields (58).From Proposition 11 we have that the denominator is given by γ ( k α k k β k − h α , β i ); again,inserting (57) into such formula yields (59).We will generalize the above results in the next section. The complexity of the formula for curvature reflects a real complexity in the geometry of the land-mark space. But there is one case in which the geometry of landmark space can be analyzed quitecompletely. This is when there are only two nonzero momenta along a geodesic. To put this incontext, we first introduce a basic structural relation between landmark spaces.
Instead of labeling the landmarks as 1 , , · · · , N , one can use any finite index set A and label thelandmarks as q a with a ∈ A . And instead of calling the landmark space L N , we can call it L A .Now suppose we have a subset B ⊂ A . Then there is a natural projection π : L A → L B gotten byforgetting about the points with labels in A − B . In the metrics we have been discussing this is asubmersion . In fact, the kernel of dπ , the vertical subspace of T L A , is the space of vectors v a suchthat v a = 0 if a ∈ B . Its perpendicular in T ∗ is:( T ∗ L A ) B := (cid:8) p ∈ T ∗ L A (cid:12)(cid:12) p a = 0 for a ∈ A − B (cid:9) so the orthogonal complement of ker( dπ ) in T L A is the space of vectors p ] where p is in ( T ∗ L A ) B .On this subspace, the norm is just X b,b ∈B K ( q b − q b ) h p b , p b i p ] is taken to be a tangent vector to A or to B . In other words, the horizontal subspacefor the submersion π is the subbundle ( T ∗ L A ) ] B ⊂ T L A of tangent vectors p ] where p has zerocomponents in A − B and this has the same metric as the tangent space to L B . In particular,from the general theory of submersions, we know that every geodesic in L B beginning at somepoint π ( { q a } ) has a unique lift to a horizontal geodesic in L A starting at { q a } . The picture to haveis that all the landmark spaces form a sort of inverse system of spaces whose inverse limit is thegroup of diffeomorpisms of R D .We don’t want to pursue this is in general, but rather we will study the special case where thecardinality of B is two. We might as well, then, go back to our former terminology and consider themap π : L N → L gotten by mapping an N -tuple ( q , q , · · · , q N ) to the pair ( q , q ). Moreover, wewant to consider only the case in which the kernel K is rotationally invariant as in (52). A basicquantity in all that follows is the distance ρ := k q − q k between the two momentum bearing points. Remarkably, we can describe, more or less explicitly, all the geodesics which arise as horizontallifts from this map. These are the geodesics with nonzero momenta only at q and q . Moreover,the formula for sectional curvature for the 2-plane spanned by any two horizontal vectors can beanalyzed. This analysis was started in the PhD thesis of the first author [19] and has been pursuedfurther in [18].The metric tensor of L = L ( R D ) in coordinates is obtained by inverting the 2 × K : K = (cid:20) γ γ ( ρ ) γ ( ρ ) γ (cid:21) = ⇒ (cid:26) ( K − ) = ( K − ) = ( γ − γ ( ρ ) ) − γ ( K − ) = ( K − ) = − ( γ − γ ( ρ ) ) − γ ( ρ ) , (60)so that the cometric and metric, for all covectors α, β ∈ T ∗ q L and vectors v, w ∈ T q L , are simply: g − ( α, β ) = γ (cid:0) h α , β i + h α , β i (cid:1) + γ ( ρ ) (cid:0) h α , β i + h α , β i (cid:1) , (61) g ( v, w ) = 1 γ − γ ( ρ ) h γ (cid:0) h v , w i + h v , w i (cid:1) − γ ( ρ ) (cid:0) h v , w i + h v , w i (cid:1)i . The geometry of the two-point space is best understood by changing variables for the landmarkcoordinates ( q , q ) and the momentum ( p , p ) to their means and semi-differences, that is: q := q + q , δq := q − q , p := p + p , δp := p − p , so that: q = q + δq, q = q − δq, p = p + δp, p = p − δp. Then the cometric (61) becomes: g − (cid:0) ( α, δα ) , ( β, δβ ) (cid:1) = 2 (cid:0) γ + γ ( ρ ) (cid:1) (cid:10) α, β (cid:11) + 2 (cid:0) γ − γ ( ρ ) (cid:1) (cid:10) δα, δβ (cid:11) . (62)With these coordinates, the two-point landmark space becomes a product V × V δ in which all fibres V × { δq } are flat Euclidean spaces though with variable scales, all fibres { q } × V δ are conformally flat metrics sitting on the manifold R D −{ } and the tangent spaces of the two factors are orthogonal. Proposition 16.
In terms of means and semi-differences, the geodesic equations for L ( R D ) are: ˙ q = (cid:0) γ + γ ( ρ ) (cid:1) p, ˙ p = 0 , ˙ δq = (cid:0) γ − γ ( ρ ) (cid:1) δp, ˙ δp = − γ ( ρ ) ρ (cid:0) k p k − k δp k (cid:1) δq. (63)24he above result is proven by direct computation. We can solve these equations in four steps . First the linear momentum p is a constant, so “center of mass” q moves in a straight lineparallel to this constant: q ( t ) = q (0) + (cid:16) Z t (cid:0) γ + γ ( ρ ( τ )) (cid:1) dτ (cid:17) p. (64) Secondly, if we treat vectors δq and δp as 1-forms in R D , equations (63) also show that:( δq ∧ δp ) (cid:5) = ˙ δq ∧ δp + δq ∧ ˙ δp = [(scalar) · δp ] ∧ δp + δq ∧ [(scalar) · δq ] = 0 , so the angular momentum δq ∧ δp ∈ V R D is constant ; we write this as ω e ∧ e where ω is the nonnegative real magnitude of the angular momentum and ( e , e ) is an orthonormal pair.Then it follows that: δq ( t ) = ρ ( t ) (cid:2) cos (cid:0) θ ( t ) (cid:1) e + sin (cid:0) θ ( t ) (cid:1) e (cid:3) , for some function θ ( t ) . Thirdly, we can express θ ( t ) as an integral:˙ δq = ˙ ρ (cid:2) cos( θ ) e + sin( θ ) e (cid:3) + ρ ˙ θ (cid:2) − sin( θ ) e + cos( θ ) e (cid:3) , so˙ δq ∧ δq = − ρ ˙ θ e ∧ e , as well as (from (63)):˙ δq ∧ δq = (cid:0) γ − γ ( ρ ) (cid:1) δp ∧ δq = − ω (cid:0) γ − γ ( ρ ) (cid:1) e ∧ e ;combining the second and third lines, we find: θ ( t ) = θ (0) + 4 ω Z t γ − γ ( ρ ( τ )) ρ ( τ ) dτ ; (65)note that θ ( t ) is a monotone increasing function if ω = 0, otherwise it is a constant. The last step is to solve for ρ ( t ). This can be done using conservation of energy. Equations (63)are in fact the cogeodesic equations for the Hamiltonian H ( p, q ) of section 4.1, which we may rewritein terms of means and semi-differences as H = (cid:0) γ + γ ( ρ ) (cid:1) k p k + (cid:0) γ − γ ( ρ ) (cid:1) k δp k by (62); hence this function of ρ and k δp k is a constant ( p is also a constant). Then we calculate:( ρ ) (cid:5) = 4 h δq, δq i (cid:5) = 8 h ˙ δq, δq i = 8 (cid:0) γ − γ ( ρ ) (cid:1) h δp, δq i = ⇒ ˙ ρ = 4 γ − γ ( ρ ) ρ h δp, δq i . But: h δp, δq i + ω = h δp, δq i + k δp ∧ δq k = k δp k · k δq k = ρ (cid:18) H − ( γ + γ ( ρ )) k p k γ − γ ( ρ ) (cid:19) , = ⇒ ˙ ρ = 2 p γ − γ ( ρ ) ρ q ρ (cid:2) H − (cid:0) γ + γ ( ρ ) (cid:1) k p k (cid:3) − ω (cid:0) γ − γ ( ρ ) (cid:1) . This means that the function ρ ( t ) is the solution of: t = Z ρ ( t ) ρ (0) x dx p F ( x ) , where: F ( x ) := H x (cid:0) γ − γ ( x ) (cid:1) − k p k x (cid:0) γ − γ ( x ) (cid:1) − ω (cid:0) γ − γ ( x ) (cid:1) . (66)25 − − − − − − − − − − − − − − − − − − − Figure 4: Converging and diverging trajectories for two landmarks in two dimensions. In these ex-amples γ ( x ) = exp( − x ), ( q (0) , q (0)) = ((1 , , ( − , p (0) , p (0)) = (( − , . , (10 , − . p (0) , p (0)) = (( − , , (10 , − Summary.
If we fix constants H , p , ω , ρ (0), θ (0), q a (0) (for all a ), we can first integrate (66)to get ρ ( t ) (the separation of q and q ), then integrate (65) to find their relative angle θ ( t ), thenintegrate (64) to get their center of mass q ( t ). This gives the trajectories of q and q . The remainingpoints are dragged along as solutions of: ddt q a ( t ) = γ (cid:0) k q a ( t ) − q ( t ) k (cid:1) p ( t ) + γ (cid:0) k q a ( t ) − q ( t ) k (cid:1) p ( t ) . As worked out in [18, 19], one can classify the global behavior of these geodesics into two types.One is the scattering type in which q , q diverge from each other as time goes to either ±∞ . Thisoccurs if the linear or angular momentum is large enough compared to the energy. In the other casewhere the energy is large enough compared to both momenta, they come together asymptoticallyat either t = + ∞ or −∞ , diverging at the other limit. In both cases, they may spiral around eachother an arbitrarily large number of times (see Figure 4). Next we consider L N ( R D ): we want to compute the sectional curvature R ( α ] , β ] , β ] , α ] ) for cotangentvectors that are nonzero at only ( q , q ). Also, we will use the notation u := q − q k q − q k for the unitvector from q to q as well as ρ = k q − q k for their distance. Similarly to (55), we will also wantto decompose any vector in η ∈ R D into its parts tangent to u and perpendicular to u : η k := h η, u i , and η ⊥ := η − η k u. Once again note that η k is a scalar whereas η ⊥ is a vector. Following the notation used to describegeodesics above, for any α ∈ ( T ∗ q L ) , := (cid:8) η ∈ T ∗ q L (cid:12)(cid:12) η a = 0 for a > (cid:9) , we write α = ( α + α )and δα = ( α − α ). 26 roposition 17. In L N ( R D ) for any pair α, β ∈ ( T ∗ q L ) , , the terms R , R and R in the numer-ator of sectional curvature can be written as R = 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) + 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ (cid:10) δα ⊥ ⊗ β − δβ ⊥ ⊗ α , δα ⊥ ⊗ β − δβ ⊥ ⊗ α (cid:11) ,R = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) ,R = γ − γ ( ρ )2 γ ( ρ ) (cid:2)(cid:0) h α , β i + h β , α i (cid:1) − h α , α ih β , β i (cid:3) . Once again we have used: h v ⊗ w , v ⊗ w i := h v , v ih w , w i . We need the following result. Lemma 18.
For any α ∈ ( T ∗ q L ) , , the discrete strain S ( α ) is given by: S ( α ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) δα. (67) For any pair α, β ∈ ( T ∗ q L ) , it is the case that F a ( α, β ) = 0 for a > , whereas F ( α, β ) = − F ( α, β ) = γ ( ρ )2 (cid:0) h α , β i + h β , α i (cid:1) u. (68) Also, D ( α, β ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k β , D ( α, β ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k β . (69) Remark.
We are not interested in D a ( α, β ) for a > F a ( α, β ) = 0 for a > Proof of Lemma 18.
The formula for the discrete strain results from: S ( α ) = ( α ] ) − ( α ] ) = P b ( K b − K b ) α b = γ α + γ ( ρ ) α − γ ( ρ ) α − γ α = 2 (cid:0) γ − γ ( ρ ) (cid:1) δα. The values for F follow immediately from formula (38) and ∇ K = γ ( ρ ) u . Note that: C ( α ) = C ( α ) = (cid:10) S ( α ) , ∇ K (cid:11) = (cid:10) (cid:0) γ − γ ( ρ ) (cid:1) δα, γ ( ρ ) u (cid:11) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δα k , and similarly we have that C ( β ) = C ( β ) = 2 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δβ T . So D ( α, β ) = C ( α ) β = 2( γ − γ ( ρ )) γ ( ρ ) δα k β ,D ( α, β ) = C ( α ) β = 2( γ − γ ( ρ )) γ ( ρ ) δα k β . Proof of Proposition 17.
The R expression follows by substituting the expressions in (67) into for-mula (56), noting that the only non-zero terms in the latter are for ( a, b ) = (1 ,
2) and ( a, b ) = (2 , F = − F from Lemma 18, R is given by R = (cid:10) D ( α, α ) − D ( α, α ) , F ( β, β ) (cid:11) + (cid:10) D ( β, β ) − D ( β, β ) , F ( α, α ) (cid:11) − (cid:10) D ( α, β ) − D ( α, β ) + D ( β, α ) − D ( β, α ) , F ( α, β ) (cid:11) . (70)Again by Lemma 18 we have that D ( η, ζ ) − D ( η, ζ ) = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) δη k δζ for any pair η, ζ ∈ ( T ∗ q L ) , , while F ( η, ζ ) = γ ( ρ ) (cid:0) h η , ζ (cid:11) + h η , ζ i (cid:1) u. Applying this to all the terms we getthe expression for R in the statement of the proposition.27s far as R is concerned, by Theorem 9: R = γ (cid:2) h F ( α, β ) ,F ( α, β ) i−h F ( α, α ) ,F ( β, β ) i (cid:3) + γ ( ρ ) (cid:2) h F ( α, β ) ,F ( α, β ) i −h F ( α, α ) ,F ( β, β ) i (cid:3) + γ ( ρ ) (cid:2) h F ( α, β ) ,F ( α, β ) i−h F ( α, α ) ,F ( β, β ) i (cid:3) + γ (cid:2) h F ( α, β ) ,F ( α, β ) i −h F ( α, α ) ,F ( β, β ) i (cid:3) = 2( γ − γ ( ρ )) (cid:2) h F ( α, β ) , F ( α, β ) i − h F ( α, α ) , F ( β, β ) i (cid:3) = γ − γ ( ρ )2 ( γ ( ρ )) (cid:2) ( h α , β i + h β , α i ) − h α , α ih β , β i (cid:3) , where we have used the fact that F = − F , by equation (68). This completes the proof.The expressions provided by Proposition 17 become much clearer if we go over to means andsemi-differences, i.e. if we use the substitutions: α = α + δα, α = α − δα, β = β + δβ, β = β − δβ. (71) Corollary 19.
For any α, β ∈ ( T ∗ q L ) , , with L = L N ( R D ) , it is the case that: R = 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ k α − δα k β k − k δβ k δα − δα k δβ k (cid:1) + 4 (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ (cid:0) k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k − k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k (cid:1) ,R = − (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ k α − δα k β k − k δβ k δα − δα k δβ k (cid:1) ,R = (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) (cid:0) k δβ ⊗ α − δα ⊗ β k − k β ⊗ α − α ⊗ β k − k δβ ⊗ δα − δα ⊗ δβ k (cid:1) . Proof.
By insertion of formulae (71) it is easily seen that (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11) = k δβ k α − δα k β k − k δβ k δα − δα k δβ k , (cid:10) δα ⊥ ⊗ β − δβ ⊥ ⊗ α , δα ⊥ ⊗ β − δβ ⊥ ⊗ α (cid:11) = k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k − k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k , so the new expressions for R and R follow immediately. Also: (cid:2) h α ,β i + h β , α i (cid:3) − h α , α ih β , β i = (cid:2) h α, β i − h δα, δβ i ) (cid:3) − (cid:0) h α, α i − h δα, δα i (cid:1)(cid:0) h β, β i − h δβ, δβ i (cid:1) = − (cid:2) h α, α ih β, β i − h α, β i ) (cid:3) − (cid:2) h δα, δα ih δβ, δβ i − h δα, δβ i ) (cid:3) + 4 (cid:2) h α, α ih δβ, δβ i + h β, β ih δα, δα i − h α, β ih δα, δβ i (cid:3) = − k β ⊗ α − α ⊗ β k − k δβ ⊗ δα − δα ⊗ δβ k + 4 k δβ ⊗ α − δα ⊗ β k . The fourth term R is the only one which involves the other points q a , a >
2. But one has aninequality for this term involving the same expressions in α and β : Proposition 20.
Any pair α, β ∈ ( T ∗ L N ) , are constant 1-forms on L N which are pull-backs viathe submersion L N → L of constant 1-forms on L . We can therefore consider the curvature term R ( L N ) = − k [ α ] , β ] ] L N k on L N and the corresponding term R ( L ) = − k [ α ] , β ] ] L k on L .Then we have the inequality: R ( L N ) ≤ R ( L ) = − γ ( ρ ) (cid:20) (cid:0) γ − γ ( ρ ) (cid:1) γ + γ ( ρ ) k δβ k α − δα k β k + (cid:0) γ − γ ( ρ ) (cid:1) k δβ k δα − δα k δβ k (cid:21) . • q q α α δ k T ∗ q L (cid:27) - • • q q α α δ ⊥ T ∗ q L ? 6 • • q q α α T ∗ q L HHHY HHHY
Figure 5: Typical covectors α = ( α , α ) in spaces δ k T ∗ q L , δ ⊥ T ∗ q L , and T ∗ q L , for L = L ( R ). Proof.
Firstly, note that [ α ] , β ] ] L N breaks into perpendicular parts: a vertical part in the kernel of dπ and a horizontal part which is simply the horizontal lift of [ α ] , β ] ] L . This explains the inequalityassertion in Proposition 20. To calculate R ( L ), we use the last expression in (44), i.e. R ( L ) = − P a,b =1 (cid:10) D a ( α, β ) − D a ( β, α ) , D b ( α, β ) − D b ( β, α ) (cid:11) ( K − ) ab = − γ − γ ( ρ ) γ + γ ( ρ ) γ ( ρ ) n γ h k δα k β − δβ k α k + k δα k β − δβ k α k i − γ ( ρ ) (cid:10) δα k β − δβ k α , δα k β − δβ k α (cid:11)o , where we have used (60) and (69). The final result follows after inserting (71) into the aboveexpression and performing some algebra.Note that all terms in Corollary 19 and Proposition 20 are very similar. In fact, they are all“components” of the norm k α ∧ β k of the 2-form whose sectional curvature is being computed.First note that we can decompose T ∗ L into the direct sum of three pieces, namely: δ k T ∗ L := (cid:8) ( au, − au ) (cid:12)(cid:12) a ∈ R (cid:9) , dim (cid:0) δ k T ∗ L (cid:1) = 1 ,δ ⊥ T ∗ L := (cid:8) ( p, − p ) (cid:12)(cid:12) p ∈ R D , p ⊥ u (cid:9) , dim (cid:0) δ ⊥ T ∗ L (cid:1) = D − ,T ∗ L := (cid:8) ( p, p ) (cid:12)(cid:12) p ∈ R D (cid:9) , dim (cid:0) T ∗ L (cid:1) = D, where as usual u := q − q k q − q k (see Figure 5). Note that these three subspaces are orthogonal withrespect to the cometric by virtue of (61). An arbitrary covector α = ( α , α ) ∈ T ∗ q L can be uniquelydecomposed into the summation α = α (1) + α (2) + α (3) , with: α (1) := ( δα k u, − δα k u ) ∈ δ k T ∗ L , α (2) := ( δα ⊥ , − δα ⊥ ) ∈ δ ⊥ T ∗ L , α (3) := ( α, α ) ∈ T ∗ L . (72)So it is the case that: (i) α ∈ δ k T ∗ L ⇔ δ ⊥ α = 0 and α = 0; (ii) α ∈ δ ⊥ T ∗ L ⇔ δ k α = 0 and α = 0;(iii) α ∈ T ∗ L ⇔ δα k = 0 and δα ⊥ = 0.Consequently the space of 2-forms V T ∗ L decomposes into the direct sum of five pieces: ^ T ∗ L = M i =1 V i , with: V := δ k T ∗ L ∧ T ∗ L ,V := δ ⊥ T ∗ L ∧ T ∗ L , V := δ k T ∗ L ∧ δ ⊥ T ∗ L ,V := ^ (cid:0) δ ⊥ T ∗ L (cid:1) , V := ^ (cid:0) T ∗ L (cid:1) . δ k T ∗ L is one-dimensional it creates no 2-forms.) Once again, note that the spaces V , . . . , V are pairwise orthogonal with respect to the inner product (cid:10) α ∧ β, ξ ∧ η (cid:11) V T ∗ L := (cid:10) α, ξ (cid:11) T ∗ L (cid:10) β, η (cid:11) T ∗ L − (cid:10) α, η (cid:11) T ∗ L (cid:10) β, ξ (cid:11) T ∗ L , α, β, ξ, η ∈ T ∗ L (73)by the orthogonality of δ k T ∗ q L , δ ⊥ T ∗ q L , and T ∗ q L . Any 2-form α ∧ β then decomposes into the sumof its five projections onto these subspaces and its norm squared is the sum of the norm squared ofthese components. Let us first give the five pieces of its norm names: T := k δβ k u ⊗ α − δα k u ⊗ β k ,T := k δβ ⊥ ⊗ α − δα ⊥ ⊗ β k , T := k δβ k u ⊗ δα ⊥ − δα k u ⊗ δβ ⊥ k ,T := k δβ ⊥ ⊗ δα ⊥ − δα ⊥ ⊗ δβ ⊥ k , T := k β ⊗ α − α ⊗ β k . In the above definitions k k indicates the Euclidean norm. We have to be careful here: we havebeen using Euclidean norms in R D in all our formulas above and now we are dealing with norms in T ∗ L ; these essentially differ only by a factor, by (62). More precisely, the following result holds: Proposition 21.
The denominator of the sectional curvature (16) for L ( R ) can be written as: k α ∧ β k V T ∗ L = 4 (cid:0) γ − γ ( ρ ) (cid:1) ( T + T ) + 2 (cid:0) γ − γ ( ρ ) (cid:1) (2 T + T ) + 2 (cid:0) γ + γ ( ρ ) (cid:1) T . (74) Proof.
We may apply decomposition (72) to both α = P i =1 α ( i ) and β = P i =1 β ( i ) , and write α ∧ β = (cid:0) α (1) ∧ β (3) − β (1) ∧ α (3) (cid:1) + (cid:0) α (2) ∧ β (3) − β (2) ∧ α (3) (cid:1) + (cid:0) α (1) ∧ β (2) − β (1) ∧ α (2) (cid:1) + α (2) ∧ β (2) + α (3) ∧ β (3) , where the five summands on the right-hand side belong to V , . . . , V respectively. We have k α (1) ∧ β (3) − β (1) ∧ α (3) k V T ∗ L == (cid:13)(cid:13) α (1) ∧ β (3) k V T ∗ L + (cid:13)(cid:13) β (1) ∧ α (3) k V T ∗ L − h α (1) ∧ β (3) , β (1) ∧ α (3) i V T ∗ L ( ∗ ) = 4 (cid:0) γ − γ ( ρ ) (cid:1)(cid:2) ( δα k ) k β k + ( δβ k ) k α k − δα k δβ k h α, β i (cid:3) = 4 (cid:0) γ − γ ( ρ ) (cid:1) T , where we have used (73) and (62) in step ( ∗ ). The square norm of the remaining four terms iscomputed similarly. Orthogonality of V , . . . , V finally yields (74).To express the formulas for the numerator of sectional curvature succinctly, let us also introduceabbreviations for the coefficients involving γ : k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) , k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) ρ ,k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ ( ρ ) , k ( ρ ) := (cid:0) γ − γ ( ρ ) (cid:1) γ + γ ( ρ ) γ ( ρ ) . (75)Note that k , k , k and k are all homogeneous of degree 3 in γ and degree − ρ or dρ on L N . Moreover k is negative, k and k are positive, while k may be positive or negative.For all γ of interest, γ is everywhere negative, starting at 0 decreasing to a minimum at some ρ ,then increasing back to 0 at ∞ . Then k is negative for ρ < ρ and positive for ρ > ρ .30 k − k − +k − k − k − k − − k Figure 6: The coefficients of T (top left), T (top right), T (bottom left) and T (bottom right) forthe Bessel kernels γ (shown with thin lines) and the Gaussian kernel (shown with the thick line).The kernels are scaled to normalize γ (0) and γ (0).The following equalities are proven by direct computation: k δβ ⊗ α − δα ⊗ β k = T + T , k δβ ⊥ ⊗ δα − δα ⊥ ⊗ δβ k = T + T , k δβ ⊗ δα − δα ⊗ δβ k = 2 T + T . Inserting notation (75) and the above equalities into Propositions 17 and 20 immediately yields:
Proposition 22.
We can write the terms in the numerator of sectional curvature for L ( R D ) as: R = 4 k ( T − T ) + 4 k ( T − T − T ) , R = − k ( T − T ) ,R = k (2( T + T ) − T − T − T ) , R = − k T + k T ) , (76) hence R = R ( α ] , β ] , β ] , α ] ) = P i =1 R i may be expressed as: R = 2 (cid:0) k − k − k (cid:1) T + 2 (cid:0) k + k (cid:1) T + 4 (cid:0) − k − k − k (cid:1) T + (cid:0) − k − k (cid:1) T − k T . (77)By virtue of Proposition 20 the above proposition still holds in the case of L N ( R D ) as longas α, β ∈ ( T ∗ q L ) , and the equality signs for R in (76) and R in (77) are substituted by “ ≤ ”. The31 − − − − ! " q GEODESIC 1GEODESIC 2
Figure 7: Left: sectional curvature K for L ( R ) (from Proposition 23), as a function of ρ = | q − q | ;here γ ( x ) = exp( − x ). Right: two trajectories in L ( R ) shown in the ( q , q ) plane (under theassumption that q < q ). Both geodesics originate at ( q , q ) = (0 , K > K at | q − q | ’ . p , p ) = (1 ,
1) and ( p , p ) = (1 , . γ of interestare the Bessel kernels (3) and the Gaussian kernel, which is their asymptotic limit as their ordergoes to infinity. The coefficients for these kernels are shown in Figure 6. We see that the coefficientsof T and T are negative while those of T are positive. Henceforth, we assume we have a kernelfor which this is true. L ( R ) Finally, we will now explore the important example of two landmarks on the real line. In this partic-ular case the manifold is two dimensional, so sectional curvature K will turn out to be independentof cotangent vectors α and β . In fact, given the translation invariance of the metric tensor, it willonly depend on the distance ρ = | q − q | between the two landmarks.The spaces δ k T ∗ L ( R ) and T ∗ L ( R ) are one-dimensional while δ ⊥ T ∗ L ( R ) = { } . Thus ^ T ∗ L ( R ) = δ k T ∗ L ( R ) ∧ T ∗ L ( R )and the only non-zero term in (77) is T . Therefore combining formulas (74) and (77) we get: Proposition 23.
The sectional curvature of L ( R ) is given by K = 2 k − k − k γ − γ ( ρ ) ) = γ − γ ( ρ ) γ + γ ( ρ ) γ − γ − γ ( ρ )( γ + γ ( ρ )) (cid:0) γ ( ρ ) (cid:1) . − − − − − Figure 8: Existence of conjugate points in L ( R ), with γ ( x ) = exp( − x ). Both geodesicsoriginate at landmark set ( q , q ) = (cid:0) ( − , − , (2 , (cid:1) ; the first one (dashed) has initial momen-tum ( p , p ) = (cid:0) (0 , , (0 , (cid:1) ∈ T ∗ L while the second one (continuous) has initial momentum( p , p ) = (cid:0) (6 , , ( − , (cid:1) ∈ δ k T ∗ L ⊕ T ∗ L . The geodesic trajectories exhibit conjugate points.The above function K is shown on the left-hand side of Figure 7 as a function of ρ , for theGaussian kernel. The coefficient of the term T in (77) is negative for ρ small and positive for ρ large. The “cause” of the positive curvature has been analyzed in [19]. Roughly speaking, supposetwo points both want to move a fixed distance to the right. Then if they are far enough away, theycan just move more or less independently (we shall refer to this as Geodesic 1). Or (i) the one inback can speed up while the one in front slows down, then (ii) when the pair are close, they move intandem using less energy because they are close and finally (iii) the back one slows down, the frontone speeds up when they near their destinations (Geodesic 2). This gives explicit conjugate pointsand is illustrated on the right-hand side of figure Figure 7 (where Geodesics 1 and 2 are represented,respectively, by the dashed and thick curves). There is another source of positive curvature in L in higher dimensions. It is clear from equation (77)and Figure 6 that any positive curvature must come from the term with T or the term with T . Asthe five terms are orthogonal, we can make all of them but one zero.For example, if we choose α = ( δα k u, − δα k u ) ∈ δ k T ∗ L and β = ( β, β ) ∈ T ∗ L , then it is the casethat T = ( δα k ) k β k and it is the only non-zero term. Then, if ρ is sufficiently large, the sectionalcurvature for this 2-plane is positive as discussed in the last section. Figure 8 illustrates an instanceof the existence of conjugate points for two geodesics in L ( R ); the momenta ( p , p ) of each of thetwo trajectories belong at all times to δ k T ∗ L ⊕ T ∗ L .The other possibility is that T is the non-zero term, which happens when α = ( δα ⊥ , − δα ⊥ ) ∈ ⊥ T ∗ L and β = ( δβ ⊥ , − δβ ⊥ ) ∈ δ ⊥ T ∗ L . We have T = 2( k δα ⊥ k k δβ ⊥ k − h δα ⊥ , δβ ⊥ i ), and for itto be nonzero it is required that D ≥ T is the norm of a 2-form in V (cid:0) δ ⊥ T ∗ L (cid:1) , whichhas dimension ( D − D − /
2. The positive curvature of this section is readily seen by consideringthe geodesics which these vectors generate. The simplest example is the following:
Proposition 24.
The circular periodic orbit of radius r : q ( t ) = ( r cos t, r sin t ) , q ( t ) = − q ( t ) , (78) t ∈ R , is a geodesic in L ( R ) if and only if r is the solution of the equation γ − γ ( x ) + 2 xγ ( x ) = 0 .Proof. For orbit (78) it is the case that q ≡ δq ≡ ρ ≡ r , p ≡ δp ( t ) = (cid:0) γ − γ ( ρ ) (cid:1) − ˙ q ( t );therefore equations (63) are satisfied if and only if γ − γ ( r ) + 2 rγ ( r ) = 0.(The above result was also proved by Fran¸cois-Xavier Vialard of Imperial College, London.)Orbit (78) has the property that at time π , q and q interchange their positions: it is a geodesicfrom the set of landmark points (cid:0) ( r, , ( − r, (cid:1) ∈ L ( R ) to the set (cid:0) ( − r, , ( r, (cid:1) ∈ L ( R ). Butif these points live in R , they can move around each other in any plane containing the points. Thuswe have a circle of geodesics in L ( R ): q ( t ) = ( r cos t, r cos θ sin t, r sin θ sin t ) , q ( t ) = − q ( t )all connecting (cid:0) ( r, , , ( − r, , (cid:1) to (cid:0) ( − r, , , ( r, , (cid:1) , for any θ ∈ [0 , π ). This is exactly like allthe lines of fixed longitude connecting the north and south pole on the 2-sphere and means that oneset of landmark points is a conjugate point of the other in L ( R ). This is the simplest example ofhow geodesics between landmark points must avoid collisions and so make a choice between differentpossible detours, leading to conjugate points and thus positive curvature. We have computed a formula for sectional curvature for L N ( R D ), the Riemannian manifold of N landmark points in D dimensions. To do so we have developed a formula to compute sectionalcurvature of a Riemannian manifold in terms of the cometric, its partial derivatives, and the metric(but not its derivatives). Finally, we have fully examined the case of geodesics in which only twopoints have non-zero momenta, and found that there are essentially two sources of positive curvature;one only occurs when D ≥
3. Future work may include: exploring the shape of the coefficients in (77)for different kernels; finding new sources of positive curvature when momenta are non-zero at morethan two points; analyzing what happens asymptotically when the points are very close or very farfrom each other; further relating the dynamics of landmarks to the geometry of the space.
References [1] M. Abramowitz and I. A. Stegun.
Handbook of Mathematical Functions . Dover Publications,New York, 1964.[2] M. F. Beg, M. I. Miller, A. Trouv´e, and L. Younes. Computing large deformation metricmappings via geodesic flows of diffeomorphisms.
International Journal on Computer Vision ,61(2):139–157, 2005. 343] Y. Cao, M. I. Miller, S. Mori, R. L. Winslow, and L. Younes. Diffeomorphic matching ofdiffusion tensor images. In
Proceedings of the IEEE Conference on Computer Vision and PatternRecognition (CVPR ’06) , New York, June 2006.[4] Y. Cao, M. I. Miller, R. L. Winslow, and L. Younes. Large deformation diffeomorphic metricmapping of vector fields.
IEEE Transactions on Medical Imaging , 24(9):1216–1230, 2005.[5] C. Chicone.
Ordinary Differential Equations and Applications , volume 34 of
Texts in AppliedMathematics . Springer, 1999.[6] L. C. Evans.
Partial Differential Equations , volume 19 of
Graduate Studies in Mathematics .American Mathematical Society, Providence, Rhode Island, 1998.[7] J. Glaun`es.
Transport par diff´eomorphismes de points, de mesures et de courants pour la com-paraison de formes et l’anatomie num´erique . PhD thesis, Universit´e Paris 13, France, Sept.2005.[8] J. Glaun`es, A. Qiu, M. I. Miller, and L. Younes. Large deformation diffeomorphic metric curvemapping.
International Journal of Computer Vision , 80(3):317–336, Dec. 2008.[9] J. Glaun`es, A. Trouv´e, and L. Younes. Diffeomorphic matching of distributions: a new approachfor unlabeled point-sets and sub-manifolds matching. In
Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR ’04) , volume 2, pages 712–718, Washington,DC, June 2004.[10] J. Glaun`es, M. Vaillant, and M. I. Miller. Landmark matching via large deformation diffeomor-phisms on the sphere.
Journal of Mathematical Imaging and Vision , 20:170–200, 2004.[11] S. C. Joshi and M. I. Miller. Landmark matching via large deformation diffeomorphisms.
IEEETransactions on Image Processing , 9(8):1357–1370, Aug. 2000.[12] J. Jost.
Riemannian Geometry and Geometric Analysis . Springer-Verlag, New York, thirdedition, 2002.[13] D. G. Kendall. Shape manifolds, Procrustean metrics, and complex projective spaces.
Bulletinof the London Mathematical Society , 16(2):81–121, 1984.[14] E. Klassen, A. Srivastava, W. Mio, and S. Joshi. Analysis of planar shapes using geodesic pathson shape spaces.
IEEE Transactions on Pattern Analysis and Machine Intelligence , 26(3):372–383, Mar. 2004.[15] A. Kriegl and P. W. Michor.
The Convenient Setting of Global Analysis , volume 53 of
Mathe-matical Surveys and Monographs . American Mathematical Society, Providence, Rhode Island,1997.[16] S. Kushnarev. Teichons: Soliton-like geodesics on universal Teichm¨uller space.
ExperimentalMathematics , 18(3), 2009.[17] J. M. Lee.
Riemannian Manifolds: an Introduction to Curvature , volume 176 of
Graduate Textsin Mathematics . Springer, New York, 1997.[18] R. L. Mclachlan and S. Marsland. N-particle dynamics of the euler equations for planar diffeo-morphisms.
Dynamical Systems , 22:269–290, 2007.3519] M. Micheli.
The Geometry of Landmark Shape Spaces: Metrics, Geodesics, and Curvature . PhDthesis, Brown University, Providence, Rhode Island, 2008.[20] P. W. Michor.
Topics in differential geometry , volume 93 of
Graduate Studies in Mathematics .American Mathematical Society, Providence, RI, 2008.[21] P. W. Michor and D. B. Mumford. Riemannian geometries on spaces of plane curves.
Journalof the European Mathematical Society , 8:1–48, 2006.[22] P. W. Michor and D. B. Mumford. An overview of the Riemannian metrics on spaces of curvesusing the Hamiltonian approach.
Applied and Computational Harmonic Analysis , 23:74–113,2007.[23] M. I. Miller and L. Younes. Group actions, homeomorphisms, and matching: A general frame-work.
International Journal of Computer Vision , 41(1/2):61–84, 2001.[24] E. Sharon and D. B. Mumford. 2D-shape analysis using conformal mapping.
InternationalJournal of Computer Vision , 70(1):55–75, Oct. 2006.[25] G. Sundaramoorthi, A. Yezzi, and A. Mennucci. Sobolev active contours.
International Journalof Computer Vision , 73(3):345–366, July 2007.[26] L. Younes.
Shapes and Diffeomorphisms , volume 171 of
Applied Mathematical Sciences .Springer, 2010.[27] L. Younes, P. W. Michor, J. Shah, and D. B. Mumford. A metric on shape space with explicitgeodesics.
Rendiconti Lincei – Matematica e Applicazioni , 9:25–57, 2008.[28] S. Zhang, L. Younes, J. Zweck, and J. T. Ratnanather. Diffeomorphic surface flows: A novelmethod of surface evolution.