[PDF] Higher-Order Momentum Distributions and Locally Affine LDDMM Registration

Abstract

To achieve sparse parametrizations that allows intuitive analysis, we aim to represent deformation with a basis containing interpretable elements, and we wish to use elements that have the description capacity to represent the deformation compactly. To accomplish this, we introduce in this paper higher-order momentum distributions in the LDDMM registration framework. While the zeroth order moments previously used in LDDMM only describe local displacement, the first-order momenta that are proposed here represent a basis that allows local description of affine transformations and subsequent compact description of non-translational movement in a globally non-rigid deformation. The resulting representation contains directly interpretable information from both mathematical and modeling perspectives. We develop the mathematical construction of the registration framework with higher-order momenta, we show the implications for sparse image registration and deformation description, and we provide examples of how the parametrization enables registration with a very low number of parameters. The capacity and interpretability of the parametrization using higher-order momenta lead to natural modeling of articulated movement, and the method promises to be useful for quantifying ventricle expansion and progressing atrophy during Alzheimer's disease.

Full PDF

aa r X i v : . [ c s . C V ] N ov HIGHER-ORDER MOMENTUM DISTRIBUTIONS AND LOCALLYAFFINE LDDMM REGISTRATION

STEFAN SOMMER ∗ , MADS NIELSEN ∗ , † , SUNE DARKNER ∗ , AND

XAVIER PENNEC ‡ Abstract.

To achieve sparse parametrizations that allows intuitive analysis, we aim to representdeformation with a basis containing interpretable elements, and we wish to use elements that havethe description capacity to represent the deformation compactly . To accomplish this, we introduce inthis paper higher-order momentum distributions in the LDDMM registration framework. While thezeroth order moments previously used in LDDMM only describe local displacement, the ﬁrst-ordermomenta that are proposed here represent a basis that allows local description of aﬃne transfor-mations and subsequent compact description of non-translational movement in a globally non-rigiddeformation. The resulting representation contains directly interpretable information from bothmathematical and modeling perspectives. We develop the mathematical construction of the registra-tion framework with higher-order momenta, we show the implications for sparse image registrationand deformation description, and we provide examples of how the parametrization enables registra-tion with a very low number of parameters. The capacity and interpretability of the parametrizationusing higher-order momenta lead to natural modeling of articulated movement, and the methodpromises to be useful for quantifying ventricle expansion and progressing atrophy during Alzheimer’sdisease.

Key words.

LDDMM, diﬀeomorphic registration, RHKS, kernels, momentum, computationalanatomy

AMS subject classiﬁcations.

1. Introduction.

In many image registration applications, we wish to describethe deformation using as few parameters as possible and with a representation thatallows intuitive analysis: we search for parametrizations with basis elements that havethe capacity to describe deformation sparsely while being directly interpretable . Forinstance, we wish to use such a representation to compactly describe the progressiveatrophy that occurs in the human brain suﬀering from Alzheimer’s disease and thatcan be detected by the expansion of the ventricles [19, 13].Image registration algorithms often represent translational movement in a densesampling of the image domain. Such approaches fail to satisfy the above goals: lowdimensional deformations such as expansion of the ventricles will not be representedsparsely; the registration algorithm must optimize a large number of parameters; andthe expansion cannot easily be interpreted from the registration result.In this paper, we use higher-order momentum distributions in the LDDMM regis-tration framework to obtain a deformation parametrization that increases the capacity of sparse approaches with a basis consisting of interpretable elements. We show howthe higher-order representation model locally aﬃne transformations, and we use thecompact deformation description to register points and images using very few param-eters. We illustrate how the deformation coded by the higher-order momenta can bedirectly interpreted and that it represents information directly useful in applications:with low numbers of control points, we can detect the expanding ventricles of thepatient shown in Figure 1.1.

Most of the methods for non-rigid registration in medicalimaging model the displacement of each spatial position by either a combination of ∗ Dept. of Computer Science, Univ. of Copenhagen, Denmark ( [email protected] ) † BiomedIQ, Copenhagen, Denmark ‡ Asclepios Project-Team, INRIA Sophia-Antipolis, France1

50 100 150 200 25050100150200250 (a) Baseline with control points.

50 100 150 200 25050100150200250 1020304050607080 (b) Follow up (box marking zoom area, ﬁgure(c) and (d)).

90 100 110 120 130 140 150 160 170708090100110120130140 −0.04−0.0200.020.040.060.08 (c) log-Jacobian in ventricle area.

80 90 100 110 120 130 140 150 160 170 180708090100110120130140 (d) Initial deformation ﬁeld in ventricle area.

Fig. 1.1 . Progressing Alzheimer’s disease cause atrophy and expansion of the ventricles. Byplacing ﬁve deformation atoms in the ventricle area of the baseline MRI scan [20] and by using higher-order momenta , we can detect the expansion. (a) The position of the deformation atomsshown in the baseline scan; (b) the follow up scan; (c) the log -Jacobian determinant of the generateddeformation in the ventricle area (red box in (b)); (d) the vector ﬁeld at t = 0 of the generateddeformation. The logarithm of the Jacobian determinant and the divergence at the deformationatoms are positive which is in line with the expected ventricle expansion, confer also Figure 7.5. suitable basis functions for the displacement itself or for the velocity of the voxels.The number of control points vary between one for each voxel [2, 17, 7] and fewer withlarger basis functions [25, 5, 11]. For all methods, the inﬁnite-dimensional space ofdeformations is approximated by the ﬁnite- but high-dimensional subspace spannedby the parametrization of the individual method. The approximation will be goodif the underlying deformation is close to this subspace, and the representation willbe compact, if few basis functions describe the deformation well. The choice of ba-sis functions play a crucial role, and we will in the rest of the paper denote them deformation atoms . Two main observations constitute the motivation for the workpresented in this paper: Observation 1: Order of the Deformation Model.

In the majority of registrationmethods, the deformation atoms model the local translation of each point. We wisha richer representation which is in particular able to model locally linear componentsin addition to local translations. The Polyaﬃne and Log-Euclidean Polyaﬃne [3,1] frameworks pursue this by representing the velocity of a path of deformationslocally by matrix logarithms. Ideas from the Polyaﬃne methods have recently beenincorporated in e.g. the Demons algorithm [32] but, to the best of our knowledge, notin the LDDMM registration framework. We wish to extend the set of deformationatoms used in LDDMM to allow representation of ﬁrst- and higher-order structure andhence incorporate the beneﬁts of the Polyaﬃne methods in the LDDMM framework.

Observation 2: Order of the Similarity Measure.

When registering DT images,the reorientation is a function of the derivative of the warp; curve normals also containdirectional information which is dependent on the warp derivative and airway treescontain directional information in the tree structure which can be used for measuringsimilarity. These are examples of similarity measures containing higher-order infor-mation . For the case of image registration, the warp derivative may also enter theequation either directly in the similarity measure [24, 22] or to allow use of more imageinformation than provided by a sampling of the warp. Consider an image similaritymeasure on the form U ( ϕ ) = R Ω F ( I m ( ϕ − ( x )) , I f ( x )) dx . A ﬁnite sampling of thedomain Ω can approximate this with˜ U ( ϕ ) = 1 N N X k =1 F ( I m ( ϕ − ( x k )) , I f ( x k )) . Letting { p , . . . , p P } be uniformly distributed points around 0, we can increase theamount of image information used in ˜ U ( ϕ ) without additional sampling of the warpby using a ﬁrst-order approximation of ϕ − :˜ U ( ϕ ) = 1 N P N X k =1 N X l =1 F ( I m ( Dϕ − p l + ϕ − ( x k )) , I f ( p l + x k )) . This can be considered an increase from zeroth to ﬁrst-order in the approximation of U . Besides including more image information than provided by the initial samplingof the warp, the increase in order allows capture of non-translational information -e.g. rotation and dilation - in the similarity measure. The approach can be seen as aspeciﬁc case of similarity smoothing; more examples of smoothing in intensity basedimage registration can be found in [9].We focus on deformation modeling with the Large Deformation DiﬀeomorphicMetric Mapping (LDDMM) registration framework which has the beneﬁt of bothproviding good registrations and drawing strong theoretical links with Lie group the-ory and evolution equations in physical modeling [8, 34]. Most often, high-dimensionalvoxel-wise representations are used for LDDMM although recent interest in compact representations [11, 28] show that the number of parameters can be much reduced.These methods use interpolation of the velocity ﬁeld by deformation atoms to repre-sent translational movement but deformation by other parts of the aﬃne group cannotbe compactly represented.The deformation atoms are called kernels in LDDMM. The kernels are centered atdiﬀerent spatial positions and parameters determine the contribution of each kernel.In this paper, we use the partial derivative reproducing property [35] to show thatpartial derivatives of kernels ﬁt naturally in the LDDMM framework and constitutedeformation atoms along with the original kernels. In particular, these deformationatoms have a singular higher-order momentum and the momentum stays singularwhen transported by the EPDiﬀ evolution equations. We show how the higher-ordermomenta allow modeling locally aﬃne deformations, and they hence extend the capac-ity of sparsely discretized LDDMM methods. In addition, they comprise the naturalvehicle for incorporating ﬁrst-order similarity measures in the framework. A number of methods for non-rigid registration have beendeveloped during the last decades including non-linear elastic methods [21], parametriza-tions using static velocity ﬁelds [2, 17], the demons algorithm [29, 32], and spline-basedmethods [25, 5]. For the particular case of LDDMM, the groundbreaking work ap-peared with the deformable template model by Grenander [16] and the ﬂow approachby Christensen et al. [7] together with the theoretical contributions of Dupuis etal. and Trouv´e [10, 30]. Algorithms for computing optimal diﬀeomorphisms havebeen developed in [4], and [31] uses the momentum representation for statistics anddevelops a momentum based algorithm for the landmark matching problem.Locally aﬃne deformations can be modeled using the Polyaﬃne and Log-EuclideanPolyaﬃne [3, 1] frameworks. The velocity of a path of deformations is here computedusing matrix logarithms, and the resulting diﬀeomorphism ﬂowed forward by integrat-ing the velocity. Ideas from the Polyaﬃne methods have recently been incorporated ine.g. the Demons algorithm [32, 26]. In LDDMM, the deformation atoms, the kernels,represent translational movement and the non-translational part of aﬃne transforma-tions cannot directly be represented. We will show how partial derivatives of kernelsconstitute deformation atoms which allow representing the linear parts of aﬃne trans-formations. From a mathematical point of view, this is possible due to the partialderivative reproducing property (Zhou [35]). The partial derivative reproducing prop-erty, partial derivatives of kernels, and ﬁrst-order momenta have previously been usedin [6] to derive variations of ﬂow equations for LDDMM DTI registration, in [14] tomatch landmarks with vector features, and in [15] to match surfaces with currents.Confer the monograph [34] for information on RKHSs and their role in LDDMM.In order to reduce the dimensionality of the parametrization used in LDDMM,Durrleman et al. [11] introduced a control point formulation of the registration prob-lem by choosing a ﬁnite set of control points and constraining the momentum to beconcentrated as Dirac measures at the point trajectories. As we will see, higher-ordermomenta make a ﬁnite control point formulation possible which is diﬀerent in impor-tant aspects. Younes [33] in addition considers evolution in constrained subspaces.Higher-order momenta increase the capacity of the deformation parametrization,a goal which is also treated in sparse multi-scale methods such as the kernel bundleframework [28]. This method concerns the size of the kernel in contrast to the orderwhich we deal with here. As we will discuss in the experiments section, the sizeof the kernel is important when using the higher-order representation as well, andrepresentations using higher-order momenta will likely complement the kernel bundlemethod if applied together.

We start the paper with an overview of LDDMMregistration and the mathematical constructs behind the method. In the followingsection, we describe registration using higher-order image information and parame-terization using higher-order momentum distributions. We then turn to the mathe-matical background of the method and describe the evolution of the momentum andvelocity ﬁelds governed by the EPDiﬀ evolution equations in the ﬁrst-order case. Thenext sections describes the relation to polyaﬃne approaches, the eﬀect of varyingthe initial conditions, and the backwards gradient transport. We then provide ex-amples and illustrate how the deformation represented by ﬁrst-order atoms can beinterpreted when registering human brains with progressing atrophy. The paper endswith concluding remarks and outlook.

2. LDDMM Registration, Kernels, and Evolution Equations.

In the LD-DMM framework, registration is performed through the action of diﬀeomorphisms ongeometric objects. This approach is very general and allows the framework to be ap-plied to both landmarks, curves, surfaces, images, and tensors. In the case of images,the action of a diﬀeomorphism ϕ on the image I : Ω → R takes the form ϕ.I = I ◦ ϕ − ,and given a ﬁxed image I f and moving image I m , the registration amounts to a searchfor ϕ such that ϕ.I m ∼ I f . In exact matching, we wish ϕ.I m be exactly equal to I f but, more frequently, we allow some amount of inexactness to account for noise in theimages and allow for smoother diﬀeomorphisms. This is done by deﬁning a similaritymeasure U ( ϕ ) = U ( ϕ.I m , I f ) on images and a regularization measure E to give acombined energy E ( ϕ ) = E ( ϕ ) + λU ( ϕ.I m , I f ) . (2.1)Here λ is a positive real representing the trade-oﬀ between regularity and goodnessof ﬁt. The similarity measure U is in the simplest form the L -error R Ω | ϕ.I m ( x ) − I f ( x ) | dx but more advanced measures can be used (e.g. [23, 18, 9]).In order to deﬁne the regularization term E , we introduce some notations in thefollowing: Let the domain Ω be a subset of R d with d = 2 ,

3, and let V denote aHilbert space of vector ﬁelds v : Ω → R d such that V with associated norm k · k V isincluded in L (Ω , R d ) and admissible [34, Chap. 9], i.e. suﬃciently smooth. Given atime-dependent vector ﬁeld t v t with Z k v t k V dt < ∞ (2.2)the associated diﬀerential equation ∂ t ϕ t = v t ◦ ϕ t has with initial condition ϕ s adiﬀeomorphism ϕ vst as unique solution at time t . The set G V of diﬀeomorphismsbuilt from V by such diﬀerential equations is a Lie group, and V is its tangent spaceat the identity. Using the group structure, V is isomorphic to the tangent space ateach point ϕ ∈ G V . The inner product on V associated to a norm k · k V makes G V a Riemannian manifold with right-invariant metric. Setting ϕ v = Id Ω , the map t ϕ v t is a path from Id Ω to ϕ with energy given by (2.2) and generated by v t . Wewill use this notation extensively in the following. A critical path for the energy (2.2)is a geodesic on G V , and the regularization term E is deﬁned using the energy by E ( ϕ ) = min v t ∈ V,ϕ v = ϕ Z k v s k V ds , (2.3)i.e. it measures the minimal energy of diﬀeomorphism paths from Id Ω to ϕ . Sincethe energy is high for paths with great variation, the term penalizes highly varyingpaths, and a low value of E ( ϕ ) thus implies that ϕ is regular. As a consequence of the assumed admissibilityof V , the evaluation functionals δ x : v v ( x ) ∈ R d is well-deﬁned and continuousfor any x ∈ Ω. Thus, for any z ∈ R d the map z ⊗ δ x : v z T v ( x ) belongs to thetopological dual V ∗ , i.e. the continuous linear maps on V . This in turn implies theexistence of spatially dependent matrices K : Ω × Ω → R d × d , the kernel , such that,for any constant vector z ∈ R d , the vector ﬁeld K ( · , x ) z ∈ V represents z ⊗ δ x and h K ( · , x ) z, v i V = z ⊗ δ x ( v ) for any v ∈ V , point x ∈ Ω and vector z ∈ R d . Thislatter property is denoted the reproducing property and gives V the structure of areproducing kernel Hilbert space (RKHS). Tightly connected to the norm and kernelsis the notion of momentum given by the linear momentum operator L : V → V ∗ ⊂ L (Ω , R d ) which satisﬁes h Lv, w i L (Ω , R d ) = Z Ω (cid:0) Lv ( x ) (cid:1) T w ( x ) dx = h v, w i V for all v, w ∈ V . The momentum operator connects the inner product on V with theinner product in L (Ω , R d ), and the image Lv of an element v ∈ V is denoted themomentum of v . The momentum Lv might be singular and in fact L (cid:0) K ( · , y ) z (cid:1) ( x ) isthe Dirac measure δ y ( x ) z . Considering K as the map z R Ω K ( · , x ) z ( x ) dx , L canbe viewed as the inverse of K . We will use the symbol ρ for the momentum whenconsidered as a functional in V ∗ while we switch to the symbol z when the momentumis realized as a vector ﬁeld on Ω or for the parameters when the momentum consistsof a ﬁnite number of singular point measures.Instead of deriving the kernel from V , the opposite approach can be used: build V from a kernel, and hence impose the regularization in the framework from the kernel.With this approach, the kernel is often chosen to ensure rotational and translationalinvariance [34] and the scalar Gaussian kernel K ( x, y ) = exp( − k x − y k σ )Id d is an oftenused choice. Confer [12] for details on the construction of V from Gaussian kernels. The relation be-tween norm and momentum leads to convenient equations for minimizers of the energy(2.1). In particular, the EPDiﬀ equations for the evolution of the momentum z t foroptimal paths assert that if ϕ t is a path minimizing E ( ϕ ) with ϕ = ϕ minimizing E ( ϕ ) and v t is the derivative of ϕ t then v t satisﬁes the system v t = Z Ω K ( · , x ) z t ( x ) dx ,ddt z t = − Dz t v t − z t ∇ · v t − ( Dv t ) T z t with Dz t and Dv t denoting spatial diﬀerentiation of the momentum and velocityﬁelds, respectively. The ﬁrst equation connects the momentum z t with the velocity v t , and the second equation describes the time evolution of the momentum. In themost general form, the EPDiﬀ equations describe the evolution of the momentumusing the adjoint map. Following [34], deﬁne the adjoint Ad ϕ v ( x ) = ( Dϕ v ) ◦ ϕ − ( x )for v ∈ V . The dual of the adjoint is the functional Ad ∗ ϕ on the dual V ∗ of V deﬁnedby (Ad ∗ ϕ ρ | v ) = ( ρ | Ad ϕ ( v )). Deﬁne in addition Ad Tϕ v = K (Ad ∗ ϕ ( Lv )) which thensatisﬁes D Ad Tϕ v, w E = (Ad ∗ ϕ ( Lv ) | w ), and let ∇ ϕ U denote the gradient of the similaritymeasure U with respect to the inner product on V so that h∇ ϕ U, v i V = ∂ ǫ U ( ψ v ǫ ◦ ϕ ) forany variation v ∈ V and diﬀeomorphism path ψ v ǫ with derivative v . For optimal paths v t , the EPDiﬀ equations assert that v t = Ad Tϕ vt v with v = − ∇ ϕ v U which leadsto the conservation of momentum property for optimal paths. Conversely, the EPDiﬀequations reduce to simpler forms for certain objects. For landmarks x , . . . , x N , the Here and in the following, we will use the notation ( p | v ) := p ( v ) for evaluation of the functional p ∈ V ∗ on the vector ﬁeld v ∈ V . momentum will be concentrated at point trajectories x t,k := ϕ t ( x k ) as Dirac measures z t,k ⊗ δ x t,k leading to the ﬁnite dimensional system of ODE’s v t = N X l =1 K ( · , x t,l )) z t,l , ddt ϕ t ( x k ) = v t ( x t,k ) ,ddt z t,k = − N X l =1 ∇ K ( x t,l , x t,k ) z Tt,k z t,l . (2.4)

3. Registration with Higher-Order Information.

We here introduce higher-order momentum distributions for registration using higher-order information withthe LDDMM framework. We start by motivating the construction by considering theapproximation used when computing the similarity measure. We then describe thedeformation encoded by higher-order momenta and the evolution equations in theﬁnite case, and we use this to derive a registration algorithm using ﬁrst-order infor-mation. The mathematical background behind the method will be presented in thefollowing sections.We will motivate the introduction of higher-order momenta by considering a spe-ciﬁc case of image registration: we take on the goal of using a control point formulation[11] when solving the registration problem (2.1) and hence aim for using a relativelysparse sampling of the velocity or momentum ﬁeld. To achieve this, we will considerthe coupling between the transported control points { ϕ − ( x ) , . . . , ϕ − ( x N ) } and thesimilarity measure in order to ensure the momentum stays singular and localized atthe point trajectories while removing the need for warping the entire image at everyiteration of the optimization process.Considering a similarity measure U ( ϕ ) = R Ω F ( I m ( ϕ − ( x )) , I f ( x )) dx as discussedin the introduction, and a ﬁnite discretization ˜ U ( ϕ ) = 1 /N P Nk =1 F ( ϕ.I m ( x k ) , I f ( x k ))with a sparse set of control points { x k } . While using ˜ U ( ϕ ) to drive registration ofthe images will be very eﬃcient in evaluating the warp in few points, it will suﬀercorrespondingly from only using image information present in those points. Apartfrom not being robust under the presence of noise in the images, the discretizationimplies that local dilation or rotation around the points ϕ − ( x k ) cannot be detected:any variation v ∈ V of ϕ keeping ϕ − ( x k ) constant for all k = 1 , . . . , N will not change˜ U ( ϕ ). Formally, if ψ ǫ is a diﬀeomorphism path that is equal to ϕ at t = 0 and hasderivative v at t = 0, i.e. ∂ ǫ ψ ǫ = v and ψ = ϕ , then ∂ ǫ F ( ψ ǫ .I m ( x k ) , I f ( x k )) = ∂ F ( ϕ.I m ( x k ) , I f ( x k )) · (cid:0) ∇ ϕ − ( x k ) I m (cid:1) T v ( ϕ − ( x k ))which vanishes if v ( ϕ − ( x k )) = 0. Here ∂ F denotes the derivative of F : R → R with respect to the ﬁrst variable.A simple way to include more image information in the similarity measure is toconvolve with a kernel K s , and thus extend ˜ U to U ( ϕ ) = 1 N N X k =1 c K s Z Ω K s ( p + x k , x k ) F ( ϕ.I m ( p + x k ) , I f ( p + x k )) dp with c K s a normalization constant. If K s is a box kernel, this amounts to a ﬁnersampling of both the image and warp, and hence a ﬁner discretization of the Riemannintegral. The kernel K s should not be confused with the RKHS kernel connected tothe norm on V that is used when generating the V -gradient. A Gaussian kernel maybe used for K s , and more information on using smoothing kernels for intensity basedimage registration can be found in [9, 36].The measure U ( ϕ ) is problematic since a variation of ϕ would aﬀect not onlythe point ϕ − ( x k ) but also ϕ.I m ( p + x k ), and U ( ϕ ) will therefore be dependent on ϕ.I m ( p + x k ) for any p where K s ( p, x k ) is non-zero. In this situation, the momentum isno longer concentrated in Dirac measures located at ϕ − t ( x k ), and it will be necessaryto increase the sampling of the warp. However, a ﬁrst-order expansion of ϕ − yieldsthe approximation˜ U ( ϕ ) = 1 N N X k =1 c K s Z Ω K s ( p + x k , x k ) F ( I m ( D x k ϕ − p + ϕ − ( x k )) , I f ( p + x k )) dp . (3.1)The measure ˜ U ( ϕ ) is now again local depending only on ϕ − ( x k ) and the ﬁrst-orderderivatives D x k ϕ − . It oﬀers the stability provided by the convolution with K s ,and, importantly, variations v of ϕ keeping ϕ − ( x k ) constant but changing D x k ϕ − do indeed aﬀect the similarity measure. This implies that ˜ U ( ϕ ) is able to catchrotations and dilations and drive the search for optimal ϕ accordingly. Please notethe diﬀerences with the approach of Durrleman et al. [11]: when using ˜ U ( ϕ ) asoutlined here, the need for ﬂowing the entire moving image forward is removed andthe momentum ﬁeld will stay singular directly thus removing the need for constrainingthe form of the velocity ﬁeld. Thedependence on Dϕ in the similarity measure ˜ U ( ϕ ) raises the question of how torepresent variations of Dϕ in the LDDMM framework. As we will outline here, higher-order momenta appear as the natural choice for such a representation that keeps thebeneﬁts of the ﬁnite control point formulation. Mathematical details will follow inthe next sections.Recall the reproducing property of the RKHS structure, i.e. h K ( · , x ) z, v i V = z ⊗ δ x ( v ) for v ∈ V , x ∈ Ω and z ∈ R d . Let us deﬁne the maps z ⊗ D αx : V → R that extend the Diracs z ⊗ δ x ( v ) by measuring the derivative of v at x . These will bedenoted higher-order Diracs , and we say that the momentum distribution is of higher-order if it is a sum of higher order Diracs. When applying the momentum operator L to the higher-order Diracs, we will get partial derivatives D αx K of the RHKS kernel K . In particular, we will see that when using similarity measures such as ˜ U ( ϕ ), themomentum ﬁeld will be a linear combination of higher-order Diracs and the velocityﬁeld will, correspondingly, be a linear combination of partial derivatives of K . Thiswill imply that the ﬁnite dimensional system of ODE’s (2.4) describing the EPDiﬀequations in the landmark case will be extended so that the velocity v t will containpartial derivatives D αx K . In the ﬁrst-order case, we will get the velocity v ( · ) = N X l =1 (cid:0) K ( · , x l ) z l + d X j =1 D j K ( · , x l ) z jl (cid:1) (3.2)where z i denotes the coeﬃcients of the Dirac measures as in (2.4) but now the ad-ditional vectors z ji denote the coeﬃcients of the ﬁrst-order Diracs z ji ⊗ D jx i for eachof the d dimensions j = 1 , . . . , d . We will later show how these coeﬃcients evolve. (a) The RHKS kernel encodeslocal translation. (b) Ensembles of kernels canapproximate locally aﬃne de-formation. (c) Derivatives of the kerneldirectly encode locally aﬃnedeformation. Fig. 3.1 . The deformation encoded by the kernel: (a) the RHKS kernel, here a Gaussian ofscale 8 in grid units, encodes local translation; (b) locally aﬃne deformation, here expansion, can beapproximated by placing kernels close together. When moving these kernels inﬁnitesimally close, thederivative of the kernel arises in the limit, and (c) the derivative encode locally aﬃne deformationdirectly. With higher-order momenta, we will use derivatives of the kernel as deformation atoms.

Combined with knowledge of how variations of z i and z ji aﬀect the system, we cantransport variational information along the optimal paths speciﬁed by the EPDiﬀequations and thus provide the necessary building blocks for a ﬁrst-order registrationalgorithm.Figure 3.1 illustrates how the local translation encoded by the kernel is com-plemented by locally aﬃne deformation when incorporating ﬁrst-order momenta andcorresponding partial derivatives of the kernel. Using the language of deformationatoms, the ﬁrst-order constructions adds partial derivatives of kernels to the usual setof atoms, and the deformation atoms are thus able to compactly encode expansion,contraction, rotation etc. We can directly interpret the coeﬃcients of the ﬁrst-ordermomenta as controlling the magnitude of these ﬁrst-order deformations. In Figure 7.1in the experiments section, we give additional illustrations of the deformation encodedby the new atoms. In this section, we will derivea registration algorithm for similarity measures incorporating ﬁrst-order informationsuch as ˜ U ( ϕ ). Since the algorithm works for general ﬁrst-order measures, we willagain let U denote the similarity measure with ˜ U ( ϕ ) being just a particular example.There exists various choices of optimization algorithms for LDDMM registration.Roughly, they can be divided into two groups based on whether they represent theinitial momentum/velocity or the entire path ϕ t . Here, we take the approach ofincorporating ﬁrst-order momenta with the shooting method of e.g. Vaillant et al.[31]. The algorithm will take a guess for the initial momentum, integrate the EPDiﬀequations forward, compute the similarity measure gradient ∇ U ( ϕ ), and ﬂow thegradient backwards to provide an improved guess.The registration problem (2.1) consists of both the similarity measure and theminimal path energy E . For e.g. landmark based registration, the similarity U ( ϕ ) ismost often expressed in terms of ϕ directly whether as the similarity measure is usuallydependent on the inverse ϕ − for image registration. In the ﬁrst case, the gradient ∇ ϕ U is known, and, given the initial momentum z , we can obtain the gradient ∇ z U for a gradient descent based optimisation procedure from the backwards transportequations that we derive in Section 6. For the energy part, it is a fundamental propertyof critical paths in the LDDMM framework that the energy stays constant along the0path. Thus, we can easily compute the gradient from the expressions provided inSection 4. Given this, the zeroth order matching algorithm in the initial momentumis generalized to zeroth and ﬁrst-order momenta in Algorithm 1. Algorithm 1

Matching with Zeroth and First-Order Momenta. z ← initial guess for initial momentum repeat Solve EPDiﬀ equations forwardCompute similarity U Solve backwards the transpose equationsCompute the energy gradient ∇k v k Update z from ∇k v k + ∇ z U until convergenceTraditionally, the similarity measure U ( ϕ ) is in image matching formulated usingthe inverse of ϕ , and this approach was taken when formulating the approximation˜ U ( ϕ ) in (3.1). For this reason, at ﬁnite control point formulation is naturally ex-pressed using a sampling { x , . . . , x N } in the target image with the algorithm opti-mizing for the momentum z at time t = 1. The evaluation points ϕ − ( x k ) are thengenerated by ﬂowing backwards from t = 1 to t = 0, and the gradient of U ( ϕ ) canbe computed in ϕ − ( x k ) before being ﬂowed forwards to update z . Algorithm 1 willaccommodate this situation by just reversing the integration directions. The controlpoints can be chosen either at e.g. anatomically important locations, at random, or ona regular grid. In the experiments, we will register expanding ventricles using controlpoints placed in the ventricles. The integration of the ﬂow equations can beperformed with standard Runge-Kutta integrators such as Matlabs ode45 procedure.With zeroth order momenta only and N points, the forward and backwards systemconsist of 2 dN equations. With zeroth and ﬁrst-order momenta, the forward systemis extended to N (2 d + d ) and the backwards system to 2 N ( d + d ). For d = 3, thisimplies an 2 . d = 3 require at least four zeroth-order atoms for each ﬁrst-order atom makingthe size of the approximating system equivalent to the ﬁrst-order system. Due to thenon-linearity of the systems, the eﬀect of the approximation introduced with such asan approach is not presently established.In addition to the increase in the size of the systems, the extra ﬂoating point op-erations necessary for computing the more complicated evolution equations should beconsidered. The additional computational eﬀort should, however, be viewed againstthe fact that the ﬁnite dimensional system can contain orders of magnitude fewercontrol points, and the added capacity of deformation description included in thederivative information. In addition and in contrast to previous approaches, we trans-port the similarity gradient only at the control point trajectories, again an order ofmagnitude reduction of transported information.

4. Higher-Order Momentum Distributions.

We now link partial derivativesof kernels to higher-order momenta using the derivative reproducing property, and weprovide details on the EPDiﬀ evolution equations that we outlined in the previous1section. We underline that the analytical of structure of LDDMM is not changedwhen incorporating higher-order momenta, and the evolution equations will thus beparticular instances of the general EPDiﬀ equations. These equations in Hamiltonianform constitutes the explicit expressions that allows implementation of the registrationalgorithm.We will restrict to scalar kernels when appropriate for simplifying the notation.Scalar kernels are diagonal matrices where all diagonal elements are equal. Thus, wecan consider K ( x, y ) both a matrix and a scalar so that the entries K ji ( x, y ) of thekernel in matrix form equals the scalar K ( x, y ) if and only if i = j and 0 otherwise. Recall the reproducing property ofthe RKHS structure, i.e. h K ( · , x ) z, v i V = z ⊗ δ x ( v ) for v ∈ V , x ∈ Ω and z ∈ R d .Zhou [35] shows that this property holds not only for the kernel but also for its partialderivatives. Letting D αx v denote the derivative of v ∈ V at x ∈ Ω with respect to themulti-index α , D αx v = ∂ | α | ∂ α x . . . ∂ α d x d v ( x )and deﬁning ( D αx Kz )( y ) = D αx ( K ( · , y ) z ) for z ∈ R d , Zhou proves that D αx Kz ∈ V and that the partial derivative reproducing property h D αx Kz, v i V = z T D αx ( v ) (4.1)holds when the maps in V are suﬃciently smooth for the derivatives to exist. Wedenote the maps z ⊗ D αx : V → R deﬁned by z ⊗ D αx ( v ) := z T D αx v higher-orderDiracs , and we say that the momentum distribution is of higher order if it is a sumof higher-order Diracs. It follows that z ⊗ D αx = (cid:0) v

7→ h D αx Kz, v i V (cid:1) ∈ V ∗ . As a consequence, we can connect higher-order momenta and partial derivatives D αx K of the kernel. Recall that the momentum map L : V → V ∗ satisﬁes h Lv, w i L = h v, w i V . With the higher-order momenta, h LD αx Kz, v i L = h D αx Kz, v i V = z ⊗ D αx ( v ) = h z ⊗ D αx , v i L . Thus LD αx Kz = z ⊗ D αx or, shorter, LD αx K = D αx . That is, partial derivatives of thekernel and higher-order momenta corresponds just as the kernels and Diracs measuresin the usual RKHS sense.Consider a map on diﬀeomorphisms U : G V → R e.g. an image similarity measuredependent on ϕ . In a ﬁnite dimensional setting with N evaluation points x k , U woulddecompose as U ( ϕ ) = P ◦ Q ( ϕ ) with Q ( ϕ ) = ( ϕ ( x ) , . . . , ϕ ( x N )) and P : R dN → R . Introducing higher-order momenta, we let Q ( ϕ ) = ( D α x ( ϕ ) , . . . , D α J x N ( ϕ )) with J multi-indices α j , and decompose U as U ( ϕ ) = P ◦ Q ( ϕ ) with P : R dNJ → R . Weallow α j to be empty and hence incorporate the standard zeroth order case. Thepartial derivative reproducing property now lets us compute the V -gradient of U asa sum of partial derivatives of the kernel. Proposition 4.1.

Let ∇ kj P denote the gradient with respect to the variableindexed by D α j x k ( ϕ ) in the expression for Q . Then the gradient ∇ ϕ U ∈ V of U withrespect to the inner product in V is given by ∇ ϕ U = P Nk =1 P Jj =1 D α j x k K ∇ kjQ ( ϕ ) P . Proof . The gradient ∇ ϕ U at ϕ is deﬁned by h∇ ϕ U, v i = ∂ ǫ U ( ǫv + ϕ ) for allvariations v ∈ V . For such v , we get using (4.1) that ∂ ǫ U ( ǫv + ϕ ) = ∂ ǫ P ◦ Q ( ǫv + ϕ ) = ∂ ǫ P ( D α j x k ( ǫv + ϕ )) = ∂ ǫ P ( ǫD α j x k v + D α j x k ϕ )= N X k =1 J X j =1 ( ∇ kjQ ( ϕ ) P ) T D α j x k v = * N X k =1 J X j =1 D α j x k ∇ kjQ ( ϕ ) P, v + V . As a result of Proposition 4.1, the momentumof the gradient of U is L ∇ ϕ U = P Nk =1 P Jj =1 ∇ kjQ ( ϕ ) P ⊗ D α j x k . In general, if v ∈ V isrepresented by a sum of higher-order momenta, the energy k v k V can be computedusing (4.1) as a sum of partial derivatives of the kernel evaluated at the points x k .To keep the notation brief, we restrict to sums of zeroth and ﬁrst-order momenta inthe following. If v ( · ) = P Nk =1 (cid:0) K ( x k , · ) z k + P dj =1 D j K ( x k , · ) z jk (cid:1) , we get the energy k v k V = * N X k =1 (cid:0) K ( x k , · ) z k + d X j =1 D j K ( x k , · ) z jk (cid:1) , N X k =1 (cid:0) K ( x k , · ) z k + d X j =1 D j K ( x k , · ) z jk (cid:1)+ V = N X k,l =1 h K ( x l , · ) z l , K ( x k , · ) z k i V + N X k,l =1 d X j,i =1 D D j K ( x l , · ) z jl , D i K ( x k , · ) z ik E V + 2 N X k,l =1 d X j =1 D D j K ( x l , · ) z jl , K ( x k , · ) z k E V = N X k,l =1 (cid:0) z Tl K ( x l , x k ) z k + d X j,i =1 z i,Tk D i D j K ( x l , x k ) z jl + 2 d X j =1 z Tk D j K ( x l , x k ) z jl (cid:1) (4.2)with D jq K ( · , · ) denoting diﬀerentiation with the respect to the q th variable, q = 1 , j th coordinate, j = 1 , . . . , d . For scalar symmetric kernels, this expression reducesto k v k V = N X k,l =1 (cid:0) z Tl K ( x l , x k ) z k + d X j,i =1 (cid:0) D ∇ K ( x l , x k ) (cid:1) ij z i,Tk z jl + 2 d X j =1 ( ∇ K ( x l , x l )) j z Tk z jl (cid:1) . It is important to note that higher-order momentaoﬀer a convenient representation for the gradients of maps U incorporating derivativeinformation but since the partial derivatives of kernels are members of V and thehigher order momentum in the dual V ∗ , the analytical of structure of LDDMM is notchanged. In particular, the adjoint form of the EPDiﬀ equations, i.e. that optimalpaths v t satisfy v t = Ad Tϕ vt v with v = − ∇ ϕ v U , is still valid. The momentum ρ = Lv is transported to the momentum ρ t by Ad ∗ ϕ vt p . Because( ρ t | w ) = ( ρ | Ad ϕ vt ( w )) = ( ρ | ( Dϕ vt w ) ◦ ( ϕ vt ) − ) , ρ is a sum of higher-order Diracs, ρ t will be sum of higher-order Diracs for all t .However, since the time evolution of ( ρ t | w ) with the above relation involves derivativesof Dϕ vt , this form is inconvenient for computing ρ t . Instead, we make use of theHamiltonian form of the EPDiﬀ equations [34, P. 265]. Here, the momentum ρ t is pulled back to ρ but with a coordinate change of the evaluation vector ﬁeld:the Hamiltonian form µ t is deﬁned by (cid:0) µ t (cid:12)(cid:12) w (cid:1) := (cid:0) ρ (cid:12)(cid:12) ( Dϕ v t ) − ( y ) w ( y ) (cid:1) y where thesubscript stresses that ( Dϕ v t ) − ( y ) w ( y ) is evaluated as a y -dependent vector ﬁeld.To simplify the notation, we write just ϕ t instead of ϕ v t . Using this notation, theevolution equations become ∂ t ϕ t ( y ) = d X j =1 (cid:0) µ t (cid:12)(cid:12) K j ( ϕ t ( x ) , ϕ t ( y )) (cid:1) x e j (cid:0) ∂ t µ t (cid:12)(cid:12) w (cid:1) = − d X j =1 (cid:0) µ t (cid:12)(cid:12)(cid:0) µ t (cid:12)(cid:12) D K j ( ϕ t ( x ) , ϕ t ( y )) w ( y ) (cid:1) x e j (cid:1) y . (4.3)The system forms an ordinary diﬀerential equation describing the evolution of thepath and momentum [34] when ( ρ | w ) does not involve derivatives of w , e.g. when ρ and hence ρ t is a vector ﬁeld z t and the ﬁrst equation therefore is an integral ∂ t ϕ t ( y ) = Z Ω K ( ϕ t ( y ) , ϕ t ( x )) z t ( x ) dx . For the higher-order case, we will need to incorporate additional information in thesystem.Again we restrict to ﬁnite sums of zeroth and ﬁrst-order point measures, and wetherefore work with initial momenta on the form ρ = N X k =1 z ,k ⊗ δ x ,k + N X k =1 d X j =1 z j ,k ⊗ D j δ x ,k (4.4)with x t,r as usual denoting the point positions ϕ t ( x i ) at time t . Then (cid:0) µ t (cid:12)(cid:12) w (cid:1) = (cid:0) ρ (cid:12)(cid:12) Dϕ t ( y ) − w ( y ) (cid:1) y = Z Ω (cid:16) N X k =1 z ,k ⊗ δ x ,k + N X k =1 d X j =1 z j ,k ⊗ D j δ x ,k (cid:17) Dϕ t ( y ) − w ( y ) dy = N X k =1 (cid:0)(cid:0) Dϕ t ( x ,k ) − ,T z ,k + d X j =1 (cid:0) D j Dϕ t ( x ,k ) − (cid:1) T z j ,k (cid:1) ⊗ δ x ,k (cid:12)(cid:12) w (cid:1) + N X k =1 d X j =1 (cid:0) Dϕ t ( x ,k ) − ,T z j ,k ⊗ D j δ x ,k (cid:12)(cid:12) w (cid:1) showing that µ t = P Nk =1 µ t,k ⊗ δ x ,k + P Nk =1 P dj =1 µ jt,k ⊗ D j δ x ,k with µ t,k = Dϕ t ( x ,k ) − ,T z ,k + d X j =1 (cid:0) D j Dϕ t ( x ,k ) − (cid:1) T z j ,k µ jt,k = Dϕ t ( x ,k ) − ,T z j ,k . (4.5)4The momentum ρ t can the be recovered as (cid:0) ρ t (cid:12)(cid:12) w (cid:1) = (cid:0) µ t (cid:12)(cid:12) w ◦ ϕ t (cid:1) = (cid:0) N X k =1 µ t,k ⊗ δ x ,k + N X k =1 d X j =1 µ jt,k ⊗ D j δ x ,k (cid:1) w ◦ ϕ t = N X k =1 µ t,k ⊗ δ x t,k w + N X k =1 d X j =1 µ j,Tt,k Dw ( D j ϕ t )( x ,k )= N X k =1 µ t,k ⊗ δ x t,k w + N X k =1 d X j =1 (cid:0) d X i =1 ( D i ϕ t )( x ,k ) j µ it,k (cid:1) ⊗ D j δ x t,k w and hence the coeﬃcients of the momentum z t,k and z jt,k (confer (4.4)) are given by z t,k = µ t,k and z jt,k = P di =1 ( D i ϕ t )( x ,k ) j µ it,k . We note that both z jt,k and µ jt,k arecoordinate vectors of the ﬁrst-order parts of the momentum in ordinary and Hamil-tonian form respectively. For each point k and time t , these coordinate vectors thusrepresent two d × d tensors. Even though µ t,k in (4.5) depend on the second orderderivative of ϕ , we will show that the complete evolution in the zeroth and ﬁrst-ordercase can be determined by solving for the points ϕ t ( x k, ), the matrices Dϕ t ( x k, ),and the vectors µ t,k . This will provide the computational representation we will usewhen implementing the systems.Using (4.3), ϕ t evolves according to ∂ t ϕ t ( y ) = d X i =1 Z Ω N X k =1 (cid:0) µ Tt,k ⊗ δ x ,k + d X j =1 µ jt,k ⊗ D j δ x ,k (cid:1) K i ( ϕ t ( x ) , ϕ t ( y )) dx e i = d X i =1 N X k =1 (cid:0) µ Tt,k K i ( ϕ t ( x ,k ) , ϕ t ( y )) + d X j =1 µ j,Tt,k D K i ( ϕ t ( x ,k ) , ϕ t ( y )) D j ϕ t ( x ,k ) (cid:1) e i . With scalar kernels, the trajectories x t,k are given by ∂ t ϕ t ( x ,k ) = N X l =1 (cid:0) K ( ϕ t ( x ,l ) , ϕ t ( x ,k )) µ t,l + d X j =1 ∇ K ( ϕ t ( x ,l ) , ϕ t ( x ,k )) T D j ϕ t ( x ,l ) µ jt,l (cid:1) . It is shown in [34] that the evolution of the matrix Dϕ t ( x k, ) is governed by ∂ t Dϕ t ( y ) a = d X i =1 (cid:0) µ t (cid:12)(cid:12) D K i ( ϕ t ( x ) , ϕ t ( y )) Dϕ t ( y ) a (cid:1) x e i . Inserting the Hamiltonian form of the higher-order momentum, each component ( r, c )5( r ow/ c olumn) of the matrix Dϕ t ( y ) thus evolves according to ∂ t Dϕ t ( y ) cr = (cid:0) µ t (cid:12)(cid:12) D K r ( ϕ t ( x ) , ϕ t ( y )) Dϕ t ( y ) e c (cid:1) x = Z Ω N X k =1 (cid:0) µ t,k ⊗ δ x ,k + d X j =1 µ jt,k ⊗ D j δ x ,k (cid:1) D K r ( ϕ t ( x ) , ϕ t ( y )) Dϕ t ( y ) e c dx = N X k =1 µ Tt,k D K r ( ϕ t ( x ,k ) , ϕ t ( y )) Dϕ t ( y ) e c + N X k =1 d X j =1 µ j,Tt,k (cid:0) d X i =1 (cid:0) D i D K r ( ϕ t ( x ,k ) , ϕ t ( y )) (cid:1)(cid:0) D j ϕ t ( x ,k ) (cid:1) i (cid:1) Dϕ t ( y ) e c . With scalar kernels, the evolution at the trajectories is then ∂ t Dϕ t ( x ,k ) c = N X l =1 (cid:16) ∇ K ( ϕ t ( x ,l ) , ϕ t ( x ,k )) T D c ϕ t ( x ,k ) µ t,l + d X j =1 (cid:0) D ∇ K ( ϕ t ( x ,l ) , ϕ t ( x ,k )) D j ϕ t ( x ,l ) (cid:1) T D c ϕ t ( x ,k ) µ jt,l (cid:17) . The complete derivation of the evolution of µ t is notationally heavy and can befound in the supplementary material for the paper. Combining the evolution of µ t with the expressions above, we arrive at the following result: Proposition 4.2.

The EPDiﬀ equations in the scalar case with zeroth and ﬁrst-order momenta are given in Hamiltonian form by the system ∂ t ϕ t ( x ,k ) = N X l =1 (cid:0) K ( x t,l , x t,k ) µ t,l + d X j =1 ∇ K ( x t,l , x t,k ) T D j ϕ t ( x ,l ) µ jt,l (cid:1) ∂ t Dϕ t ( x ,k ) c = N X l =1 (cid:16) ∇ K ( x t,l , x t,k ) T D c ϕ t ( x ,k ) µ t,l + d X j =1 (cid:0) D ∇ K ( x t,l , x t,k ) D j ϕ t ( x ,l ) (cid:1) T D c ϕ t ( x ,k ) µ jt,l (cid:17) ∂ t µ t,k = − N X l =1 (cid:16)(cid:0) µ Tt,k µ t,l (cid:1) ∇ K ( x t,l , x t,k )+ d X j =1 (cid:0) µ j,Tt,k µ t,l (cid:1) D ∇ K ( x t,l , x t,k ) D j ϕ t ( x ,k )+ d X j =1 (cid:0) µ Tt,k µ jt,l (cid:1) D ∇ K ( x t,l , x t,k ) D j ϕ t ( x ,l )+ d X j,j ′ =1 (cid:0) µ j ′ ,Tt,k µ jt,l (cid:1) D (cid:0) D ∇ K ( x t,l , x t,k ) D j ϕ t ( x ,l ) (cid:1) D j ′ ϕ t ( x ,k ) (cid:17) µ jt,k = Dϕ t ( x ,k ) − ,T z j ,k . (4.6)Note that both x ,k = ϕ v ( x ,k ) and Dϕ v ( x ,k ) are provided by the systemand hence can be used to evaluate a similarity measure that incorporates ﬁrst-order6information. As in the zeroth order case, the entire evolution can be recovered by theinitial conditions for the momentum.

5. Locally Aﬃne Transformations.

The Polyaﬃne and Log-Euclidean Poly-aﬃne [3, 1] frameworks model locally aﬃne transformations using matrix logarithms.The higher-order momenta and partial derivatives of kernels can be seen as the LD-DMM sibling of the Polyaﬃne methods, and diﬀeomorphism paths generated byhigher-order momenta, in particular, momenta of zeroth and ﬁrst-order, can locallyapproximate all aﬃne transformations with linear component having positive deter-minant. The approximation will depend only on how fast the kernel approaches zerotowards inﬁnity. The manifold structure of G V provides this result immediately. In-deed, let ϕ ( x ) = Ax + b be an aﬃne transformation with det( A ) >

0. We deﬁne apath ϕ t of ﬁnite energy such that ϕ ≈ ϕ which shows that ϕ ∈ G V and can bereached in the framework. The matrices of positive determinant is path connectedso we can let ψ t be a path from Id d to A and deﬁne ˜ ψ t ( x ) = ψ t x + bt . Then with˜ v t ( x ) = ( ∂ t ψ t ) ˜ ψ − t ( x ) + b , we have ∂ t ˜ ψ t ( x ) = ( ∂ t ψ t ) x + b = ˜ v t ◦ ˜ ψ t ( x ) and x + Z ˜ v t ◦ ˜ ψ t ( x ) dt = x + Z ( ∂ t ψ t ) x + bdt = ϕ ( x ) . Now use that ( ∂ t ψ t ) ˜ ψ − t ( x ) = ( ∂ t ψ t )( ψ t ) − ( x − bt ) and let the M t = ( m ,t . . . m d,t ) bethe t -dependent matrix ( ∂ t ψ t )( ψ t ) − so that the ﬁrst term of ˜ v t ( x ) equals M t ( x − bt ).Then choose a radial kernel, e.g. a Gaussian K σ , and deﬁne the approximation v t of˜ v t by v t ( x ) = d X j =1 D j ˜ ψ t (0) K σ ( x ) m j,t + K σ ( ˜ ψ t (0) , x ) b . (5.1)The path ϕ v generated by v t then has ﬁnite energy, and ϕ v ( x ) = x + Z v t ◦ ϕ v t ( x ) dt ≈ ϕ ( x )with the approximation depending only on the kernel scale σ . Note that the aﬃnetransformations with linear components having negative determinant can in a similarway be reached by starting the integration at a diﬀeomorphism with negative Jacobiandeterminant.In the experiments section, we will illustrate the locally aﬃne transformationsencoded by zeroth and ﬁrst-order momenta, and, therefore, it will be useful to intro-duce a notation for these momenta. We encode the translational part of either themomentum or velocity using the notationTsl x ( b ) = K σ ( x, · ) b and the linear part by Lin x ( M ) = d X j =1 D jx K σ ( · ) m j with m , m j being the columns of the matrix M . Equation (5.1) can then be written v t ( x ) = Lin ˜ ψ t (0) ( M t ) + Tsl ˜ ψ t (0) ( b ) . (5.2)7We emphasize that though we mainly focus on zeroth and ﬁrst-order momenta, themathematical construction allows any order momenta permitted by the smoothnessof the kernel at order zero.

6. Variations of the Initial Conditions.

In Algorithm 1, we used the varia-tion of the EPDiﬀ equations when varying the initial conditions and in particular thebackwards gradient transport. We discuss both issues here.A variation δρ of the initial momentum will induce a variation of the system(4.6). By diﬀerentiating that system, we get the time evolution of the variation. Toease notation, we assume the kernel is scalar on the form K ( x, y ) = γ ( | x − y | ) andwrite γ t,lk = K ( x t,l , x t,k ). Variations of the kernel and kernel derivatives such as theentity δ ∇ K ( x t,l , x t,k ) below depend only on the variation of point trajectories δx t,l and δx t,k . The full expressions for these parts are provided in supplementary materialfor the paper. The variation of the point trajectories in the derived system then takesthe form ∂ t δϕ t ( x ,k ) = N X l =1 (cid:0) δK ( x t,l , x t,k ) µ t,l + γ t,lk δµ t,l (cid:1) + N X l =1 d X j =1 (cid:0) δ ∇ K ( x t,l , x t,k ) T D j ϕ t ( x ,l ) µ jt,l + ∇ K ( x t,l , x t,k ) T δD j ϕ t ( x ,l ) µ jt,l + ∇ K ( x t,l , x t,k ) T D j ϕ t ( x ,l ) δµ jt,l (cid:1) The similar expressions for the evolution of δµ t,k and δDϕ t ( x ,k ) are provided in thesupplementary material. The variation of µ jt,k is available as δµ jt,k = − (cid:0) Dϕ t ( x ,k ) − δDϕ t ( x ,k ) Dϕ t ( x ,k ) − (cid:1) T z j ,k + Dϕ t ( x ,k ) − ,T δz j ,k . However, when computing the backwards transport, we will need to remove the de-pendency on δz j ,k which is only available for forward integration. Instead, by writingthe evolution of µ jt,k in the form ∂ t µ jt,k = ∂ t Dϕ t ( x ,k ) − ,T z j ,k = − (cid:0) Dϕ t ( x ,k ) − ∂ t Dϕ t ( x ,k ) Dϕ t ( x ,k ) − (cid:1) T z j ,k = − Dϕ t ( x ,k ) − ,T ∂ t Dϕ t ( x ,k ) T µ jt,k , we get the variation ∂ t δµ jt,k = − δDϕ t ( x ,k ) − ,T ∂ t Dϕ t ( x ,k ) T µ jt,k − Dϕ t ( x ,k ) − ,T ∂ t δDϕ t ( x ,k ) T µ jt,k − Dϕ t ( x ,k ) − ,T ∂ t Dϕ t ( x ,k ) T δµ jt,k . The correspondence between initial momentum ρ and end diﬀeomorphism ϕ v asserted by the EPDiﬀ equations allows us to view thesimilarity measure U ( ϕ v ) as a function of ρ . Let A denote the result of integratingthe system for the variation of the initial conditions from t = 0 to t = 1 such that w = Aδρ ∈ V for a variation δρ . We then get a corresponding variation δU in thesimilarity measure. To compute the gradient of U as a function of ρ , we have δU ( ϕ v ) = (cid:10) ∇ ϕ v U, w (cid:11) V = (cid:10) ∇ ϕ v U, Aδρ (cid:11) V = (cid:10) A T ∇ ϕ v U, δρ (cid:11) V ∗ . The subscript notation is used in accordance with [34]. Please note that γ t,lk contains three separate indices, i.e. the time t and the point indices l and k . V ∗ -gradient of ∇ ρ U is given by A T ∇ ϕ v U . The gradient can equivalentlybe computed in momentum space at both endpoints of the diﬀeomorphism path usingthe map P deﬁned in Proposition 4.1.The complete system for the variation of the initial conditions is a linear ODE,and, therefore, there exists a time-dependent matrix M t such that the ODE ∂ t y t = M t y t has the variation as a solution y t . It is shown in [34] that, in such cases, solving thebackwards transpose system ∂ t w t = − M Tt w t (6.1)from t = 1 to t = 0 provides the value of A T w . Therefore, we can obtain ∇ ρ U bysolving the transpose system backwards. The components of M t can be identiﬁedby writing the evolution equations for the variation in matrix form. This provides M Tt and allows the backwards integration of the system 6.1. The components of thetranspose matrix M t are provided in the supplementary material for the paper.

7. Experiments.

In order to demonstrate the eﬃciency, compactness, and in-terpretability of representations using higher-order momenta, we perform four sets ofexperiments. First, we provide four examples illustrating the type of deformationsproduced by zeroth and ﬁrst-order momenta and the relation to the Polyaﬃne frame-work. We then use point based matching using ﬁrst-order information to show howcomplicated warps that would require many parameters with zeroth order deforma-tion atoms can be generated with very compact representations using higher-ordermomenta. We underline the point that higher-order momenta allow low-dimensionaltransformations to be registered using correspondingly low-dimensional representa-tions: we show how synthetic test images generated by a low-dimensional transfor-mation can be registered using only one deformation atom when representing usingﬁrst-order momenta and using the ﬁrst-order similarity measure approximation (3.1).We further emphasize this point by registering articulated movement using only onedeformation atom per rigid part, and thus exemplify a natural representation thatreduces the number of deformation atoms and the ambiguity in the placement of theatoms while also reducing the degrees of freedom in the representation. Finally, weillustrate how higher-order momenta in a natural way allow registration of humanbrains with progressing atrophy. We describe the deformation ﬁeld throughout theventricles using few deformation atoms, and we thereby suggest a method for detectinganatomical change using few degrees of freedom. In addition, the volume expansioncan be directly interpreted from the parameters of the deformation atoms. We startby brieﬂy describing the similarity measures used throughout the experiments.For the point examples below, we register moving points x , . . . , x N against ﬁxedpoints y , . . . , y N . In addition, we match ﬁrst-order information by specifying valuesof D jx k ϕ . This is done compactly by providing matrices Y k so that we seek D x k ϕ = Y k for all k = 1 , . . . , N . The similarity measure is simple sum of squares, i.e. U ( ϕ ) = N X i =1 k ϕ ( x k ) − y k k + k D x k ϕ − Y k k using the matrix 2-norm. This amounts to ﬁtting ϕ against a locally aﬃne map withtranslational components y k and linear components Y k . For the image cases, we use L -similarity to build the ﬁrst-order approximation (3.1) with the smoothing kernel K s being Gaussian of the same scale as the LDDMM kernel.9 (a) Expansion (b) Contraction(c) Rotation ( − π/

2) (d) Two rotations ( π/ Fig. 7.1 . The eﬀect of the generated deformation on an initially square grid for several initialﬁrst-order momenta: Using the notation of section 5, (a) expansion ρ = Lin (Id ) ; (b) contraction ρ = Lin ( − Id ) ; (c) rotation ρ = Lin (Rot( v )) , v = − π/ ; (d) two rotations v = π/ . The kernelis Gaussian with σ = 8 in grid units, and the grids are colored with the trace of Cauchy-Green straintensor (log-scale). Notice the locality of the deformation caused by the ﬁnite scale of the kernel, andthat the deformation stays diﬀeomorphic even when two rotations force conﬂicting movements. To visually illustrate the deformation gener-ated by higher-order momenta, we show in Figure 7.1 the generated deformationson an initially square grid with four diﬀerent ﬁrst-order initial momenta. The de-formation locally model the linear part of aﬃne transformations and the the localityis determined by the Gaussian kernel that in the examples has scale σ = 8 in gridunits. Notice for the rotations that the deformation stays diﬀeomorphic in the pres-ence of conﬂicting forces. The similarity between the examples and the deformationsgenerated in the Polyaﬃne framework [1] underlines the viewpoint that the registra-tion using higher-order momenta constitutes the LDDMM sibling of the Polyaﬃneframework. Figure 7.2 presents simple point basedmatching results with ﬁrst-order information. The lower points (red) are matchedagainst the upper points (black) with match against expansion D ϕ ( x k ) = 2Id androtation D ϕ ( x k ) = Rot( v ) = (cid:18) cos( v ) , sin( v ) − sin( v ) , cos( v ) (cid:19) for v = ∓ π/

2. The optimal dif-feomorphisms exhibit the expected expanding and turning eﬀect, respectively. Westress that the deformations are generated using only two deformations atoms with0 (a) Match with dilations (expansion) (b) Match with rotations ( − π/ π/ Fig. 7.2 . Two moving points (red) are matched against two ﬁxed points (black) with results(green) and with match against (a) expansion D ϕ ( x k ) = 2Id , i = 1 , ; and (b) rotation D ϕ ( x k ) =Rot( v ) , v = ∓ π/ , i = 1 , . The kernel is Gaussian with σ = 8 in grid units, and the grids arecolored with the trace of Cauchy-Green strain tensor (log-scale). combined 12 parameters. Representing equivalent deformation using zeroth ordermomenta would require a signiﬁcantly increased number of atoms and a correspondincrease in the number of parameters. We now exemplify how higher-order momenta allow low-dimensional transformations to be registered using corre-spondingly low-dimensional representations. We generate two test images by applyingtwo linear transformations, an dilation and a rotation, to a binary image of a square,confer the moving images (a) and (e) in Figure 7.3. By placing one deformation atomin the center of each ﬁxed image and by using the similarity measure approxima-tion (3.1), we can successfully register the moving and ﬁxed images. The result anddiﬀerence plots are shown in Figure 7.3. The dimensionality of the linear transfor-mations generating the moving images is equal to the number of parameters for thedeformation atom. A registration using zeroth order momenta would need more thanone deformation atom which would result in a number of parameters larger than thedimensionality. The scale of the Gaussian kernel used for the registration is 50 pixels.

The articulated motion of the ﬁnger in Figure 7.4(a) and (b) can be described by three locally linear transformations. With higher-order momenta, we can place deformation atoms at the center of the bones in themoving and ﬁxed images, and use the point positions together with the direction ofthe bones to drive a registration. This natural and low dimensional representationallows a fairly good match of the images resembling the use of the Polyaﬃne aﬃneframework for articulated registration [26]. A similar registration using zeroth ordermomenta would need two deformation atoms per bone and lacking a natural wayto place such atoms, the positions would need to be optimized. With higher-ordermomenta, the deformation atoms can be placed in a natural and consistent way, and X-ray frames from

10 20 30 40 50 60 70 801020304050607080 (a) Moving image

10 20 30 40 50 60 70 801020304050607080 (b) Fixed image

10 20 30 40 50 60 70 801020304050607080 (c) Registration result

10 20 30 40 50 60 70 801020304050607080 (d) Diﬀerence

10 20 30 40 50 60 70 801020304050607080 (e) Moving image

10 20 30 40 50 60 70 801020304050607080 (f) Fixed image

10 20 30 40 50 60 70 801020304050607080 (g) Registration result

10 20 30 40 50 60 70 801020304050607080 (h) Diﬀerence

Fig. 7.3 . With linear transformations, the dimensionality of the higher-order representationmatches the dimensionality of the transformation. A dilation (e) and rotation (d) is applied to theﬁxed binary images (b) and (f), respectively. The registration results (c) and (g) subtracted fromthe ﬁxed images are shown in the diﬀerence pictures (d) and (h). The registration is performedwith a single ﬁrst-order momenta in the center of the pictures, and the number of parameters forthe registration thus matches the dimensionality of the linear representations. The slight diﬀerencesbetween results and ﬁxed images are caused by the ﬁrst-order approximation in (3.1) . Increasing thekernel size, adding more control points, or using second order momenta would imply less diﬀerence.

20 40 60 80 100 120102030405060708090100 (a) Moving

20 40 60 80 100 120102030405060708090100 (b) Fixed

20 40 60 80 100 120102030405060708090100 (c) Result, ﬁrst-order

Fig. 7.4 . Registering articulated movement using directional information of the bones: thelandmarks and bone orientations (red points and arrows) in the moving image (a) are matchedagainst the landmarks and bone orientations (green points and arrows) in the ﬁxed image (b). Theresult using ﬁrst-order momenta (c) can be obtained with a low number of deformation atoms thatcan be consistently placed at the center of the bones. A corresponding zeroth order representationwould use a higher number of atoms with a corresponding increase in the number of parameters. the total number of free parameters is lower than a zeroth order representation usingtwo atoms per bone.

Atrophy occurs in the human brain among patientssuﬀering from Alzheimer’s disease, and the progressing atrophy can be detected bythe expansion of the ventricles [19, 13]. Since ﬁrst-order momenta oﬀer compactdescription of expansion, this makes a parametrization of the registration based onhigher-order momenta suited for describing the expansion of the ventricles, and, in2addition, the deformation represented by the momenta will be easily interpretable. Inthis experiment, we therefore suggest a registration method that using few degrees offreedom describes the expansion of the ventricles, and does so in a way that can beinterpreted when doing further analysis of e.g. the volume change.We use the publicly available Oasis dataset [20], and, in order to illustrate theuse of higher-order momenta, we select a small number of patients from which twobaseline scans are acquired at the same day together with a later follow up scan. Thepatients are in various stages of dementia, and, for each patient, we rigidly registerthe two baseline and one follow up scan [9].The expanding ventricles can be registered by placing deformation atoms in thecenter of the ventricles of the ﬁxed image as shown in Figure 1.1. For each patient,we manually place ﬁve deformation atoms in the ventricle area of the ﬁrst baseline 3Dvolume. It is important to note that though we localize the description of the defor-mation to the deformation atoms, the atoms control the deformation ﬁeld throughoutthe ventricle area. Based on the size of the ventricles, we use 3D Gaussian kernelswith a scale of 15 voxels, and we let the regularization weight in (2.1) be λ = 16.The eﬀect of these choices is discussed below. Each deformation atom consists of azeroth and ﬁrst-order momenta. We use L similarity to drive the registration [9] and, for each patient, we perform two registrations: we register the two baseline scansacquired at the same day, and we register one baseline scan against the follow up scan.Thus, the baseline-baseline registration should indicate no ventricle expansion, and weexpect the baseline-follow up registration to indicate ventricle expansion. Figure 1.1shows for one patient the placement of the control points in the baseline image, thefollow up image, the log-Jacobian determinant in the ventricle area of the generateddeformation, and the initial vector ﬁeld driving the registration.The use of ﬁrst-order momenta allows us to interpret the result of the registrationsand to relate the results to possible expansion of the ventricles. The volume change isindicated by the Jacobian determinant of the generated deformation at the deforma-tion atoms as well as by the divergence of the ﬁrst-order momenta. The latter is avail-able directly from the registration parameters. We plot in Figure 7.5 the logarithm ofthe Jacobian determinant and the divergence for both the same day baseline-baselineregistrations and for the baseline-follow up registrations. Patient 1 − See also http://image.diku.dk/darkner/LOI . a v e r age l og − J a c ob i an de t e r m i nan t non−dementeddementedbaseline−baselinebaseline−follow up (a) The average log-Jacobian of the ﬁnal de-formation at the 5 deformation atoms for thebaseline-baseline and baseline-follow up regis-trations a v e r age de f o r m a t i on a t o m d i v e r gen c e non−dementeddementedbaseline−baselinebaseline−follow up (b) The average divergence at the deformationatoms for the baseline-baseline and baseline-follow up registrations Fig. 7.5 . Indicated volume change: (a) The average log -Jacobian determinant of the generateddeformation at the 5 deformation atoms for six patients (1-4 demented, 5-6 non-demented); (b)divergence of the 5 higher-order momenta representing the deformation. The divergence can beextracted directly from the parameters of the ﬁrst-order momenta, and the correlation between the log -Jacobian and the divergence as seen by the similarity between (a) and (b) therefore shows theinterpretability of the deformation atoms. The time-span between baseline and follow up scans are1.5-2 years with the exception of 3 years for patient four (arrows). term in (2.1) will aﬀect the amount of expansion captured in the registration. Becauseof the low number of control points, we can in practice set the contribution of theregularization term to zero without experiencing non-diﬀeomorphic results. It will beinteresting in the future to estimate the actual volume expansion directly using theparameters of the deformation atoms with this less biased model.

8. Conclusion and Outlook.

We have introduced higher-order momenta in theLDDMM registration framework. The momenta allow compact representation of lo-cally aﬃne transformations by increasing the capacity of the deformation description.Coupled with similarity measures incorporating ﬁrst-order information, the higher-order momenta improve the range of deformations reached by sparsely discretizedLDDMM methods, and they allow direct capture of ﬁrst-order information such asexpansion and contraction. In addition, the constitute deformation atoms for whichthe generated deformation is directly interpretable.We have shown how the partial derivative reproducing property implies singularmomentum for the higher-order momenta, and we used this to derive the EPDiﬀevolution equations. By computing the forward and backward variational equations,we are able to transport gradient information and derive a matching algorithm. Weprovide examples showing typical deformation coded by ﬁrst-order momenta and howimages can be registered using a very few parameters, and we have applied the methodto register human brains with progressing atrophy.The experiments included here show only a ﬁrst step in the application of higher-order momenta: the representation may be applied to register entire images; mergingthe method with multi-scale approaches will increase the description capacity and maylead to further reduction in the dimensionality of the representation. Combined witheﬃcient implementations, higher-order momenta promise to provide a step forward incompact deformation description for image registration.4

REFERENCES[1]

Vincent Arsigny, Olivier Commowick, Nicholas Ayache, and Xavier Pennec , A fast andLog-Euclidean polyaﬃne framework for locally linear registration , J. Math. Imaging Vis.,33 (2009), pp. 222–238.[2]

Vincent Arsigny, Olivier Commowick, Xavier Pennec, and Nicholas Ayache , A Log-Euclidean framework for statistics on diﬀeomorphisms , in MICCAI 2006, 2006, pp. 924–931.[3]

Vincent Arsigny, Xavier Pennec, and Nicholas Ayache , Polyrigid and polyaﬃne trans-formations: A novel geometrical tool to deal with non-rigid deformations – application tothe registration of histological slices , Medical Image Analysis, 9 (2005), pp. 507–523.[4]

M. Faisal Beg, Michael I. Miller, Alain Trouv´e, and Laurent Younes , Computing largedeformation metric mappings via geodesic ﬂows of diﬀeomorphisms , IJCV, 61 (2005),pp. 139–157.[5]

F. L. Bookstein , Linear methods for nonlinear maps: Procrustes ﬁts, thin-plate splines,and the biometric analysis of shape variability , in Brain warping, Academic Press, 1999,pp. 157–181.[6]

Yan Cao, M. I Miller, R. L Winslow, and L. Younes , Large deformation diﬀeomorphic met-ric mapping of vector ﬁelds , IEEE Transactions on Medical Imaging, 24 (2005), pp. 1216–1230.[7]

GE Christensen, RD Rabbitt, and MI Miller , Deformable templates using large deforma-tion kinematics , Image Processing, IEEE Transactions on, 5 (2002).[8]

Colin J Cotter and Darryl D Holm , Singular solutions, momentum maps and computa-tional anatomy , nlin/0605020, (2006).[9]

Sune Darkner and Jon Sporring , Generalized partial volume: An inferior density estima-tor to parzen windows for normalized mutual information , in Information Processing inMedical Imaging, vol. 6801, Springer, 2011, pp. 436–447.[10]

Paul Dupuis, Ulf Grenander, and Michael I Miller , Variational problems on ﬂows ofdiﬀeomorphisms for image matching , (1998).[11]

Stanley Durrleman, Marcel Prastawa, Guido Gerig, and Sarang Joshi , Optimal data-driven sparse parameterization of diﬀeomorphisms for population analysis , Proc. of theInformation Processing in Medical Imaging Conference, IPMI, 22 (2011), pp. 123–134.[12]

Gregory E. Fasshauer and Qi Ye , Reproducing kernels of generalized sobolev spaces via agreen function approach with distributional operators , Numerische Mathematik, (2011).[13]

N. C. Fox, E. K. Warrington, P. A. Freeborough, P. Hartikainen, A. M. Kennedy,J. M. Stevens, and M. N. Rossor , Presymptomatic hippocampal atrophy in alzheimer’sdisease , Brain, 119 (1996), pp. 2001 –2007.[14]

Laurent Garcin , Thechniques de Mise en Correspondance et D´etection de Changements , PhDthesis, 2005.[15]

Joan Glaun`es , Transport par diﬀ´eomorphismes de points, de mesures et de courants pour lacomparaison de formes et l’anatomie num´erique , PhD thesis, Universit´e Paris 13, Villeta-neuse, France, 2005.[16]

Ulf Grenander , General Pattern Theory: A Mathematical Study of Regular Structures , Ox-ford University Press, USA, Feb. 1994.[17]

Monica Hernandez, Matias Bossa, and Salvador Olmos , Registration of anatomical imagesusing paths of diﬀeomorphisms parameterized with stationary vector ﬁeld ﬂows , Interna-tional Journal of Computer Vision, 85 (2009), pp. 291–306.[18]

William M. Wells III, Paul Viola, Hideki Atsumi, Shin Nakajima, and Ron Kikinis , Multi-Modal volume registration by maximization of mutual information , Medical ImageAnalysis, 1 (1996), pp. 35 – 51.[19]

Clifford R. Jack, Ronald C. Petersen, Yue Cheng Xu, Stephen C. Waring, Peter C.O’Brien, Eric G. Tangalos, Glenn E. Smith, Robert J. Ivnik, and Emre Kokmen , Medial temporal atrophy on MRI in normal aging and very mild alzheimer’s disease , Neu-rology, 49 (1997), pp. 786 –794.[20]

Daniel S Marcus, Anthony F Fotenos, John G Csernansky, John C Morris, andRandy L Buckner , Open access series of imaging studies: longitudinal MRI data innondemented and demented older adults , Journal of Cognitive Neuroscience, 22 (2010),pp. 2677–2684. PMID: 19929323.[21]

X. Pennec, R. Stefanescu, V. Arsigny, P. Fillard, and N. Ayache , Riemannian elasticity:A statistical regularization framework for non-linear registration , in MICCAI 2005, 2005,pp. 943–950.[22]

J. P.W Pluim, J. B.A Maintz, and M. A Viergever , Image registration by maximization of combined mutual information and gradient information , IEEE Transactions on MedicalImaging, 19 (2000), pp. 809–814.[23] Alexis Roche, Gr´egoire Malandain, Xavier Pennec, and Nicholas Ayache , The correla-tion ratio as a new similarity measure for multimodal image registration , in Proceedings ofthe First International Conference on Medical Image Computing and Computer-AssistedIntervention, MICCAI ’98, Springer-Verlag, 1998, p. 1115–1124. ACM ID: 709612.[24]

Alexis Roche, Xavier Pennec, Gr´egoire Malandain, and Nicholas Ayache , Rigid reg-istration of 3D ultrasound with MR images: a new approach combining intensity andgradient information , IEEE Transactions on Medical Imaging, 20 (2001), pp. 1038—1049.[25]

D Rueckert, L I Sonoda, C Hayes, D L Hill, M O Leach, and D J Hawkes , Nonrigid regis-tration using free-form deformations: application to breast MR images , IEEE Transactionson Medical Imaging, 18 (1999), pp. 712–721. PMID: 10534053.[26]

Christof Seiler, Xavier Pennec, and Mauricio Reyes , Geometry-Aware multiscale imageregistration via OBBTree-Based polyaﬃne Log-Demons , in Medical Image Computing andComputer-Assisted Intervention - MICCAI, Toronto, Canada, 2011.[27]

Stefan Sommer, Mads Nielsen, Francois Lauze, and Xavier Pennec , A Multi-Scale kernelbundle for LDDMM: towards sparse deformation description across space and scales , inIPMI 2011, Springer, 2011.[28]

Stefan Sommer, M. Nielsen, and X. Pennec , Sparse multi-scale diﬀeomorphic registration:the kernel bundle framework , To appear in Journal of Mathematical Imaging and Vision.[29]

J.-P. Thirion , Image matching as a diﬀusion process: an analogy with maxwell’s demons ,Medical Image Analysis, 2 (1998), pp. 243–260.[30]

Alain Trouv´e , An inﬁnite dimensional group approach for physics based models in patternsrecognition , 1995.[31]

M. Vaillant, M.I. Miller, L. Younes, and A. Trouv´e , Statistics on diﬀeomorphisms viatangent space representations , NeuroImage, 23 (2004), pp. S161–S169.[32]

Tom Vercauteren, Xavier Pennec, Aymeric Perchant, and Nicholas Ayache , Diﬀeo-morphic demons: eﬃcient non-parametric image registration , NeuroImage, 45 (2009),pp. 61–72.[33]

Younes , Constrained diﬀeomorphic shape evolution , submitted to foundations Comp Math,(2011).[34]

Laurent Younes , Shapes and Diﬀeomorphisms , Springer, 2010.[35]

Ding-Xuan Zhou , Derivative reproducing properties for kernel methods in learning theory ,Journal of Computational and Applied Mathematics, 220 (2008), pp. 456–463.[36]