A second order equation for Schrödinger bridges with applications to the hot gas experiment and entropic transportation cost
AA SECOND ORDER EQUATION FOR SCHR ¨ODINGER BRIDGES WITHAPPLICATIONS TO THE HOT GAS EXPERIMENT AND ENTROPICTRANSPORTATION COST
GIOVANNI CONFORTIA
BSTRACT . The
Schr¨odinger problem is obtained by replacing the mean square distancewith the relative entropy in the Monge-Kantorovich problem. It was first addressed bySchr¨odinger as the problem of describing the most likely evolution of a large numberof Brownian particles conditioned to reach an “unexpected configuration”. Its optimalvalue, the entropic transportation cost , and its optimal solution, the
Schr¨odinger bridge ,stand as the natural probabilistic counterparts to the transportation cost and displace-ment interpolation. Moreover, they provide a natural way of lifting from the point tothe measure setting the concept of Brownian bridge. In this article, we prove that theSchr¨odinger bridge solves a second order equation in the Riemannian structure of opti-mal transport. Roughly speaking, the equation says that its acceleration is the gradientof the Fisher information. Using this result, we obtain a fine quantitative description ofthe dynamics, and a new functional inequality for the entropic transportation cost, thatgeneralize Talagrand’s transportation inequality. Finally, we study the convexity of theFisher information along Schr¨odigner bridges, under the hypothesis that the associated reciprocal characteristic is convex. The techniques developed in this article are also wellsuited to study the
Feynman-Kac penalisations of Brownian motion.
1. I
NTRODUCTION AND STATEMENT OF THE MAIN RESULTS
The
Schr¨odinger problem (SP) is the problem of finding the best approximation in therelative entropy sense of a stationary dynamics P under constraints on the marginallaws. It originates from the early works [49, 48] and is now an active field of researchwith ties to many others. Without being exhaustive, we mention [20, 21, 6, 17, 18] forconnections with large deviations and [16, 40, 5, 31, 23, 7, 57, 26] for the relations withoptimal transport and stochastic control. In [29, 30, 37, 53, 15, 58] Schr ¨odinger bridgesare used to develop Euclidean Quantum Mechanics (EQM) and second order calculusfor diffusions, while in the article [36] they are employed to construct interpolationsbetween probability measures on discrete spaces. We refer to [8, 9, 22, 50, 38] for ap-plications in engineering, economics and graphics. In this article we prove that the Schr¨odinger bridge ˆ Q , i.e. the solution to SP, solves a second order differential equationin the Riemannian structure of optimal transport and obtain quantitative results for itsdynamics. In particular, since the evolution of the empirical measure of N Brownianparticles whose initial and final configurations are known is described in the limit for N → + ∞ by a Schr ¨odinger bridge, our results bring answers to the basic question (seealso section 1.1 ) a r X i v : . [ m a t h . P R ] J un GIOVANNI CONFORTI Q Suppose you observe the configuration at times t = 0 , of N (cid:29) indepen-dent Brownian particles. How well does their configuration at time t ∈ (0 , resemble the equilibrium configuration?Moreover, we shall establish a new connection between the so called reciprocal char-acteristics associated with a potential, and the convexity of the Fisher information alongSchr ¨odinger bridges. Although the main objective of this paper is to understand the dy-namics of Schr ¨odinger bridges from a probabilistic viewpoint, our results can be seenas “stochastic” generalizations of well known results of optimal transport, such as Ta-lagrand’s inequality and displacement convexity of the entropy. Indeed, these classicalresults can be recovered through a small noise argument. Organization of the paper.
The paper is organized as follows: in section 1 we statethe results; in section 2 we recall some notions of second order calculus in Wassersteinspace, and in section 3 we provide proofs. Some technical lemmas are in the Appendix.1.1.
The Schr ¨odinger problem.
In this subsection we give an introduction to SP andcollect some useful facts. To formulate SP, let Ω be the space of continuous paths overthe time interval [0 , with values in a smooth complete connected Riemannian Man-ifold ( M, g ) without boundary. We consider the law P on Ω of the only stationaryMarkov measure for the generator L = σ (cid:16)
12 ∆ − ∇ U · ∇ (cid:17) , where σ is a positive constant, ∆ the Laplace-Beltrami operator and U a smooth Lip-schitz potential. The Schr ¨odinger problem is the problem of finding the best approx-imation of P in the relative entropy sense in the set of probabilities with prescribedmarginals µ, ν at times t = 0 , .(SPd) min H ( Q | P ) Q ∈ P (Ω) , X Q = µ, X Q = ν, where P (Ω) is the space of probabilities over Ω , X t ( ω ) = ω t is the canonical projectionmap, the push-forward and H ( ·| P ) the relative entropy functional. We refer to thisformulation as the dynamic formulation . The optimal measure ˆ Q is the Schr¨odinger bridge
SB( L , µ, ν ) between µ and ν . The static formulation is obtained by projecting onto theendpoint marginals.(SP) min H ( π | ( X , X ) P ) , π ∈ Π( µ, ν ) , where Π( µ, ν ) is the set of couplings of µ and ν . We call the optimal value T σU ( µ, ν ) ofSPd the entropic transportation cost . It is known [32, Prop. 2.4] that an optimal solution ˆ Q of SPd exists whenever T σU ( µ, ν ) < + ∞ and that the two formulations are equivalent,see [20]: the Schr ¨odinger bridge ˆ Q is obtained lifting the optimal coupling ˆ π of SP topath space by mean of bridges. We have(1) ˆ Q ( · ) = (cid:90) M × M P xy ( · )ˆ π ( dx dy ) , P may not be a probability measure, but an infinite measure. This won’t be a problem as long as Q isa probability measure. The note [34] takes care of this issue in detail. CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 3 where P xy is the xy bridge of P ([32] for details). Using this decompositon, one can showthat T σU ( µ, ν ) is also the optimal value of SP, and therefore that the optimal values ofSPd and SP coincide. The fg representation of the optimal solution. It has been shown at [32, Th 2.8] that, undersome mild assumptions on µ, ν and P , there exist two measurable functions f, g : M → R ≥ such that E P ( f ( X ) g ( X )) = 1 and the SB takes the form(2) ˆ Q = f ( X ) g ( X ) P .f, g are found by solving the Schr¨odinger system , see [32],[47]. In the rest of the articlewe will often deal with the functions f t , g t defined for t ∈ [0 , , by(3) f t ( x ) := E P [ f ( X ) | X t = x ] , g t ( x ) := E P [ g ( X ) | X t = x ] . The following “hot gas experiment” provides an heuristic motivation for the formu-lation of SP, and helps building intuition.
Hot gas experiment.
At time t = 0 we are given N independent particles ( X it ) t ≤ ,i ≤ N whose configuration µ is µ := 1 N N (cid:88) i =1 δ X i . We then let the particles travel independently following the Langevin dynamics for L and look at their configuration ν at t = 1 , ν := 1 N N (cid:88) i =1 δ X i . If the Langevin dynamics has good ergodic properties and N is very large, one expects ν to be very close to the solution at time of the Fokker-Planck equation associatedwith the generator L and the initial datum µ . However, although very unlikely, it isstill possible to observe an unexpected configuration . Schr ¨odinger’s question is to findthe most likely evolution of the particle system conditionally on the fact that such arare event happened. Letting N → + ∞ and using Sanov’s Theorem, this question canrigorously be formulated as SPd (see also [32, sec.6]). The Schr ¨odinger problem can beviewed as a stochastic counterpart to the “lazy gas experiment” of optimal transport[54, Ch. 16]. Indeed • Particles choose their final destination minimizing the relative entropy insteadof the mean square distance. • Particles travel along Brownian bridges instead of geodesics, see (1).We refer to [35, sec. 6] for an extensive treatment of this analogy. Quite remarkably, thedescription of the hot gas experiment we gave is very similar to the original formulationof the problem that Schr ¨odinger proposed back in 1932, (see [49]) when the moderntools of probability theory were not available. Schr¨odinger writes in [49] “un ´ecart spontan´e et considerable par rapport `a cette uniformit´e”
GIOVANNI CONFORTI
An equation for the Schr ¨odinger bridge.
The first result of this article is that themarginal flow ( µ t ) of SB( L , µ, ν ) solves a second order differential equation, whenviewed as a curve in the space P ( M ) of probability measures with finite second mo-ment endowed with the Riemannian-like structure of optimal transport, [42, 51, 39].Let us, for the moment, provide the minimal notions to state the result; we postpone toSection 2 a summary based on [2, 25] of second order calculus in P ( M ) . As said, weview ( µ t ) as a curve in a kind of Riemannian manifold. Therefore, provided it is regularenough, we can speak of its velocity and acceleration. If ( µ t ) is a regular curve (Def.2.1),starting from the continuity equation ∂ t µ t + ∇ · ( µ t v t ) = 0 , one can define a Borel family of vector fields, which is the velocity field of ( µ t ) . Fur-thermore, if v t is absolutely continuous along ( µ t ) (Def 2.5) we can compute its covariantderivative (Def. 2.7) D dt v t , which plays the role of the acceleration of ( µ t ) . We considerthe Fisher information functional I U on P ( M ) (4) I U ( µ ) = (cid:40)(cid:82) M |∇ (log µ + 2 U ) | dµ if µ (cid:28) vol + ∞ otherwiseIn the definition above, and in the rest of the paper we make no distinction between ameasure µ and its Radon-Nykodym density dµd vol w.r.t. the volume measure vol on M .Moreover, if v is a vector field on M , we abbreviate | v ( z ) | T z M with | v | . In Definition2.8 we provide the definition of gradient of a functional over P ( M ) . According to thisnotion, the gradient of I U is the vector field(5) ∇ W I U ( µ ) := ∇ (cid:16) − |∇ log µ | −
2∆ log µ + 8 U (cid:17) , where U = 12 ( |∇ U | − ∆ U ) . The content of the next Theorem is that under some suitable regularity assumptions,the acceleration of ( µ t ) is ∇ W I U ( µ t ) . Assumptions 1.1.
M, U, µ, ν satisfy one of the following(A) M is compact, U ∈ C ∞ ( M ) , H U ( µ ) , H U ( ν ) < + ∞ .(B) M = R d , U ( z ) = α | z | for some α > , µ, ν (cid:28) vol have compact support and boundeddensity against vol . Theorem 1.2.
Let
M, U, µ, ν be such that Assumption 1.1 holds. Then the marginal flow ( µ t ) of SB( L , µ, ν ) is a regular curve and its velocity field ( v t ) is absolutely continuous. Moreover, ( µ t ) satisfies the equation (6) ∀ < t < , D dt v t = σ ∇ W I U ( µ t ) , where D dt v t is the covariant derivative of ( v t ) along ( µ t ) . Assumption 1.1 essentially says that either M is compact or P is a stationary Ornstein-Uhlenbeck process on R d . We impose this strong assumptions because we do not wantto assume any regularity on the solution to SPd, but only on the data of the problem. Ifwe accept to do so, Assumption 1.1 can be dropped. CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 5
Theorem 1.3.
Assume that the dual representation (2) of SB( L , µ, ν ) holds, that the marginalflow ( µ t ) is such that (7) ∀ ε ∈ (0 ,
1) sup t ∈ [ ε, − ε ] |∇ W I U ( µ t ) | L µt < + ∞ , and that f t , g t are such that for all ε ∈ (0 , (8) sup t ∈ [ ε, − ε ] |∇ (log g t − log f t ) | L µt < + ∞ , sup t ∈ [ ε, − ε ] x ∈ M | Hess x (log g t − log f t ) | op < + ∞ . Then the marginal flow ( µ t ) is a regular curve and its velocity field ( v t ) is absolutely continuous.Moreover, ( µ t ) satisfies the equation (9) ∀ < t < , D dt v t = σ ∇ W I U ( µ t ) , where D dt v t is the covariant derivative of ( v t ) along ( µ t ) . After some manipulations, it is possible to reinterpret the fluid dynamic formulationsof SPd obtained in [23] and [7] as variational problems in the Riemannian manifold ofoptimal transport: equation (9) is then the associated Euler-Lagrange equation. Thus,in principle we could link (9) with the theory of Hamiltonian system in P ( M ) , see[1]. However, our proof is based on probabilistic arguments, and does not use thatframework. By changing the sign in the right hand side of (6), one gets a nice connectionwith the Schr ¨odinger equation, see [55, 10]. The gradient flow of the Fisher informationhas been studied in [24] as a model for the quantum drift diffusion. Using the hot gasexperiment, we can give an heuristic for equation (6) to hold. The marginal flow ( µ t ) of SB models the empirical measure of a particle system which minimizes the relativeentropy on path space while going from µ to ν . Therefore, if there was no constraint onthe final end point, ( µ t ) would be the gradient flow of the marginal entropy started at µ (10) v t = − σ ∇ W H U ( µ t ) , µ = µ, where v t is the velocity field of µ t and H U the relative entropy on P ( M ) :(11) H U ( µ ) = (cid:40)(cid:82) M (log µ + 2 U ) dµ if µ (cid:28) vol + ∞ otherwiseIf we differentiate once more this relation to be allowed to impose the terminal condi-tion, we formally obtain D dt v t (10) = − σ D dt ∇ W H U ( µ t )= − σ Hess W µ t H U ( v t ) (10) = σ Hess W µ t H U ( ∇ W H U )= σ ∇ W I U ( µ t ) , GIOVANNI CONFORTI where to get the last equation we used the fact that the Fisher information is thesquared norm of the gradient of the entropy, I U ( µ ) = |∇ W H U ( µ ) | L µ . Theorem 1.2 gives an answer to the problem of determining what second order equa-tion should the bridge of a diffusion satisfy. Other authors (see e.g. [29, 30, 15, 53, 58])have proven results in this direction. In this respect, equation (6) has some nice fea-tures. The first one is that the acceleration we consider here is a “true” acceleration, inthe sense that it can be constructed as the covariant derivative associated with a Rie-mannian structure. Moreover, as we shall see in Theorem 1.4, using the Riemannianformalism we can obtain quantitative results for the dynamics of Schr ¨odinger bridges.The following kind of conservation law is also immediately derived: it can be comparedwith similar results obtained in the above mentioned articles.
Corollary 1.1.
If either Theorem 1.2 or Theorem 1.3 holds, there exists a finite constant c ( µ, ν ) such that (12) ∀ < t < , σ | v t | T µt − I U ( µ t ) = c ( µ, ν ) , where | · | T µt = | · | L µt is the norm taken in the tangent space at µ t (see section 2.3 for details). Quantitative results for Schr ¨odinger bridges.
Going back once more to the hotgas experiment is useful to give a qualitative description of the dynamics of SB. Indeed,the marginal flow ( µ t ) is the empirical measure of a particle system that, at the sametime • has to minimize entropy: this means that particles are willing to arrange ac-cording to the equilibrium configuration for the Langevin dynamics, which wedenote m (recall that d m = exp( − U ) d vol ). • has to reach an unexpected final configuration ν , which looks very different from m .Thus, we expect the dynamics to be divided into two phases. In the first one, entropyminimization dominates and µ t relaxes towards m . In the second phase, the necessityto attain the configuration ν at t = 1 prevails: particles start arranging according to ν and ( µ t ) drifts away from m . From this sketch, it is clear that Q is the crucial questionto address when studying the dynamics of Schr ¨odinger bridges. Our answer is in thefollowing Theorem, where we show that the Bakry ´Emery condition is equivalent to anupper bound for the relative entropy of µ t w.r.t. m , under the additional assumptionthat M is compact. The proofs we present are also formally correct in the non compactcase, and we expect the compactness assumption not to be necessary. We make use ofit to justify some integration by parts under m . Theorem 1.4.
Let M be compact and L = ∆ − ∇ U · ∇ . The following are equivalent:(i) The Bakry ´Emery condition (13) ∀ x ∈ M, Ric x + 2 Hess x U ≥ λ id CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 7 (ii) For any µ, ν such that H U ( µ ) , H U ( ν ) < + ∞ and any σ > , the marginal flow of SB ( µ, ν, σ L ) is such that ∀ t ∈ [0 , H U ( µ t ) ≤ − exp( − σλ (1 − t ))1 − exp( − σλ ) H U ( µ ) + 1 − exp( − σλt )1 − exp( − σλ ) H U ( ν ) − cosh( σλ ) − cosh( − σλ ( t − ))sinh( σλ ) T σU ( µ, ν ) . (14) Remark 1.1.
A simple case when our proof works beyond compactness is when Assumption1.1 (B) holds, i.e. when P is a stationary Ornstein Uhlenbeck process. In the article [35] the author proves a representation of the first and second derivativeof the entropy along SBs using the operators Γ and Γ . From this, he deduces that theentropy is convex under the condition (13). However, no quantitative estimate such as(14) is proven there. Having equation (6) at hand, we can give a geometric interpreta-tion to the results of [35], turn them into quantitative estimates and gives some answersto the questions raised there. We shall give a first “geometric” proof of ( i ) ⇒ ( ii ) atTheorem 1.4, and then a second proof based on Γ -calculus, which follows the blueprintof the first proof. Remark 1.2.
Since the functions − exp( − σλ (1 − t ))1 − exp( − σλt ) and − exp( − σλt )1 − exp( − σλt ) are increasing in λ and onlythe term involving T σU ( µ, ν ) is decreasing, one might think that the bound (14) could get worseby increasing λ . This is false because of how T σU ( µ, ν ) depends on the marginal entropies. It canbe read off the proof of Theorem 1.4 that for any µ, ν fixed the right hand side of (14) is strictlydecreasing in λ . A functional inequality for the entropic transportation cost.
Setting t = in (14),we obtain Corollary 1.2.
Let M be compact and (13) hold for some λ > . Then, for any µ, ν ∈ P ( M ) and σ > (15) T σU ( µ, ν ) ≤ − exp( − σλ ) ( H U ( µ ) + H U ( ν )) . To interpret this inequality, recall that if ˆ Q is the Schr ¨odinger bridge between µ and ν then H ( ˆ Q | P ) = T σU ( µ, ν ) . The bound (15) tells that the entropic cost is at most linear in the marginal entropieswith a constant that improves with curvature. Since T σU ( µ, ν ) is a relative entropy onpath space, (15) may as well be seen as a partial converse to the fact that joint entropiesdominate marginal entropies. Such bound may be of particular interest in practice.Indeed, it is a very hard task to compute T σU ( µ, ν ) . However, H U ( µ ) , H U ( ν ) can becomputed directly from the data of the problem. With some simple computation wealso derive from Theorem 1.4 the following Corollary. Corollary 1.3.
Let M be compact and (13) hold for some λ > . Then, for any µ ∈ P ( M ) and σ > (16) H U ( µ ) ≤ T σU ( µ, m ) ≤ − exp( − σλ ) H U ( µ ) . GIOVANNI CONFORTI F IGURE
1. If M = R , L = ∆ − x ·∇ , then P is a stationary OU process and thecondition (13) holds with λ = 1 . In blue, we plot the corresponding bound (14)for σ = 1 . We chose µ = ν to be a Gaussian law with mean zero and variance . . The entropic cost turns out to be around 0.6841. In green the same boundfor L = ∆ − x · ∇ . In this case λ = 2 , and the bound is thus stronger. We shall see in the next section how (16) is connected with Talagrand’s inequality. Letus remark here that the entropic transportation cost falls into the class of generalizedcosts considered in [27].1.5.
Small noise limits and relations with classical optimal transport.
In this sectionwe “slow down” Brownian motion, i.e. consider the generator L ε = ε , assuming for simplicity that U = 0 . Because of this choice, H U is equal to the relativeentropy against the volume measure, which we abbreviate with H . Our aim is toprovide an heuristic showing that the bounds obtained above are consistent with theclassical results of optimal transport. We will make this argument rigorous in the proofof Theorem 1.4. In particular, we shall see how the convexity of the entropy alongdisplacement interpolations is the small noise limit of the estimate (14), and Talagrand’stransportation inequality (see [52], [43]) is the small noise limit of (16). Let (13) hold and ( µ εt ) be the marginal flow of SB( µ, ν, L ε ) ; the bound 1.4 says that H ( µ εt ) ≤ − exp( − ελ (1 − t ))1 − exp( − ελ ) H ( µ ) + 1 − exp( − ελt )1 − exp( − ελ ) H ( ν ) − cosh( ελ ) − cosh( − ελ ( t − ))sinh( ελ ) T ε H ( µ, ν ) , (17) CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 9
Moreover, Corollary 1.3 is(18) ∀ µ ∈ P ( M ) , T ε ( µ, m ) ≤ − exp( − λε ) H ( µ ) . In the articles [40, 31, 26] connections between the Schr ¨odinger problem and the MongeKantorovich problem were established. In particular, in [31, Th 3.6] a Γ -convergenceresult for the “small noise” Schr ¨odinger problem towards the Benamou-Brenier formu-lation of the Monge Kantorovich problem is proven. Roughly speaking, this impliesthat ( µ εt ) → ( µ t ) , ε T ε ( µ, ν ) → W ( µ, ν ) , where ( µ t ) is the displacement interpolation between µ and ν . Using these results, andletting ε → in (17), one recovers the convexity of the entropy along displacementinterpolations, whereas in (18) one recovers Talagrand’s inequality.1.6. Reciprocal characteristics and Fisher information.
Reciprocal characteristics.
In this section we take M = R d , σ = 1 . P is then the stationaryLangevin dynamics for the generator L = 12 ∆ − ∇ U · ∇ . The aim of this section is to point out some connections between convexity of the recip-rocal characteristic and convexity of the Fisher information along SBs. Let us recall thedefinition of reciprocal characteristic associated with a smooth potential U : it is the vectorfield ∇ U , where we recall that U ( x ) := 12 |∇ U | ( x ) −
12 ∆ U ( x ) . In [29, 11] it is shown how the reciprocal characteristic is an invariant for the familyof bridges of the Langevin dynamics, and, more generally, for the associated reciprocalprocesses . We refer to the survey [33] and the articles [30, 44, 45, 13] for more informationabout reciprocal processes and reciprocal characteristics. In Theorem 1.3 the reciprocalcharacteristic appears as one of the terms expressing the acceleration of a Schr ¨odingerbridge. In the recent work [14], a number of quantitative results about the bridges of theLangevin dynamics were obtained, under the hypothesis that U is uniformly convex.In particular at Theorem 1.1, an equivalence between α convexity of U , the possibilityof constructing certain couplings for bridges with different endpoints, and a gradientestimate along the bridge semigroup is proven. We want to add one more motive ofinterest for the case when U is convex showing that it implies lower bounds for thesecond derivative of the modified Fisher information along SBs. Convexity of the Fisher information. If U is convex, we are able to show that t (cid:55)→ I U ( µ t ) is convex, provided some technical assumptions hold and the optimal coupling ˆ π of SPis a log-concave measure. In the next Theorem, since we are in the flat case, we preferto write Dv · w instead of ∇ w v for the derivatives of vector fields. Theorem 1.5.
Let
U, µ, ν be such that(i) The hypothesis of Theorem 1.3 are satisfied. (ii) The vector field ∂ t ∇ W I U ( µ t ) + D ∇ W I U ( µ t ) · v t is bounded in L µ t uniformly in t ∈ [0 , .(iii) If we define A t , B t by A t ( x ) := sup ≤ k ≤ i ..i k ∈ R d | ∂ x ik ..∂ x i log µ t ( x ) | , B t ( x ) := sup ≤ k ≤ ≤ j ≤ di ..i k ∈ R d | ∂ x ik ..∂ x i v jt ( x ) | , then sup t ∈ [0 , (cid:90) R d ( A t B t ) ( max ≤ i ≤ d | ∂ i µ t | + µ t ) d vol < + ∞ . (iv) The optimal coupling ˆ π in SP is log-concave.(v) U is α convex.Then t (cid:55)→ I U ( µ t ) is convex. Moreover, we have the following bound: (19) ∂ tt I U ( µ t ) ≥ α | v t | T µt + 18 |∇ W I U ( µ t ) | T µt . A simple example to which the Theorem can be applied is when P is a stationaryOrnstein Uhlenbeck process and µ, ν are Gaussian laws. Concerning the assumptions ofthe Theorem, we believe that (iii) could be largely weakened , and that the assumptionthat ˆ π is log concave cannot be significantly weakened. Indeed, the key argument in theproof of Theorem 1.5 is to identify the log concave measures as “convexity points” forthe Fisher information functional I (corresponding to the case U = 0 ). Let us explainthis. In Lemma 3.10 we prove that (cid:104) D dt ∇ W I ( µ t ) , v t (cid:105) T µt = 2 (cid:90) | Dv t · ∇ log µ t + ∇ div ( v t ) | dµ t − (cid:90) d (cid:88) k,j =1 ( Dv t · Dv t ) kj ∂ kj log µ t dµ t . (20)Schur’s product theorem [28, Th 7.5.3] makes sure that if − Hess log µ t is positive def-inite then (cid:80) dk,j =1 ( Dv t · Dv t ) kj ∂ kj log µ t > . Since (cid:104) D dt ∇ W I ( µ t ) , v t (cid:105) T µt morally standsfor (cid:104) Hess W µ t ( v t ) , v t (cid:105) T µt , equation (20) indicates that log-concave measures are convexitypoints for I . The same formula suggests that the Fisher information is not a convexfunction in general. For the modified Fisher information I U we get (cid:104) D dt ∇ W I U ( µ t ) , v t (cid:105) T µt = 2 (cid:90) | Dv t · ∇ log µ t + ∇ div ( v t ) | dµ t − (cid:90) d (cid:88) k,j =1 ( Dv t · Dv t ) kj ∂ kj log µ t dµ t (21) + 8 (cid:90) (cid:104) v t , Hess U · v t (cid:105) dµ t . (22) We recall that ( v t ) is the velocity field of ( µ t ) , (cid:104)· , ·(cid:105) T µt is the inner product in L µ t and D dt the covariantderivative. We also denote Dv t the Jacobian matrix of v t . Finally, we abbreviate ∂ x i with ∂ i , and adopt thesame convention for higher-order derivatives. CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 11
This formula clearly indicates that the convexity of U contributes to the convexity of I U . Remark 1.3.
If we reintroduce the dependence on σ in Theorem 1.5 the analog of (19) is ∂ tt I U ( µ t ) ≥ α | v t | T µt + σ |∇ W I U ( µ t ) | T µt . Letting σ → and using the convergence of the SB towards the displacement interpolation, itis natural to guess that the modified Fisher information is convex along a geodesic ( µ t ) provided µ t is log concave for all t . At the moment of writing, we have no rigorous proof of this. Feynman-Kac penalisation of Brownian motion.
In this section, the setting is M = R d , σ = 1 , U = 0 . We look at a family of stochastic processes called Feynman-Kacpenalisations of Brownian motion, see the monograph [46]. If P is the stationary Brow-nian motionon R d (which is not a probability measure), a Feynman-Kac penalisation Q [46, Ch. 2] is a probability measure on Ω which can be written as d Q = 1 Z f ( X ) exp( − (cid:90) K ( X s ) ds ) d P , where K is a smooth lower bounded potential, f a positive integrable function and Z anormalization constant. As before, for any t ∈ [0 , we define the functions f t ( x ) = E P (cid:2) f ( X ) exp( − (cid:90) t K ( X s ) ds ) (cid:12)(cid:12) X t = x (cid:3) , (23) g t ( x ) = E P (cid:2) exp( − (cid:90) t K ( X s ) ds ) (cid:12)(cid:12) X t = x (cid:3) . We will also consider the energy functional E K defined as(24) E K ( µ ) = (cid:90) M Kdµ.
Theorem 1.6.
Assume that the marginal flow ( µ t ) of Q is such that (25) ∀ ε ∈ (0 ,
1) sup t ∈ [ ε, − ε ] |∇ W I ( µ t ) + E K ( µ t ) | L ( µ t ) < + ∞ and f t , g t are such that for all ε ∈ (0 , (26) sup t ∈ [ ε, − ε ] |∇ (log g t − log f t ) | L µt < + ∞ , sup t ∈ [ ε, − ε ] x ∈ M | Hess x (log g t − log f t ) | op < + ∞ . Then the marginal flow ( µ t ) is a regular curve and its velocity field ( v t ) is absolutely continuous.Moreover, ( µ t ) satisfies the equation D dt v t = ∇ W ( 18 I + E K )( µ t ) , where D dt v t is the covariant derivative of ( v t ) along ( µ t ) .
2. T HE R IEMANNIAN STRUCTURE OF OPTIMAL TRANSPORT
Preliminaries and notation.
We consider a smooth Riemannian Manifold ( M, g ) which is complete, connected, closed, without boundary, and of finite dimension. Wedenote the Levi-Civita connection of ( M, g ) by ∇ . For any ε ∈ (0 , , we define D ε :=[ ε, − ε ] × M and let C ∞ be the space of smooth functions on M . C + ∞ , C b ∞ ⊆ C ∞ are thesubset of positive function and the subset of bounded functions whose derivatives areall bounded. We also set C b, + ∞ := C b ∞ ∩ C + ∞ . Finally C c ∞ ⊆ C ∞ is the subset of compactlysupported functions and C c, + ∞ := C c ∞ ∩ C + ∞ . The definition of C ∞ , C b ∞ , C c ∞ naturally ex-tends to vector fields over M , to functions whose domain is [0 , × M instead of M , andto families of vector fields over M indexed by time. We call P ( M ) the space of prob-ability measures on M admitting second moment. The Wasserstein distance W ( µ, ν ) between µ, ν ∈ P ( M ) is given by W ( µ, ν ) := inf π ∈ Π( µ,ν ) (cid:90) M d M ( x, y ) π ( dxdy ) , where d M is the Riemannian distance on M . In the rest of the paper, curves are alwaysdefined on the time interval [0 , , unless otherwise stated. In this section, we shalldescribe a kind of Riemannian structure on P ( M ) associated with W ( · , · ) . We do notprove new results and simply extract definitons and results from [2] and [25], to whichwe refer for the proofs.2.2. Geodesics, Velocity fields.
Recall that if ( x t ) is a curve in a generic metric space ( X , d ) , we say that it is absolutely continuous over [ ε, − ε ] provided that for some inte-grable function f ∀ ε ≤ r < s ≤ − ε d ( x r , x s ) ≤ (cid:90) sr f ( t ) dt. In all what follows, we will write ”absolutely continuous curve” and mean ”absolutelycontinuous over [ ε, − ε ] for all ε ∈ (0 , ”. A curve is a constant speed geodesic if andonly if ∀ s, t ∈ [0 , d ( x s , x t ) = | t − s | d ( x , x ) . ( X , d ) is said to be a geodesic space provided that for any pair of points there exist a con-stant speed geodesic connecting them. It turns out that ([2, Th 2.10]) ( P ( M ) , W ( · , · )) is a geodesic space. If ( µ t ) is an absolutely continuous curve on ( P ( M ) , W ( · , · )) thenone can show ([2, Th. 2.29]) that there exists a Borel family of vector fields ( v t ) such thatthe continuity equation (27) ∂ t µ t + ∇ · ( µ t v t ) = 0 holds in the sense of distributions. Moreover, it can be shown that there exists a uniqueup to a negligible set of times family of vector fields ( v t ) satisfying (27) and such that v t ∈ {∇ ϕ, ϕ ∈ C c ∞ } L µt t − a.e. .We call this the velocity field of ( µ t ) . Conversely, if ( v t ) is a Borel family of vector fieldssatisfying (27) such that ∀ ε ∈ (0 , (cid:82) − εε | v t | L µt dt < + ∞ and v t ∈ {∇ ϕ, ϕ ∈ C c ∞ } L µt t − a.e., then ( µ t ) is absolutely continuous and v t is its velocity field. CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 13
Tangent space and Riemannian metric.
The tangent space at µ ∈ P ( M ) is gener-ated by the “space of directions” Geod µ = { const. speed geodesics starting from µ defined on some interval [0 , T ] } / ≈ where ( µ t ) ≈ ( µ (cid:48) t ) provided they coincide on some right neighborhood of . We equip Geod µ with the distance D (( µ t ) , ( µ (cid:48) t )) = lim t ↓ t W ( µ t , µ (cid:48) t ) . The
Tangent space T µ is defined as the completion of Geod µ w.r.t. D . The tangent spacecan be nicely described using the completion in L µ of the “space of gradients ” byconsidering the map ι µ ι µ : (cid:16) {∇ ϕ, ϕ ∈ C c ∞ } L µ , d L µ (cid:17) → ( T µ , D ) , ι µ ∇ ϕ (cid:55)→ ( µ ϕt ) , where ( µ ϕt ) is the unique (modulo ≈ ) constant speed geodesic originating from µ andsuch that v = ∇ ϕ . If µ is a regular measure in the sense of [2, Def 1.25], then ι µ can beextended to a bijective isometry. Therefore, T µ inherits the inner product (cid:104) ., . (cid:105) T µ fromthat of L µ (cid:104) ι µ ∇ ϕ, ι µ ∇ ψ (cid:105) T µ := (cid:104)∇ ϕ, ∇ ψ (cid:105) L µ . For this reason, in what follows we make no distinction between the elements of thetwo different spaces, and write (cid:104)∇ ϕ, ∇ ψ (cid:105) T µ instead of (cid:104) ι µ ∇ ϕ, ι µ ∇ ψ (cid:105) T µ . Similarly, wewrite |∇ ϕ | T µ instead of | ι µ ∇ ϕ | T µ .The Benamou-Brenier formula [4] shows how the metric we have just introduced ismorally the Riemannian metric for ( P ( M ) , W ( · , · )) . Indeed it says that W ( µ, ν ) = inf µ t ,v t (cid:90) | v t | T µt dt, where ( µ t ) varies in the set of absolutely continuous curves joining µ and ν and ( v t ) isthe velocity field of ( µ t ) .2.4. Regular curves and flow maps.
We give the definition of regular curve followingclosely [25, Def. 2.8], the only difference being that we define regularity over [ ε, − ε ] for ε ∈ (0 , , instead of looking at [0 , . Definition 2.1.
For ε ∈ (0 , , an absolutely continuous curve ( µ t ) is regular over [ ε, − ε ] provided (28) (cid:90) − εε | v t | T µt dt < + ∞ and (29) (cid:90) − εε L ( v t ) dt < + ∞ , In the original result of Benamou and Brenier v t is not the velocity field of ( µ t ) , but just an arbitraryweak solution to the continuity equation. However, it is easy to see that the representation formula for theWasserstein distance remains true if we restrict the minimization to the couples ( µ t , v t ) such that ( v t ) isthe velocity field of ( µ t ) . where for a smooth vector field ξ , L ( ξ ) is defined asL ( ξ ) = sup x ∈ Mw : | w | =1 |∇ w ξ ( x ) | . For non smooth vector fields, the general definition of L can be found at [25, Def2.1]; in this article we will only be concerned with the smooth case. In all what follows,by regular curve we mean “ regular over [ ε, − ε ] for all ε ∈ (0 , ”. If ( µ t ) is a regularcurve and ( v t ) its velocity field, there exists a unique family of maps, called flow maps , T ( t, s, · ) : supp µ t → supp µ s such that for any ε ∈ (0 , , t ∈ (0 , , x ∈ supp µ t the curve s (cid:55)→ T ( t, s, x ) is absolutely continuous over ( ε, − ε ) and satisfies(30) (cid:40) dds T ( t, s, x ) = v s ( T ( t, s, x )) a.e. s ∈ ( ε, − ε ) , T ( t, t, x ) = x. The flow maps enjoy the following properties:(31) T ( r, s, · ) ◦ T ( t, r, · ) = T ( t, s, · ) , T ( t, s, · ) µ t = µ s . The maps ( τ x ) ts and τ ts ( u ) . These maps are needed to define the covariant deriva-tive. The following are Definition 2.9 and 2.12 in [25].
Definition 2.2.
Let ( µ t ) be a regular curve and T ( t, s, · ) its flow maps. Given t, s ∈ [0 , and x ∈ supp µ t , we let ( τ x ) ts : T T ( t,s,x ) M → T x M be the map which associate to v ∈ T T ( t,s,x ) M its parallel transport along the absolutely continuous curve r (cid:55)→ T ( t, r, x ) from r = s to r = t . The map ( τ x ) ts is used to translate vectors along regular curves in M . In the next def-inition we shall see how the maps τ ts do the same for vector fields along regular curvesin P ( M ) . It should be stressed that the next definition is not the parallel transport.Indeed, in general, τ ts ( u ) may not be in T µ t . Definition 2.3.
Let ( µ t ) be a regular curve, T ( t, s, · ) its flow maps and u a vector field in L µ s .Then τ ts ( u ) is the vector field in L µ t defined by τ ts ( u )( x ) = ( τ x ) ts ( u ◦ T ( t, s, x )) . The maps τ ts have similar properties to the flow maps: they enjoy the group property(32) τ ts = τ tr ◦ τ rs . Moreover τ ts is an isometry from L µ s to L µ t , i.e.(33) ∀ u ∈ L µ s , (cid:90) M | u | dµ s = (cid:90) M | τ ts ( u ) | dµ t , or equivalently | u | T µs = | τ ts ( u ) | T µt . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 15
Vector fields along a curve.Definition 2.4.
A vector field along a curve ( µ t ) is a Borel map ( t, x ) (cid:55)→ u t ( x ) ∈ T x M suchthat u t ∈ L µ t for a.e. t . It will be denoted by ( u t ) . Observe that also non tangent vector fields are considered in this definition, i.e. u t may not be a gradient. Here is the definition of absolutely continuous vector field alonga curve, see [25, Def. 3.2]. Definition 2.5.
Let ( u t ) be a vector field along the regular curve ( µ t ) and τ ts ( u ) be given byDefinition 2.3 . We say that ( u t ) is absolutely continuous over [ ε, − ε ] provided the map t (cid:55)→ τ t t ( u t ) ∈ L µ t is absolutely continuous in L µ t for all t ∈ [ ε, − ε ] . It can be seen that choice of t is irrelevant. One could then set t = ε in the definitionabove. As before, by absolutely continuous vector field we mean “ absolutely continuousover [ ε, − ε ] for all ε ∈ (0 , ”.2.7. Total derivative and covariant derivative.
We are ready to define the total deriva-tive of an absolutely continuous vector field as in [25, Def 3.6]. Note that this is not yetthe covariant derivative.
Definition 2.6.
Let ( u t ) be an absolutely continuous vector field along the regular curve ( µ t ) .Its total derivative is defined as (34) d dt u t := lim h → τ tt + h ( u t + h ) − u t h t − a.e. , where the limit is intended in L µ t . To define the covariant derivative, we consider the orthogonal projection P µ : L µ →{∇ ϕ, ϕ ∈ C c ∞ } L µ . The following is [2, Def. 6.8] for the flat case. For the general case,we refer to Definition 5.1 and discussion thereafter in [25]. Definition 2.7.
Let ( µ t ) be an absolutely continuous and tangent vector field along the regularcurve ( µ t ) . Its covariant derivative is defined as D dt u t := P µ t ( d dt u t ) t − a.e. . Levi-Civita connection.
The covariant derivative just defined is the Levi-Civitaconnection, in the sense that it satisfies the compatibility of the metric and torsion freeidentity . The compatibility of the metric says that if ( u t ) , ( u t ) are tangent absolutelycontinuous vector fields along the regular curve ( µ t ) , then(35) ∂ t (cid:104) u t , u t (cid:105) T µt = (cid:104) D dt u t , u t (cid:105) T µt + (cid:104) u t , D dt u t (cid:105) T µt . For the torsion free identity, we refer to [2, Eq. (6.10)] and [25, Sec. 5.1].
Gradient of a function.
Here, we do not seek for a very general definition, butrather give a customary one, well-adapted to our scopes, following [2, Eq 3.50]. In thisdefinition and in the rest of the paper by µ ∈ C + ∞ we mean that µ (cid:28) vol and dµd vol ∈ C + ∞ . Definition 2.8.
Let µ ∈ C + ∞ . We say that F : P ( M ) → R ∪ {±∞} is differentiable at µ ifthere exists a vector field w ∈ T µ such that for all ϕ ∈ C c ∞ (36) lim h → F ( µ ϕh ) − F ( µ ) h = (cid:104) w, ∇ ϕ (cid:105) T µ , where µ ϕh is the constant speed geodesic (modulo ≈ ) such that µ = µ and v = ∇ ϕ . It follows from our definition that if F is differentiable at µ , then there exists a unique w fulfilling (36). We then call w the gradient of F at µ , and denote it ∇ W F ( µ ) . Let usconsider some functionals of interest. The gradient of the relative entropy H U is knownto be(37) ∇ W H U ( µ ) = ∇ log µ + 2 ∇ U, provided ∇ log µ + 2 ∇ U ∈ L µ and µ is regular enough. Thus, using (4) we obtain that(38) I U ( µ ) = |∇ W H U | T µ . The gradient of the modified Fisher information at µ is the vector field(39) ∇ W I U ( µ ) = ∇ (cid:16) − |∇ log µ | −
2∆ log µ + 8 U (cid:17) , provided the right hand side is in L µ and µ regular enough (recall the definition of U given at (5)). All these computations can be found in [55]. We shall derive theexpression (39) in the Appendix.2.10. Convexity of the entropy.
Recall that on a smooth finite dimensional Riemannianmanifold whose Levi-Civita connection is ∇ , the Hessian of f at x applied to v ∈ T x M is defined through (see e.g. [19, Ex. 11 , pg. 141]) Hess x f ( v ) = ∇ v ∇ f ( x ) = D dt ∇ f ( x t ) | t =0 , where ( x t ) is any curve such that x = x , ˙ x = v . In this article we are only interested indefining a kind of Hessian for the entropy functional H U . Therefore, as we did before,to simplify the definitions, we restrict to a very special setting. Definition 2.9.
Let M be compact. Consider a measure µ ∈ C + ∞ and a vector field v ∈ C b ∞ ∩ T µ .We define Hess W µ H U ( v ) = D dt ∇ W H U ( µ t ) (cid:12)(cid:12)(cid:12) t =0 , where ( µ t ) is any regular curve such that µ = µ , v = v and µ t ∈ C + ∞ ([0 , ε ] × M ) for some ε > . It is easy to see that this is a good definition in the sense that there is always onecurve fulfilling the requirements and the value of the Hessian does not depend on the
CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 17 specific choice of the curve. The well known convexity of the entropy [56] implies, inparticular, that under the assumption (13) we have(40) ∀ µ ∈ C + ∞ , v ∈ C b ∞ ∩ T µ , (cid:104) Hess W µ H U ( v ) , v (cid:105) T µ ≥ λ | v | T µ . We shall give in the Appendix a formal proof of this fact, following [43].3. P
ROOFS OF THE RESULTS
Proof of Theorems 1.2 and 1.3.
This section is structured as follows: we first provesome preparatory Lemmas (Lemma 3.1, 3.2, 3.3, 3.4). To prove these Lemmas, we relyon the technical Lemmas 4.2 and 4.3, which we put in the Appendix. Finally, we provethe two Theorems. In the proofs of the Lemmas and of the Theorems, we assume forsimplicity that σ = 1 . There is no difficulty in extending the proof to the general case.In the Lemmas 4.2 and 4.3 it is proven that if Assumption 1.1 holds, then the dualrepresentation (2) holds as well. Thus, we will take it for granted in the next Lemmasand in the proof of the Theorem. In what follows, it is sometimes useful to use thefunctions ˜ f t and ˜ g t defined by (recall the definition of f t , g t at (3))(41) ˜ f t ( x ) := exp( − U ( x )) f t ( x ) , ˜ g t ( x ) := exp( − U ( x )) g t ( x ) , The first Lemma is useful to identify the velocity field of ( µ t ) . Lemma 3.1.
Let Assumption 1.1 hold. If f, g are given by (2) and f t , g t by (3) , then ∀ ε ∈ (0 ,
1) sup t ∈ [ ε, − ε ] (cid:90) M |∇ log g t − ∇ log f t | f t g t d m < + ∞ . Proof.
Assume that Assumption 1.1 (B) holds. Then using point (ii) of Lemma 4.2 wesee that there exist constants
A, B such that ∀ x ∈ R d sup t ∈ [ ε, − ε ] |∇ log g t − ∇ log f t | ≤ A + B | x | . The conclusion then follows from the boundedness of f t , g t (see (i) in Lemma 4.2) andthe fact that m has Gaussian tails. If (A) holds, the conclusion follows directly fromLemma 4.3. (cid:3) The next result is a representation for ( µ t ) and its velocity field. It is a minor modi-fication of the analogous results [23, Sec. 5] and [7, Sec. VI], which build on the notionof current velocity of a Markov diffusion process, as introduced by Nelson in [41]. Thus,we postpone its proof the Appendix. Lemma 3.2.
Let f, g be given by (2) and f t , g t , ˜ f t , ˜ g t by (3) , (41) . Then (42) ∀ t ∈ [0 , , dµ t = f t g t d m = ˜ f t ˜ g t d vol . Moreover, if the conclusion of Lemma 3.1 holds, ( µ t ) is an absolutely continuous curve and itsvelocity field is (43) ( t, x ) (cid:55)→ σ ∇ (log g t − log f t ) . The following Lemma gives a sufficient condition for the existence of the covariantderivative and its explicit form. It is a slight rearrangement of Example 6.7 in [2] andEquation (3.3) in [25], the only difference being that we do not assume that ξ t has com-pact support. Its proof is in the Appendix. Lemma 3.3.
Let ( µ t ) be a regular curve, ( v t ) its velocity field and ( ξ t ) a C ∞ vector field along ( µ t ) . If (44) ∀ ε ∈ (0 ,
1) sup t ∈ [ ε, − ε ] | ∂ t ξ t + ∇ v t ξ t | T µt < + ∞ , then ( ξ t ) is absolutely continuous along ( µ t ) in the sense of Definition 2.5 and (45) ∀ t ∈ (0 , , d dt ξ t = ∂ t ξ t + ∇ v t ξ t . If ξ t = v t then the we also have (46) ∀ t ∈ (0 , , D dt v t = d dt v t = ∂ t v t + 12 ∇| v t | . In this Lemma, we establish that ( µ t ) has the desired regularity properties. Lemma 3.4.
Let Assumption 1.1 hold and ( µ t ) be the marginal flow of SB( L , µ, ν ). Then wehave(i) ( µ t ) is a regular curve.(ii) The velocity field ( v t ) is such that (47) ∀ ε ∈ (0 ,
1) sup t ∈ [ ε, − ε ] (cid:12)(cid:12) ∂ t v t + 12 ∇| v t | (cid:12)(cid:12) T µt < + ∞ . Proof.
First, observe that, thanks to Lemma 3.2, we know that v t = ∇ (log g t − log f t ) and that dµ t = f t g t d m . Using this, if Assumption 1.1 (A) holds, the conclusion followsfrom Lemma 4.3. Therefore, let us assume that (B) holds. To prove that ( µ t ) is regularwe have to show (28) and (29). Therefore ∀ t ∈ (0 , , | v t | T µt = σ (cid:90) M |∇ log g t − ∇ log f t | f t g t d m Using this identity, (28) follows from Lemma 3.1. To prove (29) observe that, since v t isof class C ∞ , we have that L ( v t ) = sup x ∈ R d w : | w | =1 |∇ w v t ( x ) | . Using (43), we have for any ε ∈ (0 , : sup t ∈ [ ε, − ε ] x ∈ R d sup w : | w | =1 |∇ w v t ( x ) | ≤
12 sup t ∈ [ ε, − ε ] x ∈ R d sup w ,w | w | , | w |≤ | ∂ w ∂ w log f t | (48) + 12 sup t ∈ [ ε, − ε ] x ∈ R d sup w ,w | w | , | w |≤ | ∂ w ∂ w log g t | . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 19
Point (iii) of Lemma 4.2 then immediately yields sup t ∈ [ ε, − ε ] L ( v t ) < + ∞ , which proves (29) and that ( µ t ) is a regular curve. Let us now turn to the proof of (47).First, we observe that log f t , log g t are classical solutions on D ε of the HJB equation(49) ∂ t log f t = L log f t + 12 |∇ log f t | , ∂ t log f t = − L log g t − |∇ log g t | . Then, combining (43) with point (ii) of Lemma 4.2 and the HJB equation, we obtain thatthere exist constants
A, B such that uniformly on D ε (cid:12)(cid:12)(cid:12) ∂ t v t + 12 ∇| v t | (cid:12)(cid:12)(cid:12) ( x ) ≤ A + B | x | . Moreover, point (i) in Lemma 4.2 makes sure that sup D ε f t g t ≤ C for some C < + ∞ .Thus, sup t ∈ [ ε, − ε ] (cid:90) M (cid:12)(cid:12)(cid:12) ∂ t v t + 12 ∇| v t | (cid:12)(cid:12)(cid:12) dµ t (42) = sup t ∈ [ ε, − ε ] (cid:90) M (cid:12)(cid:12)(cid:12) ∂ t v t + 12 ∇| v t | (cid:12)(cid:12)(cid:12) f t g t d m ≤ C sup t ∈ [ ε, − ε ] (cid:90) M A + B | x | d m < + ∞ , where to obtain the last inequality we used that m has Gaussian tails. The desiredconclusion follows. (cid:3) Let us now prove Theorem 1.2.
Proof.
Let ε ∈ (0 , and Assumption 1.1 hold. This entitles us to apply Lemma 3.1 and3.2 to conclude that the velocity field of ( µ t ) is given by (43). Lemma 3.4 makes surethat ( µ t ) is regular and that (44) holds for ξ t = v t . Therefore we can apply Lemma 3.3to conclude that ( v t ) is absolutely continuous and its covariant derivative is D dt v t = ∂ t v t + 12 ∇| v t | . Next, we shall prove that ∂ t v t + 12 ∇| v t | = ∇ W I U ( µ t ) , which concludes the proof. To do this, recall that from Lemma 3.2 v t = 12 ∇ (log g t − log f t ) = 12 ∇ (log ˜ g t − log ˜ f t ) , and that ˜ f t , ˜ g t are classical solutions of (76) over D ε . Therefore, by taking log and usingthe positivity of ˜ f t , ˜ g t we get(50) ∂ t log ˜ f t = 12 ∆ log ˜ f t + 12 |∇ log ˜ f t | − U , ∂ t log ˜ g t = −
12 ∆ log ˜ g t − |∇ log ˜ g t | + U , where U is defined at (5). Combining these facts, we have ∂ t v t + 12 ∇| v t | = − ∇ ∂ t log ˜ f t + 12 ∇ ∂ t log ˜ g t + 12 ∇ (cid:12)(cid:12)(cid:12) ∇ log ˜ g t − ∇ log ˜ f t (cid:12)(cid:12)(cid:12) (50) = − ∇ (cid:0)
12 ∆ log ˜ f t + 12 |∇ log ˜ f t | − U (cid:1) + 12 ∇ (cid:0) −
12 ∆ log ˜ g t − |∇ log ˜ g t | + U (cid:1) + 12 ∇ (cid:0) |∇ log ˜ f t | − (cid:104)∇ log ˜ f t , ∇ log ˜ g t (cid:105) + 14 |∇ log ˜ g t | (cid:1) = − ∇ (∆ log ˜ f t + ∆ log ˜ g t ) − ∇ (cid:0) |∇ (log ˜ f t + log ˜ g t ) | (cid:1) + ∇ U (41) = − ∇ ∆ log µ t − ∇|∇ log µ t | + ∇ U (5) = 18 ∇ W I U ( µ t ) . (cid:3) Finally, we prove Theorem 1.3. Given the proof of Theorem 1.2 and the Lemmasproven above, this proof is quite straightforward.
Proof.
One can check as in the proof of Theorem 1.2 that ∇ (log f t − log g t ) solves thecontinuity equation in the classical sense. Assumption (6) ensures enough integrabilityto conclude that this vector field is indeed the velocity field of ( µ t ) . Using this repre-sentation of the velocity field, assumption (8) also ensures that ( µ t ) is a regular curve.Repeating the same calculation as in the proof of Theorem 1.2 we can prove that ∂ t v t + 12 ∇| v t | = 18 ∇ W I U . Then, (6) enables to apply 3.3 to conclude the covariant derivative D dt v t exists and it hasthe desired form. (cid:3) Proof of Theorem 1.4.
First proof of (i) ⇒ (ii). Let us note that the hypothesis of the Theorem implyAssumption 1.1(B). Thus, we can use all the Lemmas proven in the previous section,as well as Theorem 1.2 and Lemma 4.3 from the Appendix. The proof relies on thepreparatory Lemmas 3.5, 3.6 and 3.7. In the first Lemma we compute an expression for c ( µ, ν ) in terms of ˜ f t , ˜ g t . Lemma 3.5.
Let c ( µ, ν ) be defined via (12) . Then ∀ < t < , c ( µ, ν ) = − (cid:90) M ∇ log ˜ f t · ∇ log ˜ g t + 2 U dµ t Proof.
By Corollary 1.1, c ( µ, ν ) = 1 σ | v t | T µt − I U ( µ t ) CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 21
We can exploit Lemma 3.2 and an integration by parts under vol (justified by Lemma4.3) to obtain σ | v t | T µt − I U ( µ t ) = (cid:90) M | ∇ log ˜ g t − ∇ log ˜ f t | − |∇ log µ t + 2 ∇ U | dµ t = (cid:90) M |∇ log ˜ g t − ∇ log ˜ f t | − |∇ log ˜ f t + ∇ log ˜ g t | − ∇ log µ t · ∇ U − |∇ U | dµ t = (cid:90) M −∇ log ˜ f t · ∇ log ˜ g t − |∇ U | dµ t − (cid:90) M ∇ µ t · ∇ U d vol = (cid:90) M −∇ log ˜ f t · ∇ log ˜ g t + ∆ U − |∇ U | dµ t = − (cid:90) M ∇ log ˜ f t · ∇ log ˜ g t + 2 U dµ t , which is the desired conclusion. (cid:3) In the second Lemma we compute the first derivative of the entropy along SB ( L , µ, ν ) . Lemma 3.6.
There exist functions h f and h b such that: H U ( µ t ) = h f ( t ) + h b ( t ) and for all < t < ∂ t h f ( t ) = − σ | v t − σ ∇ W H U ( µ t ) | T µt ∂ t h b ( t ) = 12 σ | v t + σ ∇ W H U ( µ t ) | T µt . Proof.
First, observe that equation (42) combined with Lemma 4.3 make sure that all theexchanges of integral and derivatives which follow are justified and all the expressionsmake sense (i.e. all integrals are finite). We have H U ( µ t ) (11) , (42) = (cid:90) M ˜ f t ˜ g t (log ˜ f t ˜ g t + 2 U ) dµ t (51) = (cid:90) M ˜ f t ˜ g t (log ˜ f t + U ) dµ t (cid:124) (cid:123)(cid:122) (cid:125) := h f ( t ) + (cid:90) M ˜ f t ˜ g t (log ˜ g t + U ) dµ t (cid:124) (cid:123)(cid:122) (cid:125) := h b ( t ) . We can write h f ( t ) = 12 H U ( µ t ) − E log ˜ g t − log ˜ f t ( µ t ) , where the definition of E log ˜ g t − log ˜ f t is given at (24). Therefore ∂ t h f ( t ) = 12 (cid:104)∇ W H U ( µ t ) , v t (cid:105) T µt − (cid:104) ∇ (log ˜ g t − log ˜ f t ) , v t (cid:105) T µt − E ∂ t (cid:0) log ˜ g t − log ˜ f t (cid:1) ( µ t ) By Lemma 3.2 we have that v t = σ ∇ (log ˜ g t − log ˜ f t ) , so that (cid:104)∇ ( 12 log ˜ g t −
12 log ˜ f t ) , v t (cid:105) T µt = 1 σ | v t | T µt . Moreover E ∂ t (cid:0) log ˜ g t − log ˜ f t (cid:1) ( µ t ) = 12 (cid:90) M ( ∂ t log ˜ g t − ∂ t log ˜ f t ) dµ t (42) = 12 (cid:90) M ∂ t ˜ g t ˜ f t − ∂ t ˜ f t ˜ g t d vol σ × (76) = − σ (cid:90) M ∆˜ g t ˜ f t + ∆ ˜ f t ˜ g t d vol + σ (cid:90) M U ˜ f t ˜ g t d vol Int. by parts = σ (cid:90) M ( ∇ log ˜ g t · ∇ log ˜ f t + 2 U ) dµ t Lemma . = − σ c ( f, g ) . Therefore, we have ∂ t h f ( t ) = 12 (cid:104)∇ W H U ( µ t ) , v t (cid:105) T µt − σ | v t | T µt + σ c ( f, g ) Cor. . = 12 (cid:104)∇ W H U ( µ t ) , v t (cid:105) T µt − σ | v t | T µt − σ I U ( µ t ) (38) = 12 (cid:104)∇ W H U ( µ t ) , v t (cid:105) T µt − σ | v t | T µt − σ |∇ W H U ( µ t ) | T µt = − σ | v t − σ ∇ W H U ( µ t ) | T µt . which is the desired conclusion. The proof of the other identity is analogous. (cid:3) In the last Lemma we compute the second derivative of the entropy.
Lemma 3.7.
We have for all < t < ∂ tt h f ( t ) = 12 (cid:68) Hess W µ t H U (cid:16) v t − σ ∇ W H U ( µ t ) (cid:17) , v t − σ ∇ W H U ( µ t ) (cid:69) T µt , and ∂ tt h b ( t ) = 12 (cid:68) Hess W µ t H U (cid:16) v t + σ ∇ W H U ( µ t ) (cid:17) , v t + σ ∇ W H U ( µ t ) (cid:69) T µt . Proof.
Lemma 3.4 ensures ( µ t ) is regular. On the other hand, Lemma 4.3 combined with(42) and (37) show that ∇ W H U ( µ t ) ∈ C b ∞ on any D ε . Thus, we can use Lemma 3.3 toconclude that D dt ∇ W H U ( µ t ) is well defined. Using Definition 2.9, we also obtain that(52) D dt ∇ W H U ( µ t ) (cid:12)(cid:12)(cid:12) t = s = Hess W µ s H U ( v s ) , whereas from the compatibility with the metric we get(53) ∇ W I U ( µ t ) (38) = ∇ W |∇ W H U | T µt = 2 Hess W µ t H U ( ∇ W H U ) , where we wrote ∇ W H U in place of ∇ W H U ( µ t ) to simplify notation. We shall retainthis convention in the next calculations. We thus have, using the compatibility with the CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 23 metric ∂ tt h f ( t ) Lemma . = − σ ∂ t | v t − σ ∇ W H U | T µt (35) = − σ (cid:68) D dt (cid:0) v t − σ ∇ W H U (cid:1) , v t − σ ∇ W H U (cid:69) T µt Th. . = − (cid:68) σ ∇ W I U ( µ t ) − D dt ∇ W H U , v t − σ ∇ W H U (cid:69) T µt (52) + (53) = − (cid:68) Hess W µ t H U (cid:0) σ ∇ W H U − v t (cid:1) , v t − σ ∇ W H U (cid:69) T µt = 12 (cid:68) Hess W µ t H U (cid:0) v t − σ ∇ W H U ) , v t − σ ∇ W H U (cid:69) T µt . The other identity is proven analogously. (cid:3)
Let us now prove Theorem 1.4.
Proof.
We first assume that f and g are continuous. The Bakry ´Emery condition (13)grants λ -convexity of the entropy. Therefore we get ∀ t ∈ (0 , , ∂ tt h f ( t ) (40) ≥ λ | v t − σ ∇ H U | T µt Lemma . = − λσ∂ t h f ( t ) , where the use of (40) is justified by the fact that µ t ∈ C + ∞ and ∇ W H U ( µ t ) ∈ C b ∞ . In thesame way, ∀ t ∈ (0 , , ∂ tt h b ( t ) ≥ λσ∂ t h b ( t ) . Since f, g are continuous, it is easy to see that both h f and h b are continuous over thewhole [0 , . Moreover, Lemma 3.6 and Lemma3.7 make sure that they are C over (0 , .Thus, we can apply Lemma 4.1 (see the Appendix) to obtain h f ( t ) ≤ h f (0) + ( h f (1) − h f (0)) 1 − exp( − λσt )1 − exp( − λσ ) and h b ( t ) ≤ h b (1) + ( h b (0) − h b (1)) 1 − exp( − λσ (1 − t ))1 − exp( − λσ ) Summing the two inequalities we obtain (1 − exp( − λσ )) H U ( µ t ) ≤ h f (0) (cid:0) exp( − λσt ) − exp( − λσ ) (cid:1) + h b (0) (cid:0) − exp( − λσ (1 − t )) (cid:1) + h f (1) (cid:0) − exp( − λσt ) (cid:1) + h b (1) (cid:0) exp( − λσ (1 − t )) − exp( − λσ ) (cid:1) = ( h f (0) + h b (0)) (cid:0) − exp( − λσ (1 − t )) (cid:1) + ( h f (1) + h b (1)) (cid:0) − exp( − λσt ) (cid:1) − ( h f (0) + h b (1)) (cid:0) exp( − λσt ) − (cid:1)(cid:0) exp( − λσ (1 − t )) − (cid:1) . Dividing by (1 − exp( − λσ )) we arrive, after some simple calculations at H U ( µ t ) ≤ ( h f (0) + h b (0)) 1 − exp( − λσ (1 − t ))1 − exp( − λσ ) (54) + ( h f (1) + h b (1)) 1 − exp( − λσt )1 − exp( − λσ ) − ( h f (0) + h b (1)) cosh( λσ ) − cosh( λσ ( t − ))sinh( λσ ) Observe that, by definition(55) h f (0) + h b (0) = H U ( µ ) = H U ( µ ) , h f (1) + h b (1) = H U ( µ ) = H U ( ν ) . From the definition of ˜ f , ˜ g , we obtain, using the standard properties of conditionalexpectation: h f (0) (51) = (cid:90) M ˜ f ˜ g (log ˜ f + U ) d vol = (cid:90) M f g log f d m = E P (cid:16) f ( X ) E P [ g ( X ) | X ] log f ( X ) (cid:17) = E P (cid:16) f ( X ) g ( X ) log f ( X ) (cid:17) . Using the same argument we can show that h b (1) = E P (cid:16) f ( X ) g ( X ) log g ( X ) (cid:17) , meaning that(56) h f (0) + h b (1) = E P (cid:16) f ( X ) g ( X ) log[ f ( X ) g ( X )] (cid:17) = E ˆ Q (cid:16) log d ˆ Q d P (cid:17) = T σU ( µ, ν ) . Plugging (55) and (56) into (54) yields the conclusion.The general case when f, g are notcontinuous is obtained with a standard approximation argument. (cid:3)
Second proof of (i) ⇒ ( ii ) . In this proof we assume for simplicity σ = 1 . There isno difficulty in extending it to the general case. We recall the definitions of the Γ and Γ operators on C ∞ × C ∞ (57) Γ( f, g ) := 12 [ L ( f g ) − f ( L g ) − g ( L f )] , (58) Γ ( f, g ) := 12 [ L Γ( f, g ) − Γ( L f, g ) − Γ( f, L g )] . Since M is compact, the integration by parts formula for the invariant measure m tellsthat for any f, g ∈ C ∞ (= C b ∞ = C c ∞ ) .(59) (cid:90) M f L gd m = (cid:90) M g L f d m = − (cid:90) M Γ( f, g ) d m , CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 25 see the monograph [3] for details. The next two lemmas are the analogous to the Lemma3.6 and 3.7. In their proofs, we will often exchange integrals and time derivatives, anduse the integration by parts formula (59). All these operations are justified by Lemma4.3, where it is proven that f t , g t , log f t , log g t are of class C b ∞ over D ε , for any ε ∈ (0 , . Lemma 3.8.
Let f, g be given by (2) and f t , g t be given by (3) . Assume that f and g are of class C b, + ∞ . Then for any t ∈ [0 , , we can write H U ( µ t ) = h f ( t ) + h b ( t ) , where (60) h f ( t ) = (cid:90) M log f t f t g t d m , h b ( t ) = (cid:90) M log g t f t g t d m Moreover, for all < t < : (61) ∂ t h f ( t ) = − (cid:90) M Γ(log f t , log f t ) f t g t d m , ∂ t h b ( t ) = (cid:90) M Γ(log g t , log g t ) f t g t d m Proof.
From (42) we get H U ( µ t ) = (cid:90) M f t g t log f t g t d m which yields (60). Recall that f t , g t and their logarithms are classical solutions over [0 , × M of ∂ t f t = L f t , ∂ t log f t = L log f t + Γ(log f t , log f t ) , (62) ∂ t g t = − L g t , ∂ t log g t = − L log g t − Γ(log g t , log g t ) , where we used the fact that Γ( f, f ) = |∇ f | Hence ∂ t h f ( t ) (62) = (cid:90) M ( L f t ) g t log f t − ( L g t ) f t log f t + L (log f t ) f t g t + Γ(log f t , log f t ) f t g t d m Using integration by parts , we get (cid:90) M ( L f t ) g t log f t − ( L g t ) f t log f t + L (log f t ) f t g t + Γ(log f t , log f t ) f t g t d m = (cid:90) M ( L f t ) g t log f t − g t L ( f t log f t ) + L (log f t ) f t g t + Γ(log f t , log f t ) f t g t d m = (cid:90) M Γ(log f t , log f t ) f t g t − f t , f t ) g t d m = − (cid:90) M Γ(log f t , log f t ) f t g t d m . The other identity is derived analogously. (cid:3)
Lemma 3.9.
In the same hypothesis of Lemma 3.8 we have for all < t < ∂ tt h f ( t ) = 2 (cid:90) M Γ (log f t , log f t ) f t g t d m , (63) ∂ tt h b ( t ) = 2 (cid:90) M Γ (log g t , log g t ) f t g t d m . Proof.
Using Lemma 3.8 we get ∂ tt h f ( t ) (62) = − (cid:90) M ( L f t )Γ(log f t , log f t ) g t − ( L g t )Γ(log f t , log f t ) f t + 2Γ( L log f t , log f t ) f t g t + 2Γ(Γ(log f t , log f t ) , log f t ) f t g t d m (64)We can now use (59) and the fact that Γ is a derivation to rewrite the last term in theabove expression (cid:90) M Γ(Γ(log f t , log f t ) , log f t ) f t g t d m = 2 (cid:90) M Γ(Γ(log f t , log f t ) , f t ) g t d m (57) = (cid:90) M L ( f t Γ(log f t , log f t )) g t − L Γ(log f , log f t ) f t g t − ( L f t )Γ(log f t , log f t ) g t d m (59) = (cid:90) M ( L g t )Γ(log f t , log f t ) f t − L Γ(log f t , log f t ) f t g t − ( L f t )Γ(log f t , log f t ) g t d m . Plugging this expression in (64), and using the definition of Γ , we arrive at ∂ tt h f ( t ) = (cid:90) M (cid:16) L Γ(log f t , log f t ) − L log f t , log f t ) (cid:17) g t f t d m = 2 (cid:90) M Γ (log f t , log f t ) f t g t d m , which is the desired conclusion. The other identity is proven analogously. (cid:3) Let us now complete the proof of Theorem 1.4.
Proof.
Let µ, ν be such that f, g ∈ C b, + ∞ . It is well known that under (13) ∀ f ∈ C ∞ Γ ( f, f ) ≥ λ f, f ) . In view of Lemmas 3.8 and 3.9, this implies that ∂ tt h f ( t ) ≥ − λ∂ t h f ( t ) , ∂ tt h b ( t ) ≥ λ∂ t h b ( t ) . From this point on, the proof goes as the previous case, and we do not repeat it. The casewhen f, g are not both in C b, + ∞ follows with a standard approximation argument. (cid:3) The constant λ instead of λ is because the generator L has a ∆ as second order part instead of ∆ CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 27
Proof of (ii) ⇒ (i).Proof. Let us choose µ and ν such that µ, ν (cid:28) m and with bounded density, and let ( µ εt ) be the marginal flow of SB ( µ, ν, ε L ) . Then, equation (14) tells that ∀ t ∈ [0 , H U ( µ εt ) ≤ − exp( − ελ (1 − t ))1 − exp( − ελ ) H U ( µ ) + 1 − exp( − ελt )1 − exp( − ελ ) H U ( ν ) − cosh( ελ ) − cosh( − ελ ( t − ))sinh( ελ ) T ε H U ( µ, ν ) . Next, as it easy to check, our hypothesis allows us to apply the results of [26, sec 6].They tell that, • For t ∈ [0 , , lim ε → µ εt → µ t , where the limit is intented in the weak sense and ( µ t ) is the unique constantspeed geodesic between µ and ν . • lim ε → ε T ε H U ( µ, ν ) = 12 W ( µ, ν ) . In view of this, taking the lim inf at both sides in the equation above and using the lowersemicontinuity of H U yields(65) H U ( µ t ) ≤ (1 − t ) H U ( µ ) + t H U ( ν ) − λ t (1 − t ) W ( µ, ν ) Therefore we obtained that the entropy is λ convex along displacement interpolations,provided the initial and final measure are absolutely continuous with bounded density.It is well known that λ -convexity along all geodesics implies the condition (13). How-ever, the proof of this fact given in [56] uses only geodesics between uniform measureson balls, which are clearly among those for which we can prove (65). The conclusionfollows. (cid:3) Proof of Corollary 1.3.Proof. If H U ( µ ) = + ∞ , there is nothing to prove since, T σU ( µ, ν ) = + ∞ as well. Let H U ( µ ) < + ∞ and ν = m . Then we can apply Theorem 1.4. After dividing by (1 − t ) the bound (14) becomes, observing that entropy is non negative and cosh symmetricaround t = 0 : ≤ − exp( − λ (1 − t ))1 − t − exp( − λ ) H U ( µ ) − cosh( λ ) − cosh( λ − λ (1 − t ))(1 − t ) sinh( λ ) T σU ( µ, ν ) . Letting t → the conclusion follows. (cid:3) Proof of Theorem 1.5.
We prove two preparatory Lemmas, and then the Theorem.The first Lemma is a rigorous proof of equation (21)
Lemma 3.10.
The vector field ∇ W I U ( µ t ) is absolutely continuous along ( µ t ) . Moreover (cid:104) D dt ∇ W I U ( µ t ) , v t (cid:105) T µt = 2 (cid:90) R d | Dv t · ∇ log µ t + ∇ div ( v t ) | dµ t (66) − (cid:90) R d d (cid:88) k,j =1 ( Dv t · Dv t ) kj ∂ kj log µ t dµ t + 8 (cid:90) R d (cid:104) v t , Hess U · v t (cid:105) dµ t . Proof.
Point (i) of the assumption make sure that ( µ t ) is regular. (ii) combined withLemma 3.3 grant the desired absolute continuity and that d dt ∇ W I U ( µ t ) = ∂ t ∇ W I U ( µ t ) + D ∇ W I U ( µ t ) t · v t . Since v t ∈ T µ t we also have, by definition of covariant derivative (cid:104) D dt ∇ W I U ( µ t ) , v t (cid:105) T µt = (cid:104) d dt ∇ W I U ( µ t ) , v t (cid:105) T µt Let us now prove (66). First, we do it for the case when U = 0 . In this case, from (5) wehave ∇ W I U ( µ t ) = α t + β t with α t = −∇|∇ log µ t | , β t = − ∇ ∆ log µ t From now on, we drop the dependence on t both in µ t and v t and adopt Einstein’sconvention for indexes. Moreover, we abbreviate ∂ x k with ∂ k and we are going to use,without mentioning it, the fact that v is a gradient vector field, i.e. ∂ k v j = ∂ j v k . Since µ t ∈ C + ∞ , we can use the continuity equation in the form ∂ t log µ = − div ( v ) + ∇ log µ · v to get ( ∂ t α t + Dα t · v t ) i = − ∂ i ( ∂ k log µ ∂ k ∂ t log µ ) − ∂ i ( ∂ kj log µ ∂ k log µ ) v j = 2 ∂ i (cid:0) ∂ k log µ ∂ k ( ∂ j v j + ∂ j log µ v j ) (cid:1) − ∂ i ( ∂ jk log µ ∂ k log µ ) v j = 2 ∂ i (cid:0) ∂ k log µ ∂ jk v j + ∂ k log µ ∂ j log µ ∂ k v j (cid:1) + 2 ∂ i (cid:0) ∂ k log µ ∂ kj log µ∂ k v j (cid:1) − ∂ i ( ∂ jk log µ ∂ k log µ ) v j = 2 ∂ i (cid:0) ∂ k log µ ∂ jk v j + ∂ k log µ ∂ j log µ ∂ k v j (cid:1) + 2 ∂ kj log µ∂ k log µ ∂ i v j . Therefore (cid:104) ∂ t α t + Dα t , v t (cid:105) T µt = 2 (cid:90) ∂ i (cid:0) ∂ k log µ ∂ jk v j ) v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A + 2 (cid:90) ∂ i ( ∂ k log µ ∂ j log µ ∂ k v j (cid:1) v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A. + 2 (cid:90) ∂ kj log µ ∂ k log µ∂ i v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A. . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 29
Let us now compute ∂ t β t + Dβ t · v t . We have, using the continuity equation ( ∂ t β t + Dβ t · v t ) i = 2 ∂ ikk ( ∂ j v j + v j ∂ j log µ ) − ∂ ijkk log µ v j = 2 ∂ ijkk v j + 2 ∂ i (cid:0) ∂ j log µ ∂ kk v j + 2 ∂ kj log µ ∂ k v j + ∂ jkk log µ v j (cid:1) − ∂ ijkk log µ v j = 2 ∂ ijkk v j + 2 ∂ j log µ ∂ ikk v j + 2 ∂ ij log µ ∂ kk v j + 4 ∂ ikj log µ ∂ k v j + 4 ∂ kj log µ ∂ ik v j + 2 ∂ ijkk log µ v j + 2 ∂ jkk log µ ∂ i v j − ∂ ijkk log µ v j = 2 ∂ ijkk v j + 2 ∂ j log µ ∂ ikk v j + 2 ∂ ij log µ ∂ kk v j + 4 ∂ ikj log µ ∂ k v j + 4 ∂ kj log µ ∂ ik v j + 2 ∂ jkk log µ ∂ i v j . Thus, (cid:104) ∂ t β t + Dβ t , v t (cid:105) T µt = 2 (cid:90) ∂ ijkk v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. + 2 (cid:90) ∂ j log µ ∂ ikk v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. + 2 (cid:90) ∂ ij log µ ∂ kk v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. + 4 (cid:90) ∂ ijk log µ ∂ k v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. + 4 (cid:90) ∂ kj log µ ∂ ik v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. + 2 (cid:90) ∂ jkk log µ ∂ i v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := B. Finally we define C. (cid:90) ( Dv · Dv ) ij ∂ ij log µdµC. (cid:90) |∇ div ( v ) | dµC. (cid:90) | Dv · ∇ log µ | dµC. (cid:90) (cid:104)∇ div ( v ) , Dv · ∇ log µ (cid:105) dµ, where we denote by Dv · Dv the usual matrix product. In the following lines we performa series of Integration by parts, which are all justified by point (iii) of the hypothesis.We first integrate twice B.1 by parts. B. (cid:90) ∂ jk v j ∂ ik v i dµ + 2 (cid:90) ∂ jk v k ∂ i v i ∂ k µ d vol + 2 (cid:90) ∂ jk v j ∂ k v i ∂ i µ d vol + 2 (cid:90) ∂ jk v j v i ∂ ik µ d vol Next, we integrate the second term once again by parts and obtain (cid:90) ∂ jk v k ∂ i v i ∂ k µ d vol = − (cid:90) ∂ jk v k v i ∂ ik µ d vol − (cid:90) ∂ ijk v k v i ∂ k µ d vol Plugging this back, we get B. (cid:90) ∂ jk v j ∂ ik v i dµ + 2 (cid:90) ∂ jk v j ∂ k v i ∂ i µ d vol − (cid:90) ∂ ijk v j v i ∂ k µ d vol = 2 (cid:90) ∂ jk v j ∂ ik v i dµ (cid:124) (cid:123)(cid:122) (cid:125) =2 C. + 2 (cid:90) ∂ jk v j ∂ k v i ∂ i log µ dµ (cid:124) (cid:123)(cid:122) (cid:125) =2 C. − (cid:90) ∂ ijk v j v i ∂ k log µ dµ = 2 C. C. − (cid:90) ∂ ijj v k v i ∂ k log µ dµ (cid:124) (cid:123)(cid:122) (cid:125) = − B. , where the last identity is obtained relabeling j with k and viceversa. Thus, (cid:80) i =1 B.i =2 C. C. (cid:80) i =3 B.i
Now, let us integrate A. once by parts A. (cid:90) ∂ kj log µ ∂ k µ ∂ i v j v i d vol = − (cid:90) ∂ jkk log µ ∂ i v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − B. − (cid:90) ∂ kj log µ∂ i v j ∂ k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − C. − (cid:90) ∂ jk log µ∂ ik v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − B. . Thus,(67) A. (cid:88) i =1 B.i = − C. C. C. B. B. B. . Next, we observe that, using the product’s rule and exchanging j and k , we have A. A. . B. , where A. . (cid:90) ∂ j log µ ∂ ijk v k v i dµ Let us now turn to A.2. Using the product rule, and exchanging j with k : A. (cid:90) ∂ ij log µ∂ k log µ∂ j v k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A. . +2 (cid:90) ∂ j log µ∂ k log µ∂ ij v k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A. . Using that ∂ k log µdµ = ∂ k µd vol , we integrate A.2.2. by parts A. . − (cid:90) ∂ jk log µ∂ ij v k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − B. − (cid:90) ∂ ijk v k ∂ j log µv i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − A. . − (cid:90) ∂ j log µ∂ ij v k ∂ k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) := A. . . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 31
Therefore A. A. B. − B. A. . A. . . . Combining this with (67) we get (cid:88) i =1 A.i + (cid:88) i =1 B.i = − C. C. C. B. B. A. . A. . . . Let us now integrate A.2.2.1 by parts. A. . . (cid:90) ∂ ij log µ ∂ j v k ∂ k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) =2 C. + 2 (cid:90) ∂ j log µ∂ j v k ∂ ik v i dµ (cid:124) (cid:123)(cid:122) (cid:125) =2 C. + 2 (cid:90) ∂ j log µ ∂ j v k ∂ k v i ∂ i log µ dµ (cid:124) (cid:123)(cid:122) (cid:125) =2 C. Thus, (cid:80) i =1 A.i + (cid:80) i =1 B.i = 2 C. C. C. B. B. A. . . Lastly, we shallintegrate B. by parts. We get B. − (cid:90) ∂ ij log µ∂ kk v j v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − B. − (cid:90) ∂ ij log µ∂ k v j ∂ k v i dµ (cid:124) (cid:123)(cid:122) (cid:125) = − C. − (cid:90) ∂ ij log µ∂ k v j v i ∂ k log µdµ (cid:124) (cid:123)(cid:122) (cid:125) = − A. . . Hence, we can conclude that (cid:80) i =1 A.i + (cid:80) i =1 B.i = − C. C. C. C. . It isan easy calculation to see that this is indeed the right hand side of (66). Since (cid:88) i =1 A.i + (cid:88) i =1 B.i = (cid:104) ∂ t β t + Dβ t , v t (cid:105) T µt + (cid:104) ∂ t β t + Dβ t , v t (cid:105) T µt = (cid:104) D dt ∇ W I ( µ t ) , v t (cid:105) T µt , the Lemma is proven for the case when U = 0 . The general case follows observing that ∇ W I U ( µ t ) = ∇ W I ( µ t ) + ∇ U . The conclusion then follows with an easy calculation. (cid:3) In the next Lemma we establish log-concavity of µ t . Lemma 3.11.
For all ≤ t ≤ , µ t is a log-concave measureProof. From (1) we have: µ t ( · ) = (cid:90) R d × R d P xy ( X t ∈ · )ˆ π ( dxdy ) , where P xy is the bridge of P between x and y . It has been show at [12, Thm 2.1] thatif U is convex, P xy ( X t ∈ · ) is a log-concave distribution for all x, y . Thus, because ofpoint ( iv ) of the hypothesis, µ t is a log-concave mixture of log-concave measures, andis therefore log-concave itself. (cid:3) We can now proceed to the proof of the Theorem.
Proof.
Since the hypothesis of Theorem 1.3 hold we can differentiate once in time theFisher information to get ∂ t I U ( µ t ) = (cid:104)∇ W I U ( µ t ) , v t (cid:105) T µt . Using Lemma 3.10 in combination with Theorem 1.3 and the compatibility with themetric we get(68) ∂ tt I U ( µ t ) = (cid:104) D dt ∇ W I U ( µ t ) , v t (cid:105) T µt + 18 |∇ W I U ( µ t ) | T µt . We know from Lemma 3.11 that µ t is a log concave measure. Therefore, using Schur’sproduct Theorem [28, Th 7.5.3]in the second term of the rhs of (66) we get (cid:104) D dt ∇ W I U ( µ t ) , v t (cid:105) T µt ≥ (cid:104) v t , Hess U · v t (cid:105) T µt ≥ α | v t | T µt , since U is α -convex. Plugging this back into (68) we get ∂ tt I U ( µ t ) ≥ |∇ W I U ( µ t ) | T µt + 8 α | v t | T µt , which proves the desired convexity and (19). (cid:3) Proof of Theorem 1.6.
Proof.
From the Feynman-Kac formula we have that f t , g t solve ∂ t f t ( x ) = 12 ∆ f t ( x ) − V f t ( x ) , ∂ t g t = −
12 ∆ g t + V g t on any D ε . By taking logarithms, and using the positivity of f t , g t we get ∂ t log f t ( x ) = 12 ∆ log f t ( x ) + 12 |∇ log f t | − V∂ t log g t ( x ) = −
12 ∆ log g t ( x ) − |∇ log g t | + V (69)One can check directly as in the proof of Theorem 1.2 that ∇ (log f t − log g t ) solvesthe continuity equation in the classical sense. Assumption (26) ensures enough integra-bility to conclude that this vector field is actually the velocity field of ( µ t ) . Moreover,(26) ensures that ( µ t ) is a regular curve. If we replace (50) with (69) and use the sameargument as in the proof of Theorem 1.2 we get ∂ t v t + 12 ∇| v t | = ∇ W ( 18 I + E K ) . Then, using (25) and Lemma 3.3, we get that the covariant derivative exists and it hasthe desired form. (cid:3) A CKNOWLEDGMENTS
The author acknowledges support from CEMPI Lille and the University of Lille 1. Healso wishes to thank Christian L´eonard for having introduced him to the Schr ¨odingerproblem, and for many fruitful discussions.
CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 33
4. A
PPENDIX
The following Lemma has been used in the proof of Theorem 1.4. Here, we denote ˙ f, ¨ f the first and second derivatives of a function on the real line. Lemma 4.1.
Let φ : [0 , → R be twice differentiable on (0 , and continuous on [0 , .(i) If ¨ φ t ≥ λ ˙ φ t for all t ∈ (0 , , then (70) ∀ t ∈ [0 , , φ t ≤ φ + ( φ − φ ) 1 − exp( − λ (1 − t ))1 − exp( − λ ) . (ii) If ¨ φ t ≥ − λ ˙ φ t for all t ∈ (0 , , then ∀ t ∈ [0 , , φ t ≤ φ + ( φ − φ ) 1 − exp( − λt )1 − exp( − λ ) . Note that the rhs of (70) rewrites nicely as exp( λ ) − exp( λt )exp( λ ) − φ + exp( λt ) − λ ) − φ . Proof.
We prove only ( i ) , as ( ii ) follows from ( i ) with a simple time-reversal argument.Let g be the unique solution of the differential equation ¨ g t = λ ˙ g t , < t < , g = φ , g = φ . (71)All we have to show is: h := φ − g ≤ , because a direct calculation shows that thesolution of (71) is g t = φ + ( φ − φ ) 1 − exp( − λ (1 − t ))1 − exp( − λ ) . We see that ¨ h t ≥ λ ˙ h t , < t < with h = h = 0 . Considering the function u t := e − λt ˙ h t , ≤ t ≤ , we have ˙ u t = e − λt [¨ h t − λ ˙ h t ] ≥ , which implies that u is increasing, that is: ˙ h t ≥ ˙ h t ∗ e λ ( t − t ∗ ) , ≤ t ∗ ≤ t ≤ . (72)Suppose ad absurdum that h t o > for some < t o < . As h = 0 , there exists some < t ∗ ≤ t o such that h t ∗ > and ˙ h t ∗ > . In view of (72), this implies that h is increasingon [ t ∗ , . In particular, h ≥ h t ∗ > , contradicting h = 0 . Hence h ≤ . (cid:3) Hessian of the Entropy and gradient of the Fisher information.
Hessian of the entropy.
In this paragraph we make some formal computations, whoseaim is to give an explanation for equation (40). We assume U = 0 for simplicity. Let µ ∈ C b, + ∞ and ∇ ϕ ∈ C c ∞ be fixed. We consider the constant speed geodesic ( µ t ) such that µ = µ and v = ∇ ϕ . Then, by definition Hess W H U ( ∇ ϕ ) = D dt ∇ W H U ( µ t ) (cid:12)(cid:12) t =0 . Using the identification of the covariant derivative at Lemma 3.3 and (37) we have that
Hess W H U ( ∇ ϕ ) = ∂ t ∇ log µ t + ∇ v t ∇ log µ t (cid:12)(cid:12) t =0 . Using the continuity equation ∂ t ∇ log µ t = −∇ ( 1 µ t ∇ · ( µ t v t ))= −∇ ( div ( v t )) − ∇(cid:104)∇ log µ t , v t (cid:105) . Evaluating at t = 0 and using v = ∇ ϕ , we can rewrite the latter as −∇ ∆ ϕ − Hess log µ ( ∇ ϕ ) − Hess ϕ ( ∇ log µ ) . Therefore, observing that ∇ v t ∇ log µ t (cid:12)(cid:12) t =0 = Hess log µ ( ∇ ϕ ) , we arrive at Hess W H U ( ∇ ϕ ) = −∇ ∆ ϕ − Hess ϕ ( ∇ log µ ) . Hence, using an integration by parts: (cid:104)
Hess W H U ( ∇ ϕ ) , ∇ ϕ (cid:105) T µt = − (cid:90) M (cid:104)∇ ϕ, ∇ ∆ ϕ (cid:105) dµ − (cid:90) M (cid:104)∇ ϕ, Hess ϕ ( ∇ log µ ) (cid:105) dµ = − (cid:90) M (cid:104)∇ ϕ, ∇ ∆ ϕ (cid:105) dµ − (cid:90) M (cid:104)∇ µ, Hess ϕ ( ∇ ϕ ) (cid:105) d vol = − (cid:90) M (cid:104)∇ ϕ, ∇ ∆ ϕ (cid:105) dµ − (cid:90) M (cid:104)∇ µ, ∇|∇ ϕ | (cid:105) d vol = (cid:90) M
12 ∆ |∇ ϕ | − (cid:104)∇ ϕ, ∇ ∆ ϕ (cid:105) dµ. At this point one can use the Bochner-Weitzenb ¨ock formula
12 ∆ |∇ ϕ | = (cid:104)∇ ϕ, ∇ ∆ ϕ (cid:105) + | Hess ϕ | HS + Ric ( ∇ ϕ, ∇ ϕ ) and the hypothesis (13) to obtain the conclusion. Gradient of the Fisher information.
In this section, we shall make some formal computa-tions to justify (5). As we did before, we assume U = 0 for simplicity. Differentiatingthe relation (38) and using the definition of Hessian we get ∇ W I ( µ ) = 2 Hess W µ H ( ∇ W H ( µ )) By the definition of Hessian
Hess W µ ( ∇ W H ( µ )) = D dt ∇ W H ( µ t ) (cid:12)(cid:12)(cid:12) t =0 , where ( µ t ) is any regular enough curve such that µ = µ , v = ∇ W H ( µ ) . From Lemma3.3 such covariant is the projection on the space of gradient vector fields of ∂ t ∇ W H ( µ t ) + ∇ v t ∇ W H ( µ t ) Using the continuity equation in the form ∂ t log µ t = −∇ · v t − v t · ∇ log µ t , and recallingthat ∇ W H ( µ t ) = ∇ log( µ t ) we arrive at ∂ t ∇ W H ( µ t ) (cid:12)(cid:12)(cid:12) t =0 = −∇ ( ∇ · v t + v t · ∇ log µ t ) (cid:12)(cid:12)(cid:12) t =0 = −∇ (cid:0) ∇ · ( ∇ log µ ) (cid:1) − ∇ ( |∇ log µ | )= −∇ ∆ log µ − ∇|∇ log µ | . On the other hand ∇ v t ∇ W H ( µ t ) (cid:12)(cid:12)(cid:12) t =0 = Hess log( µ t )( v t ) (cid:12)(cid:12)(cid:12) t =0 = Hess log( µ )( ∇ log µ ) = 12 ∇|∇ log µ | . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 35
Therefore ∂ t ∇ W H ( µ t ) + ∇ v t ∇ W H ( µ t ) (cid:12)(cid:12)(cid:12) t =0 = −∇ ∆ log µ − ∇|∇ log µ | , and since the rhs of this vector field is of gradient type, ∇ W I ( µ ) = 2 D dt ∇ W H ( µ t ) (cid:12)(cid:12)(cid:12) t =0 = − ∇ ∆ log µ − ∇|∇ log µ | , which is (5).4.2. Lemmas 4.2 and 4.3.
These Lemmas are needed in the proof of Theorem 1.2 and1.3.
Lemma 4.2.
Let Assumption 1.1 (B) hold. Then T σU ( µ, ν ) < + ∞ and the dual representation (2) holds. Moreover(i) f and g are compactly supported and f t , g t globally bounded on [0 , × R d .(ii) For any ≤ l ≤ and ε ∈ (0 , , there exist constants A l,ε , B l,ε such that ∀ x ∈ M sup t ∈ [ ε, − ε ] sup v ,..,v l ∈ R d | v | ,.., | v l |≤ | ∂ v l . . . ∂ v log f t ( x ) | ≤ A l,ε + B l,ε | x | l , and the same conclusion holds replacing f t by g t .(iii) For any ε ∈ (0 , there exists a constant A ,ε such that sup x ∈ R d ,t ∈ [ ε, − ε ] sup v ,v ∈ R d | v | , | v |≤ | ∂ v ∂ v log f t ( x ) | ≤ A ,ε , and the same conclusion holds replacing f t by g t .Proof. Since all statements concerning g are proven in the same way as those for f , welimit ourselves to prove the latter ones. In the proof, we assume that σ = 1 , the prooffor the general case being almost identical. The fact that T σU ( µ, ν ) < + ∞ can be easilysettled using point (b) in [32, Prop. 2.5], whereas the dual representation is obtainedfrom [31, Th 2.8]. Let us show that f is compactly supported. Observe dµd m = f ( x ) g ( x ) . Since g ∈ C + ∞ , and dµd m is compactly supported, f must have the same support as dµd m .Moreover, since g is bounded from below on the support of f , the fact that dµd m isbounded from above, implies that f is bounded from above. It follows from the verydefinition of f t that they must be bounded as well. The proof of ( i ) is complete. Weonly do the proof of ( ii ) and ( iii ) in the case d = 1 . This proof can be extended withno difficulty to the general case. We first make some preliminary observations. For α fixed, the transition density of the Ornstein-Uhlenbeck semigroup is p t ( x, z ) = (cid:16) γ ( α, t )2 π (cid:17) − / φ ( γ ( α, t )( z − exp( − αt ) x )) where φ ( z ) = exp( − z , γ ( α, t ) = 2 α (1 − exp( − αt )) . The derivatives of φ can be computed using the Hermite polynomials ( H m ) m ≥ . Wehave ∀ m ∈ N , φ m ( z ) = ( − m H m ( z ) φ ( z ) . Thus, we obtain the following formula for the m -th derivative of the transition densityw.r.t. x :(73) ∂ mx p t ( x, z ) = ( − m γ ( α, t ) m exp( − mαt ) H m ( z − exp( − αt ) x ) p t ( x, z ) . Finally, observe that we can rewrite f t equivalently in the form(74) f t ( x ) = (cid:90) R p t ( x, z ) f ( z ) dz. Let us now prove (ii). Fix ≤ l ≤ . Using (74), we can write ∂ lx log f t ( x ) as a sum offinitely terms of the form f t ( x ) − k k (cid:89) j =1 (cid:90) f ( z ) ∂ i j x p t ( x, z ) dz where k ≤ l and i , .., i k are integers summing up to l . Plugging (73) in this expression,the desired conclusion follows using the fact that H m is a polynomial of degree m , f is compactly supported, and γ ( α, t ) is uniformly bounded from above and below for t ∈ [ ε, − ε ] . To prove (iii), we compute explicitly ∂ x log f t ( x ) , using (73): exp( − αt ) γ ( α, t ) f − t ( x ) × (cid:16) (cid:90) f ( z ) H ( z − exp( − αt ) x ) p t ( x, z ) dz (cid:90) f ( z ) p t ( x, z ) dz − (cid:104) (cid:90) f ( z ) H ( z − exp( − αt ) x ) p t ( x, z ) dz (cid:105) (cid:17) . Using the explicit form of the first two Hermite polynomials and some standard calcu-lations, the latter expression is seen to be equal to exp( − αt ) γ ( α, t ) f t ( x ) − × (cid:16) (cid:90) f ( z ) z p t ( x, z ) dz (cid:90) f ( z ) p t ( x, z ) dz − (cid:104) (cid:90) f ( z ) zp t ( x, z ) dz (cid:105) (cid:17) . The conclusion then follows using the fact that f is compactly supported and that γ ( α, t ) is uniformly bounded from above and below for t ∈ [ ε, − ε ] . (cid:3) Lemma 4.3.
Let Assumption 1.1(A) hold. Then the dual representation 2 holds and(i) T σU ( µ, ν ) < ∞ (ii) For any ε ∈ (0 , f t , g t , ˜ f t , ˜ g t are C b, + ∞ over D ε and log f t , log g t , log ˜ f t , log ˜ g t are C b ∞ over D ε .Proof. Let ϕ µ = dµd m , ϕ ν = dνd m and define π ∈ Π( µ, ν ) by π ( x, y ) = ϕ µ ( x ) ϕ ν ( y ) m ⊗ m ( dxdy ) . The theory of Malliavin calculus ensures that ( X , X ) P is an absolutelycontinuous measure on M × M with positive smooth density. Since M is compact, then ( X , X ) P is equivalent to m ⊗ m . Therefore, for some constant C < + ∞ , H ( π | ( X , X ) P ) ≤ C H ( π | m ⊗ m ) = C ( H ( µ | m ) + H ( ν | m )) < + ∞ . CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 37
Thus T σU ( µ, ν ) < + ∞ . It is also a result of Malliavin calculus that f t , g t are of class C + ∞ on any D ε . But then, since M is compact, they are also in C b, + ∞ and uniformly boundedfrom below, which gives that log f t , log g t are in C b ∞ . The statement about ˜ f t , ˜ g t and theirlogarithms follows from the one for f t , g t and the compactness of M . (cid:3) Proof of Lemma 3.2.
Proof.
We can rewrite (2) as d ˆ Q d P = f ( X ) g ( X ) . Moreover, since P is stationary, we have for any t that X t P = m . Therefore dµ t d m ( x ) = d ( X t ˆ Q ) d ( X t P ) ( x ) = E P [ f ( X ) g ( X ) | X t = x ] Markovproperty = E P [ f ( X ) | X t = x ] E P [ g ( X ) | X t = x ]= f t ( x ) g t ( x ) . Observing that d m = exp( − U ) d vol , (42) follows from the definition of ˜ f t and ˜ g t . Toprove the second statement, fix ε ∈ (0 , . We observe that, since U is taken to besmooth, the well known results of Malliavin calculus grant that the function f t and g t are of class C + ∞ , and thus classical solutions on D ε of the forward and backwardKolmogorov equations(75) ∂ t f t = L f t , ∂ t g t = − L g t . Using some standard algebraic manipulations and the positivity of f t , g t one finds that ˜ f t and ˜ g t are classical solutions on D ε of(76) ∂ t ˜ f t = 12 ∆ ˜ f t − U ˜ f t , ∂ t ˜ g t = −
12 ∆˜ g t + U ˜ g t = 0 , where U was defined at (5). Using this, we prove that ∇ (log g t − log f t ) = ∇ (log ˜ g t − log ˜ f t ) is a classical solution to the continuity equation on D ε . Indeed ∂ t µ t (42) = ( ∂ t ˜ f t )˜ g t + ˜ f t ( ∂ t ˜ g t ) (76) = 12 ˜ g t ∆ ˜ f t −
12 ˜ f t ∆˜ g t = 12 ∇ · (˜ g t ∇ ˜ f t ) − ∇ · ( ˜ f t ∇ ˜ g t )= 12 ∇ · (cid:0) ˜ f t ˜ g t ( ∇ log ˜ f t − ∇ log ˜ g t ) (cid:1) (42) = 12 ∇ · (cid:0) µ t ( ∇ log ˜ f t − ∇ log ˜ g t ) (cid:1) . Thus, ( t, x ) (cid:55)→ ∇ (log g t − log f t ) solves the continuity equation, it is of gradient typeand, thanks to Lemma 3.1 and (42), sup t ∈ [ ε, − ε ] 12 |∇ (log g t − log f t ) | T µt < + ∞ also holds.The conclusion then follows. (cid:3) Proof of Lemma 3.3.
Proof.
Fix ε ∈ (0 , . As a preliminary step, we compute ∂ t τ εt ( ξ t ) . Using the groupproperty we get τ εt + h ( ξ t + h ) − τ εt ( ξ t ) = τ εt (cid:0) τ tt + h ( ξ t + h ) − ξ t (cid:1) = hτ εt (cid:0) ∂ t ξ t (cid:1) + τ εt (cid:0) τ tt + h ( ξ t ) − ξ t (cid:1) + o ( h ) , where o ( h ) /h → as h → . Recalling Definition 2.3 and the definition of flow map weget τ tt + h ( ξ t )( x ) − ξ t ( x ) = ( τ x ) tt + h (cid:0) ξ t ◦ T ( t, t + h, x ) − ξ t ( x ) (cid:1) = h ∇ ∂ t T ( t,t,x ) ξ t ( x ) + o ( h )= h ∇ v t ( x ) ξ t ( x ) + o ( h ) . Therefore, we have shown that, as a pointwise limit(77) lim h → τ tt + h ξ t + h − ξ t h = ∂ t ξ t + ∇ v t ξ t , which implies that ∂ t τ εt ( ξ t ) = τ εt ( ∂ t ξ t + ∇ v t ξ t ) . Let us now prove the absolute continuity of ( ξ t ) along ( µ t ) using what we have justshown. We have | τ εs ( ξ s ) − τ εt ( ξ t ) | L µε = (cid:16) (cid:90) M (cid:12)(cid:12)(cid:12) (cid:90) st τ εr (cid:0) ∂ r ξ r + ∇ v r ξ r (cid:1) dr (cid:12)(cid:12)(cid:12) dµ ε (cid:17) Jensen ≤ ( s − t ) / (cid:16) (cid:90) st (cid:90) M (cid:12)(cid:12) τ εr (cid:0) ∂ r ξ r + ∇ v r ξ r ) (cid:12)(cid:12) dµ ε dr (cid:17) (33) = ( s − t ) / (cid:16) (cid:90) st (cid:90) M (cid:12)(cid:12) ∂ r ξ r + 12 ∇ v r ξ r (cid:12)(cid:12) dµ r dr (cid:17) ≤ ( s − t ) sup r ∈ [ ε, − ε ] | ∂ r ξ r + ∇ v r ξ r | T µr . Using (44), the desired absolute continuity follows. Let us now turn to the proof of (45).By definition, d dt ξ t = lim h ↓ τ tt + h ξ t + h ( · ) − ξ t ( · ) h , where the limit is in L µ t . But then, it is also the pointwise limit along a subsequence.Such computation has been done at (77), and yields the desired result. The identity (46)is a direct consequence of (45) and the fact that v t is a gradient vector field. (cid:3) R EFERENCES [1] Luigi Ambrosio and Wilfrid Gangbo. Hamiltonian odes in the wasserstein space of probability mea-sures.
Communications on Pure and Applied Mathematics , 61(1):18–53, 2008.[2] Luigi Ambrosio and Nicola Gigli. A user’s guide to optimal transport. In
Modelling and optimisation offlows on networks , pages 1–155. Springer, 2013.[3] Dominique Bakry, Ivan Gentil, and Michel Ledoux.
Analysis and geometry of Markov diffusion operators ,volume 348. Springer Science & Business Media, 2013.
CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 39 [4] Jean-David Benamou and Yann Brenier. A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem.
Numerische Mathematik , 84(3):375–393, 2000.[5] Jean-David Benamou, Guillaume Carlier, Marco Cuturi, Luca Nenna, and Gabriel Peyr´e. Iterativebregman projections for regularized transportation problems.
SIAM Journal on Scientific Computing ,37(2):A1111–A1138, 2015.[6] Patrick Cattiaux and Christian L´eonard. Minimization of the Kullback information of diffusion pro-cesses.
Annales de l’IHP Probabilit´es et statistiques , 30(1):83–132, 1994.[7] Yongxin Chen, Tryphon T Georgiou, and Michele Pavon. On the relation between optimal transportand Schr¨odinger bridges: A stochastic control viewpoint.
Journal of Optimization Theory and Applica-tions , 169(2):671–691, 2016.[8] Yongxin Chen, Tryphon T Georgiou, and Michele Pavon. Optimal steering of a linear stochastic sys-tem to a final probability distribution, part i.
IEEE Transactions on Automatic Control , 61(5):1158–1169,2016.[9] Yongxin Chen, Tryphon T Georgiou, and Michele Pavon. Optimal steering of a linear stochastic sys-tem to a final probability distribution, part ii.
IEEE Transactions on Automatic Control , 61(5):1170–1180,2016.[10] Shui-Nee Chow, Wuchen Li, and Haomin Zhou. A discrete schrodinger equation via optimal trans-port on graphs. arXiv preprint arXiv:1705.07583 , 2017.[11] J.M.C. Clark. A local characterization of reciprocal diffusions.
Applied Stochastic Analysis , 5:45–59,1991.[12] G. Conforti. Fluctuations of bridges, reciprocal characteristics, and concentration of measure. preprintarXiv:1602.07231, to appear in Annales de l’Institut Henri Poincar´e , 2016.[13] Giovanni Conforti and Christian L´eonard. Reciprocal classes of random walks on graphs.
StochasticProcesses and their Applications , 127(6):1870 – 1896, 2017.[14] Giovanni Conforti and Max Von Renesse. Couplings, gradient estimates and logarithmic Sobolevinequality for Langevin bridges.
Probability Theory and Related Fields , Nov 2017. available online.[15] A.B. Cruzeiro and J.C. Zambrini. Malliavin calculus and Euclidean quantum mechanics. I. Functionalcalculus.
Journal of Functional Analysis , 96(1):62–95, 1991.[16] P. Dai Pra. A stochastic control approach to reciprocal diffusion processes.
Applied Mathematics andOptimization , 23(1):313–329, 1991.[17] D. Dawson, L. Gorostiza, and A. Wakolbinger. Schr¨odinger processes and large deviations.
Journal ofmathematical physics , 31(10):2385–2388, 1990.[18] Donald A Dawson and J ¨urgen G¨artner. Multilevel large deviations and interacting diffusions.
Proba-bility Theory and Related Fields , 98(4):423–487, 1994.[19] Manfredo Perdigao Do Carmo and J Flaherty Francis.
Riemannian geometry , volume 115. Birkh¨auserBoston, 1992.[20] H. F¨ollmer. Random fields and diffusion processes. In ´Ecole d’ ´Et´e de Probabilit´es de Saint-Flour XV–XVII, 1985–87 , pages 101–203. Springer, 1988.[21] Hans F¨ollmer, Nina Gantert, et al. Entropy minimization and schr¨odinger processes in infinite di-mensions.
The Annals of probability , 25(2):901–926, 1997.[22] Alfred Galichon, Scott Duke Kominers, and Simon Weber. The nonlinear bernstein-schr¨odinger equa-tion in economics. In Frank Nielsen and Fr´ed´eric Barbaresco, editors,
Geometric Science of Information ,pages 51–59, Cham, 2015. Springer International Publishing.[23] Ivan Gentil, Christian L´eonard, and Luigia Ripani. About the analogy between optimal transport andminimal entropy.
Annales de la facult´es des sciences de Toulouse S´er. 6 , 26(3):569–700, 2017.[24] Ugo Gianazza, Giuseppe Savar´e, and Giuseppe Toscani. The wasserstein gradient flow of the fisherinformation and the quantum drift-diffusion equation.
Archive for rational mechanics and analysis ,194(1):133–220, 2009.[25] N. Gigli.
Second Order Analysis on (P2(M),W2) . Memoirs of the American Mathematical Society. 2012.[26] Nicola Gigli and Luca Tamanini. Second order differentiation formula on compact RCD*(K, N) spaces. arXiv preprint arXiv:1701.03932 , 2017.[27] Nathael Gozlan, Cyril Roberto, Paul-Marie Samson, and Prasad Tetali. Kantorovich duality for gen-eral transport costs and applications.
Journal of Functional Analysis , 273(11):3327–3405, 2017. [28] Roger A Horn and Charles R Johnson.
Matrix analysis . Cambridge university press, 2012.[29] A. J. Krener. Reciprocal diffusions and stochastic differential equations of second order.
Stochastics ,107(4):393–422, 1988.[30] A.J. Krener. Reciprocal diffusions in flat space.
Probability Theory and Related Fields , 107(2):243–281,1997.[31] C. L´eonard. From the Schr¨odinger problem to the Monge–Kantorovich problem.
Journal of FunctionalAnalysis , 262(4):1879–1920, 2012.[32] C. L´eonard. A survey of the Schr¨odinger problem and some of its connections with optimal transport.
Discrete and Continuous Dynamical Systems , 34(4):1533–1574, 2014.[33] C. L´eonard, S. Rœlly, and J.C. Zambrini. Reciprocal processes. A measure-theoretical point of view.
Probability Surveys , 11:237–269, 2014.[34] Christian L´eonard. Some properties of path measures. In
S´eminaire de Probabilit´es XLVI , pages 207–230.Springer, 2014.[35] Christian L´eonard. On the convexity of the entropy along entropic interpolations. In N.Gigli, editor,
Measure Theory in Non-Smooth Spaces, Partial Differential Equations and Measure Theory.
De GruyterOpen, 2017.[36] Christian L´eonard et al. Lazy random walks and optimal transport on graphs.
The Annals of Probabil-ity , 44(3):1864–1915, 2016.[37] B.C. Levy and A.J. Krener. Dynamics and kinematics of reciprocal diffusions.
Journal of MathematicalPhysics , 34(5):1846–1875, 1993.[38] Wuchen Li, Penghang Yin, and Stanley Osher. Computations of optimal transport distance with fisherinformation regularization.
Journal of Scientific Computing , pages 1–15, 2017.[39] John Lott and C´edric Villani. Ricci curvature for metric-measure spaces via optimal transport.
Annalsof Mathematics , pages 903–991, 2009.[40] T. Mikami. Monge’s problem with a quadratic cost by the zero-noise limit of h-path processes.
Proba-bility Theory and Related Fields , 129(2):245–260, 2004.[41] E. Nelson.
Dynamical theories of Brownian motion , volume 2. Princeton university press Princeton, 1967.[42] Felix Otto. The geometry of dissipative evolution equations: the porous medium equation.
Commu-nications in Partial Differential Equations , 26(1-2):101–174, 2001.[43] Felix Otto and C´edric Villani. Generalization of an inequality by talagrand and links with the loga-rithmic sobolev inequality.
Journal of Functional Analysis , 173(2):361–400, 2000.[44] S. Rœlly and M. Thieullen. A characterization of reciprocal processes via an integration by partsformula on the path space.
Probability Theory and Related Fields , 123(1):97–120, 2002.[45] S. Rœlly and M. Thieullen. Duality formula for the bridges of a brownian diffusion: Application togradient drifts.
Stochastic Processes and their Applications , 115(10):1677–1700, 2005.[46] Bernard Roynette and Marc Yor.
Penalising Brownian paths , volume 1969. Springer Science & BusinessMedia, 2009.[47] L. R ¨uschendorf and W. Thomsen. Note on the Schr¨odinger equation and I-projections.
Statistics &probability letters , 17(5):369–375, 1993.[48] E. Schr¨odinger. ¨Uber die Umkehrung der Naturgesetze.
Sitzungsberichte Preuss. Akad. Wiss. Berlin.Phys. Math. , 144:144–153, 1931.[49] E. Schr¨odinger. La th´eorie relativiste de l’´electron et l’ interpr´etation de la m´ecanique quantique.
Ann.Inst Henri Poincar´e , (2):269 – 310, 1932.[50] Justin Solomon, Fernando De Goes, Gabriel Peyr´e, Marco Cuturi, Adrian Butscher, Andy Nguyen,Tao Du, and Leonidas Guibas. Convolutional wasserstein distances: Efficient optimal transportationon geometric domains.
ACM Transactions on Graphics (TOG) , 34(4):66, 2015.[51] Karl-Theodor Sturm. On the geometry of metric measure spaces.
Acta mathematica , 196(1):65–131,2006.[52] Michel Talagrand. Transportation cost for gaussian and other product measures.
Geometric & Func-tional Analysis GAFA , 6(3):587–600, 1996.[53] M. Thieullen. Second order stochastic differential equations and non-Gaussian reciprocal diffusions.
Probability Theory and Related Fields , 97(1-2):231–257, 1993.[54] C´edric Villani.
Optimal transport: old and new , volume 338. Springer Science & Business Media, 2008.
CHR ¨ODINGER BRIDGES, HOT GAS EXPERIMENT AND ENTROPIC TRANSPORTATION COST 41 [55] Max-K von Renesse. An Optimal Transport view of Schr¨odinger’s equation.
Canadian mathematicalbulletin , 55(4):858–869, 2012.[56] Max-K von Renesse and Karl-Theodor Sturm. Transport inequalities, gradient estimates, entropy andRicci curvature.
Communications on pure and applied mathematics , 58(7):923–940, 2005.[57] A. Wakolbinger. A simplified variational characterization of Schr¨odinger processes.
Journal of Mathe-matical Physics , 30(12):2943–2946, 1989.[58] J.C. Zambrini. Variational processes and stochastic versions of mechanics.
Journal of MathematicalPhysics , 27(9):2307–2330, 1986.D ´
EPARTEMENT DE M ATH ´ EMATIQUES A PPLIQU ´ EES , ´E
COLE P OLYTECHNIQUE , R
OUTE DE S ACLAY , 91128,P
ALAISEAU C EDEX , F
RANCE
E-mail address ::