[PDF] Towards Optimal Transport for Quantum Densities

Abstract

An analogue of the quadratic Wasserstein (or Monge-Kantorovich) distance between Borel probability measures on \mathbf{R}^d has been defined in [F. Golse, C. Mouhot, T. Paul: Commun. Math. Phys. 343 (2015), 165-205] for density operators on L^2(\mathbf{R}^d), and used to estimate the convergence rate of various asymptotic theories in the context of quantum mechanics. The present work proves a Kantorovich type duality theorem for this quantum variant of the Monge-Kantorovich or Wasserstein distance, and discusses the structure of optimal quantum couplings. Specifically, we prove that optimal quantum couplings involve a gradient type structure similar to the Brenier transport map (which is the gradient of a convex function), or more generally, to the subdifferential of a l.s.c. convex function as in the Knott-Smith optimality criterion (see Theorem 2.12 in [C. Villani: "Topics in Optimal Transportation", Amer. Math. Soc. 2003]).

Full PDF

aa r X i v : . [ m a t h - ph ] F e b TOWARDS OPTIMAL TRANSPORTFOR QUANTUM DENSITIES

EMANUELE CAGLIOTI, FRANC¸ OIS GOLSE, AND THIERRY PAUL

Abstract.

An analogue of the quadratic Wasserstein (or Monge-Kantorovich)distance between Borel probability measures on R d has been deﬁned in [F.Golse, C. Mouhot, T. Paul: Commun. Math. Phys. 343 (2015), 165–205]for density operators on L ( R d ) , and used to estimate the convergence rateof various asymptotic theories in the context of quantum mechanics. Thepresent work proves a Kantorovich type duality theorem for this quantumvariant of the Monge-Kantorovich or Wasserstein distance, and discusses thestructure of optimal quantum couplings. Speciﬁcally, we prove that, undersome boundedness and constraint hypothesis on the Kantorovich potentials,optimal quantum couplings involve a gradient type structure similar in thequantum paradigm to the Brenier transport map. On the contrary, when thetwo quantum densities have ﬁnite rank, the structure involved by the optimalcoupling has, in general, no classical counterpart. Introduction

Let µ, ν ∈ P( R d ) (the set of Borel probability measures on R d ). Given a l.s.c.function C ∶ R d × R d → [ , +∞ ] , the Monge problem in optimal transport is tominimize the functional I C [ T ] = ∫ R d C ( x, T ( x )) µ ( dx ) ∈ [ , +∞ ] over the set of Borel maps T ∶ R d → R d such that ν = T µ (the push-forwardmeasure of µ by T ). Here C ( x, y ) represents the cost of transporting the point x tothe point y , so that I C [ T ] represents the total cost of transporting the probability µ on ν by the map T . An optimal transport map T may fail to exist in full generality,so that one considers instead the following relaxed variant of the Monge problem,known as the Kantorovich problem:inf π ∈ Π ( µ,ν ) ∬ R d × R d C ( x, y ) π ( dxdy ) . Here, Π ( µ, ν ) is the set of couplings of µ and ν , i.e. the set of Borel probabilitymeasures on R d × R d such that ∬ R d × R d ( φ ( x ) + ψ ( y )) π ( dxdy ) = ∫ R d φ ( x ) µ ( dx ) + ∫ R d ψ ( x ) ν ( dx ) for all φ, ψ ∈ C b ( R d ) (where C b ( R d ) designates the set of bounded and continuousreal-valued functions deﬁned on R d ). An optimal coupling π opt always exists, sothat the inf is always attained in the Kantorovich problem (see Theorem 1.3 in Mathematics Subject Classiﬁcation.

Key words and phrases.

Wasserstein distance, Kantorovich duality, Quantum density opera-tors, Quantum couplings, Optimal transport. [24] or Theorem 4.1 in [25]). Of course, if an optimal map T exists for the Mongeproblem, the push-forward of the measure µ by the map x ↦ ( x, T ( x )) , which canbe (informally) written as(1) π ( dxdy ) ∶ = µ ( dx ) δ T ( x ) ( dy ) is an optimal coupling for the Kantorovich problem.In the special case where C ( x, y ) = ∣ x − y ∣ (the square Euclidean distance between x and y ) dist MK , ( µ, ν ) ∶ = inf π ∈ Π ( µ,ν ) √ ∬ R d × R d ∣ x − y ∣ π ( dxdy ) is a distance on P ( R d ) ∶ = { µ ∈ P ( R d ) s.t. ∫ R d ∣ x ∣ µ ( dx ) < ∞ } , referred to as the Monge-Kantorovich, or the Wasserstein distance of exponent 2(see chapter 7 in [24], or chapter 6 in [25], or chapter 7 in [1]). In that case, thereis “almost” an optimal transport map, in the following sense: π ∈ Π ( µ, ν ) is anoptimal coupling for the Kantorovich problem if and only if there exists a proper convex l.s.c. function a ∶ R d → R ∪ { +∞ } such thatsupp ( π ) ⊂ graph ( ∂a ) (where ∂a denotes the subdiﬀerential of a ). This is the Knott-Smith optimalitycriterion [19] (Theorem 2.12 (i) in [24]). If µ satisﬁes the condition(2) B is Borel measurable and H d − ( B ) < ∞ Ô⇒ µ ( B ) = , there exists a unique optimal coupling π of the form (1) for the Kantorovich prob-lem, with T = ∇ a , where a is a proper convex l.s.c. function on R d . (In condition(2), the notation H d − ( B ) designates the d − B .) This isthe Brenier optimal transport theorem [4] (stated as Theorem 2.12 (ii) in [24]). Itallows recasting (1) as(3) ( y − ∇ a ( x )) π ( dxdy ) = , and the a.e. deﬁned map ∇ a is referred to as the “Brenier optimal transport map”.Integrating π against a test function depending only on x shows that ∇ a trans-ports the x -marginal µ of π to its y -marginal ν , i.e.(4) ν = ∇ a µ . (This equality can obviously be deduced from (1) as well.)Recently, an analogue of the Monge-Kantorovich-Wasserstein distance dist MK , has been deﬁned in [13] on the set D ( H ) of density operators on the Hilbert space H ∶ = L ( R d ) . (Recall that a density operator on H is a linear operator R on H suchthat R = R ∗ ≥ ( R ) = f ≡ f ( q, p ) on the phase space R dq × R dp shouldbe put in correspondence with bounded operators on the Hilbert space H = L ( R dq ) of square-integrable functions deﬁned on the conﬁguration space R dq . I.e. not identically equal to +∞ . In particular ∇ a is deﬁned a.e. on R d . UANTUM OPTIMAL TRANSPORT 3 (b) The (Lebesgue) integral of (integrable) functions on R dq × R dq should be replacedby the trace of (trace-class) operators on H .(c) The coordinates q j (for j = , . . . , d ) on the null section of the phase space R dq × R dp should be put in correspondence with the (unbounded) self-adjoint operators Q j on H deﬁned byDom ( Q j ) ∶ = { ψ ∈ H s.t. ∫ R d q j ∣ ψ ( q )∣ dq < ∞ } , ( Q j ψ )( q ) ∶ = q j ψ ( q ) for all j = , . . . , d .(d) The coordinates p j (for j = , . . . , d ) on the ﬁbers of the phase space R dq × R dp should be put in correspondence with the (unbounded) self-adjoint operators P j on H deﬁned byDom ( P j ) ∶ = { ψ ∈ H s.t. ∫ R d ∣ ∂ q j ψ ( q )∣ dq < ∞ } , ( P j ψ )( q ) ∶ = − i ̵ h∂ q j ψ ( q ) for all j = , . . . , d .(e) The ﬁrst order diﬀerential operators f ↦ { q j , f } and f ↦ { p j , f } , where { ⋅ , ⋅ } isthe Poisson bracket on R dq × R dp such that { p j , p k } = { q j , q k } = , { p j , q k } = δ jk for j, k = , . . . , d should be replaced with the derivations on L ( H ) (the algebra of bounded linearoperators on H ) deﬁned by A ↦ i ̵ h [ Q j , A ] and A ↦ i ̵ h [ P j , A ] for j = , . . . , d .Following these principles, the quadratic transport cost from ( x, ξ ) to ( y, η ) in R d × R d should be replaced with the diﬀerential operator on R dx × R dy (5) C ∶ = d ∑ j = (( x j − y j ) − ̵ h ( ∂ x j − ∂ y j ) ) . Henceforth we denote by H the Hamiltonian(6) H ∶ = d ∑ j = ( Q j + P j ) = ∣ x ∣ − ̵ h ∆ x of the quantum harmonic oscillator. Given R, S ∈ D ( H ) , the set of density opera-tors ρ on H such that trace ( ρ / Hρ / ) < ∞ , the quantum analogue of the Monge-Kantorovich-Wasserstein distance dist MK , is deﬁned by the quantum Kantorovichproblem (see Deﬁnition 2.2 in [13])(7) M K ̵ h ( R, S ) ∶ = inf F ∈C( R,S ) √ trace H ⊗ H ( F / CF / ) , where C ( R, S ) is the set of quantum couplings of R and S , i.e.(8) C ( R, S ) ∶ = { F ∈ D ( H ⊗ H ) s.t. trace H ⊗ H (( A ⊗ I + I ⊗ B ) F ) = trace H ( AR + BS )} . (See Deﬁnition 2.1 in [13].) The functional M K ̵ h is a particularly convenient toolto obtain a convergence rate for the mean-ﬁeld limit in quantum mechanics that isuniform in the Planck constant ̵ h (see Theorem 2.4 in [13], and Theorem 3.1 in [16]for precise statements of these results).The striking analogy between the Wasserstein distance dist MK , and the quantumfunctional M K ̵ h suggests the following questions concerning a possible Brenier E. CAGLIOTI, F. GOLSE, AND T. PAUL type theorem in quantum mechanics, motivated in a heuristic way by the followingconsiderations.As mentioned before the classical underlying paradigm for quantum mechanicsis the classical phase space R d = T ∗ R d equipped with the standard symplecticstructure leading to the Poisson bracket deﬁned in item ( e ) above. Therefore, inthis setting and under assumption (2), equation (3) reads(9) ( z ′ − ∇ a ( z )) π ( dz, dz ′ ) = , where z ∶ = ( q, p ) and z ′ ∶ = ( q ′ , p ′ ) are the coordinates on the phase space T ∗ R d and dz ∶ = dqdp, dz ′ = dq ′ dp ′ .Deﬁning the mapping J ∶ T ∗ R d → T ∗ R d entering the deﬁnition of the symplecticform σ of T ∗ R d as σ ( dz, dz ′ ) = dz ∧ dJz ′ — in the z = ( q, p ) coordinates J = ( I R d − I R d ) — equation (9) can be put in the form(10) ( z ′ − { Jz, a ( z )}) π ( dzdz ′ ) = . This symplectic formulation of the Brenier theorem is more likely to have an ana-logue in quantum mechanics. Indeed, according to the items (c), (d) and (e) above,the factor ( z ′ −{ Jz, a ( z )}) should be put in correspondence with the (vector-valued)operator on H ⊗ H (11) I H ⊗ Z − i ̵ h [ JZ, A ] ⊗ I H = I H ⊗ Z − ∇ Q A ⊗ I H , for some operator A on H . In (11), Z designates the vector of operator-valuedcoordinates ( Q , . . . , Q d , P , . . . , P d ) , and we use the notation ∇ Q ∶ = i ̵ h [ JZ, ⋅ ] .Having in mind that the optimal classical coupling π should be put in correspon-dence with an optimal element F op of C ( R, S ) deﬁned in (8), the only ambiguitywhich remains in giving a quantum version of (10) is the choice of an ordering forthe product of the operators I H ⊗ Z − ∇ Q A ⊗ I H and F op .It happens that this ambiguity will be resolved by distributing the square-rootof F op on both sides of the expression (11), which leads us to the very symmetricequality (see Theorem 2.6 in the next section):(12) F / op ( I H ⊗ Z − ∇ Q A ⊗ I H ) F / op = . (Notice that one cannot deﬁne a square-root of the optimal coupling in the classicalcase, since such a coupling is a Dirac measure, as shown by (1).)Clearly (12) gives a hint on the structure of optimal quantum couplings in thedeﬁnition (7) of the M K ̵ h ( R, S ) and on an analogue of the notion of Brenier optimaltransport map. Notice that we are missing a quantum analogue of the originalvariational problem considered by Monge, or, equivalently, of the coupling (1), sothat deﬁning a notion of quantum optimal transport seems far from obvious.Nevertheless, (12) says that, once projected on the orthogonal of the kernel ofan optimal coupling, the operators I H ⊗ Z and ∇ Q A ⊗ I H are equal, in agreementwith Brenier’s theorem put in the form: “the support of the optimal coupling isthe graph of the gradient of a convex function”.The presence of F / op on both sides of the expression between parenthesis in theleft hand side of (12) forbids getting a quantum equivalent to (4), whose formula-tion is not clear anyway. Indeed, changes of variables in quantum mechanics are ill UANTUM OPTIMAL TRANSPORT 5 deﬁned, except for linear symplectic mappings through the metaplectic representa-tion. However, denoting by Z ′ the (operator-valued) vector Z ′ ∶ = ∇ Q A , (with the same operator A as in (11)-(12)) and writing the trace of the left handside of (12) in terms of the marginals of F op shows that(13) trace ( ZR ) = trace ( Z ′ S ) . Formula (13) can be interpreted in the framework of the so-called Ehrenfest cor-respondance principle (abusively called Ehrenfest’s

Theorem sometimes) [12, 17]:in quantum mechanics, trace ( ZR ) is known as the expected value of the variable Z in the state R (in the case where R = ∣ ϕ ⟩⟨ ϕ ∣ , then trace ( ZR ) = ( ϕ ⋅ Zϕ ) H ). Itis the only deterministic quantity that one can associate to a particle in a givenstate, by taking the average of the (non-deterministic) result of (an — in principle— inﬁnite number of) measurements. It is interpreted in the (statistical) Ehrenfestpicture as the classical value of the coordinate Z of the state R . Thus the Ehrenfestinterpretation of (13) is clear: the deterministic information we have on the state R is transported to the corresponding one on the state S by the change of variables Z ↦ Z ′ .In the present article, we ﬁrst state a Kantorovich duality theorem (Theorem2.2) for M K ̵ h , i.e., for every density operators R, S on H , M K ̵ h ( R, S ) = sup A = A ∗ ,B = B ∗∈L( H ) A ⊗ I + I ⊗ B ≤ C trace H ( RA + SB ) , where C is deﬁned in (5). In Theorem 2.4, we prove that the sup in the righthand side of the equality above is attained for some possibly unbounded operators a and b deﬁned on appropriate Gelfand triples with Ker ( R ) ⊥ and Ker ( S ) ⊥ as pivotspaces. Theorem 2.5 provides a criterion for the sup on the right hand side of theequality above to be attained on bounded operators A, B satisfying the inequalityconstraint on the form domain of C . It provides also a family of density operators R and S for which this criterion is satisﬁed.Theorem 2.6 is devoted to an analogue of Brenier’s theorem for quantum optimalcouplings.When the sup in the equality above is attained by two operators A and B boundedon H such that the constraint A ⊗ I + I ⊗ B ≤ C is satisﬁed on the form-domainof C , we show in Theorem 2.6 ( ) our quantum result “`a la Brenier”, namely theformula (12) already mentioned F / op ( I H ⊗ Z − ∇ Q A ⊗ I H ) F / op = A = ( H − a ) . Here H is the harmonic oscillator deﬁned by (6) and ∇ Q isdeﬁned by (11).On the other hand, when R and S are of ﬁnite rank, formula (12) has to be replacedby the following one (Theorem 2.6 ( ) )(14) F / op ( F − ∇ Q A ′ ⊗ I H ) F / op = A ′ = ( H ′ − a ) . Here H ′ is the harmonic oscillator H projected on Ker ( R ) ⊥ and F is the following vector operator valued on Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ :(15) F j = d ∑ k = i ̵ h [( JZ R ) j , Z Rk ] ⊗ Z Sk , j = , . . . , d. E. CAGLIOTI, F. GOLSE, AND T. PAUL where Z R (resp. Z S ) is the vector Z projected, component by component, onKer ( R ) ⊥ (resp. Ker ( S ) ⊥ ) (see Theorem 2.6 ( b ) for explicit expressions).There is no chance that the term i ̵ h [( JZ R ) j , Z Rk ] in (15) reduces to δ i.k I , leadingto F j = I ⊗ Z k so that (14) would reduce to (12). Indeed, it is well known that there isno representation of the canonical relations in ﬁnite dimension. But at the contrary,nothing prevents F / op F k F / op to be equal to (a multiple of) F / op ( I ⊗ Z Rk ) F / op . Wewill show, Lemma 7.2 in Section 7.2, that this is indeed the case for the quantumbipartite matching problem for two one-dimensional particles with equal masses,studied extensively in [7].Natural examples of classical analogues to the ﬁnite rank (independent of thePlanck constant) quantum situation are the cases where µ, ν are singular. Thereforethese cases are not covered by the Knott-Smith-Brenier result. This is the casefor the bipartite problem we just mentioned for which µ = + η δ a + − η δ − a , ν = δ b + δ − b , − < η < R = + η ∣ a ⟩⟨ a ∣ + − η ∣ − a ⟩⟨− a ∣ , S = ∣ b ⟩⟨ b ∣ + ∣ − b ⟩⟨− b ∣ in the quantum one. When η = µ is optimally transported to ν by any ﬂow which send ± a to ± b , and in this case (14) takes the form of (12), seeProposition 7.3. But when η >

0, the mass of a has to be splited in two parts, anamount to be send to b and an amount η which goes to − b , and µ is optimalytransported to ν by a multivalued map.Therefore, beside the fact that formula (12) represents a quantum analogue ofthe Knott-Smith-Brenier result, formulas (14)-(15) have in general no analogue interms of classical (monovalued) ﬂow.The main results, Theorems 2.2. 2.4, 2.5 and 2.6, are stated in Section 2 andproved in Sections 3, 4, 5 and 6 respectively. Section 7 is devoted to some examples,including the ﬁnite rank and T¨oplitz situations, and the three Appendices containsome technical material, including a result on monotone convergence for trace-classoperators in Apendix B.To conclude this introduction, we mention other attempts at deﬁning analoguesof the Wasserstein, or Monge-Kantorovich distances in the quantum setting. Forinstance ˙Zyczkowski and S lomczy´nski [26] (see also section 7.7 in chapter 7 of[3]) proposed to consider the original Monge distance (also called the Kantorovich-Rubinstein distance, or the Wasserstein distance of exponent 1) between the Husimitransforms of the density operator (see (64) for a deﬁnition of this transform).Besides the quantity M K ̵ h appeared in [13], other analogues of the Wassersteindistance of exponent 2 for quantum densities have been proposed by several otherauthors. For instance Carlen and Maas have deﬁned a quantum analogue of theBenamou-Brenier formula (see [2] or Theorem 8.1 in chapter 8 of [25]) for theclassical Wasserstein distance of exponent 2, and their idea has been used to obtaina quantum equivalent of the so-called HWI inequality: see [8, 9, 22].Other propositions for generalizing Wasserstein distances to the quantum settinghave emerged more recently, such as [18] (mainly focussed on pure states) or [10],very close to our deﬁnition of M K ̵ h , except that the set of couplings used in theminimization is diﬀerent and based instead on the notion of “quantum channels”(see also [11] for a deﬁnition of a quantum Wasserstein distance of order 1). UANTUM OPTIMAL TRANSPORT 7 Main Results

The key argument in deriving the structure (1) of optimal couplings for the Kan-torovich problem involves a min-max type result known as “Kantorovich duality”.For each µ, ν ∈ P ( R d ) , one has(16) dist MK , ( µ, ν ) = sup φ,ψ ∈ C b ( R d ) φ ( x )+ ψ ( y )≤∣ x − y ∣ ( ∫ R d φ ( x ) µ ( dx ) + ∫ R d ψ ( y ) ν ( dy )) . When µ, ν do not charge small sets, in the sense that they satisfy (2), one can provethat the supremum in the r.h.s. of (16) is actually attained anddist MK , ( µ, ν ) = min π ∈ Π ( µ,ν ) ∬ R d × R d ∣ x − y ∣ π ( dxdy ) = ∬ R d × R d ∣ x − y ∣ π op ( dxdy ) = max φ ∈ L ( µ ) , ψ ∈ L ( ν ) φ ( x )+ ψ ( y )≤∣ x − y ∣ µ ⊗ ν -a.e. ( ∫ R d φ ( x ) µ ( dx ) + ∫ R d ψ ( y ) ν ( dy )) = ∫ R d φ op ( x ) µ ( dx ) + ∫ R d ψ op ( y ) ν ( dy ) for two proper convex l.s.c. functions φ op and ψ op on R d .Moreover, a ( x ) ∶ = ( x − φ op ( x )) is precisely the function appearing in (3), thegradient of which deﬁnes a.e. the Brenier optimal transport map of the previoussection. (See Theorem 1.3, Proposition 2.1 and Theorem 2.9 in [24].)Likewise, the operator A in (12) will be similarly related to an optimal operatorappearing in a dual formulation of deﬁnition (7), to be presented below.Before we state the quantum analogue of the Kantorovich duality, we need sometechnical preliminaries.The quantum transport cost is the operator(17) C ∶ = d ∑ j = (( x j − y j ) − ̵ h ( ∂ x j − ∂ y j ) ) , viewed as an unbounded self-adjoint operator on L ( R d × R d ) with domain(18) Dom ( C ) ∶ = { ψ ∈ L ( R d × R d ) s.t. ∣ x − y ∣ ψ and ∣ D x − D y ∣ ψ ∈ L ( R d × R d )} . Henceforth we denote by H the Hamiltonian of the quantum harmonic oscillator,i.e.(19) H ∶ = ∣ x ∣ − ̵ h ∆ x , which is a self-adjoint operator on L ( R d ) with domain (20) Dom ( H ) ∶ = { φ ∈ H ( R d ) s.t. ∣ x ∣ φ ∈ L ( R d )} . If u ∈ C ∞ c ( R d ) , one has ∫ R d ∣ x ∣ u ( x ) Hu ( x ) dx = ∫ R d (∣ x ∣ u ( x ) + ̵ h ∣ x ∣ ∣ ∇ u ( x )∣ ) dx + ̵ h ∫ R d x ⋅ ∇ ( u ( x ) ) dx = ∫ R d ((∣ x ∣ − d ̵ h ) u ( x ) + ̵ h ∣ x ∣ ∣∇ u ( x )∣ ) dx , so that ∥∣ x ∣ u ∥ L ( R d ) ≤ ∥ Hu ∥ L ( R d ) ∥∣ x ∣ u ∥ L ( R d ) + d ̵ h ∥ u ∥ L ( R d ) . Thus, if u ∈ L ( R d ) and Hu ∈ L ( R d ) , then one has ∣ x ∣ u ∈ L ( R d ) , which implies in turn that − ∆ u ∈ L ( R d ) . The same E. CAGLIOTI, F. GOLSE, AND T. PAUL

In the sequel, we shall also need the form-domains of the operators H and C :(21)Form-Dom ( H ) ∶ = { φ ∈ H ( R d ) s.t. ∣ x ∣ φ ∈ L ( R d )} , Form-Dom ( C ) ∶ = { ψ ∈ H ⊗ H s.t. ( x j − y j ) ψ and ( ∂ x j − ∂ y j ) ψ ∈ H ⊗ H , ≤ j ≤ d } . The deﬁnition of the form-domain of a self-adjoint operator can be found for in-stance in section VIII.6, Example 2 of [20]. Observe that(22) Form-Dom ( H ⊗ I + I ⊗ H ) = { ψ ∈ H ( R d × R d ) s.t. (∣ x ∣+∣ y ∣) ψ ∈ L ( R d × R d )} ⊂ Form-Dom ( C ) . Lemma 2.1.

Let

R, S ∈ D ( H ) , and let Q ∈ C ( R, S ) . Each eigenfunction Φ of Q such that Q Φ / = belongs to Form-Dom ( H ⊗ I + ⊗ H ) and ≤ ⟨ Φ ∣ H ⊗ I + I ⊗ H ∣ Φ ⟩ ≤ trace H ( R / HR / + S / HS / ) < ∞ . In particular Φ ∈ Dom ( C ) with (23) ⟨ Φ ∣ C ∣ Φ ⟩ ≤ H ( R / HR / + S / HS / ) < ∞ . Proof.

Since

R, S ∈ D ( H ) , one hastrace H ⊗ H ( Q / ( H ⊗ I ) Q / ) = trace H ( R / HR / ) < ∞ trace H ⊗ H ( Q / ( I ⊗ H ) Q / ) = trace H ( S / HS / ) < ∞ for each Q ∈ C ( R, S ) by Lemma C.3. In particulartrace H ⊗ H ( Q / ( H ⊗ I + I ⊗ H ) Q / ) < ∞ . Let ( Φ k ) k ≥ be a complete orthonormal system of eigenvectors of Q , and let ( λ k ) k ≥ be the sequence of eigenvalues of Q such that Q Φ k = λ k Φ k for each k ≥

0. Thus λ k > Ô⇒ Φ k ∈ Form-Dom ( H ⊗ I + I ⊗ H ) and ∑ k ≥ λ k ⟨ Φ k ∣ H ⊗ I + I ⊗ H ∣ Φ k ⟩ = trace H ⊗ H ( Q / ( H ⊗ I + I ⊗ H ) Q / ) = trace H ( R / HR / + S / HS / ) < ∞ , and this implies the desired inequality. Using (22) shows thatΨ ∈ Form-Dom ( H ⊗ I + I ⊗ H ) Ô⇒ Ψ ∈ Form-Dom ( C ) and0 ≤ ⟨ Ψ ∣ C ∣ Ψ ⟩ ≤ ⟨ Ψ ∣ H ⊗ I + I ⊗ H ∣ Ψ ⟩ . (cid:3) A Quantum Analogue to the Kantorovich Duality.

The statement be-low is an analogue of the Kantorovich Duality Theorem (Theorem 1.3 in [24], orTheorem 6.1.1 in [1]) for the quantum transport cost operator C deﬁned by (17). Theorem 2.2 (Quantum duality) . Let

R, S ∈ D ( H ) . Then (24) min F ∈ C( R,S ) trace H ⊗ H ( F / CF / ) = sup ( A,B ) ∈ K trace H ( RA + SB ) , where K ∶ = {( A, B ) ∈ L ( H ) × L ( H ) s.t. A = A ∗ , B = B ∗ and A ⊗ I + I ⊗ B ≤ C } . argument shows that U ∈ L ( R d × R d ) and CU ∈ L ( R d × R d ) imply that ∣ x − y ∣ U ∈ L ( R d × R d ) .These observations imply that the domains of H and C are the spaces given above. UANTUM OPTIMAL TRANSPORT 9

In the deﬁnition of K , the inequality A ⊗ I + I ⊗ B ≤ C means that ⟨ ψ ∣ A ⊗ I + I ⊗ B ∣ ψ ⟩ ≤ ⟨ ψ ∣ C ∣ ψ ⟩ for all ψ ∈ Form-Dom ( C ) .Notice that the inf on the left hand side of the duality formula is attained — inother words, there always exists an optimal coupling F ∈ C ( R, S ) . On the contrary,the sup in the right hand side of the duality formula is in general not attained —at least not attained in the class K in general.2.2. Existence of Optimal Operators

A, B . In this section, we explain how thesup in the right hand side of the duality formula is attained in a class of operators ( A, B ) larger than K .2.2.1. Gelfand triple associated to a nonnegative trace-class operator.

We shall userepeatedly the following construction. Given a separable Hilbert space H and T ∈ L ( H ) such that T = T ∗ ≥

0, let ( ξ n ) n ≥ be a complete orthonormal basis of H of eigenvectors of T . Set(25) J [ T ] ∶ = span { ξ n s.t. ⟨ ξ n ∣ T ∣ ξ n ⟩ > } , and(26) ( φ ∣ ψ ) T ∶ = ⟨ φ ∣ T − ∣ ψ ⟩ , φ, ψ ∈ J [ T ] . Let J [ T ] designate the completion of J [ T ] for the inner product ( ⋅∣⋅) T . Obviously(27) J [ T ] ⊂ J [ T ] = Ker ( T ) ⊥ ⊂ J [ T ] ′ (where J [ T ] is the closure of J [ T ] in H ). The ﬁrst inclusion is continuous since,for each φ ∈ J [ T ] , one has ∥ φ ∥ H ≤ ∥ T ∥( φ ∣ φ ) T = ∥ T ∥⟨ φ ∣ T − ∣ φ ⟩ . The operator T − / , which is a priori deﬁned on J [ T ] only, has a unique continuousextension which is the unitary transformation(28) T − / ∶ J [ T ] → Ker ( T ) ⊥ with adjoint T − / ∶ Ker ( T ) ⊥ → J [ T ] ′ . In other words, one has a Gelfand triple(29) J [ T ] ⊂ c Ker ( T ) ⊥ ⊂ J [ T ] ′ . (Notice that the embedding J [ T ] ⊂ Ker ( T ) ⊥ is compact since T / is a Hilbert-Schmidt, and therefore compact, operator on H .) With the unitary transformation(28), one deﬁnes the isometric isomorphism(30) L ( J [ T ] , J [ T ] ′ ) ∋ Z ↦ T / Z T / = Z ∈ L ( Ker ( T ) ⊥ ) . Under this isomorphism, Z ∗ is obviously mapped to Z ∗ . The optimality class ˜ K ( R, S ) . While the original class K is independent ofthe quantum density operators R and S , the optimality class K ( R, S ) signiﬁcantlydepends on R, S . Deﬁnition 2.3.

For each

R, S ∈ D ( H ) , let ˜ K ( R, S ) be the set of ( v , w ) with v ∈ L ( J [ R ] , J [ R ] ′ ) and w ∈ L ( J [ S ] , J [ S ] ′ ) such that(a) the operators V = R / v R / and W = S / w S / satisfy R / HR / ≥ V = V ∗ ∈ L ( Ker ( R ) ⊥ ) R / HR / ≥ W = W ∗ ∈ L ( Ker ( S ) ⊥ ) ; (b) for each Φ ∈ J [ R ] ⊗ J [ S ] , one has ⟨ Φ ∣ v ⊗ I + I ⊗ w ∣ Φ ⟩ ≤ ⟨ Φ ∣ C ∣ Φ ⟩ . Notice that the left hand side of the inequality in condition (b) is well deﬁned,since J [ R ] ⊂ J [ R ] , so that v J [ R ] ⊂ J [ R ] ′ . Hence any element of v J [ R ] isa linear functional which can be evaluated on any element of J [ R ] ⊂ J [ R ] , andlikewise any element of w J [ S ] is a linear functional which can be evaluated onany element of J [ S ] ⊂ J [ S ] .As for the right hand side, let ( e j ) j ≥ and ( f k ) k ≥ be complete orthonormalsystems of eigenvectors of R and S respectively in H . By the implication in (69)(see Lemma C.1 in the Appendix) e j ∈ Ker ( R ) ⊥ Ô⇒ e j ∈ Form-Dom ( H ) ,f k ∈ Ker ( S ) ⊥ Ô⇒ f k ∈ Form-Dom ( H ) . In particular(31) J [ R ] ⊗ J [ S ] ⊂ Form-Dom ( H ⊗ I + I ⊗ H ) ⊂ Form-Dom ( C ) so that the right hand side of the inequality in (b) is ﬁnite.2.2.3. The sup is attained in ˜ K ( R, S ) . Passing from K to ˜ K ( R, S ) is equivalentto seeking the optimal Kantorovich potential in L ( R d , µ ) as in Theorems 1.3 orTheorem 2.9 of [24], instead of C b ( R d ) — see the last sentence in Theorem 1.3 of[24], together with Remark 1.6 in that same reference. Theorem 2.4 (Existence of optimal duality potentials) . For all

R, S ∈ D ( H ) , min F ∈ C( R,S ) trace H ⊗ H ( F / CF / ) = max ( a , b ) ∈ ˜ K ( R,S ) trace H ( R / a R / + S / b S / ) . If R and S are of ﬁnite rank, ˜ K ( R, S ) ⊂ L ( Ker ( R ) ⊥ ) × L ( Ker ( S ) ⊥ ) , so that anyoptimal pair ( a , b ) for the max in the right hand side of the equality above consistsof operators a and b deﬁned on the ﬁnite-dimensional linear spaces Ker ( R ) ⊥ and Ker ( S ) ⊥ . Structure of optimal couplings.

In the classical setting, pick a properconvex l.s.c. function φ ∶ R d ↦ R ∪ { +∞ } , and let µ ∈ P ( R d ) satisfy condition (2)and(32) ∫ R d (∣ x ∣ + ∣∇ φ ( x )∣ + ∣ φ ( x )∣ + ∣ φ ∗ (∇ φ ( x ))∣) µ ( dx ) < ∞ . Then(33) π ( dxdy ) ∶ = µ ( dx ) δ ( y − ∇ φ ( x )) UANTUM OPTIMAL TRANSPORT 11 is an optimal coupling of the measures µ and ν ∶ = ∇ φ µ for the Kantorovichproblem with the cost C ( x, y ) = ∣ x − y ∣ .We begin with a necessary and suﬃcient condition on density operators R, S ∈ D ( H ) to have the sup in (24) attained in K , along with an optimality criterionfor the couplings of such density operators. This is the quantum analogue of thesuﬃcient condition in Theorem 6.1.4 of [1]. Theorem 2.5 (Optimality criterion) . Let ( A, B ) ∈ K be such that Ker ( C − A ⊗ I − I ⊗ B ) / = { } . Let ( Φ j ) be a complete orthonormal system in Ker ( C − A ⊗ I − I ⊗ B ) , and let (34) F ∶ = ∑ j λ j ∣ Φ j ⟩⟨ Φ j ∣ , with λ j ≥ and ∑ j λ j = . Call F ∶ = trace ( F ) and F ∶ = trace ( F ) the partial traces of F on the second andﬁrst factor in H ⊗ H respectively. Then F is an optimal coupling of F and F : trace H ⊗ H ( F / CF / ) = min Q ∈ C( F ,F ) trace H ⊗ H ( Q / CQ / ) = sup ( a,b ) ∈ K trace H ( F a + F b ) = trace H ( F A + F B ) . Conversely, if ( A, B ) ∈ K is an optimal pair for R, S ∈ D ( H ) , i.e. if (35) M K ̵ h ( R, S ) = trace H ( RA + SB ) , then Ker ( C − A ⊗ I − I ⊗ B ) / = { } and any optimal coupling of R and S , i.e. any F ∈ C ( R, S ) such that M K ̵ h ( R, S ) = trace H ⊗ H ( F / CF / ) is of the form (34) . In the previous theorem (Theorem 2.5), we have obtained a complete descriptionof the densities

R, S ∈ D ( H ) such that the sup in (24) is attained in K . Next, wegive necessary conditions on the structure of the optimal couplings F ∈ C ( R, S ) forsuch density operators R and S .In the classical setting, the structure (33) of optimal couplings is a straightfor-ward consequence of (32). Indeed, the set of points where the Young inequality φ ( x ) + φ ∗ ( y ) ≥ x ⋅ y becomes an equality is included in graph ( ∂φ ) . This suggests the idea of lookingfor a quantum analogue of the Brenier optimal transport map in the optimalitycriterion in Theorem 2.5.We shall need the following basic functional analytic considerations. The linearspace Dom ( C ) endowed with the inner product ( Φ , Ψ ) ↦ ( Φ ∣ Ψ ) Dom ( C ) = ( Φ ∣ Ψ ) H ⊗ H + ( C Φ ∣ C Ψ ) H ⊗ H is a Hilbert space. Hence C ∈ L ( Dom ( C ) , H ⊗ H ) (with norm at most 1). Since C issymmetric on Dom ( C ) , it has a unique extension as an element of L ( H , Dom ( C ) ′ ) (where Dom ( C ) ′ designates the topological dual of Dom ( C ) ), which is deﬁned bythe formula(36) ⟨ C Φ , Ψ ⟩ Dom ( C ) ′ , Dom ( C ) ∶ = ( Φ ∣ C Ψ ) H ⊗ H for all Φ ∈ H ⊗ H and Ψ ∈ Dom ( C ) . On the other hand, the linear space Form-Dom ( H ⊗ I + I ⊗ H ) endowed with theinner product ( Φ , Ψ ) ↦ ( Φ ∣ Ψ ) Form-Dom ( H ⊗ I + I ⊗ H ) = ( Φ ∣ Ψ ) H ⊗ H + ⟨ Φ ∣ H ⊗ I + I ⊗ H ∣ Ψ ⟩ is a Hilbert space. If T ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , H ⊗ H ) is a symmetric operator,it has a unique extension as an element of L ( H , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) . Thisextension is deﬁned by the formula(37) ⟨ T Φ , Ψ ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) = ( Φ ∣ T Ψ ) H ⊗ H (where Form-Dom ( H ⊗ I + I ⊗ H ) ′ is the topological dual of Form-Dom ( H ⊗ I + I ⊗ H ) )for all Φ ∈ H ⊗ H and Ψ ∈ Form-Dom ( H ⊗ I + I ⊗ H ) .In particular(38) [ T, C ] ∶ Dom ( C ) ∩ Form-Dom ( H ⊗ I + I ⊗ H ) → Dom ( C ) ′ + Form-Dom ( H ⊗ I + I ⊗ H ) ′ is a continuous linear map. SinceDom ( C ) ′ + Form-Dom ( H ⊗ I + I ⊗ H ) ′ ⊂ ( Dom ( C ) ∩ Form-Dom ( H ⊗ I + I ⊗ H )) ′ , the bilinear functional ( Φ , Ψ ) ↦ ⟨ Φ ∣[ T, C ]∣ Ψ ⟩ ∶ = ( T Φ ∣ C Ψ ) H ⊗ H − ( C Φ ∣ T Ψ ) H ⊗ H is continuous on Dom ( C ) ∩ Form-Dom ( H ⊗ I + I ⊗ H ) .Henceforth we use the following notation: q j ψ ( x , . . . , x d ) ∶ = x j ψ ( x , . . . , x N ) , p j ψ ( x , . . . , x d ) ∶ = − i ̵ h∂ x j ψ ( x , . . . , x d ) for all ψ ∈ Form-Dom ( H ) and all j = , . . . , d , and D q j S ∶ = i ̵ h [ p j , S ] , D p j S ∶ = − i ̵ h [ q j , S ] . Theorem 2.6.

Let

R, S ∈ D ( H ) , let F ∈ C ( R, S ) be an optimal coupling, i.e. trace H ⊗ H ( F / CF / ) = min Q ∈ C( R,S ) trace H ⊗ H ( Q / CQ / ) . and ( A, B ) ∈ ˜ K ( R, S ) a pair of optimal operators such that trace H ( RA + SB ) = sup ( a,b ) ∈ K trace H ( Ra + Sb ) , Then, (1) if A ∈ L ( J ( R ) , H ) (resp. B ∈ L ( J ( S ) , H ) ), let us denote by the same letters A, B two extensions of

A, B to L ( H ) such that A ⊗ I + I ⊗ B ≤ C on Form-Dom ( C ) (in other words, ( A, B ) ∈ K deﬁned in Theorem 2.2) .Then(a) any eigenvector Φ of F such that F Φ / = satisﬁes Φ ∈ Dom ( C ) and C Φ = ( A ⊗ I + I ⊗ B ) Φ ; (b) Let us denote A ∶ = ( H − A ) and B ∶ = ( H − B ) , Note that even when Ker ( R ) = Ker ( S ) = { } , so that J ( E )⊗J ( S ) is dense in H ⊗ H , Theorem2.4 provides optimal operators A, B satisfying the constraint inequality only on J ( E ) ⊗ J ( S ) and not automatically on Form-Dom ( C ) . This is why the preceding constraint inequality has tobe supposed to hold true. UANTUM OPTIMAL TRANSPORT 13 where H is the harmonic oscillator in (19) .Then, for each j = , . . . , d , one has F / ( I ⊗ q j − D q j A ⊗ I ) F / = F / ( I ⊗ p j − D p j A ⊗ I ) F / = ,F / ( q j ⊗ I − I ⊗ D q j B ) F / = F / ( p j ⊗ I − I ⊗ D p j B )) F / = . (2) if R and S have ﬁnite rank, one knows by Theorem 2.4 that A ∈ L ( Ker ( R ) ⊥ ) and B ∈ L ( Ker ( S ) ⊥ ) . Let us denote by P and Q be the orthogonal projec-tions on Ker ( R ) ⊥ and Ker ( S ) ⊥ respectively.Then, by identifying F with its projection on Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ thanksto Lemma 4.1, one has (on Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ ),(c) ( P ⊗ Q C P ⊗ Q − A ⊗ I ker ( S ) ⊥ + I ker ( R ) ⊥ ⊗ B ) F = .(d) Let us denote A ′ ∶ = ( P H P − A ) and B ′ ∶ = ( Q H Q − B ) , and, for each j = , . . . , d , Q Rj = P Q j P , P Rj = P P j P , Q Sj = Q Q j Q , P Sj = Q P j Q .One has F / ( d ∑ k = ( i ̵ h [ P Rj , Q Rk ] ⊗ Q Sk + i ̵ h [ P Rj , P Rk ] ⊗ P Sk ) − D Q Rj A ′ ⊗ I ) F / = F / ( d ∑ k = ( i ̵ h [ Q Rj , Q Rk ] ⊗ Q Sk + i ̵ h [ Q Rj , P Rk ] ⊗ P Sk ) − D P j A ′ ⊗ I ) F / = F / ( d ∑ k = ( Q Rk ⊗ i ̵ h [ P Sj , Q Sk ] + P Rk ⊗ i ̵ h [ P sj , P sk ]) − I ⊗ D Q Rj B ′ ) F / = F / ( d ∑ k = ( Q Rk ⊗ i ̵ h [ Q Sj , Q Sk ] + P Rk ⊗ i ̵ h [ Q sj , P sk ]) − I ⊗ D P Sj B ′ ) F / = ( b ) are analogous tothe condition (9) ( z ′ − ∇ a ( z )) π ( dz, dz ′ ) = a is smooth, so that ∂a ( z ) = { ∇ a ( z )} (see the Brenier or the Knott-Smiththeorems, stated as Theorem 2.12 (i)-(ii) in [24]. Indeed, using the (vector-valued)operator ∇ Q = ( D q , . . . , D q d , D p , . . . , D p d ) together with the vector of operators Z deﬁned right after (11), statement (b) ofTheorem 2.6 reads F ( Z ⊗ I − I ⊗ ∇ Q A ) F = . (39)Notice that the quantum analogue of the function a is the operator ( H − A ) (equivalently, the classical analogue of A is (∣ q ∣ + ∣ p ∣ ) − φ ( q, p ) ): see Remark 2.13(iii) following Theorem 2.12 in [24], where the relation between φ and the optimalpair in the Kantorovich duality theorem is described in detail.Concerning ( d ) , it is a straightforward computation to show that the two ﬁrstequalities can be synthesized as formulas (14)-(15) in the introduction. Proof of Theorem 2.2

Set E ∶ = L ( H ⊗ H ) . Deﬁne f, g ∶ E → R ∪ { +∞ } by the formulas f ( T ) ∶ = { T = T ∗ ≥ − C , + ∞ otherwise,and g ( T ) ∶ = { trace H ( RA + SB ) if T = T ∗ = A ⊗ I + I ⊗ B , + ∞ otherwise.For each T = T ∗ ∈ L ( H ⊗ H ) , the constraint T ≥ − C in the deﬁnition of f is to beunderstood as follows: ⟨ φ ∣ T ∣ φ ⟩ ≥ − ⟨ φ ∣ C ∣ φ ⟩ for each φ ∈ Form-Dom ( C ) . On the other hand, the nullspace of the linear mapΓ ∶ L ( H ) × L ( H ) ∋ ( A, B ) ↦ A ⊗ I + I ⊗ B ∈ L ( H ⊗ H ) is Ker ( Γ ) = {( tI, − tI ) s.t. t ∈ C } . Since trace H ( R ) = trace H ( S ) =

1, one hastrace H ( RA + SB ) = t trace H ( R − S ) = ( A, B ) = ( tI, − tI ) ∈ Ker ( Γ ) so that A ⊗ I + I ⊗ B ↦ trace H ( RA + SB ) deﬁnes a unique linear functional on ran ( Γ ) . Besides ( A ⊗ I + I ⊗ B ) ∗ = A ∗ ⊗ I + I ⊗ B ∗ , so that, by cyclicity of the trace, T = A ⊗ I + I ⊗ B and T = T ∗ Ô⇒ A = A ∗ and B = B ∗ Ô⇒ g ( T ) = trace H ( RA ∗ + SB ∗ ) = trace H ( AR + BS ) = g ( T ) . Therefore, the prescription above deﬁnes indeed a unique function g on E withvalues in ( −∞ , +∞ ] .One easily checks that f and g are convex. Indeed, f is the indicator function(in the sense of the deﬁnition in § { T = T ∗ ∈ E s.t. T ≥ − C } , while g is the extension by +∞ of a real-valued linear functional deﬁned on thelinear subspace ran ( Γ ) of E . Clearly, f ( ) = g ( ) = . Moreover f is continuous at 0. Indeed, the Heisenberg uncertainty inequality im-plies that(40) C ≥ d ̵ hI , so that T = T ∗ and ∥ T ∥ < d ̵ h Ô⇒ T ≥ − d ̵ hI ≥ − C .

Hence T = T ∗ and ∥ T ∥ < d ̵ h Ô⇒ f ( T ) = , so that f is continuous at 0.By the Fenchel-Rockafellar duality theorem (Theorem 1.12 in [5])inf T ∈ E ( f ( T ) + g ( T )) = max Λ ∈ E ′ ( − f ∗ ( − Λ ) − g ∗ ( Λ )) . UANTUM OPTIMAL TRANSPORT 15

Let us compute f ∗ and g ∗ . First f ∗ ( − Λ ) = sup T ∈ E (⟨ − Λ , T ⟩ − f ( T )) = sup T ∈ ET = T ∗≥− C ⟨ − Λ , T ⟩ . If Λ ∈ E ′ is not ≥

0, there exists T = T ∗ ≥ ⟨ Λ , T ⟩ = − α <

0. In particular, nT = nT ∗ ≥ − C for each n ≥

0, so that f ∗ ( − Λ ) ≥ sup n ≥ ⟨ − Λ , nT ⟩ = sup n ≥ nα = +∞ . For Λ ∈ E ′ such that Λ ≥

0, deﬁne ⟨ Λ , C ⟩ ∶ = sup T ∈ ET = T ∗≤ C ⟨ Λ , T ⟩ ∈ [ , +∞ ] . (That ⟨ Λ , C ⟩ ≥ T = f ∗ ( − Λ ) = {⟨ Λ , C ⟩ if Λ ≥ , + ∞ otherwise.Next g ∗ ( Λ ) = sup T ∈ E (⟨ Λ , T ⟩ − g ( T )) = sup T = T ∗∈ ET = A ⊗ I + I ⊗ B (⟨ Λ , T ⟩ − trace ( RA + SB )) . If there exists A = A ∗ ∈ L ( H ) and B = B ∗ ∈ L ( H ) such that ⟨ Λ , A ⊗ I + I ⊗ B ⟩ > trace ( RA + SB ) , then g ∗ ( Λ ) ≥ sup n ≥ ( n ⟨ Λ , A ⊗ I + I ⊗ B ⟩ − n trace H ( RA + SB )) = +∞ . Likewise, if ⟨ Λ , A ⊗ I + I ⊗ B ⟩ < trace H ( RA + SB ) , then g ∗ ( Λ ) ≥ sup n ≥ (⟨ Λ , − n ( A ⊗ I + I ⊗ B )⟩ − trace H ( − n ( RA + SB ))) = +∞ . Hence g ∗ ( Λ ) = { ⟨ Λ , A ⊗ I + I ⊗ B ⟩ = trace H ( RA + SB ) , + ∞ otherwise.Notice that the prescription ⟨ Λ , T ⟩ = trace H ( RA + SB ) whenever T = T ∗ ∈ ran ( Γ ) deﬁnes a unique linear functional on ran ( Γ ) since Ker ( Γ ) = { } as explained above.By the Fenchel-Rockafellar duality theorem recalled above,inf T ∈ E ( f ( T ) + g ( T )) = inf A = A ∗ , B = B ∗∈L( H ) A ⊗ I + I ⊗ B ≥− C trace H ( RA + SB ) = max Λ ∈ E ′ ( − f ∗ ( − Λ ) − g ∗ ( Λ )) = max ≤ Λ ∈ E ′⟨ Λ ,A ⊗ I + I ⊗ B ⟩= trace H ( RA + SB ) − ⟨ Λ , C ⟩ or equivalently, after exchanging the signs,sup ( A,B ) ∈ K trace H ( RA + SB ) = min ≤ Λ ∈ E ′⟨ Λ ,A ⊗ I + I ⊗ B ⟩= trace H ( RA + SB ) ⟨ Λ , C ⟩ . (We recall that the constraint A ⊗ I + I ⊗ B ≤ C in the deﬁnition of K is to beunderstood as explained immediately after the statement of Theorem 2.2.) One can further restrict the min on the right hand side with the following ob-servations.

Lemma 3.1.

Let V = L ( H ) where H is a separable Hilbert space. If ℓ ∈ V ′ satisﬁes ℓ ≥ , then T = T ∗ ∈ V Ô⇒ ⟨ ℓ, T ⟩ ∈ R and ∥ ℓ ∥ = ⟨ ℓ, I H ⟩ . Proof.

Indeed, for all T = T ∗ ∈ V , one has − ∥ T ∥ I H ≤ T ≤ ∥ T ∥ I H , so that − ∥ T ∥ I H ≤ T ≤ ∥ T ∥ I H , so that − ∥ T ∥⟨ ℓ, I H ⟩ ≤ ⟨ ℓ, T ⟩ ≤ ∥ T ∥⟨ ℓ, I H ⟩ . In particular, for all T = T ∗ ∈ V , one has ⟨ ℓ, T ⟩ ∈ R . For all T ∈ V (not necessarilyself-adjoint), write R ( T ) = ( T + T ∗ ) and I ( T ) ∶ = i ( T ∗ − T ) . If ⟨ ℓ, T ⟩ / =

0, there exists α ∈ C s.t. ∣ α ∣ = ⟨ ℓ, αT ⟩ = ∣⟨ ℓ, T ⟩∣ . These considerationsshow immediately that ⟨ ℓ, I ( αT )⟩ = ∣⟨ ℓ, T ⟩∣ = ⟨ ℓ, R ( αT )⟩ ≤ ⟨ ℓ, I H ⟩∥ R ( αT )∥ ≤ ⟨ ℓ, I H ⟩(∥ αT ∥ + ∥( αT ) ∗ ∥) = ⟨ ℓ, I H ⟩∥ T ∥ . Hence ∥ ℓ ∥ ≤ ⟨ ℓ, I H ⟩ , while it is obvious that ⟨ ℓ, I H ⟩ ≤ ∥ ℓ ∥ . This concludes the proofof Lemma 3.1. (cid:3) Lemma 3.2.

Let ≤ Λ ∈ E ′ . Then there exists Q ∈ L ( H ⊗ H ) such that Q = Q ∗ ≥ , and ∥ Q ∥ L ≤ ∥ Λ ∥ , and L ∈ E ′ such that L ≥ , L ∣ K( H ⊗ H ) = , and ∥ L ∥ ≤ ∥ Λ ∥ , satisfying Λ = trace H ⊗ H ( Q ● ) + L .

Proof.

Since L ( H ⊗ H ) = K ( H ⊗ H ) ′ , one hasΛ ∣ K( H ⊗ H ) = trace H ⊗ H ( Q ● ) , for some Q ∈ L ( H ⊗ H ) .First, observe that Λ ≥ Ô⇒ Q = Q ∗ ≥ . Indeed, since Q ∈ L ( H ⊗ H ) , then Q is compact. Writing R ( Q ) = ( Q + Q ∗ ) and I ( Q ) = − i ( Q − Q ∗ ) , one has R ( Q ) = R ( Q ) ∗ and I ( Q ) = I ( Q ) ∗ , so that ⟨ Λ , I ( Q )⟩ = trace H ⊗ H ( R ( Q ) I ( Q )) + i trace H ⊗ H ( I ( Q ) ) ∈ R . Sincetrace H ⊗ H ( R ( Q ) I ( Q )) = trace H ⊗ H ( I ( Q ) R ( Q )) = trace H ⊗ H (( R ( Q ) I ( Q )) ∗ ) = trace H ⊗ H ( R ( Q ) I ( Q )) ∈ R , one concludes thattrace H ⊗ H ( I ( Q ) ) = , so that I ( Q ) ) = . Thus Q = Q ∗ . Next observe that, for each ξ ∈ H ⊗ H , ∣ ξ ⟩⟨ ξ ∣ = (∣ ξ ⟩⟨ ξ ∣) ∗ ≥ ⟨ Λ , ∣ ξ ⟩⟨ ξ ∣⟩ = trace H ⊗ H ( Q ∣ ξ ⟩⟨ ξ ∣) = ⟨ ξ ∣ Q ∣ ξ ⟩ ≥ . UANTUM OPTIMAL TRANSPORT 17

Let ( φ n ) n ≥ be a complete orthonormal sequence of eigenvectors of Q in H ⊗ H ,and let λ n be the eigenvalue of Q associated to φ n . Then ∥ Q ∥ L = trace H ⊗ H ( Q ) = sup n ≥ n ∑ k = λ n = sup n ≥ ⟨ Λ , n ∑ k = ∣ φ n ⟩⟨ φ n ∣⟩ ≤ ⟨ Λ , I H ⊗ H ⟩ = ∥ Λ ∥ . Deﬁne L ∶= Λ − trace H ⊗ H ( Q ● ) , so that L ∣ K( H ⊗ H ) = n be the orthogonal projection on span ( φ , . . . , φ n ) . Obvi-ously Π n Q = Q Π n . Then, for each T = T ∗ ≥ E ,0 ≤ ⟨ Λ , ( I H ⊗ H − Π n ) T ( I H ⊗ H − Π n )⟩ = ⟨ Λ , T ⟩ − ⟨ Λ , T Π n ⟩ − ⟨ Λ , Π n T ⟩ + ⟨ Λ , Π n T Π n ⟩ = ⟨ Λ , T ⟩ − trace H ⊗ H ( Q ( T Π n + Π n T − Π n T Π n )) = ⟨ Λ , T ⟩ − trace H ⊗ H ( Π n Q Π n T ) → ⟨ Λ , T ⟩ − trace H ⊗ H ( QT ) = ⟨ L, T ⟩ as n → ∞ , since Q ∈ L ( H ⊗ H ) , so that Π n Q Π n → Q in L ( H ⊗ H ) as n → ∞ . Thisshows that L ≥

0. In particular (see footnote above), one has ∥ L ∥ = ⟨ L, I H ⊗ H ⟩ = ⟨ Λ , I H ⊗ H ⟩ − trace H ⊗ H ( Q ) ≤ ⟨ Λ , I H ⊗ H ⟩ = ∥ Λ ∥ . This conlcudes the proof of Lemma 3.2. (cid:3)

Lemma 3.3.

Let ≤ Λ ∈ E ′ satisfy ⟨ Λ , A ⊗ I + I ⊗ B ⟩ = trace H ( RA + SB ) , for all A = A ∗ and B = B ∗ ∈ L ( H ) . Then Λ is of the form Λ = trace H ⊗ H ( Q ● ) , with Q = Q ∗ ≥ and trace H ⊗ H ( Q ) = . In particular, Q is a coupling of R and S .Proof. Let ( e , e , . . . ) be a complete orthonormal system in H , and let P n be theorthogonal projection on span ( e , . . . , e n ) . Consider T n ∶= ( I H − P n ) ⊗ P n + P n ⊗ ( I H − P n ) ≥ , n ≥ . Since P n ⊗ P n ≥

0, one has0 ≤ T n ≤ I H ⊗ P n + P n ⊗ I H ≤ I H ⊗ H . Hence 0 ≤ ⟨ Λ , T n ⟩ ≤ trace H ⊗ H ( Q (( I H − P n ) ⊗ I H + I H ⊗ ( I H − P n ))) + ⟨ L, T n ⟩ ≤ trace H (( Q + Q )( I H − P n )) + ⟨ L, I H ⊗ H ⟩ → ⟨ L, I H ⊗ H ⟩ = ⟨ Λ , I H ⊗ H ⟩ − trace H ⊗ H ( Q ) as n → +∞ . In the formula above, Q , Q are the partial traces of Q , deﬁned asfollows: Q ∈ L ( H ) and trace H ( Q A ) = trace H ⊗ H ( Q ( A ⊗ I H )) ,Q ∈ L ( H ) and trace H ( Q A ) = trace H ⊗ H ( Q ( I H ⊗ A )) , for each A ∈ L ( H ) .Thus lim n → ∞ ⟨ Λ , T n ⟩ ≤ ⟨ Λ , I H ⊗ H ⟩ − trace H ⊗ H ( Q ) = − trace H ⊗ H ( Q ) . Taking A = I and B = ⟨ Λ , I H ⊗ H ⟩ = trace H ( R ) = ( I − P n ) ⊗ ( I − P n ) ≥

0, so that T n = I H ⊗ P n + P n ⊗ I H − P n ⊗ P n , and hence ⟨ Λ , T n ⟩ = trace H (( R + S ) P n ) − ⟨ Λ , P n ⊗ P n ⟩ = trace H (( R + S ) P n ) − H ⊗ H ( Q ( P n ⊗ P n )) since P n ⊗ P n is a ﬁnite-rank operator (and therefore a compact operator). Thuslim n → ∞ ⟨ Λ , T n ⟩ = trace H ( R + S ) − H ⊗ H ( Q ) = ( − trace H ⊗ H ( Q )) . Therefore0 ≤ ( − trace H ⊗ H ( Q )) = lim n → ∞ ⟨ Λ , T n ⟩ = lim n → ∞ ⟨ Λ , T n ⟩ ≤ − trace H ⊗ H ( Q ) , so that1 = trace H ⊗ H ( Q ) and ∥ L ∥ = ⟨ Λ , I ⟩ − trace H ⊗ H ( Q ) = − trace H ⊗ H ( Q ) = . Summarizing, we have proved that Λ is represented by Q ∈ L ( H ⊗ H ) such thattrace H ⊗ H ( Q ) =

1, and the condition Λ ≥ Q = Q ∗ ≥ ⟨ Λ , A ⊗ I H ⟩ = trace H ⊗ H ( Q ( A ⊗ I H )) = trace H ( RA ) , ⟨ Λ , I H ⊗ B ⟩ = trace H ⊗ H ( Q ( I H ⊗ B )) = trace H ( SB ) , so that the partial traces of Q are Q = R and Q = S , meaning that Q ∈ C ( R, S ) .This concludes the proof of Lemma 3.3. (cid:3) At this point, we have proved that the minimizing linear functional Λ in theduality formula above is represented by Q ∈ C ( R, S ) . In other words,sup ( A,B ) ∈ K trace H ( RA + SB ) = min Q ∈ C( R,S ) trace H ⊗ H ( QC ) , with the notation trace H ⊗ H ( QC ) ∶= sup T = T ∗∈ ET ≤ C trace H ⊗ H ( QT ) , where the constraint T ≤ C has the meaning recalled above. Let us prove that(41) sup T = T ∗∈ ET ≤ C trace H ⊗ H ( QT ) = trace H ⊗ H ( Q / CQ / ) . Let ( Φ k ) k ≥ be a complete orthonormal system of eigenvectors of Q , and let ( λ k ) k ≥ be the sequence of eigenvalues of Q such that Q Φ k = λ k Φ k for each k ≥ ̵ h ∑ k ≥ ∑ m ,...,md ≥ n ,...,nd ≥ λ k ( ( n + . . . + n d ) + d )∣⟨ Ψ m ,...,n d ,n ,...,n d ∣ Φ k ⟩∣ = ∑ k ≥ λ k ⟨ Φ k ∣ C ∣ Φ k ⟩ ≤ ∑ k ≥ λ k ⟨ Φ k ∣ H ⊗ I + I ⊗ H ∣ Φ k ⟩ < ∞ . By Corollary C.2, one has C N ∶= ( I H ⊗ H + N C ) − C = C ∗ N ∈ L ( H ⊗ H ) for each N ≥ H ⊗ H ( QC N ) → trace H ⊗ H ( Q / CQ / ) as N → ∞ . UANTUM OPTIMAL TRANSPORT 19

Since 0 ≤ C N = C ∗ N ≤ C for each N ≥ N → ∞ trace H ⊗ H ( C N Q ) ≤ sup T = T ∗∈ ET ≤ C trace H ⊗ H ( QT ) = sup ( A,B ) ∈ K trace H ( RA + SB ) . On the other hand, since λ k > Ô⇒ Ψ k ∈ Form-Dom ( H ⊗ I + I ⊗ H ) ⊂ Form-Dom ( C ) ,for each ( A, B ) ∈ K , one has ∑ k ≥ λ k ⟨ Ψ k ∣ A ⊗ I H + I H ⊗ B ∣ Ψ k ⟩ ≤ ∑ k ≥ λ k ⟨ Ψ k ∣ C ∣ Ψ k ⟩ , or equivalently, since Q ∈ C ( R, S ) ,(44) trace H ( RA + SB ) = trace H ⊗ H ( Q / ( A ⊗ I H + I H ⊗ B ) Q / ) ≤ trace H ⊗ H ( Q / CQ / ) . The inequalities (42), (43) and (44) obviously imply (41), and this concludes theproof if Theorem 2.2. 4.

Proof of Theorem 2.4

Let ( A k , B k ) ∈ K be a maximizing sequence, i.e.trace ( RA k + SB k ) → sup ( A,B ) ∈ K trace ( RA + SB ) =∶ τ ∈ [ , +∞ ) as k → ∞ . That τ < +∞ comes from the fact that the inf in Theorem 2.2 is attained by someoptimal coupling F ∈ C ( R, S ) , and that R and S both belong to D ( H ) . Indeed,using Lemma C.3 shows that F ∈ C ( R, S ) Ô⇒ ≤ trace ( F / CF / ) ≤ ( F / ( H ⊗ I + I ⊗ H ) F / ) = ( R / HR / + S / HS / ) < +∞ . Step 1: normalizing the maximizing sequence.

For each k ≥

1, set a k ∶= H − A k and b k = H − B k . Thus a k ⊗ I + I ⊗ b k ≥ ( H ⊗ I + I ⊗ H ) − C = d ∑ j = (( − i ̵ h∂ x j − i ̵ h∂ y j ) + ( x j + y j ) ) =∶ Σ ≥ . The operator Σ satisﬁes the same uncertainty inequality as C :(45) Σ = d ∑ j = (( x j + y j ) + i ( − i ̵ h∂ x j − i ̵ h∂ y j ))(( x j + y j ) − i ( − i ̵ h∂ x j − i ̵ h∂ y j )) + d ∑ j = i ([ − i ̵ h∂ x j , x j ] + [ − i ̵ h∂ y j , y j ]) ≥ d ̵ hI ⊗ I , and(46) Form-Dom ( Σ ) = { ψ ∈ H ⊗ H s.t. ( x j + y j ) ψ and ( ∂ x j + ∂ y j ) ψ ∈ H ⊗ H , ≤ j ≤ d } . Set α k ∶= sup { α ∈ R s.t. a k ≥ αI } for each k ≥

1. Since H = H ∗ ≥

0, one has a k ≥ − A k ≥ − ∥ A k ∥ I , so that α k ≥ − ∥ A k ∥ . On the other hand, let e be a normalizedeigenvector of R such that Re / =

0. Since R ∈ D ( H ) , one has 0 ≤ ⟨ e ∣ H ∣ e ⟩ < +∞ by (69) (see Lemma C.1 in the Appendix), so that a k ≥ αI Ô⇒ α ≤ ⟨ e ∣ a k ∣ e ⟩ ≤ ⟨ e ∣ H ∣ e ⟩ + ∥ A k ∥ . Hence α k ∈ [ − ∥ A k ∥ , ⟨ e ∣ H ∣ e ⟩ + ∥ A k ∥] . By deﬁnition of α k , there exists φ n ∈ Dom ( H ) such that ∥ φ n ∥ H = ⟨ φ n ∣ a k ∣ φ n ⟩ → α k as n → ∞ for each k ≥ . Thus ⟨ φ n ∣ a k ∣ φ n ⟩ I + b k ≥ d ̵ hI for each n ≥ , so that α k I + b k ≥ d ̵ hI . On the other hand, again by deﬁnition of α k , one has a k − α k I ≥ . Setting ˆ a k ∶= a k − α k I + d ̵ hI , ˆ b k ∶= b k + α k I − d ̵ hI , one has ˆ a k ⊗ I + I ⊗ ˆ b k = a k ⊗ I + I ⊗ b k ≥ Σ , ˆ a k = ˆ a ∗ k ≥ d ̵ hI , ˆ b k = ˆ b ∗ k ≥ d ̵ hI . Finally0 ≤ trace H ( R / ˆ a k R / + S / ˆ b k S / ) = trace H ( R / a k R / + S / b k S / ) = H ( R / HR / + S / HS / ) − trace H ( RA k + SB k ) → H ( R / HR / + S / HS / ) − τ as k → ∞ .4.2. Step 2: deﬁning the unbounded operators a and b . With the minimizingsequence ( a k , b k ) replaced with its normalized variant ( ˆ a k , ˆ b k ) as explained in theprevious section, one has0 ≤ trace ( R / ˆ a k R / ) ≤ sup k trace ( R / ˆ a k R / ) < +∞ , ≤ trace ( S / ˆ b k S / ) ≤ sup k trace ( S / ˆ b k S / ) < +∞ , since both these sequences are converging as k → ∞ . Therefore, the sequences ofoperators R / ˆ a k R / and S / ˆ b k S / are bounded in L ( H ) . Since L ( H ) is thetopological dual of K ( H ) (the algebra of compact operators on H ), the Banach-Alaoglu theorem implies that there exists a subsequence of ( ˆ a k , ˆ b k ) (abusively de-noted ( ˆ a k , ˆ b k ) for simplicity) such that R / ˆ a k R / → V and S / ˆ b k S / → W in L ( H ) weak-* as k → ∞ . Since ˆ a k = ˆ a ∗ k ≥ d ̵ hI and ˆ b k = ˆ b ∗ k ≥ d ̵ hI , one has V = V ∗ ≥ d ̵ hR and W = W ∗ ≥ d ̵ hS . In particular Ker ( V ) ⊂ Ker ( R ) and Ker ( W ) ⊂ Ker ( S ) . UANTUM OPTIMAL TRANSPORT 21

On the other handran ( V ) ⊂ ran ( R / ) = ran ( R ) and ran ( W ) ⊂ ran ( S / ) = ran ( S ) . (To check the ﬁrst inclusion, pick ξ = V x , and observe that ⟨ y ∣ R / ˆ a k R / ∣ x ⟩ = trace ( R / ˆ a k R / ∣ x ⟩⟨ y ∣) → ⟨ y ∣ V ∣ x ⟩ so that ξ k = R / ˆ a k R / x ∈ ran ( R / ) satisﬁes ξ k → ξ weakly in H . Hence ξ belongsto the weak closure of ran ( R / ) , which is equal to its strong closure ran ( R / ) since ran ( R / ) is a convex subset of H : see Theorem 3.7 in [5].) SinceKer ( V ) ⊥ = ran ( V ) ⊂ ran ( R ) = Ker ( R ) ⊥ , Ker ( W ) ⊥ = ran ( W ) ⊂ ran ( S ) = Ker ( S ) ⊥ , (see Corollary 2.18 (iv) in [5]) one hasKer ( V ) = Ker ( R ) and ran ( V ) = Ker ( R ) ⊥ , Ker ( W ) = Ker ( S ) and ran ( W ) = Ker ( S ) ⊥ . In particular V ∈ L ( Ker ( R ) ⊥ ) and W ∈ L ( Ker ( S ) ⊥ ) . Let v ∈ L ( J [ R ] , J [ R ] ′ ) and w ∈ L ( J [ S ] , J [ S ] ′ ) be the operators associated to V and W by (30); since V = V ∗ and W = W ∗ , one has v ∗ = v and w ∗ = w . Next J [ R ] ⊗ J [ S ] ⊂ Form-Dom ( H ⊗ I + I ⊗ H ) ⊂ Form-Dom ( Σ ) where the ﬁrst inclusion comes from (31), and the second from (21) and (46). Byconstruction, the sequences ( ˆ a k ) k ≥ and ( ˆ b k ) k ≥ satisfy ⟨ Φ ∣ ˆ a k ⊗ I + I ⊗ ˆ b k − Σ ∣ Φ ⟩ ≥ ∈ J [ R ] ⊗ J [ S ] . For each φ ∈ J [ R ] , there exists a unique ̃ φ ∈ J [ R ] such that R / ̃ φ = φ , so that ⟨ φ ∣ ˆ a k ∣ φ ⟩ = ⟨̃ φ ∣ R / ˆ a k R / ∣̃ φ ⟩ → ⟨̃ φ ∣ V ∣̃ φ ⟩ = ⟨ φ ∣ v ∣ φ ⟩ as k → ∞ . Likewise ⟨ ψ ∣ ˆ b k ∣ ψ ⟩ → ⟨ ψ ∣ w ∣ ψ ⟩ as k → ∞ for each ψ ∈ J [ R ] . Passing to the limit inthe last inequality implies that ⟨ Φ ∣ v ⊗ I + I ⊗ w − Σ ∣ Φ ⟩ ≥ ∈ J [ R ] ⊗ J [ S ] . Let a = a ∗ ∈ L ( J [ R ] , J [ R ] ′ ) and b = b ∗ ∈ L ( J [ S ] , J [ S ] ′ ) be the operators associ-ated to 2 R / HR / − V ∈ L (( Ker ( R ) ⊥ )) and to 2 S / HS / − W ∈ L (( Ker ( S ) ⊥ )) respectively. The last inequality on v and w implies that ( a , b ) ∈ ˜ K ( R, S ) .4.3. Step 3: relaxing the constraint.

In this step we prove the following: foreach ( ¯ a, ¯ b ) ∈ ˜ K ( R, S ) and each F ∈ C ( R, S ) , one has(47) trace H ⊗ H ( F / CF / ) ≥ trace H ( R / ¯ aR / + S / ¯ bS / ) . Let ( e ′ j ) and ( f ′ l ) be orthonormal sequences of eigenvectors of R and S belongingto J [ R ] and J [ S ] , and assumed to be complete in Ker ( R ) ⊥ and Ker ( S ) ⊥ respec-tively. Call p m and q n the orthogonal projections on the m ﬁrst elements of ( e ′ j ) and on the n ﬁrst elements of ( f ′ l ) respectively, so that0 ≤ p ≤ . . . ≤ p m ≤ sup m p m = P = orthogonal projection on ker ( R ) ⊥ , ≤ q ≤ . . . ≤ q n ≤ sup n q n = Q = orthogonal projection on ker ( S ) ⊥ . We shall argue instead in terms of the operators¯ v = ¯ v ∗ ∈ L ( J [ R ] , J [ R ] ′ ) and ¯ w = ¯ w ∗ ∈ L ( J [ S ] , J [ S ] ′ ) associated by (30) to the operators R / HR / − R / ¯ aR / ∈ L ( Ker ( R ) ⊥ ) ,S / HS / − S / ¯ bS / ∈ L ( Ker ( S ) ⊥ ) . Since ( ¯ a, ¯ b ) ∈ K ( R, S ) , one has ⟨ Φ ∣ C − ¯ a ⊗ I − I ⊗ ¯ b ∣ Φ ⟩ ≥ ∈ J [ R ] ⊗ J [ S ] ,so that, for each m, n , one has p m ⊗ q n Σ N p m ⊗ q n ≤ p m ⊗ q n Σ p m ⊗ q n ≤ ( p m ¯ v p m ) ⊗ q n + p m ⊗ ( q n ¯ w q n ) ≤ ( p m ¯ v p m ) ⊗ Q + P ⊗ ( q n ¯ w q n ) , where Σ N ∶= ( I H ⊗ H + N Σ ) − Σ = Σ ∗ N ∈ L ( H ⊗ H ) and 0 ≤ Σ N ≤ Σ for each N ≥ ( p m ¯ v p m ) ⊗ q n ≤ ( p m ¯ v p m ) ⊗ Q is seen easily, for instance by the followingargument. Let Φ be any element of Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ , which we decompose onthe complete orthonormal system ( e ′ j ⊗ f ′ l ) :Φ = ∑ j,l Φ jl e ′ j ⊗ f ′ l , ∑ j,l ∣ Φ jl ∣ = ∥ Φ ∥ H ⊗ H < ∞ . Then ⟨ Φ ∣( p m ¯ v p m ⊗ q n ∣ Φ ⟩ = ∑ ≤ j,k ≤ m ≤ l ≤ n Φ jl Φ kl ⟨ ¯ ve ′ j ∣ e ′ k ⟩ V ′ , V ≤ ∑ l ∑ ≤ j,k ≤ m Φ jl Φ kl ⟨ ¯ ve ′ j ∣ e ′ k ⟩ V ′ , V = ⟨ Φ ∣ p n ¯ v p n ⊗ Q ∣ Φ ⟩ , since the matrix (⟨ ¯ ve ′ j ∣ e ′ k ⟩ V ′ , V ) ≤ j,k ≤ m is Hermitian nonnegative. The analogousinequality for ¯ w is proved similarly.Thus, for each F ∈ C ( R, S ) , one hastrace H ⊗ H ( F ( p m ⊗ q n ) Σ N ( p m ⊗ q n )) ≤ trace H ⊗ H ( F (( p m ¯ v p m ) ⊗ Q + P ⊗ ( q n ¯ w q n ))) . Lemma 4.1.

Let

R, S ∈ D ( H ) and let P and Q be the orthogonal projections on Ker ( R ) ⊥ and Ker ( S ) ⊥ repectively. For each F ∈ C ( R, S ) , one has F = ( P ⊗ Q ) F ( P ⊗ Q ) . Taking this lemma for granted, we conclude the proof of (47). Firsttrace H ⊗ H ( F (( p m ¯ v p m ) ⊗ Q )) = trace H ⊗ H ( F ( P ⊗ Q )(( p m ¯ v p m ) ⊗ I )( P ⊗ Q )) = trace H ⊗ H (( P ⊗ Q ) F ( P ⊗ Q )(( p m ¯ v p m ) ⊗ I )) = trace H ⊗ H ( F (( p m ¯ v p m ) ⊗ I )) = trace H ( R ( p m ¯ v p m )) = trace H ( R / ( p m ¯ v p m ) R / ) = trace H ( p m R / ¯ vR / p m ) ≤ trace H ( R / ¯ vR / ) where the ﬁrst equality comes from the fact that Pp m = p m = p m P , the second andthe ﬁfth equality follow by cyclicity of the trace, the third equality from the lemma UANTUM OPTIMAL TRANSPORT 23 above, the fourth equality from the fact that F ∈ C ( R, S ) , and the sixth equalityfrom the fact that p m is a spectral projection of R and therefore commutes with R . The last inequality is obtained by computing the trace of R / ¯ vR / ∈ L ( H ) ona complete orthonormal system in H whose m ﬁrst vectors span ran ( p m ) . By thesame token, trace H ⊗ H ( F ( P ⊗ ( q n ¯ w q n ))) ≤ trace H ( S / ¯ wS / ) . On the other handtrace H ⊗ H ( F ( p m ⊗ q n ) Σ N ( p m ⊗ q n )) → trace H ⊗ H ( F ( P ⊗ Q ) Σ N ( P ⊗ Q )) = trace H ⊗ H ( F Σ N ) passing to the limit in m, n for each N ≥

1. Indeed ( p m ⊗ q n ) Σ N ( p m ⊗ q n ) → ( P ⊗ Q ) Σ N ( P ⊗ Q ) strongly in L ( H ⊗ H ) for each N ≥

1, since for each Ψ ∈ H ⊗ H ∥( P ⊗ Q ) Σ N ( P ⊗ Q ) Ψ − ( p m ⊗ q n ) Σ N ( p m ⊗ q n ) Ψ ∥ ≤ ∥( P ⊗ Q − p m ⊗ q n ) Σ N ( P ⊗ Q ) Ψ ∥ + ∥( p m ⊗ q n ) Σ N ( P ⊗ Q − p m ⊗ q n ) Ψ ∥ ≤ ∥( P ⊗ Q − p m ⊗ q n ) Σ N ( P ⊗ Q ) Ψ ∥ + ∥ Σ N ∥∥( P ⊗ Q − p m ⊗ q n ) Ψ ∥ → m, n for each N ≥

1. Then, one concludes as in Example 3 of chapter 2 in [23].Thus, we have proved thattrace H ⊗ H ( F Σ N ) ≤ trace H ( R / ¯ vR / + S / ¯ wS / ) , for each N ≥ . By Corollary C.2, trace H ⊗ H ( F Σ N ) → trace H ⊗ H ( F / Σ F / ) as N → ∞ , so thattrace H ⊗ H ( F / Σ F / ) ≤ trace H ( R / ¯ vR / + S / ¯ wS / ) , which is equivalent to the sought inequality (47).4.4. Step 4: the squeezing argument.

Pick an optimal coupling F opt ∈ C ( R, S ) .(We recall that the existence of such a coupling is one of the conclusions of The-orem 2.2, and follows from the Fenchel-Rockafellar duality theorem.) One has thefollowing chain of inequalities:(48) sup ( A,B ) ∈ K trace H ( RA + SB ) ≤ sup ( ¯ a, ¯ b ) ∈ ˜ K ( R,S ) trace H ( R / ¯ aR / + S / ¯ bS / ) ≤ trace H ⊗ ( F / opt CF / opt ) . The second inequality has been proved in Step 3.As for the ﬁrst inequality, observe ﬁrst that(49) sup ( A,B ) ∈ K trace H ( RA + SB ) = sup ( A,B ) ∈ ˆ K trace H ( RA + SB ) , with the notationˆ K ∶= {( A, B ) ∈ K s.t. A ≤ H − d ̵ hI and B ≤ H − d ̵ hI } . This is proved by the normalization argument in Step 1: pick ρ ∶= sup { α ∈ R s.t. 2 H − A ≥ αI } . Then ρ ∈ [ − ∥ A ∥ , ⟨ e ∣ H ∣ e ⟩ + ∥ A ∥] , where e is a normalized eigenvector of R suchthat Re / =

0, and one has A + ( ρ − d ̵ h ) I ≤ H − d ̵ hI and B − ( ρ − d ̵ h ) I ≤ H − d ̵ hI by the same argument as in Step 1. (Indeed, by deﬁnition of ρ , there exists asequence φ n ∈ Dom ( H ) such that ∥ φ n ∥ H = ⟨ φ n ∣ H − A ∣ φ n ⟩ → ρ as n → ∞ .With the inequality A ⊗ I + I ⊗ B ≤ C , this implies that, for each ψ ∈ Dom ( H ) , onehas ρ ∥ ψ ∥ H + ⟨ ψ ∣ H − B ∣ ψ ⟩ ≥ ⟨ φ n ⊗ ψ ∣ ( H ⊗ I + I ⊗ H ) − C ∣ φ n ⊗ ψ ⟩ ≥ d ̵ h ∥ ψ ∥ H , since 2 ( H ⊗ I + I ⊗ H ) − C ≥ d ̵ hI ⊗ I .) Observing that ( A, B ) ∈ K Ô⇒ ( A + ( ρ − d ̵ h ) I, B − ( ρ − d ̵ h ) I ) ∈ ˆ K , and that trace H ( RA + SB ) = trace H ( R ( A + ( ρ − d ̵ h ) I ) + S ( B − ( ρ − d ̵ h ) I )) leads to (49).Let P and Q be the H -orthogonal projections on Ker ( R ) ⊥ and Ker ( S ) ⊥ respec-tively, as in the previous section. We claim that ( A, B ) ∈ ˆ K Ô⇒ ( P A P , Q B Q ) ∈ ˜ K ( R, S ) . Indeed P A P = ( P A P ) ∗ ∈ L ( Ker ( T ) ⊥ ) ⊂ L ( J [ R ] , J [ R ] ′ ) Q B Q = ( Q B Q ) ∗ ∈ L ( Ker ( S ) ⊥ ) ⊂ L ( J [ S ] , J [ S ] ′ ) . because of the double continuous embedding (27). Then2 H ≥ A Ô⇒ R / HR / ≥ R / AR / = R / P A P R / H ≥ B Ô⇒ S / HS / ≥ S / BS / = S / Q B Q S / since P R / = R / P = R / and Q S / = S / Q = S / , andtrace H ( R / HR / − R / P A P R / ) ≤ trace H ( R / HR / ) + ∥ A ∥ < +∞ , trace H ( S / HS / − S / Q B Q S / ) ≤ trace H ( S / HS / ) + ∥ B ∥ < +∞ , so that 2 R / HR / − R / P A P R / ∈ L ( Ker ( R ) ⊥ ) , S / HS / − S / Q B Q S / ∈ L ( Ker ( S ) ⊥ ) . Finally, the inequality ⟨ Φ ∣ A ⊗ I + I ⊗ B ∣ Φ ⟩ ≤ ⟨ Φ ∣ C ∣ Φ ⟩ holds for all Φ ∈ J [ R ] ⊗ J [ S ] ⊂ Form-Dom ( C ) . Observe thatΦ ∈ J [ R ] ⊗ J [ S ] Ô⇒ (( I − P ) ⊗ I ) Φ = ( I ⊗ ( I − Q )) Φ = , and therefore ⟨ Φ ∣( P A P ) ⊗ I + I ⊗ ( Q B Q )∣ Φ ⟩ ≤ ⟨ Φ ∣ C ∣ Φ ⟩ for all Φ ∈ J [ R ] ⊗ J [ S ] , so that ( P A P , Q B Q ) ∈ ̃ K ( R, S ) . UANTUM OPTIMAL TRANSPORT 25

Since P R / = R / P = R / and Q S / = S / Q = S / , one hastrace H ( RA + SB ) = trace H ( R / P A P R / + S / Q B Q S / ) , we conclude that(50) sup ( A,B ) ∈ ˆ K trace ( RA + SB ) ≤ sup ( ¯ a, ¯ b ) ∈ ˆ K ( R,S ) trace ( R / ¯ aR / + S / BS / ) . Then (49) and (50) imply the chain of inequalities (48). By the quantum dualitytheorem (Theorem 2.2), all the inequalities in (48) are equalities:(51) sup ( A,B ) ∈ K trace H ( RA + SB ) = sup ( ¯ a, ¯ b ) ∈ ˜ K ( R,S ) trace H ( R / ¯ a + S / ¯ bS / ) = trace H ⊗ H ( F / opt CF / opt ) . Step 5: the pair ( a , b ) ∈ ̃ K ( R, S ) is optimal. For each ﬁnite rank orthogonalprojection P = P ∗ = P ∈ L ( H ) , one hastrace H ( P R / v R / P ) = trace H ( P R / v R / ) = lim k → ∞ trace H ( P R / ˆ a k R / ) = lim k → ∞ trace H ( P R / ˆ a k R / P ) , trace H ( P S / w S / P ) = trace H ( P S / w S / ) = lim k → ∞ trace H ( P R / ˆ a k R / ) = lim k → ∞ trace H ( P R / ˆ a k R / P ) , since R / ˆ a k R / → V = R / v R / and S / ˆ b k S / → W = S / w S / in L ( H ) weak −∗ by construction.Since ˆ a k = ˆ a ∗ k ≥ b k = ˆ b ∗ k ≥ k ≥ H ( P R / ˆ a k R / P ) = ∥ P R / ˆ a k R / P ∥ ≤ ∥ R / ˆ a k R / ∥ = trace H ( R / ˆ a k R / ) , trace H ( P S / ˆ b k S / P ) = ∥ P S / ˆ b k S / P ∥ ≤ ∥ S / ˆ b k S / ∥ = trace H ( S / ˆ b k S / ) . Thus, for each ﬁnite rank P = P ∗ = P ∈ L ( H ) , one hastrace H ( P ( R / v R / + S / w S / ) P ) ≤ lim k → ∞ trace H ( R / ˆ a k R / + S / ˆ b k S / ) = trace H ( R / HR / + S / HS / ) − τ . Indeed ˆ a k = H − A k − α k I + d ̵ hI and ˆ b k = H − B k + α k I − d ̵ hI , so that trace H ( R / ˆ a k R / + S / ˆ b k S / ) = H ( R / HR / + S / HS / ) − trace H ( R / A k R / + S / B k S / ) → H ( R / HR / + S / HS / ) − τ by deﬁnition of the sequence ( A k , B k ) (which is a maximizing sequence for the righthand side of (24)). Since R / v R / ∈ L ( H ) and S / w S / ∈ L ( H ) , one hastrace H ( R / v R / + S / w S / ) = sup P = P = P ∗ rank ( P )<∞ trace H ( P ( R / v R / + S / w S / ) P ) ≤ H ( R / HR / + S / HS / ) − τ . Equivalently, in terms of a and b , one hastrace H ( R / a R / + S / b S / ) ≥ τ and we deduce from the ﬁrst equality in (51) thattrace H ( R / a R / + S / b S / ) ≥ sup ( ¯ a, ¯ b ) ∈ ˜ K ( R,S ) trace H ( R / ¯ aR / + S / ¯ bS / ) . Since ( a , b ) ∈ ˜ K ( R, S ) as proved at the end of Step 2, the inequality above is anequality and the pair ( a , b ) is optimal.Finally, if R and S have ﬁnite ranks J [ R ] = J [ R ] = Ker ( R ) ⊥ and J [ S ] = J [ S ] = Ker ( S ) ⊥ . Since these spaces are ﬁnite-dimensional, their dual spaces areﬁnite dimensional with the same dimension. Thus the inclusions Ker ( R ) ⊥ ⊂ J [ R ] ′ and Ker ( S ) ⊥ ⊂ J [ S ] ′ in (29) are equalities. Any optimal pair ( a , b ) ∈ ˜ K ( R, S ) such that trace H ( R / a R / + S / b S / ) = M K ̵ h ( R, S ) satisﬁes a ∈ L ( Ker ( R ) ⊥ ) and b ∈ L ( Ker ( S ) ⊥ ) .This concludes the proof of Theorem 2.4.It remains to prove Lemma 4.1 Proof of Lemma 4.1.

One hastrace H ⊗ H ((( I − P ) ⊗ I ) F (( I − P ) ⊗ I )) = trace H ⊗ H ((( I − P ) ⊗ I ) F ) = trace H (( I − P ) R ) = I − P is the orthogonal projection on Ker ( R ) , so that (( I − P ) ⊗ I ) F (( I − P ) ⊗ I ) = (( I − P ) ⊗ I ) F (( I − P ) ⊗ I ) = ((( I − P ) ⊗ I ) F (( I − P ) ⊗ I )) ∗ ≥

0. Next observethat ∣⟨ φ ⊗ ψ ∣( P ⊗ I ) F (( I − P ) ⊗ I )∣ φ ′ ⊗ ψ ′ ⟩∣ ≤ ⟨ φ ⊗ ψ ∣( P ⊗ I ) F ( P ⊗ I )∣ φ ⊗ ψ ⟩⟨ φ ′ ⊗ ψ ′ ∣(( I − P ) ⊗ I ) F (( I − P ) ⊗ I )∣ φ ′ ⊗ ψ ′ ⟩ for each φ, φ ′ , ψ, ψ ′ ∈ H by the Cauchy-Schwarz inequality since F = F ∗ ≥

0, so that ⟨ φ ⊗ ψ ∣( P ⊗ I ) F (( I − P ) ⊗ I )∣ φ ′ ⊗ ψ ′ ⟩ = . Hence ( P ⊗ I ) F (( I − P ) ⊗ I ) = (( P ⊗ I ) F (( I − P ) ⊗ I )) ∗ = (( I − P ) ⊗ I ) F ( P ⊗ I ) = , so that F = ( P ⊗ I ) F ( P ⊗ I ) . The same argument shows that F = ( I ⊗ Q ) F ( I ⊗ Q ) , so that F = ( I ⊗ Q ) F ( I ⊗ Q ) = ( I ⊗ Q )( P ⊗ I ) F ( P ⊗ I )( I ⊗ Q ) = ( P ⊗ Q ) F ( P ⊗ Q ) , UANTUM OPTIMAL TRANSPORT 27 which is precisely the desired equality. (cid:3) Proof of Theorems 2.5

Proof of Theorem 2.5.

Since Φ j ∈ Ker ( C − A ⊗ I − I ⊗ B ) , one has in particularΦ j ∈ Dom ( C ) with ∥ C Φ j ∥ ≤ (∥ A ∥ + ∥ B ∥)∥ Φ j ∥ = ∥ A ∥ + ∥ B ∥ for all j . Therefore ∑ m λ m ⟨ Φ m ∣ C ∣ Φ m ⟩ ≤ (∥ A ∥ + ∥ B ∥) ∑ m λ m = ∥ A ∥ + ∥ B ∥ , so that F / CF / ∶= ∑ j,k √ λ j λ k ⟨ Φ j ∣ C ∣ Φ k ⟩∣ Φ j ⟩⟨ Φ k ∣ ∈ L ( H ⊗ H ) by Lemma C.1. Since Φ k ∈ Ker ( C − A ⊗ I − I ⊗ B ) for all k , one has F / CF / ∶= ∑ j,k √ λ j λ k ⟨ Φ j ∣ A ⊗ I + I ⊗ B ∣ Φ k ⟩∣ Φ j ⟩⟨ Φ k ∣ = F / ( A ⊗ I + I ⊗ B ) F / , and thustrace H ⊗ H ( F / CF / ) = trace H ⊗ H ( F / ( A ⊗ I + I ⊗ B ) F / ) = trace H ⊗ H ( F ( A ⊗ I + I ⊗ B )) = trace H ( F A + F B ) , where the second equality follows from cyclicity of the trace, while the third equalitycomes from the deﬁnition of F and F as the partial traces of F . Therefore(52)inf G ∈ C( F ,F ) trace H ⊗ H ( G / CG / ) ≤ trace H ⊗ H ( F / CF / ) = trace H ( F A + F B ) ≤ sup ( a,b ) ∈ K trace H ( F a + F b ) . For each G ∈ C ( R, S ) , let ( Ψ k ) k ≥ be a complete orthonormal system of eigenvectorsof G , and let ( γ k ) k ≥ be the sequence of eigenvalues of G , so that G Ψ k = γ k Ψ k foreach k ≥

1. Thentrace H ⊗ H ( G / CG / ) < ∞ ⇐⇒ ∑ k ≥ γ k ⟨ Ψ k ∣ C ∣ Ψ k ⟩ < ∞ . Thus, if trace H ⊗ H ( G / CG / ) < ∞ , one has, as explained in Lemma 2.1Ψ k ∈ Form-Dom ( C ) for each k ≥ γ k > . For all ( a, b ) ∈ K , one has therefore γ k > Ô⇒ ⟨ Ψ k ∣ C − a ⊗ I − I ⊗ b ∣ Ψ k ⟩ ≥ , so that 0 ≤ ∑ k ≥ γ k ⟨ Ψ k ∣ C − a ⊗ I − I ⊗ b ∣ Ψ k ⟩ = trace H ⊗ H ( G / CG / ) − trace H ⊗ H ( G / ( a ⊗ I + I ⊗ b ) G / ) = trace H ⊗ H ( G / CG / ) − trace H ⊗ H ( G ( a ⊗ I + I ⊗ b )) = trace H ⊗ H ( G / CG / ) − trace H ( F a + F b ) Therefore(53) sup ( a,b ) ∈ K trace H ( F a + F b ) ≤ inf G ∈ C( R,S ) trace H ⊗ H ( G / CG / ) . Putting together (52) and (53) leads to the announced result.Conversely, if

R, S ∈ D ( H ) satisﬁes (35), let F be any optimal coupling of R and S . Then M K ̵ h ( R, S ) = trace H ⊗ H ( F / CF / ) = trace H ( RA + SB ) , Since

R, S ∈ D ( H ) , the quantity M K ̵ h ( R, S ) = trace H ⊗ H ( F / CF / ) is ﬁnite,so that all the eigenvectors of F corresponding to positive eigenvalues belong toForm-Dom ( C ) . The second equality above can be equivalently recast astrace H ⊗ H ( F / ( C − A ⊗ I − I ⊗ B ) F / ) = . Since ⟨ Φ ∣ C − A ⊗ I − I ⊗ B ∣ Φ ⟩ ≥ ∈ Form-Dom ( C ) , this implies that allthe eigenvectors of F corresponding to positive eigenvalues belong to Ker ( C − A ⊗ I − I ⊗ B ) . In particular, this nullspace is not equal to { } and F is of the form(34). (cid:3) Proof of Theorem 2.6

Proof of ( ) when A, B ∈ K . Since F ∈ C ( R, S ) with R, S ∈ D ( H ) , any eigenvector Φ of F such that F Φ / = ∈ Form-Dom ( H ⊗ I + I ⊗ H ) ⊂ Form-Dom ( C ) by Lemma 2.1. By cyclicity of the tracetrace H ⊗ H ( F / ( A ⊗ I + I ⊗ B ) F / ) = trace H ⊗ H ( F ( A ⊗ I + I ⊗ B )) = trace H ( RA + SB ) . Since F / ( H ⊗ I + I ⊗ H ) F / ∈ L ( H ⊗ H ) and since C ≤ ( H ⊗ I + I ⊗ H ) onForm-Dom ( H ⊗ I + I ⊗ H ) , one hastrace H ⊗ H ( F / ( C − A ⊗ I − I ⊗ B ) F / ) = . Let ( Φ j ) j ≥ be a complete orthonormal sequence of eigenvectors of F , and deﬁne λ j ≥ F Φ j = λ j Φ j , for each j ≥

1. Then0 = trace H ⊗ H ( F / ( C − A ⊗ I − I ⊗ B ) F / ) = ∑ j ≥ ⟨ Φ j ∣ F / ( C − A ⊗ I − I ⊗ B ) F / ∣ Φ j ⟩ = ∑ j ≥ λ j ⟨ Φ j ∣ C − A ⊗ I − I ⊗ B ∣ Φ j ⟩ , so that λ j > Ô⇒ ⟨ Φ j ∣ C − A ⊗ I − I ⊗ B ∣ Φ j ⟩ = , for all j ≥ . Indeed, since Φ j ∈ Form-Dom ( C ) and ( A, B ) ∈ K , one has ⟨ Φ j ∣ C − A ⊗ I − I ⊗ B ∣ Φ j ⟩ ≥ , for all j ≥ . Since ⟨ Φ ∣ C − A ⊗ I − I ⊗ B ∣ Φ ⟩ ≥ ∈ Form-Dom ( C ) , we conclude from theCauchy-Schwarz inequality that λ j > Ô⇒ ⟨ Φ ∣ C − A ⊗ I + I ⊗ B ∣ Φ j ⟩ = , for all j ≥ ∈ Form-Dom ( C ) . UANTUM OPTIMAL TRANSPORT 29

In particular, choosing Φ = Ψ m ,...,m d ,n ,...,n d (in the notation of section A) showsthat2 ̵ h ( ( n , . . . , n d ) + d )⟨ Ψ m ,...,m d ,n ,...,n d ∣ Φ j ⟩ = ⟨ Ψ m ,...,m d ,n ,...,n d ∣ A ⊗ I + I ⊗ B ∣ Φ j ⟩ , so that4 ̵ h ∑ m ,...,md ≥ n ,...,nd ≥ ( ( n , . . . , n d ) + d ) ∣⟨ Ψ m ,...,m d ,n ,...,n d ∣ Φ j ⟩∣ ≤ (∥ A ∥ + ∥ B ∥) . This implies that Φ j ∈ Dom ( C ) with ∥ C Φ j ∥ ≤ ∥ A ∥ + ∥ B ∥ for each j ≥

1. Hence λ j > Ô⇒ ( C − A ⊗ I − I ⊗ B ) Φ j ∈ H × H and ( C − A ⊗ I − I ⊗ B ) Φ j ⊥ Form-Dom ( C ) . Since Form-Dom ( C ) is dense in H ⊗ H , we conclude that ( C − A ⊗ I − I ⊗ B ) Φ j = j ≥ λ j >

0. This proves (a).For each j = , . . . , d , one has ( D q j ⊗ I )( C − A ⊗ I − I ⊗ B ) = D q j ( H − A ) ⊗ I − I ⊗ q j = ( D q j A ⊗ I − I ⊗ q j ) ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) , ( D p j ⊗ I )( C − A ⊗ I − I ⊗ B ) = D p j ( H − A ) ⊗ I − I ⊗ p j = ( D p j A ⊗ I − I ⊗ p j ) ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) . Applying (37) with T = q j ⊗ I, p j ⊗ I, I ⊗ q j , I ⊗ p j shows that all these operators,which are bounded from Form-Dom ( H ⊗ I + I ⊗ H ) into H ⊗ H , can be extended asbounded operators from H ⊗ H to Form-Dom ( H ⊗ I + I ⊗ H ) ′ . Since A ∈ L ( H ) , thisshows that the right hand sides of these identities belong to L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ . Next, (38) with T = q j ⊗ I or T = p j ⊗ I show that these identities hold in the space L ( Form-Dom ( H ⊗ I + I ⊗ H ) ∩ Dom ( C ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ + Dom ( C ) ′ ) . Likewise, for j = , . . . , d , one has ( I ⊗ D q j )( C − A ⊗ I − I ⊗ B ) = ( I ⊗ D q j B − q j ⊗ I ) ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) , ( I ⊗ D p j )( C − A ⊗ I − I ⊗ B ) = ( I ⊗ D p j B − p j ⊗ I ) ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) . Let Φ , Ψ ∈ Ker ( F ) ⊥ be eigenvectors of F . According to (a), one hasΦ , Ψ ∈ Dom ( C ) ∩ Form-Dom ( H ⊗ I + I ⊗ H ) , and(54) 2 ⟨( D q j A ⊗ I − I ⊗ q j ) Φ , Ψ ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) = i ̵ h (( p j ⊗ I ) Φ ∣( C − A ⊗ I − I ⊗ B ) Ψ ) H ⊗ H − i ̵ h (( C − A ⊗ I − I ⊗ B ) Φ ∣( p j ⊗ I ) Ψ ) H ⊗ H = j = , . . . , d , since Φ , Ψ ∈ Ker ( C − A ⊗ I − I ⊗ B ) by (a). Likewise(55) 2 ⟨( D p j A ⊗ I − I ⊗ p j ) Φ , Ψ ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) = − i ̵ h (( p j ⊗ I ) Φ ∣( C − A ⊗ I − I ⊗ B ) Ψ ) H ⊗ H + i ̵ h (( C − A ⊗ I − I ⊗ B ) Φ ∣( p j ⊗ I ) Ψ ) H ⊗ H = for all j = , . . . , d .Let ( Φ k ) k ≥ be a complete orthonormal system of eigenvectors of F in H ⊗ H ,and let λ k ≥ F Φ k = λ k Φ k . Then F / = ∑ k ≥ λ k ∣ Φ k ⟩⟨ Φ k ∣ , and, for each T ∈ L ( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) andeach φ, ψ ∈ H ⊗ H , one has ∑ k,l ≥ λk,λl > √ λ k λ l ( Φ l ∣ ψ ) H ⊗ ( Φ k ∣ φ ) H ⊗ ⟨ T Φ k , Φ l ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) = ⟨ φ ∣ F / T F / ∣ ψ ⟩ . Observe that this last series is absolutely convergent since ∑ k,l ≥ λk,λl > λ / k λ / l ∣⟨ T Φ k , Φ l ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) × ( Φ k ∣ φ ) H ⊗ H ( Φ l ∣ ψ ) H ⊗ H ∣ ≤ ∥ T ∥ L( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) × ∑ k ≥ λk > λ / k ∣( Φ k ∣ φ ) H ⊗ H ∣ ∑ l ≥ λl > λ / l ∣( Φ l ∣ ψ ) H ⊗ H ∣ ≤ ∥ T ∥ L( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) ∑ n ≥ λ n × ( ∑ k ≥ ∣( Φ k ∣ φ ) H ⊗ H ∣ ) / ( ∑ l ≥ ∣( Φ k ∣ ψ ) H ⊗ H ∣ ) / ≤ ∥ T ∥ L( Form-Dom ( H ⊗ I + I ⊗ H ) , Form-Dom ( H ⊗ I + I ⊗ H ) ′ ) ∥ φ ∥ H ⊗ H ∥ ψ ∥ H ⊗ H . Hence(56) ∑ k,l ≥ λk,λl > λ / k λ / l ⟨ T Φ k , Φ l ⟩ Form-Dom ( H ⊗ I + I ⊗ H ) ′ , Form-Dom ( H ⊗ I + I ⊗ H ) ∣ Φ k ⟩⟨ Φ l ∣ = F / T F / ∈ L ( H ⊗ H ) by the Riesz representation theorem. Setting successively T = D q j A ⊗ I − I ⊗ q j and T = D p j A ⊗ I − I ⊗ p j ,T = I ⊗ D q j B − q j ⊗ I and T = I ⊗ D p j B − p j ⊗ I , for all j = , . . . , d in (56) and using (54) and (55) implies statement (b). (cid:3) Proof of ( ) . The proof of the statement ( ) follows closely the line of the proof of the case ( ) , simpliﬁed by the ﬁnite dimensionality.The densities R, S being of ﬁnite rank, J ( R ) = J ( R ) = Ker ( R ) ⊥ and the samefor S . Therefore, by Theorem 2.4 and Deﬁnition 2.3, we have that A ⊗ I ker ( S ) ⊥ + I ker ( R ) ⊥ ⊗ B ≤ P ⊗ Q C P ⊗ Q ∶= C ′ . UANTUM OPTIMAL TRANSPORT 31

Moreover by the optimality condition and Lemma 4.1,trace

Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ ( F / ( C ′ − A ⊗ I ker ( S ) ⊥ + I ker ( R ) ⊥ ⊗ B ) F / ) = trace H ⊗ H ( F / ( C − A ⊗ I − I ⊗ B ) F / ) = ( R ) ⊥ ⊗ Ker ( S ) ⊥ , ( c ) is proved.Let us remark that, with the notation deﬁned right after (11), C ′ = P H P ⊗ I + I ⊗ Q H Q − P Z P ⊗ Q Z Q = P H P ⊗ I + I ⊗ Q H Q − d ∑ k = ( Q Rk ⊗ Q Sk + P Rk ⊗ P Sk ) . Hence, for example, for any j = , . . . , d , D q j ⊗ I ( C ′ − A ⊗ I − I ⊗ B ) = d ∑ k = ( D q j Q Rk ⊗ Q Rk + D q j P Rk ⊗ P Rk ) − D q j A ′ ⊗ I so that, by the same argument as before, F / ( d ∑ k = ( D q j Q Rk ⊗ Q Rk + D q j P Rk ⊗ P Rk ) − D q j A ′ ⊗ I ) F / = . By using the fact that F / commutes with P ⊗ Q thanks to Lemma 4.1 and doingthe same argument for D p j instead of D q j we get immediately ( d ) .Note that F / ( D q j A ′ ⊗ I ) F / = F / ( i ̵ h [ p Rj , A ′ ] ⊗ I ) F / so that one can replace D q j A ′ by i ̵ h [ p Rj , A ′ ] in statement ( d ) . (cid:3) Examples

In this section, we shall study the optimal operators a and b from the Kan-torovich duality theorem, together with the structure of optimal couplings, on afew elementary examples. We will also give a necessary and suﬃcient condition forthe optimal coupling of two quantum densities of semiclassical (T¨oplitz) type topresent the same feature.7.1. The case where R is a rank-one projection. Let R = ∣ φ ⟩⟨ φ ∣ with ∥ φ ∥ H = S be a ﬁnite-rank density operator on the Hilbertspace H . By Theorem 2.4 in the ﬁnite rank case, the optimal operators a and b should be sought in the form(57) a ∶= α ∣ φ ⟩⟨ φ ∣ , b ∶= n ∑ k = β k ∣ e k ⟩⟨ e k ∣ , where ( e k ) ≤ k ≤ n is an orthonormal basis of Ker ( S ) ⊥ , to be determined along withthe real numbers α, β , . . . , β n . We shall see that(a) the basis ( e j ) ≤ j ≤ n is orthonormal in Ker ( S ) ⊥ and orthogonal for the Hermitianform ( ψ, ψ ′ ) ↦ ⟨ φ ⊗ ψ ∣ C ∣ φ ⊗ ψ ′ ⟩ on Ker ( S ) ⊥ — in other words, the lines C e j for j = , . . . , n are mutually orthogonal principal axes of this Hermitian form inKer ( S ) ⊥ , while(b) the real numbers α + β j for j = , . . . , n are the eigenvalues of the Hermitian(diagonal) matrix with entries ⟨ φ ⊗ e j ∣ C ∣ φ ⊗ e k ⟩ for j, k = , . . . , n . These conditions do not completely determine the orthonormal basis ( e j ) ≤ j ≤ n and the real numbers α, β , . . . , β n . For instance, if ( a , b ) of the form (57) is optimal,then ( a + t ∣ φ ⟩⟨ φ ∣ , b − tI Ker ( S ) ⊥ ) is also optimal — this corresponds to changing α in α + t and β j in β j − t for j = , . . . , n . Likewise, if ⟨ φ ⊗ e j ∣ C ∣ φ ⊗ e j ⟩ = ⟨ φ ⊗ e k ∣ C ∣ φ ⊗ e k ⟩ for some j / = k , the frame ( e j , e k ) can be replaced with its image under any rotationin the plane span { e j , e k } .To prove (a)-(b), we begin with an important observation on the set of couplingsof R and S , which is a straightforward consequence of Lemma 4.1. Lemma 7.1.

Assume that R ∈ D ( H ) is a projection. Then rank ( R ) = and foreach S ∈ D ( H ) , one has C ( R, S ) = { R ⊗ S } . This is the quantum analogue of the case where one considers two Borel proba-bility measures µ and ν , one of which, say µ , is a Dirac measures. In that case, it isobvious that Π ( µ, ν ) = { µ ⊗ ν } (all the mass from ν is transported to the support ofthe Dirac measure). Indeed, pure states, corresponding to density operators of theform R = ∣ φ ⟩⟨ φ ∣ where φ is a normalized element of H , are the quantum analoguesof phase space points in classical mechanics.Taking this lemma for granted, R ⊗ S is the optimal coupling — in fact the onlycoupling — of R and S . Therefore the optimal operators a and b satisfy ⎧⎪⎪⎨⎪⎪⎩ trace H ⊗ H (( R ⊗ S )( C − a ⊗ I − I ⊗ b )( R ⊗ S )) = , ⟨ Ψ ∣ C − a ⊗ I − I ⊗ b ∣ Ψ ⟩ ≥ ∈ Ker ( R ) ⊥ ⊗ Ker ( S ) ⊥ . Hence ⟨ φ ⊗ ψ ∣ C − a ⊗ I − I ⊗ b ∣ φ ⊗ ψ ′ ⟩ = ψ, ψ ′ ∈ Ker ( S ) ⊥ . This condition can be checked on any basis of Ker ( S ) ⊥ . For instance, using theorthonormal basis ( e j ) ≤ j ≤ n of eigenvectors of b leads to the identity ⟨ φ ⊗ e j ∣ C ∣ φ ⊗ e k ⟩ = ( α + β j ) δ jk , for all j, k = , . . . , n . This obviously implies the conclusions (a) and (b) on the real numbers α, β , . . . , β n and the orthonormal basis ( e j ) ≤ j ≤ n of Ker ( S ) ⊥ . Proof of Lemma 7.1.

We recall that, if R is an orthogonal projection, one hasrank ( R ) = trace ( R ) . On the other hand trace ( R ) = R ∈ D ( H ) . Denot-ing by Q the orthogonal projection on Ker ( S ) ⊥ , Lemma 4.1 implies that ( R ⊗ I ) F ( R ⊗ I ) = ( R ⊗ I )( R ⊗ Q ) F ( R ⊗ Q )( R ⊗ I ) = ( R ⊗ Q ) F ( R ⊗ Q ) = ( R ⊗ Q ) F ( R ⊗ Q ) = F .

Thus, for each φ , φ , ψ , ψ ∈ H , one has ⟨ φ ⊗ ψ ∣ F ∣ φ ⊗ ψ ⟩ = ⟨ φ ⊗ ψ ∣( R ⊗ I ) F ( R ⊗ I )∣ φ ⊗ ψ ⟩ = ⟨ φ ∣ e ⟩⟨ e ∣ φ ⟩⟨ e ⊗ ψ ∣ F ∣ e ⊗ ψ ⟩ = ⟨ φ ∣ R ∣ φ ⟩⟨ ψ ∣ G ∣ ψ ⟩ , where ∥ e ∥ H = C e = ran ( R ) , while G is the self-adjoint operator on H suchthat ⟨ ψ ∣ G ∣ ψ ⟩ = ⟨ e ⊗ ψ ∣ F ∣ e ⊗ ψ ⟩ , ψ , ψ ∈ H . (The existence and uniqueness of G follows from the Riesz representation theorem.) UANTUM OPTIMAL TRANSPORT 33

Hence F = R ⊗ G , and since F ∈ C ( R, S ) ,trace (( R ⊗ G )( I ⊗ B )) = trace ( GB ) = trace ( SB ) for each ﬁnite rank B ∈ L ( H ) , which implies that G = S . (cid:3) The quantum bipartite matching problem.

A classical bipartite match-ing problem consists in computing the optimal transport between two probabilitymeasures µ and ν given by µ = − η δ − a + + η δ a , ν = δ − b + δ b , a, b > , associated to dist MK , ( µ, ν ) . A quantum analogue consists in considering

M K ̵ h ( R, S ) where R = − η ∣ − a ⟩⟨ a ∣ + − η ∣ − a ⟩⟨ − a ∣ , and S = ∣ b, ⟩⟨ b ∣ + ∣ − b ⟩⟨ − b ∣ . Here ∣ q ⟩ = ∣ q, ⟩ where ∣ q, p ⟩ , q, p ∈ R , is a coherent state deﬁned by (62). Since a, b > R, S are operators of rank 2 so that Theorem 2.6 (2) applies. Since we arein dimension d =

1, the two ﬁrst equalities of the result read F / ( i ̵ h [ P R , Q R ] ⊗ Q S − i ̵ h [ P R , A ′ ] ⊗ I ) F / = F / ( i ̵ h [ Q R , P R ] ⊗ Q S − i ̵ h [ Q R , A ′ ] ⊗ I ) F / = . Note that when η = − a to − b and a to b (see [7, Section 1]). We will consider the quantumproblem in this case η =

0, that is we will study

M K ̵ h ( R, S ) where R ∶= (∣ a ⟩⟨ a ∣ + ∣ − a ⟩⟨ − a ∣) and S ∶= (∣ b ⟩⟨ b ∣ + ∣ − b ⟩⟨ − b ∣) . Deﬁne λ ∶= ⟨ a ∣ − a ⟩ = e − a /̵ h , µ ∶= ⟨ b ∣ − b ⟩ = e − b /̵ h , and consider the two pairs of orthogonal vectors(58) φ ± ∶= ∣ a ⟩ ± ∣ − a ⟩√ ( ± λ ) , ψ ± ∶= ∣ b ⟩ ± ∣ − b ⟩√ ( ± µ ) . Hence R = α + ∣ φ + ⟩⟨ φ + ∣ + α − ∣ φ − ⟩⟨ φ − ∣ , S = β + ∣ ψ + ⟩⟨ ψ + ∣ + β − ∣ ψ − ⟩⟨ ψ − ∣ , α ± ∶= ( ± λ ) , β ± ∶= ( ± µ ) . In [7, Section 4], we computed an optimal coupling F between R and S of thefollowing form: in the basis { φ + ⊗ ψ + , φ + ⊗ ψ − , φ − ⊗ ψ + , φ − ⊗ ψ − } F is expressed bythe matrix14 ⎛⎜⎜⎜⎝ + λµ + λ + µ √( + λµ ) − ( λ + µ ) − λµ + λ − µ √( − λµ ) − ( λ − µ ) √( − λµ ) − ( λ − µ ) − λµ − λ + µ √( + λµ ) − ( λ + µ ) + λµ − λ − µ ⎞⎟⎟⎟⎠ . Therefore one sees easily that Ker ( F ) is generated by the two vectors {∣ + + ⟩ − √ + λ − λ + µ − µ ∣ − − ⟩ , ∣ + − ⟩ − √ + λ − λ − µ + µ ∣ − + ⟩} , with ∣ ± , ± ⟩ = φ ± ⊗ ψ ± , so that Ker ( F ) ⊥ is the two-dimensional subspace of H gener-ated by { ϕ ∶= ∣ + + ⟩ − √ − λ + λ − µ + µ ∣ − − ⟩ , ϕ ∶= ∣ + − ⟩ − √ − λ + λ + µ − µ ∣ − + ⟩} . Moreover, straightforward computations show that Q R = a √ − λ ( ) , P R = − iaλ √ − λ ( − ) , so that i ̵ h [ P R , Q R ] = a λ − λ ( − ) and i ̵ h [ P R , Q R ] ⊗ Q S = a bλ ( − λ )√ − µ ( − ) ⊗ ( ) . Hence one easily get that, for α, β ∈ C , ( − ) ⊗ ( ) ( α (∣ + + ⟩ − √ − λ + λ − µ + µ ∣ − − ⟩) + β (∣ + − ⟩ − √ − λ + λ + µ − µ ∣ − + ⟩)) = − α (∣ + − ⟩ + √ − λ + λ − µ + µ ∣ − + ⟩) − β (∣ + + ⟩ + √ − λ + λ + µ − µ ∣ − − ⟩) and, deﬁning ∥ α, β ⟩ ∶= αϕ + βϕ , ⟨ α ′ , β ′ ∥ ( − ) ⊗ ( ) ∥ α, β ⟩ = ( ¯ α ′ β + ¯ β ′ α )( − + − λ + λ ) . Moreover ( ) ⊗ ( ) ( α (∣ + + ⟩ − √ − λ + λ − µ + µ ∣ − − ⟩) + β (∣ + − ⟩ − √ − λ + λ + µ − µ ∣ − + ⟩)) = α (∣ + − ⟩ − √ − λ + λ − µ + µ ∣ − + ⟩) + β (∣ + + ⟩ − √ − λ + λ + µ − µ ∣ − − ⟩) and ⟨ α ′ , β ′ ∥ ( ) ⊗ ( ) ∥ α, β ⟩ = ( ¯ α ′ β + ¯ β ′ α )( + − λ + λ ) so that ⟨ α ′ , β ′ ∥ ( − ) ⊗ ( ) ∥ α, β ⟩ = − λ ⟨ α ′ , β ′ ∥ ( ) ⊗ ( ) ∥ α, β ⟩ . We just proved the following lemma.

Lemma 7.2. F / ( i ̵ h [ P R , Q R ] ⊗ Q S ) F / = F / ( − a λ ̵ h ( − λ ) I ⊗ Q S ) F / We also computed in [7, Section 2] the matrix C ′ of P ⊗ Q C P ⊗ Q , the costprojected on the range of R ⊗ S ,(59) C ′ = ⎛⎜⎜⎜⎝ A + ̵ h γ B + ̵ h δ δ C + ̵ h γ D + ̵ h ⎞⎟⎟⎟⎠ . The cost used in [7] is shifted by − ̵ h with respect to the one in the present paper. UANTUM OPTIMAL TRANSPORT 35 where A = a − λ + λ + b − µ + µ , B = a − λ + λ + b + µ − µ , γ = − ab ( − λµ )√( − λ )( − µ ) , C = a + λ − λ + b − µ + µ , D = a + λ − λ + b + µ − µ , δ = − ab ( + λµ )√( − λ )( − µ ) . By the same computation, we get that the matrices H R and H S of the harmonicoscillator projected on the range of R and the one of S are H R = ( a − λ + λ + ̵ h a + λ − λ + ̵ h ) , H S = ( b − µ + µ + ̵ h b + µ − µ + ̵ h ) . Finally, we proved in [7, Section 2] that two optimal operators

A, B can be chosenin the form A = ( α α ) B = ( β β ) where α , α , β , β satisfy¯ a = α + β − A , ¯ b = α + β − B , ¯ c = α + β − C , ¯ d = α + β − D , with¯ a + ¯ d = ¯ b + ¯ c = x, ¯ a − ¯ d = √ x − γ , ¯ b − ¯ c = √ x − δ , x = − ab ( − λ µ )( − λ )( − µ ) . We get, after some algebraic computations, α − α = ¯ a − ¯ c + A − C = λa − λ ( b − a ) . Let us remark now that, if D = ( U V ) , U, V ∈ C , then i ̵ h [ P R , D ] = aλ ̵ h √ − λ ( V − UV − U ) = λ ( V − U )̵ h Q R . Therefore, deﬁning A ′ = ( H R − A ) , one ﬁnd(60) i ̵ h [ P R , A ′ ] = λ ̵ h ( a ( − λ + λ − + λ − λ ) + α − α ) Q R = a λ ̵ h ( − λ ) ba Q R . By Lemma 7.2 and (60), and the same type of computations changing Q R in P R and Q S in P S , we get ﬁnally the following result. Proposition 7.3.

In the equal mass situation, we have F / ( I ⊗ Q S − ba Q R ⊗ I ) F / = F / ( I ⊗ P S − be − b ̵ h ae − a ̵ h P R ⊗ I ) F / = which correspond to a transport ( − a, a ) → ( − b, b ) . The renormalization of Q R and P R by ba and be − b ̵ h ae − a ̵ h respectively corresponds to sending ( Q R , P R ) to ( Q S , P S ) by atransport not in the (usual) form of a conjugation by a unitary transform sending { φ + , φ − } to { ψ + , ψ − } . The case where R = S is a T¨oplitz operator. We ﬁrst recall that

M K ̵ h ( R, R ) ≥ d ̵ h > MK , ( µ, µ ) = µ ∈ P ( R d × R d ) . Therefore, computing an optimal coupling and optimal operators a and b is nontrivial problem even in this case. However, this problem can be solvedwhen R is a T¨oplitz operator, deﬁned as follows.Let µ be a Radon measure on R d × R d ; the (possibly unbounded) T¨oplitz operatorwith symbol µ is deﬁned by duality by the formula(61) ⟨ u ∣ OP T ̵ h [ µ ]∣ v ⟩ ∶= ( π ̵ h ) d ∫ R d × R d ⟨ u ∣ q, p ⟩⟨ q, p ∣ v ⟩ µ ( dqdp ) , for all u, v ∈ L ( R d ) such that the functions ( q, p ) ↦ ⟨ u ∣ q, p ⟩ and ( q, p ) ↦ ⟨ v ∣ q, p ⟩ both belong to L ( R d × R d , µ ) , where(62) ∣ q, p ⟩( x ) ∶= ( π ̵ h ) − d / e −∣ x − q ∣ / ̵ h e ip ⋅ x /̵ h . We recall that µ ∈ P ( R d × R d ) Ô⇒ OP T ̵ h [ µ ] ∈ D ( H ) , (see [15], Theorem 2.2 (iii)), and that M K ̵ h ( OP T ̵ h [ µ ] , OP T ̵ h [ ν ]) ≤ dist MK , ( µ, ν ) + d ̵ h for all µ, ν ∈ P ( R d × R d ) (see [15], Theorem 2.2 (iii), or Theorem 2.3 (1) in [13]),while 2 d ̵ h ≤ M K ̵ h ( R, S ) for all R, S ∈ D ( H ) , according to ﬂa. (14) in [13].Hence M K ̵ h ( OP T ̵ h [ µ ] , OP T ̵ h [ µ ]) = d ̵ h . This is Corollary 2.4 in [15].An optimal element of C ( OP T ̵ h [ µ ] , OP T ̵ h [ µ ]) is F ∶= ∫ R d × R d ∣ q, p ⟩⟨ q, p ∣ ⊗ µ ( dqdp ) . That F ∈ C ( OP T ̵ h [ µ ] , OP T ̵ h [ µ ]) follows from Lemma 4.1 in [13]. This is the analogueof the diagonal coupling diag µ of one Borel probability measure µ with itself,where diag is the diagonal embedding diag ∶ x ↦ ( x, x ) . (Informally, the diagonalcoupling is µ ( dx ) δ ( y − x ) .)Moreover trace ( F / CF / ) = sup ǫ > trace ( F ( I + ǫC ) − C ) = ∫ R d × R d ⟨ q, q, p, p ∣ C ∣ q, q, p, p ⟩ µ ( dqdp ) = d ̵ h . as explained in the proof of Lemma 2.1 of [15]. In the formula above, we havedenoted ∣ q , q , p , p ⟩ = ∣ q , p ⟩ ⊗ ∣ q , p ⟩ , ⟨ q , q , p , p ∣ = ⟨ q , p ∣ ⊗ ⟨ q , p ∣ We claim that one can choose in this case(63) a = b = d ̵ hI H . Indeed, according to the Heisenberg uncertainty principle C ≥ d ̵ hI ⊗ I = a ⊗ I + I ⊗ a . UANTUM OPTIMAL TRANSPORT 37

On the other hand trace ( a OP T ̵ h [ µ ]) = d ̵ h trace ( OP T ̵ h [ µ ]) = d ̵ h ∫ R d × R d µ ( dxdξ ) = d ̵ h , so that, with the choice of a and b above, one hastrace ( a OP T ̵ h [ µ ]) + trace ( b OP T ̵ h [ µ ]) = d ̵ h = trace ( F / CF / ) . In the classical setting, the optimal functions φ and ψ in (16) are φ op = ψ op = ̵ h → When is the optimal coupling a T¨oplitz operator?

Let R and S beT¨oplitz density operators, of the form R = OP T ̵ h [( π ̵ h ) d µ ] and S = OP T ̵ h [( π ̵ h ) d ν ] with µ, ν ∈ P ( R d × R d ) . When is an optimal coupling of R and S a T¨oplitzoperator, of the form F = OP T ̵ h [( π ̵ h ) d λ ] for some λ ∈ P ( R d × R d × R d × R d ) ?We shall see that this question is answered in the aﬃrmative only under ratherstringent conditions.We already know the answer in two diﬀerent cases discussed above:(a) R = S (see previous section);(b) µ = δ q,p and ν = δ q ′ ,p ′ , in which case R and S are rank-one operators, in whichcase the only (and therefore the optimal) coupling is R ⊗ S = OP T ̵ h [( π ̵ h ) d µ ⊗ ν ] . Moreover, as recalled at the end of Section 1 and in Section 7.2, we studied in [7]the case where R = + η ∣ a, ⟩⟨ a, ∣ + − η ∣ − a, ⟩⟨ − a, ∣ = OP T ̵ h [( π ̵ h ) µ ] , µ = + η δ ( a, ) + − η δ ( − a, ) S = ∣ b, ⟩⟨ b, ∣ + ∣ − b, ⟩⟨ − b, ∣ = OP T ̵ h [( π ̵ h ) ν ] , ν = δ ( b, ) + δ ( − b, ) , a, b ∈ R + . we proved in [7, Section 4] that ● when η = F is the T¨oplitzquantization of a classical one f : F = (∣ a ⟩ ⊗ ∣ b ⟩⟨ a ∣ ⊗ ⟨ b ∣ + ∣ − a ⟩ ⊗ ∣ − b ⟩⟨ − a ∣ ⊗ ⟨ − b ∣) = OP T ̵ h [( π ̵ h ) ( δ ( a, ) ⊗ δ ( b, ) + δ ( − a, ) ⊗ δ ( − b, ) )] ∶= OP T ̵ h [( π ̵ h ) f ] ● when η ≠ a = b ) aclassical optimal coupling is f = δ ( a, ⊗ δ ( a, ) + − η δ ( − a, ⊗ δ ( − a, ) + η δ ( a, ⊗ δ ( − a, ) , and we proved that, non only OP T ̵ h [( π ̵ h ) f ] is not an optimal couplingof R, S , but no optimal coupling of

R, S can be a T¨oplitz operator in thiscase.In the analysis below, we consider this problem when µ and ν are of the form µ ( dqdp ) = m ( q, p ) dqdp , ν ( dqdp ) = n ( q, p ) dqdp , m, n > ( R ) = Ker ( S ) = { } . To see this, weﬁrst recall one deﬁnition of the Husimi transform of an operator A on L ( R d ) :(64) ˜ W ̵ h [ A ]( q, p ) ∶= ( π ̵ h ) d ⟨ q, p ∣ A ∣ q, p ⟩ , q, p ∈ R d . (There is another, equivalent deﬁnition in terms of the Wigner transform: see (49),and the formula following (53) in [13]; the equivalence between these two deﬁnitionsis the formula before (54) in [13].) Now, if φ ∈ Ker ( R ) , one has, by formula (54) of[13], ⟨ φ ∣ R ∣ φ ⟩ = trace (∣ φ ⟩⟨ φ ∣ R ) = ∫ R d ˜ W ̵ h [∣ φ ⟩⟨ φ ∣]( q, p ) m ( q, p ) dqdp = . This implies that ˜ W ̵ h [∣ φ ⟩⟨ φ ∣] =

0, which implies in turn that φ =

0. For this impli-cation, see for instance Remark 2.3 in [14]. Equivalently, using (46) and (54) in [13]with R = ∣ φ ⟩⟨ φ ∣ shows that ∥ φ ∥ H = trace (∣ φ ⟩⟨ φ ∣) = ∫ R d ˜ W ̵ h [∣ φ ⟩⟨ φ ∣]( q, p ) dqdp = . Let ( a , b ) ∈ ˜ K ( R, S ) such thattrace ( R / a R / + S / b S / ) = M K ̵ h ( R, S ) . Assume that ∣ q, p ⟩ ∈ J [ R ] ∩ J [ S ] for each ( q, p ) ∈ R d × R d , and that the functions ( q, p ) ↦ ⟨ q, p ∣ a ∣ q, p ⟩ and ( q, p ) ↦ ⟨ q, p ∣ b ∣ q, p ⟩ are of class C on R d × R d . Deﬁne a ( q, p ) ∶= ˜ W ̵ h [ a ]( q, p ) , b ( q, p ) ∶= ˜ W ̵ h [ b ]( q, p ) , and ˜ a ( q, p ) ∶= (∣ q ∣ + ∣ p ∣ − a ( q, p ) + d ̵ h ) , ˜ b ( q, p ) ∶= (∣ q ∣ + ∣ p ∣ − b ( q, p ) + d ̵ h ) . Notice that a ∈ L ( R d , µ ) and b ∈ L ( R d , ν ) since R / a R / and S / b S / belongto L ( H ) by deﬁnition of ˜ K ( R, S ) , because Ker ( R ) = Ker ( S ) = { } . Theorem 7.4.

The following conditions are equivalent:(a) there exists λ ∈ P ( R d × R d × R d × R d ) such that F = OP T ̵ h [( π ̵ h ) d λ ] is anoptimal coupling of R and S , i.e. F ∈ C ( R, S ) and trace ( F / CF / ) = M K ̵ h ( R, S ) ; (b) one has M K ̵ h ( R, S ) = dist MK , ( µ, ν ) + d ̵ h ; (c) the functions ˜ a and ˜ b are the Legendre transforms of each other, i.e. ˜ a ∗ = ˜ b and ˜ b ∗ = ˜ a , and satisfy the Monge-Amp`ere equation det ( ∇ ˜ a ) = mn ○ ∇ ˜ a . If these conditions are satisﬁed, λ is of the form λ ( dq dp dq dp ) = m ( q , p ) δ (( q , p ) − ∇ ˜ a ( q , p )) dq dp . Let us recall that, for two T¨oplitz densities R and S in D ( L ( R d )) with symbols ( π ̵ h ) d µ and ( π ̵ h ) d ν resp., it has been proved in [13] (Theorem 2.3 (1)) that M K ̵ h ( R, S ) ≤ dist MK , ( µ, ν ) + d ̵ h . We also recall the example constructed in section 3 of [7], where µ and ν areconvex combinations of two Dirac measures with identical supports, for which theinequality above is strict. This example explains the title of [7]: quantum optimal UANTUM OPTIMAL TRANSPORT 39 transport is “cheaper” that classical optimal transport, due to additional degreesof freedom in quantum couplings which have no classical interpretation: see thepenultimate paragraph on pp. 161–162 in [7].At variance with the example in section 3 of [7], the situation where the optimalcoupling between two T¨oplitz densities

R, S is a T¨oplitz operator is the closest tothe classical setting. The classical optimal transport map between the symbols of R and S is transformed into an optimal quantum coupling by T¨oplitz quantization.There are no strictly quantum eﬀects in this coupling, unlike in the case discussed insection 3 of [7], so that, the inequality in Theorem 2.3 (1) of [13] becomes an equalityin this case. In other words, apart from the additional term 2 d ̵ h , the quantumdistance between such T¨oplitz densities is indeed the classical Monge-Kantorovichdistance between their symbols. Examples of T¨oplitz densities satisfying properties(a) and (b) of the theorem above can be found in section 2 of [7]. However, theexample constructed in section 2 of [7] does not fall exactly in the class of densitiesconsidered in the theorem above, since the symbols of the densities consideredin section 2 of [7] are convex combinations of two Dirac measures (with diﬀerentsupports and equal coeﬃcients). Proof.

Assume that (a) holds. One has(65) trace ( F / CF / ) = ∫ R d ˜ W ̵ h [ C ]( q , p , q , p ) λ ( dq dp dq dp ) and ˜ W ̵ h [ C ]( q , p , q , p ) = c ( q , p , q , p ) + d ̵ h with c ( q , p , q , p ) = ∣ q − q ∣ + ∣ p − p ∣ . Let us take (65) for granted — we shall give a quick proof of this formula at theend of the present section.Since F ∈ C ( R, S ) , the symbol ( π ̵ h ) d λ of F satisﬁes λ ∈ Π ( µ, ν ) , the set ofcouplings of µ and ν , according to Lemma 4.1 in [13]. Hence M K ̵ h ( R, S ) = ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) + d ̵ h ≥ dist MK , ( µ, ν ) + d ̵ h By Theorem 2.3 (1) of [13], one has

M K ̵ h ( R, S ) ≤ dist MK , ( µ, ν ) + d ̵ h , which proves (b).Conversely, pick an optimal coupling λ ∈ Π ( µ, ν ) and set F = OP T ̵ h [( π ̵ h ) d λ ] .Then F ∈ C ( R, S ) by Lemma 4.1 in [13], anddist MK , ( µ, ν ) + d ̵ h = ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) + d ̵ h = trace ( F CF ) . Hence, (b) implies that

M K ̵ h ( R, S ) = trace ( F / CF / ) , so that (a) holds. If (a) holds, then ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) + d ̵ h = trace ( F / CF / ) = trace ( R / a R / + S / b S / ) = ∫ R d a ( q , p ) µ ( dq dp ) + ∫ R d b ( q , p ) ν ( dq dp ) . On the other hand, since ( a , b ) ∈ ˜ K ( R, S ) , and since ∣ q, p ⟩ ∈ J [ R ] ∩ J [ S ] , one has ⟨ q , q , p , p ∣ C ∣ q , q , p , p ⟩ ≥ ⟨ q , p ∣ a ∣ q , p ⟩ + ⟨ q , p ∣ b ∣ q , p ⟩ i.e. c ( q , p , q , p ) ≥ a ( q , p ) − d ̵ h + b ( q , p ) − d ̵ h . Since a ∈ L ( R d , µ ) and b ∈ L ( R d , ν ) and λ ∈ Π ( µ, ν ) with ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) = ∫ R d ( a ( q , p ) − d ̵ h ) µ ( dq dp ) + ∫ R d ( b ( q , p ) − d ̵ h ) ν ( dq dp ) , we conclude from Theorem 1.3 in [24] (the Kantorovich duality theorem) that λ isan optimal element of Π ( µ, ν ) , and that the optimal functions ˜ a and ˜ b are Legendreduals of each other (see Lemma 2.10 in [24]).By the Brenier theorem (Theorem 2.12 (ii) in [24], the measure λ is of the form λ ( dq dp dq dp ) = m ( q , p ) δ (( q , p ) − ∇ Φ ( q , p )) dq dp , with Φ convex. Hence˜ a ( q , p ) + ˜ b ( ∇ Φ ( q , p )) = q ⋅ ∇ q Φ ( q , p ) + p ⋅ ∇ p Φ ( q , p ) for a.e. ( q , p ) . On the other hand, we know that˜ a ( z, ζ ) + ˜ b ( ∇ Φ ( q , p )) ≥ z ⋅ ∇ q Φ ( q , p ) + ζ ⋅ ∇ p Φ ( q , p ) for a.e. ( q , p , z, ζ ) . Therefore˜ a ( z, ζ ) − a ( q , p ) ≥ ( z − q ) ⋅∇ q Φ ( q , p ) + ( ζ − p ) ⋅∇ p Φ ( q , p ) for a.e. ( q , p , z, ζ ) . Hence ∇ Φ = ∇ ˜ a , and since ∇ Φ µ = ν , the change of variables formula implies thatdet ( ∇ Φ ) n ○ ∇ Φ = m . This proves (c).Conversely, assume that (c) holds and set λ ( dq dp dq dp ) = m ( q , p ) δ (( q , p ) − ∇ ˜ a ( q , p )) dq dp . Obviously, λ ∈ Π ( µ, ν ) because of the Monge-Amp`ere equation satisﬁed by ˜ a , andBrenier’s theorem implies that ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) = dist MK , ( µ, ν ) . Set F = OP T ̵ h [( π ̵ h ) d λ ] ; by Lemma 4.1 in [13], one has F ∈ C ( R, S ) . On the otherhand, since ˜ a ∗ = ˜ b and ˜ b ∗ = ˜ a , one has˜ a ( q , p ) + ˜ b ( q , p ) = q ⋅ q + p ⋅ p λ − a.e. in ( q , p , q , p ) UANTUM OPTIMAL TRANSPORT 41 or equivalently ∫ R d c ( q , p , q , p ) λ ( dq dp dq dp ) + d ̵ h = ∫ R d a ( q , p ) µ ( dq dp ) + ∫ R d b ( q , p ) ν ( dq dp ) . Since a = ˜ W ̵ h [ a ] and b = ˜ W ̵ h [ b ] , this equality can be recast astrace ( F / CF / ) = trace ( R / a R / + S / b S / ) = M K ̵ h ( R, S ) , so that F is an optimal element of C ( R, S ) , and (a) holds. (cid:3) Proof of (65) . Let ( e j ) j ≥ be a complete orthonormal system of eigenvectors of F ∈ D ( H ⊗ H ) , so that F = ∑ j ≥ ℓ j ∣ e j ⟩⟨ e j ∣ , with ∑ j ≥ ℓ j = ℓ j ≥ j ≥ . On the other hand, by formula (48) in [13]OP T ̵ h [ c ] = C + ̵ h ( ∆ q ,p ,q ,p c ) I H ⊗ H = C + d ̵ hI , where we recall that c ( q , p , q , p ) = ∣ q − q ∣ + ∣ p − p ∣ . By Tonelli’s theorem, denoting z ∶ = ( q , p ) and z ∶ = ( q , p ) , one hastrace ( F / CF / ) + d ̵ h = ∑ j,k ≥ ℓ / j ℓ / k ( π ̵ h ) d ∫ R d ⟨ e j ∣ z , z ⟩⟨ z , z ∣ e k ⟩⟨ e k ∣ e j ⟩ c ( z , z ) dq dp dq dp = ∑ j ≥ ℓ j ( π ̵ h ) d ∫ R d ⟨ e j ∣ z , z ⟩⟨ z , z ∣ e j ⟩ c ( z , z ) dq dp dq dp = ∫ R d ∑ j ≥ ℓ j ( π ̵ h ) d ⟨ e j ∣ z , z ⟩⟨ z , z ∣ e j ⟩ c ( z , z ) dq dp dq dp = ∫ R d ˜ W [ F ]( z , z ) c ( z , z ) dq dp dq dp . Since F = OP T ̵ h [( π ̵ h ) d λ ] , one has ˜ W ̵ h [ F ] = e ̵ h ∆ q ,p ,q ,p λ (see (51) and theformula following (53) in [13]), so thattrace ( F / CF / ) + d ̵ h = ∫ R d ˜ W [ F ]( z , z ) c ( z , z ) dq dp dq dp = ∫ R d e ̵ h ∆ q ,p ,q ,p c ( q , p , q , p ) λ ( dq dp dq dp ) since e ̵ h ∆ q ,p ,q ,p is self-adjoint. On the other hand e ̵ h ∆ q ,p ,q ,p c ( q , p , q , p ) = c ( q , p , q , p ) + d ̵ h , so thattrace ( F / CF / ) + d ̵ h = ∫ R d ( c ( q , p , q , p ) + d ̵ h ) λ ( dq dp dq dp ) , which is equivalent to (65). (cid:3) Appendix A. The Quantum Transport Cost

The quantum cost is the diﬀerential operator C ∶ = d ∑ j = (( x j − y j ) − ̵ h ( ∂ x j − ∂ y j ) ) . For each f ≡ f ( x , y , . . . , x d , y d ) ∈ S ( R d ) , one has Cf ( x , y , . . . , x d , y d ) = d ∑ j = ( Y j − ̵ h ∂ Y j ) f ( X + Y , X − Y , . . . , X d + Y d , X d − Y d ) . The d operators Y j − ̵ h ∂ Y j obviously commute pairwise. Since each one of theseoperators is the quantum Hamiltonian of a harmonic oscillator, we know that acomplete orthonormal system of eigenfunctions for Y j − ̵ h ∂ Y j on L ( R ) is ( ̵ h ) − / h n ( Y j /√ ̵ h ) , n ≥ , where h n is the n -th Hermite function h n ( z ) ∶ = π − / ( n n ! ) − / e − z / H n ( z ) , H n ( z ) ∶ = ( − ) n e z ( e − z ) ( n ) ,h n ( Y j ) ∶ = ( π ̵ h ) − / ( n n ! ) − / e − Y j / ̵ h H n ( Y j /√ ̵ h ) n ≥ , with ( Y j − ̵ h ∂ Y j ) h n ( Y j /√ ̵ h ) = ̵ h ( n + ) h n ( Y j /√ ̵ h ) , n ≥ . Since the linear transformation of R d ( X , Y , . . . , X d , Y d ) ↦ ( X + Y , X − Y , . . . , X d + Y d , X d − Y d ) has Jacobian determinant ( − ) d , it leaves the Lebesgue measure of R d invariant,so thatΨ m ,...,m d ,n ,...,n d ( x , y , . . . , x d , y d ) ∶ = ( ̵ h ) − d / d ∏ j = h m j ( x j + y j √ ̵ h ) h n j ( x j − y j √ ̵ h ) deﬁnes a complete orthonormal system of eigenfunctions of C , i.e. ∫ R d Ψ m ′ ,...,m ′ d ,n ′ ,...,n ′ d Ψ m ,...,m d ,n ,...,n d ( x , y , . . . , x d , y d ) dx . . . dy d = d ∏ j = δ m ′ j ,m j δ n ′ j ,n j , with C Ψ m ,...,m d ,n ,...,n d ( x , y , . . . , x d , y d ) = ̵ h ( ( n + . . . + n d ) + d ) Ψ m ,...,m d ,n ,...,n d ( x , y , . . . , x d , y d ) . Thus C = ̵ h ∑ m ,...,md ≥ n ,...,nd ≥ ( ( n + . . . + n d ) + d )∣ Ψ m ,...,m d ,n ,...,n d ⟩⟨ Ψ m ,...,m d ,n ,...,n d ∣ ;in other words, C has the spectral decomposition E ( dλ ) ∶ = ∑ m ,...,md ≥ n ,...,nd ≥ δ ( λ − ̵ h ( ( n + . . . + n d ) + d ))∣ Ψ m ,...,m d ,n ,...,n d ⟩⟨ Ψ m ,...,m d ,n ,...,n d ∣ . UANTUM OPTIMAL TRANSPORT 43

Appendix B. Monotone Convergence for Trace-Class Operators

Here is an analogue of the Beppo Levi monotone convergence theorem for oper-ators in the form convenient for our purpose.Let H be a separable Hilbert space and 0 ≤ T = T ∗ ∈ L ( H ) . For each completeorthonormal system ( e j ) j ≥ of H , settrace H ( T ) = ∥ T ∥ ∶ = ∑ j ≥ ⟨ e j ∣ T ∣ e j ⟩ ∈ [ , +∞ ] . See Theorem 2.14 in [23]; in particular the expression on the last right hand side ofthese equalities is independent of the complete orthonormal system ( e j ) j ≥ . Then T ∈ L ( H ) ⇐⇒ ∥ T ∥ < ∞ . Lemma B.1 (Monotone convergence) . Consider a sequence T n = T ∗ n ∈ L ( H ) such that(i) ≤ T ≤ T ≤ . . . ≤ T n ≤ . . . , and sup n ≥ ⟨ x ∣ T n ∣ x ⟩ < +∞ for all x ∈ H , or(ii) ≤ T ≤ T ≤ . . . ≤ T n ≤ . . . , and sup n ≥ trace H ( T n ) < +∞ . Then(a) there exists T = T ∗ ∈ L ( H ) such that T ≥ and T n → T weakly as n → ∞ , and(b) trace H ( T n ) → trace H ( T ) as n → ∞ .Proof. First we prove statements (a) and (b) under assumption (i). Since thesequence ⟨ x ∣ T n ∣ x ⟩ ∈ [ , +∞ ) is nondecreasing for each x ∈ H , ⟨ x ∣ T n ∣ x ⟩ → sup n ≥ ⟨ x ∣ T n ∣ x ⟩ = ∶ q ( x ) ∈ [ , +∞ ) for all x ∈ H as n → ∞ . Hence ⟨ x ∣ T n ∣ y ⟩ = ⟨ y ∣ T n ∣ x ⟩ → ( q ( x + y ) − q ( x − y ) + iq ( x − iy ) − iq ( x + iy )) = ∶ b ( x, y ) ∈ C as n → +∞ . By construction, b is a nonnegative sesquilinear form on H .Consider, for each k ≥ F k ∶ = { x ∈ H s.t. ⟨ x ∣ T n ∣ x ⟩ ≤ k for each n ≥ } . The set F k is closed for each k ≥

0, being the intersection of the closed sets deﬁnedby the inequality ⟨ x ∣ T n ∣ x ⟩ ≤ k as n ≥

1. Since the sequence ⟨ x ∣ T n ∣ x ⟩ is bounded foreach x ∈ H , ⋃ k ≥ F k = H . Applying Baire’s theorem shows that there exists N ≥ F N / = ∅ . In otherwords, there exists r > x ∈ H such that ∣ x − x ∣ ≤ r Ô⇒ ∣⟨ x ∣ T n ∣ x ⟩∣ ≤ N for all n ≥ . By linearity and positivity of T n , this implies ∣⟨ z ∣ T n ∣ z ⟩∣ ≤ r ( M + N )∥ z ∥ for all n ≥ , with M ∶ = sup n ≥ ⟨ x ∣ T n ∣ x ⟩ . In particularsup ∣ z ∣ ≤ q ( z ) ≤ r ( M + N ) , so that ∣ b ( x, y )∣ ≤ r ( M + N )∣∥ x ∥ H ∥ y ∥ H for each x, y ∈ H by the Cauchy-Schwarz inequality. By the Riesz representationtheorem, there exists T ∈ L ( H ) such that T = T ∗ ≥ , and b ( x, y ) = ⟨ x ∣ T ∣ y ⟩ . This proves (a). Observe that T ≥ T n for each n ≥

1, so thatsup n ≥ trace H ( T n ) ≤ trace H ( T ) . In particular sup n ≥ trace H ( T n ) = +∞ Ô⇒ trace H ( T ) = +∞ . Since the sequence trace H ( T n ) is nondecreasing,trace H ( T n ) → sup n ≥ trace H ( T n ) as n → ∞ . By the noncommutative variant of Fatou’s lemma (Theorem 2.7 (d) in [23]),sup n ≥ trace H ( T n ) < ∞ Ô⇒ T ∈ L ( H ) and trace H ( T ) ≤ sup n ≥ trace H ( T n ) . Since the opposite inequality is already known to hold, this proves (b).Next we prove (a) and (b) under assumption (ii). Since any x ∈ H ∖ { } can benormalized and completed into a complete orthonormal system of H , one hassup n ≥ ⟨ x ∣ T ∣ x ⟩ ≤ ∥ x ∥ H sup n ≥ trace H ( T n ) < ∞ . Thus, assumption (ii) implies (i), which implies in turn (a) and (b). (cid:3)

Appendix C. The Finite Energy Condition

Let A = A ∗ ≥ H with domainDom ( A ) , and let E be its spectral decomposition.Throughout this section, we assume that T ∈ L ( H ) satisﬁes T = T ∗ ≥

0, andlet ( e j ) j ≥ be a complete orthonormal system of eigenvectors of T with T e j = τ j e j and τ j ∈ [ , +∞ ) for each j ≥

1, such that(66) ∑ j ≥ τ j ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e j ⟩ < ∞ . Lemma C.1.

Under the assumptions above, (67) T / AT / ∶ = ∑ j,k ≥ τ / j τ / k ( ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e k ⟩) ∣ e j ⟩⟨ e k ∣ satisﬁes ≤ T / AT / = ( T / AT / ) ∗ ∈ L ( H ) and (68) trace H ( T / AT / ) = ∑ j ≥ τ j ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e j ⟩ . In particular (69) e j ∈ Ker ( T ) ⊥ Ô⇒ e j ∈ Form-Dom ( A ) . Proof.

For each Borel ω ⊂ R and each x, y ∈ H , one has ∣⟨ x ∣ E ( ω )∣ y ⟩∣ = ∣⟨ E ( ω ) x ∣ E ( ω ) y ⟩∣ ≤ ∥ E ( ω ) x ∥∥ E ( ω ) y ∥ = ⟨ x ∣ E ( ω )∣ x ⟩ / ⟨ y ∣ E ( ω )∣ y ⟩ / since E ( ω ) is a self-adjoint projection. In particular, for each α >

0, one has2 ∣⟨ x ∣ E ( ω )∣ y ⟩∣ ≤ α ⟨ x ∣ E ( ω )∣ x ⟩ + α ⟨ y ∣ E ( ω )∣ y ⟩ . UANTUM OPTIMAL TRANSPORT 45

Hence, for all j, k ≥ a jk ∶ = ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e k ⟩ ∈ C satisﬁes 2 ∣ a jk ∣ ≤ αa jj + α a kk for all α > , so that ∣ a jk ∣ ≤ a jj a kk . Since ( τ j a jj ) j ≥ ∈ ℓ ( N ∗ ) by (66) and since ⟨ e j ∣ T / AT / ∣ e k ⟩ = τ / j τ / k a jk = ⟨ e k ∣ T / AT / ∣ e j ⟩ , one concludes that T / AT / = ( T / AT / ) ∗ ∈ L ( H ) . Moreover, for each x ∈ H ⟨ x ∣ T / AT / ∣ x ⟩ = ∑ j,k ≥ τ / j τ / k ⟨ e j ∣ x ⟩⟨ e k ∣ x ⟩ ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e k ⟩ ≥ ∫ ∞ λ ⟨ ∑ j ≥ τ / j ⟨ e j ∣ x ⟩ e j ∣ E ( dλ )∣ ∑ j ≥ τ / j ⟨ e j ∣ x ⟩ e j ⟩ = ∫ ∞ λ ⟨ T / x ∣ E ( dλ )∣ T / x ⟩ ≥ , so that T / AT / ≥

0. Finally ∑ l ≥ ⟨ e l ∣ T / AT / ∣ e l ⟩ = ∑ l ≥ ∑ j,k ≥ τ / j τ / k ( ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e k ⟩) ⟨ e l ∣ e j ⟩⟨ e k ∣ e l ⟩ = ∑ l ≥ ∑ j,k ≥ τ / j τ / k ( ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e k ⟩) δ lj δ lk = ∑ l ≥ τ l ∫ ∞ λ ⟨ e l ∣ E ( dλ )∣ e l ⟩ < ∞ so that T / AT / ∈ L ( H ) , with ∥ T / AT / ∥ = trace H ( T / AT / ) = ∑ l ≥ τ l ∫ ∞ λ ⟨ e l ∣ E ( dλ )∣ e l ⟩ < ∞ . In particular for each j ≥

1, one has τ j > Ô⇒ ⟨ e j ∣ A ∣ e j ⟩ ∶ = ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e j ⟩ ≤ τ j trace H ( T / AT / ) < ∞ , and this proves (69). (cid:3) Corollary C.2.

Let Φ n ∈ C ( R + ) satisfy ≤ Φ ( r ) ≤ Φ ( r ) ≤ . . . ≤ Φ n ( r ) → r as n → ∞ . For each n ≥ , set Φ n ( A ) ∶ = ∫ ∞ Φ n ( λ ) E ( dλ ) ∈ L ( H ) , so that Φ n ( A ) = Φ n ( A ) ∗ ≥ . In the limit as n → ∞ , one has T Φ n ( A ) T → T AT weakly, and trace H ( T Φ n ( A )) → trace H ( T AT ) . Proof.

Since E is a resolution of the identity on [ , +∞ ) , and since Φ n is continuous,bounded and with values in [ , +∞ ) , the operators Φ n ( A ) satisfy0 ≤ Φ ( A ) ≤ Φ ( A ) ≤ . . . ≤ Φ n ( A ) ≤ Φ n ( A ) ∗ ≤ ( sup z ≥ Φ n ( z )) I H . Set R n ∶ = T / Φ n ( A ) T / ; by deﬁnition, one has R n = R ∗ n ∈ L ( H ) and0 ≤ R ≤ R ≤ . . . ≤ R n ≤ . . . On account of (66), one hastrace H ( R n ) = ∑ j ≥ τ j ∫ ∞ Φ n ( λ )⟨ e j ∣ E ( dλ )∣ e j ⟩ ≤ ∑ j ≥ τ j ∫ ∞ λ ⟨ e j ∣ E ( dλ )∣ e j ⟩ < ∞ . By Lemma B.1, one has R n → R weakly, with R ∈ L ( H ) and R = R ∗ ≥

0. Finally T / AT / − R n = ∑ j,k ≥ τ / j τ / k ( ∫ ∞ ( λ − Φ n ( λ ))⟨ e j ∣ E ( dλ )∣ e k ⟩) ∣ e j ⟩⟨ e k ∣ using the deﬁnition (67) of T / AT / given in Lemma C.1, so that ⟨ x ∣ T / AT / − R n ∣ x ⟩ = ∫ ∞ ( λ − Φ n ( λ ))⟨ ∑ j ≥ τ / j ⟨ e j ∣ x ⟩ e j ∣ E ( dλ )∣ ∑ k ≥ τ / k ⟨ e k ∣ x ⟩ e k ⟩ = ∫ ∞ ( λ − Φ n ( λ ))⟨ T / x ∣ E ( dλ )∣ T / x ⟩ ≥ . Hence T / AT / − R n = ( T / AT / − R n ) ∗ ∈ L ( H ) with T / AT / − R n ≥

0, and ∥ T / AT / − R n ∥ = trace H ( T / AT / − R n ) = ∑ j ≥ τ j ∫ ∞ ( λ − Φ n ( λ ))⟨ e j ∣ E ( dλ )∣ e j ⟩ → n → ∞ by monotone convergence. Therefore R n → T / AT / in L ( H ) , andtrace H ( T Φ n ( A )) = trace H ( T / Φ n ( A ) T / ) → trace H ( T / AT / ) as n → ∞ . (cid:3) Lemma C.3.

Let S ∈ L ( H ⊗ ̃ H ) satisfy the partial trace condition trace ̃ H ( S ) = T .Then ≤ S / ( A ⊗ I ̃ H ) S / = ( S / ( A ⊗ I ̃ H ) S / ) ∗ ∈ L ( H ⊗ ̃ H ) and one has trace H ⊗ ̃ H ( S / ( A ⊗ I ̃ H ) S / ) = trace H ( T / AT / ) . Proof.

For all n ≥

1, set A n = Φ n ( A ) ∈ L ( H ) , with Φ n ( r ) ∶ = r + r / n for all r ≥

0. Thus A n = A ∗ n and 0 ≤ A ≤ A ≤ . . . ≤ A n ≤ . . . Hence S / ( A n ⊗ I ̃ H ) S / = ( S / ( A n ⊗ I ̃ H ) S / ) ∗ for all n ≥

1, with0 ≤ S / ( A ⊗ I ̃ H ) S / ≤ S / ( A ⊗ I ̃ H ) S / ≤ . . . ≤ S / ( A n ⊗ I ̃ H ) S / ≤ . . . The partial trace condition implies thattrace H ⊗ ̃ H ( S / ( A n ⊗ I ̃ H ) S / ) = trace H ⊗ ̃ H ( S ( A n ⊗ I ̃ H )) = trace H ( T A n ) = trace H ( T / A n T / ) while trace H ⊗ ̃ H ( S / ( A n ⊗ I ̃ H ) S / ) → trace H ⊗ ̃ H ( S / ( A ⊗ I ̃ H ) S / ) trace H ( T / A n T / ) → trace H ( T / AT / ) as n → ∞ by Lemma B.1. This implies the announced equality by uniqueness ofthe limit. (cid:3) UANTUM OPTIMAL TRANSPORT 47

References [1] L. Ambrosio, N. Gigli, G. Savar´e: “Gradient Flows in Metric Spaces and in the Space ofProbability Measures”, 2nd edition, Lectures in Mathematics ETH Z¨urich, Birkh¨auser Verlag,Basel, 2008.[2] J.-D. Benamou, Y. Brenier:

A computational ﬂuid mechanics solution to the Monge-Kantorovich mass transfer problem , Numer. Math. (2000), 375–393.[3] I. Bengtsson, K. ˙Zyczkowski: “Geometry of Quantum States. An introduction to QuantumEntanglement”, 2nd edition, Cambridge Univ. Press, Cambridge, 2017.[4] Y. Brenier: Polar factorization and monotone rearrangement of vector-valued functions ,Comm. Pure Appl. Math. (1991), 375–417.[5] H. Brezis: “Functional Analysis, Sobolev Spaces and Partial Diﬀerential Equations”, SpringerScience + Business Media 2011.[6] H. Brezis: Remarks on the Monge-Kantorovich problem in the discrete setting , C. R. Acad.Sci. Paris, S´er. I (2018), 207–213.[7] E. Caglioti, F. Golse, T. Paul:

Quantum Optimal Transport is Cheaper , J. Statist. Phys. (2020), 149–162.[8] E.A. Carlen, J. Maas:

An Analog of the -Wasserstein Metric in Non-Commutative Proba-bility Under Which the Fermionic Fokker-Planck Equation is Gradient Flow for the Entropy ,Commun. Math. Phys., (2014), 887–926.[9] E.A. Carlen, J. Maas: Non-commutative calculus, optimal transport and functional inequal-ities in dissipative quantum systems , J. Stat. Phys. (2020), 319–378.[10] G. De Palma, D. Trevisan:

Quantum optimal transport with quantum channels , preprintarXiv:1911.00803 [math-ph].[11] G. De Palma, M. Marvian, D. Trevisan, S. Lloyd:

The quantum Wasserstein distance oforder 1 , preprint arXiv:2009.04469 [math-ph].[12] P. Ehrenfest:

Bemerkung ¨uber die angen¨aherte G¨ultigkeit der klassichen Mechanik innerhalbder Quantenmechanik , Z. Physik (1927), 455–457.[13] F. Golse, C. Mouhot, T. Paul: On the Mean Field and Classical Limits of Quantum Me-chanics , Commun. Math. Phys. (2016), 165–205.[14] F. Golse, T. Paul:

The Schr¨odinger Equation in the Mean-Field and Semiclassical Regime ,Arch. Rational Mech. Anal. (2017), 57–94.[15] F. Golse, T. Paul:

Wave packets and the quadratic Monge-Kantorovich distance in quantummechanics , C. R. Acad. Sci. Paris, S´er. I (2018), 177–197.[16] F. Golse, T. Paul, M. Pulvirenti:

On the derivation of the Hartree equation in the mean-ﬁeldlimit: uniformity in the Planck constant , J. Functional Anal. (2018), 1603–1649.[17] K. Hepp:

The classical limit for quantum mechanical correlation functions , Commun. Math.Phys. (1974), 265–277.[18] K. Ikeda: Foundation of Quantum Optimal Transport and Applications , Quantum Inf. Pro-cess. (2020), no. 1, Paper No. 25, 17 pp.[19] M. Knott, C.S. Smith: On the optimal mapping of distributions , J. Optim. Theory Appl. (1984), 39–49.[20] M. Reed, B. Simon: “Methods of Modern Mathematical Physics I: Functional Analysis”,Academic Press, Inc. , 1980.[21] R.T. Rockafellar: “Convex Analysis”, 2nd printing, Princeton University Press, 1972.[22] C. Rouze, N. Datta: Relating relative entropy, optimal transport and Fisher information: aquantum HWI inequality , Ann. Henri Poincar´e (2020), 2115–2150.[23] B. Simon: “Trace Ideals and their Applications”, 2nd ed., Amer. Math. Soc. 2005.[24] C. Villani: “Topics in Optimal Transportation”, American Mathematical Society, Providence,2003.[25] C. Villani: “Optimal Transport. Old and New”, Springer, Berlin, 2009.[26] K. ˙Zyczkowski, W. S lomczy´nski: Monge Distance between Quantum States , J. Phys. A (1998), 9095–9104. (E.C.) Dipartimento di Matematica, Sapienza Universit`a di Roma, P.le A. Moro 5,00185 Roma, Italy

Email address : [email protected] (F.G.) Centre de Math´ematiques Laurent Schwartz, ´Ecole polytechnique, route deSaclay, 91128 Palaiseau Cedex, France

Email address : [email protected] (T.P.) Laboratoire Jacques-Louis Lions, Sorbonne Universit´es & CNRS, boˆıte cour-rier 187, 75252 Paris Cedex 05, France

Email address ::