Geometry from divergence functions and complex structures
Florio M. Ciaglia, Fabio Di Cosmo, Armando Figueroa, Giuseppe Marmo, Luca Schiavone
GGeometry from divergence functions and complexstructures
F. M. Ciaglia , , F. Di Cosmo , , , A. Figueroa , ,G. Marmo , , , L. Schiavone , , , Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany ICMAT, Instituto de Ciencias Matemáticas (CSIC-UAM-UC3M-UCM) Depto. de Matemáticas, Univ. Carlos III de Madrid, Leganés, Madrid, Spain Dipartimento di Fisica “E. Pancini”, Università di Napoli Federico II, Naples, Italy INFN-Sezione di Napoli, Naples, Italy e-mail: florio.m.ciaglia[at]gmail.com and ciaglia[at]mis.mpg.de e-mail: fabiodicosmo[at]gmail.com e-mail: figueroarmando[at]yahoo.com.mx e-mail: marmo[at]na.infn.it e-mail: lucaschiavone[at]live.it Abstract
Motivated by the geometrical structures of quantum mechanics, we introduce an almost-complex structure J on the product M × M of any parallelizable statistical manifold M .Then, we use J to extract a pre-symplectic form and a metric-like tensor on M × M froma divergence function. These tensors may be pulled back to M , and we compute them inthe case of an N-dimensional symplex with respect to the Kullback-Leibler relative entropy,and in the case of (a suitable unfolding space of) the manifold of faithful density operatorswith respect to the von Neumann-Umegaki relative entropy. Contents If available, please cite the published version a r X i v : . [ qu a n t - ph ] F e b Introduction
In information geometry, it is customary to consider Riemannian metric tensors on (suitablesubmanifolds of) the space of probability distributions on some measure space in order tointroduce a notion of distance or distinguishability among different probability distributions.The idea of distance between probability distributions goes back to Fisher and has beenelaborated by Rao [32], Cencov [13], and Amari and Nagaoka [1], to name just a few.Let us briefly recall the classical setting in the simple case of a discrete, finite sample space X = { , ...N } . In this case, an arbitrary probability distribution on X may be identified witha probability vector p = ( p , ..., p N ), where p j ∈ [0 ,
1] for all j = 1 , ..., N , and P j p j = 1. Thecollection of probability distributions is thus in one-to-one correspondence with an ( N − R N . The open interior ∆ + of ∆ made up of all those probabilityvectors with strictly positive components is a smooth manifold of dimension ( N − g F R = X j p j d ln p j ⊗ d ln p j = X j p j d p j ⊗ d p j , (1)which is related to (four times) the round metric tensor on (an open submanifold of) the n -dimensional sphere, g = 4 P j dx j ⊗ dx j , where x j = √ p j .In the case of a discrete, finite measure space, an important theorem by Cencov states that ifwe ask the distinguishability between probability distributions not to increase under the actionof stochastic maps, then the metric tensor is necessarily a multiple of the so-called Fisher-Raotensor. This theorem has been generalized also to the case of a non-discrete measure spaceprovided some additional conditions are met [6, 9].In the quantum case, the situation is completely different. Indeed, when we pass fromprobability distributions to quantum states, that is, to density operators on the Hilbert spaceof the system, already in the finite-dimensional case, it is possible to prove that Cencov’stheorem is maximally violated in the sense that there is an infinite number of metric tensorssatisfying the quantum analogue of the monotonicity property under classical stochastic maps[28, 31]. This means that, in the quantum case, there is additional freedom in choosing a relevantmetric tensor as long as the classical Fisher-Rao metric tensor is recovered when we performa quantum-to-classical limit. For instance, as we will recall in section 2 and section 4, this isprecisely what happens for the Fubini-Study metric tensor on the space of pure quantum states,and with the metric tensor on the space of (faithful) density operators which is associated withthe von Neumann-Umegaki relative entropy.In this contribution, we will review the geometrical aspects of classical and quantuminformation theory, and we will exploit the parallel between classical and quantum informationgeometry to argue that the geometry of the quantum case leads to the definition of additionalgeometric structures in the classical case. Specifically, we will take inspiration from the geometryof the pure quantum states in order to build an almost complex structure on the product M × M of any parallelizable statistical manifold M by means of which we may extract a pre-symplecticform and a symmetric (0 ,
2) tensor field on M × M starting from a divergence function (relativeentropy). In particular, we will compute the pre-symplectic form on the N -simplex which isassociated with the Kullback-Leibler relative entropy. Furthermore, we will consider also thecase of faithful density operators, and compute the metric tensor on M associated with the vonNeumann-Umegaki relative entropy. 2 If available, please cite the published version
Remarks on the geometry of pure states
Here, we will recall some of the basic ingredients of the so-called geometrization of quantummechanics [2, 11, 23]. We will focus on two aspects which will be further explored in thefollowing sections.On the one hand, we will introduce the idea according to which it is possible to recover thegeometrical structures of the classical case, e.g., the Fisher-Rao metric tensor, starting fromthe quantum ones by means of a suitable immersion of probability distributions into quantumstates. In this section, we will develop this idea in the context of pure states, while in section 4,we will present an extension of this idea in the context of mixed states.On the other hand, we will exploit the geometrical structure of the space of pure quantumstates in order to highlight the role of the complex structure in the definition of geometricaltensors starting from functions. In section 3, we will start from this idea in order to reformulatewhat is usually done in the context of information geometry by introducing an almost-complexstructure on the Cartesian product M × M of a statistical manifold M with itself. In this way,we will be able to define metric-like and a pre-symplectic tensor on M × M that may be pulledback to M .In standard quantum mechanics [18, 20], a Hilbert space H is associated with a quantumsystem, and observables are identified with self-adjoint operators on H . According to Dirac, thelinear structure of H is crucial to describe the superposition principle of quantum mechanics.Specifically, if ψ, φ ∈ H are vectors representing two states of the system, the linear structure of H allows to say that ψ + φ ∈ H represents another admissible state for the system. This way oflooking at states as vectors in H may be satisfying from the point of view of the superpositionprinciple, but it is not fully compatible with the statistical content of quantum mechanics.Indeed, one of the fundamental prescription in quantum mechanics is that the quantity h A i ψ := h ψ | A | ψ i , (2)where A is a self-adjoint linear operator on H and h , i is the Hilbert product on H , has to beinterpreted as the expectation value of the observable A on the state ψ . Then, relying on thespectral decomposition of A given by A = Z σ ( A ) λ d E A ( λ ) , (3)where σ ( A ) ⊆ R is the spectrum of A and E A is the projection-valued measure associated with A [33], the quantity µ ( ψ, E A ( O )) = h ψ | E A ( O ) | ψ i , (4)where O is a measurable subset of σ ( A ), is interpreted as the probability that a measure of A on ψ gives an outcome which is in O . Therefore, in order to make this picture consistent, wemust have that µ ( ψ, E A ( σ ( A ))) = 1 , (5)and thus, since E ( σ ( A )) is the identity operator I on H for every observable A , we must havethat h ψ, ψ i = 1 , (6)that is, the vector ψ must be normalized. Consequently, the probabilistic-statistical interpretationof quantum mechanics forces us to leave the linear space H and pass to the nonlinear manifoldgiven by normalized vectors in H , that is, on the unit-sphere in H . However, this is not the end3 If available, please cite the published version f the story. Indeed, if we look at equation (4), we can notice that the measure µ ( ψ, E A ( O )) isequal to the measure µ ( φ, E A ( O )) if we consider the vector φ := e ıθ ψ with θ ∈ R , and that φ isstill a normalized vector in H . This means that we have an additional U (1) symmetry of whichwe can dispose of, so that the mathematical object that correctly describes a (pure) quantumstate is the equivalence class [ ψ ] associated with ψ with respect to the action of C = R + × U (1)on H given by scalar multiplication. Eventually, we obtain that the statistical interpretationof quantum mechanics forces us to describe (pure) quantum states as points in the complexprojective space P ( H ) = H / C associated with H .On this nonlinear manifold, the superposition principle is clearly not applicable in the sameway as it is on H , however, it can be proved [26] that there is a formulation of the superpositionprinciple on P ( H ) which requires the specification of a third reference state. From this pointof view, the Hilbert space H seems to be a very useful computational tool to express thesuperposition principle, and this simplicity is gained at the expenses of a redundant descriptionof quantum states.Once we accept that (pure) states in quantum mechanics are points in P ( H ), we may startto uncover the geometrical structures that are “naturally” present on this manifold. To avoidtechnical difficulties, in the sequel we shall always assume dim( H ) = N < ∞ .On the one hand, the Hermitian product of H does not play any role in defining the manifoldof pure states, it is only the action of C on the vector space V underlying H that enters thegame. From the mathematical point of view, H is a N -dimensional, complex vector space, say V , endowed with an Hermitian product denoted by h ψ, φ i which is, by convention, C -linearwith respect to the second entry and anti-linear with respect to the first entry. The group ofisometries of h , i is a compact subgroup of the complex, general linear group of V called theunitary group and denoted by U ( H ) in order to emphasize that its definition depends on theHermitian structure h , i on V (contrarily to the complex, general linear group which is definedfor a generic complex vector space V ).If { e j } j =1 ,...,N is an Hermitian basis for H = ( V , h , i ), the corresponding coordinates for anelement ψ are written as h e j , ψ i = q j + ip j with ( q j , p j ) real numbers. Therefore, the Hilbertspace H can be studied as a real, 2 N -dimensional linear manifold with a global coordinatechart given as above. The smooth action of C on V is given by ψ α ψ with α ∈ C , and theinfinitesimal generators of the action are the linear vector fields∆ = q j ∂∂q j + p j ∂∂p j Γ = p j ∂∂q j − q j ∂∂p j . (7)The vector field Γ implements the phase rotations, while ∆ implements the dilations anddescribes the linear structure of the underlying vector space V [12]. The vector fields ∆ and Γdetermine an involutive distribution D (they commute because C is Abelian), and they arecomplete because they are linear vector fields. However, we need to discard the null vector of V (the unique fixed point of ∆ and Γ) in order for the quotient with respect to the action of C tobe a smooth manifold. In the following V will denote the space obtained from the vector space V after removing the null vector. The resulting space V / C is the so-called complex projectivespace for the complex vector space V , and we denote it by CP ( V ) in order to emphasize thefact that the manifold structure of the complex projective space depends on the complex vectorspace structure of V and does not depend on the Hermitian product turning V into the Hilbertspace H . The canonical projection map from V to CP ( V ) will be denoted by π .4 If available, please cite the published version n the other hand, the Hermitian product h , i determines a Hermitian tensor H that reads H = N X j =1 (d q j ⊗ d q j + d p j ⊗ d p j ) + i (d q j ⊗ d p j − d p j ⊗ d q j ) = g + i ω, (8)where g is a Riemannian metric, and ω a symplectic structure. The (1 ,
1) tensor field J = d q k ⊗ ∂∂p k − d p k ⊗ ∂∂q k (9)is such that J = − Id, and it determines a complex structure compatible with g and ω in thesense that g( J u, v ) = ω ( u, v ) , g( J u, J v ) = g( u, v ) ,ω ( J u, J v ) = ω ( u, v ) (10)for every couple ( v, u ) of vector fields. The triple ( J, g , ω ) determines a Kähler structure on V .Upon considering the contravariant tensor fields Λ = ω − , G = g − , it is possible to show [27]that e Λ = R Λ, e G = R G with R = g(∆ , ∆), are tensor fields “projectable” with respect to theprojection map π . This means that there are two tensor fields Λ π , and G π on CP ( V ) such that e Λ is π -related with Λ π , e G is π -related with G π , and J is π -related with J π [27]. Furthermore,Λ π and G π are invertible their inverses are a simplectic form ω π and a Riemannian metric tensor g π on CP ( V ) (the so-called Fubini-Study metric on the complex projective space), respectively,and there is a complex structure J π on CP ( V ) that is compatible with ω π and g π (see equation(10)). Essentially, the triple ( J π , g π , ω π ) determines a Kähler structure on CP ( V ) which clearlydepends on the Hermitian product h , i .Note that all the linear vector fields generating the smooth, left action of U ( H ) on V ( ψ U ψ ) commute with ∆ and Γ, and thus are “projectable” on CP ( V ) and determine asmooth left action of U ( H ) on CP ( V ). It can be proved [19] that the Fubini-Study metric g π isthe unique Riemannian metric on CP ( V ) which is invariant with respect to the action of U ( H )up to a constant factor.The space of pure quantum states is then the complex projective space CP ( V ) endowed withthe Kähler structure given above, and will be denoted as P ( H ) to emphasize the fact that theKähler structure depends on the Hilbert space H .On the punctured Hilbert space H = H − { } , we may define the following Hermitian tensor h = h d ψ | d ψ ih ψ | ψ i − h d ψ | ψ ih ψ | d ψ ih ψ | ψ i . (11)The relevance of this tensor stems from the fact that, quite interestingly, its real part is thepullback to H of the Fubini-Study metric g π on P ( H ), while its immaginary part is thepullback to H of the symplectic form ω π defining the canonical Kähler structure on P ( H )[19]. Consequently, we may look at h as an unfolding tensor for g π and ω π , and this wayof looking at h is particularly relevant when we want to address the issue of recovering theclassical case from the quantum one [21]. Specifically, given an arbitrary probability vector p = ( p , ..., p N ), we consider a sort of complex-valued square root of p given by the complexvector ( e i θ √ p , ..., e i θ N √ p N ), that is, we replace the probability vector with a “probabilityamplitude” vector ψ p = N X j =1 z j | e j i = N X j =1 e i θ j √ p j | e j i , (12)5 If available, please cite the published version rom the mathematical point of view, we may interpret this procedure as a nonlinear change ofcoordinates in H so that a direct computation reveals that, in this new coordinates system,the expression of h is h = 14 ( h d ln p ⊗ d ln p i p − h d ln p i p ⊗ h d ln p i p ) + h dθ ⊗ dθ i p − h dθ i p ⊗ h dθ i p ++ i h d ln p ∧ dθ i p − h d ln p i p ∧ h dθ i p ) , (13)where h·i p denotes the expectation value with respect to the probability distribution p . Fromthis, it follows that the real part of h is nothing but the Fisher-Rao metric tensor wheneverd θ = 0. According to the results in [21], this procedure may also be extended to the case where H is an infinite-dimensional Hilbert space of square-integrable functions on some measure space.Another relevant aspect of the geometry of pure quantum states is related with the possibilityof describing the real and immaginary part of the Hermitian tensor h by means of a potentialfunction and the complex structure J on H given in equation (9). Specifically, given anyfunction F on H , we define the (0 ,
2) tensor ω F := d d J F = d ( J ◦ d F ) . (14)This covariant tensor is clearly closed, and is easily seen to be antisymmetric because it is theexterior differential of a 1-form. Then, we may define a (0 ,
2) tensor field g F by setting g F ( X, Y ) = ω F ( X, J ( Y )) . (15)If the function F is such that ω F ( X, J ( Y )) = − ω F ( J ( X ) , Y ) , (16)then g F is a symmetric tensor, and the triple ( J, ω F , g F ) defines a sort of Kähler structure on H . For instance, we may consider the function F = ln( h ψ | ψ i ) , (17)and a direct computation shows that ω F coincides with the imaginary part of h , while g F coincides with the real part of h given in equation (11).Note that F is not the pullback of a function on the complex projective space P ( H ) becauseln( h ψ | ψ i ) changes by an additional constant under dilation. Furthermore, it should be clear thatthe procedure just outlined will work on any manifold M admitting a (1 ,
1) tensor J such that J = − Id . (18)We will exploit this instance in the following section. In the last part of the previous section, we saw how the complex structure J allows to build aclosed two-form and a symmetric (0 ,
2) tensor field on H starting from a single real-valued,smooth function on the manifold. As, in general, the two tensor fields may be degenerate, we need to “enlarge” the definition of triple definingthe Kähler structure. If available, please cite the published version ere, we will adapt this idea to the case of statistical manifolds in the context of informationgeometry. A (naked) statistical manifold M is just a smooth manifold the points of whichparametrize, in a one-to-one way, a subset of probability distributions on some given outcomespace X . A typical example is given by the statistical manifold of Gaussian probabilitydistributions on R , where a point in M is given by ( µ, σ ) where µ is the mean and σ the varianceof the Gaussian.It is customary to fix a reference measure µ on X in such a way that the points in M parametrize a family of probability distributions by means of the map m p ( x, m ) d µ, (19)where p ( x, m ) is a function on X depending parametrically on m ∈ M . Then, this immersion of M into the space of probability distributions on X allows to define a geometrical structure on M , namely, the Fisher-Rao metric tensor g F R given by( g F R ) jk ( ξ ( m )) := Z X p ( x, ξ ( m )) ∂ ln( p ( x, ξ ( m ))) ∂ξ j ∂ ln( p ( x, ξ ( m ))) ∂ξ k d µ ( x ) , (20)where { ξ j } j =1 ,...,dim ( M ) is a coordinate chart for M . It can be proved that this expressiontransform as a (0 ,
2) tensor under coordinate change [6].Quite remarkably, the Fisher-Rao metric tensor on M may also be “extracted” from asuitable two-point function D (i.e., a function on M × M ), called a divergence function, or acontrast function, which, in some cases, may be interpreted as a relative entropy. Specifically, adivergence function D is a real-valued, smooth function on M × M such that (see section 3.2 in[1]) D ( m , m ) ≥ ∀ ( m , m ) ∈ M × M, (21)and such that the equality in the previous equation holds iff m = m .Starting from a divergence function D , it is possible to extract a metric tensor g on M bysetting g = g jk d q j ⊗ d q k with g jk = ∂ D∂x j ∂x k !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) diag = ∂ D∂y j ∂y k !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) diag = − ∂ D∂x j ∂y k !(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) diag , (22)where ( q j ) is a coordinate chart on M , ( x j , y j ) is a coordinate chart on M × M which is adaptedto the product structure of the manifold, and | diag denotes the restriction to the diagonal of M × M . Note that these second derivatives patch together to form a (0 ,
2) tensor on M because D satisfies the properties characterizing a divergence function.A paradigmatic example of divergence function which is interpreted as a relative entropy isgiven by the Kullback-Leibler relative entropy S KL ( p, q ) = N X j =1 p j ln p j q j ! . (23)If we consider the statistical manifold ∆ + , that is, the open interior of an N -dimensional simplex,then a direct computation shows that the metric g KL we may extract from S KL is precisely theFisher-Rao metric tensor given in equation (1).A coordinate-free formulation of this procedure is given in [14, 25], and a general formalismbased on the Lie groupoid structure of M × M is presented in [22]. Here, on the other hand, we7 If available, please cite the published version ould like to give an alternative (but equivalent), coordinate-free formulation of the extractionprocedure which is inspired by Kähler geometry and thus will lead us to define also a pre-symplectic form on M × M starting from a divergence function. Specifically, we assume that M is a parallelizable manifold, so that we have a global basis { X j } of vector fields and its duallyrelated global basis { θ j } of differential 1-forms. Note that for these bases to be dually related wemust have that θ j ( X k ) = δ jk . Then, by denoting with pr and pr the left and right projectionsfrom M × M to M respectively, we immediately obtain a global basis { X j , Y k } of vector fieldson M × M by setting X j ( pr ∗ ( f )) = pr ∗ ( X j f ) ∀ f ∈ F ( M ) Y j ( pr ∗ ( f )) = pr ∗ ( X j f ) ∀ f ∈ F ( M ) . (24)A global basis { α j , β k } of 1-forms is obtained as the dually-related basis of { X j , Y k } . If i d : M −→ M × M is the diagonal immersion given by m i d ( m ) = ( m, m ), it follows fromproposition 2 in [14] that X j is i d -related with X j + Y j , and thus we have i ∗ d α j = i ∗ d β j = θ j . (25)If ( q r ) is a coordinate chart on M for which X j = X rj ∂∂q r , θ j = θ jr d q r , (26)and { x r , y r } is a coordinate chart on M × M adapted to its product structure, we have X j = X rj ∂∂x r , α j = θ jr d x r Y j = X rj ∂∂y r , β j = θ jr d y r , (27)where, because duality, we must have X rj θ kr = δ kj . Starting from these bases, it is possibleto introduce an almost complex structure on M × M , that is, a (1,1) tensor field J such that J = − Id. Specifically, we set J = α j ⊗ Y j − β j ⊗ X j , (28)and a direct computation shows that J = − Id.Note that we do not require any integrability property for J , and this means that J isnot in general a proper complex structure on M × M like the tensor field J π on the complexprojective space in section 2, and in the literature it is called quasi-complex structure. However,the procedure outlined at the end of section 2 relies only on the fact that J = − Id and onthe compatibility between J and F expressed by equation (16), and not on the integrabilityproperties of J . Therefore, if F is any function on M × M , we may still define a closed,antisymmetric 2-form ω JF on M × M setting ω JF := d d J F = d ( J ◦ d F ) == d (cid:16) Y j ( F ) α j − X j ( F ) β j (cid:17) == (cid:18) L X j L Y k ( F ) − Y l ( F ) α l ([ X j , X k ]) (cid:19) α j ∧ α k ++ (cid:18) L X k L Y j ( F ) + 12 X l ( F ) β l ([ Y j , Y k ]) (cid:19) β j ∧ β k ++ (cid:16) L Y k L Y j ( F ) + L X j L X k ( F ) (cid:17) β k ∧ α j (29)8 If available, please cite the published version nd a (0,2) tensor field g JF on M × M by setting g JF ( Z, W ) := ω JF ( J ( Z ) , W ) (30)for all vector fields Z, W on M × M . An explicit computation leads to the following expression g JF = (cid:16) L Y j L Y k ( F ) + L X k L X j ( F ) (cid:17) (cid:16) α j ⊗ α k + β k ⊗ β j (cid:17) ++ (cid:16) X l ( F ) β l ([ Y j , Y k ]) − L X k L Y j ( F ) + L X j L Y k ( F ) (cid:17) α j ⊗ β k ++ (cid:16) Y l ( F ) α l ([ X j , X k ]) − L X j L Y k ( F ) + L X k L Y j ( F ) (cid:17) β j ⊗ α k (31)This tensor will be symmetric whenever J and F satisfy the compatibility condition givenin equation (16).The possibility of obtaining a pre-symplectic form on M × M has also been discussed in[7, 30, 36]. The main difference between these works and our is that our considerations arecompletely coordinate-independent and are valid for all parallelizable statistical manifolds.Moreover, we want to stress the importance of the almost complex structure J , whose definitionis coordinate independent, and its relation with the Kähler geometry of quantum mechanicsas the basic building blocks of our procedure. This is in line with the idea that the quantumsetting, being “more fundamental”, should cast its light on the classical setting inspiring theconstruction of new geometrical structures on the latter, and not vice-versa.At this point, we may consider the diagonal immersion i d and take the pullback of ω JF and g JF to M to obtain i ∗ d ω JF = (cid:16) L X j L Y k ( F ) − L X k L Y j ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ∧ θ k ++ (cid:16) L Y k L Y j ( F ) + L X j L X k ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ∧ θ k −− (cid:16) Y l ( F ) α l ([ X j , X k ]) + X l ( F ) β l ([ Y j , Y k ]) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ∧ θ k (32)and i ∗ d g JF = (cid:16) L Y j L Y k ( F ) + L X k L X j ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ⊗ s θ k ++ 12 (cid:16) Y l ( F ) α l ([ X j , X k ]) + X l ( F ) β l ([ Y j , Y k ]) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ∧ θ k , (33)where we used equation (25), and we set θ j ⊗ s θ k = θ j ⊗ θ k + θ k ⊗ θ j .According to the results in section 2 of [14], F is a divergence function if we have (cid:16) L X j F (cid:17)(cid:12)(cid:12)(cid:12) diag = (cid:16) L Y j F (cid:17)(cid:12)(cid:12)(cid:12) diag = 0 ∀ j = 1 , ..., dim( M ) . (34)Then, F is such that (cid:16) L X j L X k F (cid:17)(cid:12)(cid:12)(cid:12) diag = (cid:16) L X k L X j F (cid:17)(cid:12)(cid:12)(cid:12) diag = (cid:16) L Y k L Y j F (cid:17)(cid:12)(cid:12)(cid:12) diag = (cid:16) L Y j L Y k F (cid:17)(cid:12)(cid:12)(cid:12) diag == − (cid:16) L Y j L X k F (cid:17)(cid:12)(cid:12)(cid:12) diag = − (cid:16) L X k L Y j F (cid:17)(cid:12)(cid:12)(cid:12) diag == − (cid:16) L X j L Y k F (cid:17)(cid:12)(cid:12)(cid:12) diag = − (cid:16) L Y k L X j F (cid:17)(cid:12)(cid:12)(cid:12) diag , (35)and we obtain i ∗ d ω JF = 0 , (36)9 If available, please cite the published version nd i ∗ d g JF = 2 (cid:16) L X j L X k ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ⊗ s θ k == 2 (cid:16) L Y j L Y k ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ⊗ s θ k == − (cid:16) L X j L Y k ( F ) (cid:17)(cid:12)(cid:12)(cid:12) diag θ j ⊗ s θ k , (37)where we used proposition 5 in [14]. By comparing equation (37) with the results of proposition5 in [14], we obtain that the metric-like tensor on M extracted from the function F by meansof the almost complex structure J as explained in this section coincides with the metric-liketensor on M extracted from the function F by means of the procedure outlined in [14]. At thispoint, if J is the almost complex structure associated with another choice of global bases on M × M , a direct computation shows that we have i ∗ d ω J F = i ∗ d ω JF = 0 i ∗ d g J F = i ∗ d g JF . (38)where the last equality follows upon a direct comparison with the results in [14]. Note that, inorder for the previous equations to be true, we must impose that F is a divergence function.We will now apply our considerations to the case of an N-simplex where the function F istaken to be the Kullback-Leibler relative entropy S KL ( p , q ) = N X j =1 p j ln p j q j ! . (39)In the open interior of the simplex we have the basis { P j } j =1 ,..., ( N − of vector fields given by P j = ∂∂p j − ∂∂p j +1 , (40)and its associated dual basis of 1-forms { ϑ j } j =1 ,..., ( N − given by ϑ j = 12 d p j − d p j +1 − X k = j,j +1 d p k . (41)Denoting by { P j , Q j } j =1 ,..., ( N − and { α j , β j } j =1 ,..., ( N − the basis of vector fields and one-forms on ∆ + × ∆ + obtained by { P j } j =1 ,..., ( N − and { ϑ j } j =1 ,..., ( N − , we first note that[ P j , P k ] = [ P j , Q k ] = [ Q j , Q k ] = 0 , (42)and then we directly compute ω JKL = 1 q l (cid:16) δ kl ( δ jl − δ ( j +1) l ) − δ ( k +1) l ( δ jl − δ ( j +1) l ) (cid:17) ·· (cid:16) α j ∧ α k + β j ∧ β k + 2 β k ∧ α j (cid:17) , (43)and g JF = 2 q l (cid:16) δ kl ( δ jl − δ ( j +1) l ) − δ ( k +1) l ( δ jl − δ ( j +1) l ) (cid:17) (cid:16) α j ⊗ α k + β k ⊗ β j (cid:17) , (44)10 If available, please cite the published version o that on ∆ + we obtain i ∗ d ω JKL = 0 , (45)and i ∗ d g JKL = 2 δ jk − δ ( j +1) k p k + δ ( j +1)( k +1) − δ j ( k +1) p k +1 ! ϑ j ⊗ s ϑ k . (46)Concerning i ∗ d g JKL , we immediately see that i ∗ d g JKL ( P j , P j ) = 2 p j + 1 p j +1 ! , (47)while, if k + 1 < j or j + 1 < k we have i ∗ d g JKL ( P j , P k ) = 0 , (48)and, if k + 1 = j we have i ∗ d g JKL ( P j , P k ) = − p j , (49)and, if j + 1 = k we have i ∗ d g JKL ( P j , P k ) = − p j +1 . (50)Then, a direct comparison using the explicit form of the P j ’s and the explicit form of theFisher-Rao metric tensor g F R given in equation (1) shows that i ∗ d g JKL = 2 g F R . (51) In section 2, we reviewed how classical probability distributions may be immersed into thecomplex projective space by replacing a probability vector with a suitable notion of probabilityamplitude. Here, we want to develop a similar, but different, idea in the context of mixedquantum states. Pictorially speaking, we want to quantize classical probability distributions inorder to obtain quantum states.
Remark 1.
It is possible to pass from the concept of pure quantum “states” to the concept ofquantum “amplitudes”. In other words, a sort of operator-valued square root of a pure statecan be introduced, if one considers the set of rank-one operators. Indeed, let | ψ i and | φ i benormalized vectors in a Hilbert space H , such that h ψ | φ i 6 = 0 and ρ ψ = | ψ ih ψ | and ρ φ = | φ ih φ | be the corresponding pure states in P ( H ) . Let ρ ψφ = | ψ ih φ | be a rank one operator. Then, thefollowing relations hold true: ρ ψφ ρ ∗ ψφ = ρ ψ ρ ∗ ψφ ρ ψφ = ρ φ , (52) which amounts to say that the rank one operator ρ ψφ is a square root for the two states ρ ψ and ρ φ . It is worth noticing that this rank-one operator defines a transition amplitude because ofthe non-othogonality condition. These rules coincide with the algebraic structure underlyingSchwinger’s approach to quantum mechanics[34], an approach which is based on the concept of If available, please cite the published version elective measurements. It is possible to prove[15] that these basic elements satisfy the definingproperties of a groupoid, each selective measurement describing a transition between outcomesof an experiment performed on a quantum system (for instance the outcomes of a Stern-Gerlachexperiment on a beam of atoms). In this framework the rank-one operator ρ ψφ describes aselective measurement between the outcomes of two “non-compatible” experiments (for moredetails on this formulation see [15, 16, 17]).A similar procedure may be implemented also for mixed states, see Section 2.3 in [4]. However,here we shall limit ourselves to mixed quantum states and do not consider their “square root”. In the quantum information theory of finite-level quantum systems, the role of probabilitydistributions is played by the density operators on the Hilbert space H of the system underinvestigation (see [3, 8, 10, 29, 35]). By fixing a basis { e j } in H , we may realize every densityoperator ρ as a “density matrix” with respect to the chosen basis. Then, we may consider anatural immersion of a probability vector p = ( p , ..., p N ) into the space of density matricesgiven by p ρ p = p j E j , (53)where E j = | e j ih e j | . In this way, we realize a probability vector as a diagonal matrix withrespect to the basis { e j } in H . Here, since we shifted our attention from pure states to mixedstates, the analogue of the phase factor used in section 2 (i.e., and element of the unitary group U (1)) is an element U of the special unitary group SU ( H ) of the Hilbert space of the system.Therefore, by acting with this “generalized phase factor”, we obtain the density operator ρ ( U , p ) := U ρ p U † (54)which is a density operator such that its vector of eigenvalues is in one-to-one correspondencewith p modulo an action of the permutation group. In this way, we have just obtained a coveringof the space of states (density operators) S of H in terms of the space f M = SU ( H ) × ∆ , (55)where ∆ denotes the simplex of N -probability vectors, by means of the map e π : f M −→ S givenby e π ( U , p ) := ρ ( U , p ) . (56)By considering only probability vectors with p j > j = 1 , .., N , we obtain a map π fromthe smooth manifold M = SU ( H ) × ∆ + , (57)where ∆ + denotes the open interior of the simplex of N -probability vectors, to the smoothmanifold S + of faithful states (invertible density operators) on H given by the restriction of e π to M . According to [14], the map π is differentiable and it is a submersion at each point( U , p ) ∈ M for which the vector p is such that p j = p k for j = k .The possibility of working on M turns out to be particularly useful when we need toperform explicit computations regarding geometrical structures on S + . For instance, M isa parallelizable manifold, and thus, the theory developed in section 3 applies. Therefore, if S : S + × S + −→ R is a relative quantum entropy, e.g., the von Neumann-Umegaki relativeentropy, we may consider its pullback S π to M × M , and compute the geometrical tensorsassociated with it as explained in section 3. The symmetric tensor thus obtained will be thepullback on
M × M of the symmetric tensor on S + × S + that we may obtain by applying one ofthe general procedures described in [14, 22, 24]. Furthermore, the differential calculus availableon M × M is considerably simple once we note that M is just the Cartesian product of a Lie12 If available, please cite the published version roup with an open set of an affine space, and this makes some computations particularly explicitand clear. Then, since the probability distributions are embedded in M in a manifest way, weget an easier comparison between geometrical structures on quantum states and geometricalstructures on classical probabilities in the spirit of section 2 and of the work [21].To better illustrate this point, we will now explicitely work out the details of the computationof the metric tensor on M for the case of the von Neumann-Umegaki relative entropy defined by S ( ρ, σ ) := Tr ( ρ (ln( ρ ) − ln( σ )) . (58)Note that, however, a similar procedure may be also considered for the family of Tsallis entropies(see [25]), and for the family of ( α − z )-Renyi relative entropies [14].For the sake of simplicity, we omit the explicit expression of the pre-symplectic form and thesymmetric (0 ,
2) tensor on
M × M . Indeed, the explicit expressions of these objects turn out tobe particularly long, and the computations that we are about to perform in order to computethe metric tensor on M already give enough details to obtain the tensors on M × M by meansof equations (29) and (31).First of all, we need to introduce a global basis of vector fields on M and its dually-relatedbasis of 1-forms. At this purpose, we will exploit the product structure of M = SU ( H ) × ∆ + to select the basis { X j ; P k } of vector fields where the X j ’s are left-invariant vector fields “on” SU ( H ), and the P k ’s are the vector fields on ∆ + introduced in the last part of section 3. Then,the dual basis will be written as { θ j ; ϑ k } , while the basis of vector fields on M × M is writtenas { X j , Y j ; P k , Q k } , and its dual basis of 1-forms as { α j , β j ; ζ k , η k } .Setting ρ ≡ ρ p for p ∈ ∆ + (see equation (53)), the pullback of S to M can be written as S π ( U , ρ ; V , σ ) = Tr ( ρ ln( ρ )) − Tr (cid:16) U ρ U † V ln( σ ) V † (cid:17) , (59)where we exploited the fact that ln( U A U † ) = U ln( A ) U † for every invertible self-adjointoperator A because ln is an analytic function.According to equation (37), we need to compute the quantities (cid:16) L X k L Y j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag , (cid:16) L X k L Q j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag , (cid:16) L P k L Q j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag (60)in order to obtain the tensor i ∗ d g JS .In order to compute them, we first observe that the following matrix equality holds by directcomputations d ρ = d (cid:16) U ρ U † (cid:17) = U (cid:16)h U † d U , ρ i + d ρ (cid:17) U † , (61)where U † d U is the left-invariant, Maurer-Cartan form on SU ( H ). Note that, If we fix anorthonormal basis { τ j } of matrices in the Lie algebra of SU ( H ) (w.r.t the Cartan-Killing form),the Maurer-Cartan form can be written as U † d U = θ j τ j . (62)Furthermore, setting E j = | e j ih e j | where { e j } is the orthonormal basis in H in terms of whichthe unfolding map π is written, we haved ρ = d p j E j , (63)so that d ρ ( P k ) = E j − E j +1 , (64) Recall that, for the basis on the N -simplex, the indexes run from 1 to N − If available, please cite the published version n the following, we will select a basis { τ k } ,..., ( N − in the Lie algebra of SU ( H ) in such a waythat the ıE j ’s are part of this algebra, and thus form a Cartan subalgebra.A direct computation shows that L X k L Y j ( S π ) = − L X k (cid:16) Tr (cid:16) U ρ U † V h ( V † d V )( Y j ) , ln( σ ) i V † (cid:17)(cid:17) == − L X k (cid:16) Tr (cid:16) U ρ U † V [ τ j , ln( σ )] V † (cid:17)(cid:17) == − Tr (cid:16) U h ( U † d U )( X k ) , ρ i U † V [ τ j , ln( σ )] V † (cid:17) == − Tr (cid:16) U [ τ k , ρ ] U † V [ τ j , ln( σ )] V † (cid:17) , (65)so that it is (cid:16) L X k L Y j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag = Tr ([ ρ , τ k ] [ τ j , ln( ρ )]) . (66)Similarly, we have L X k L Q j ( S π ) = − L X k (cid:16) Tr (cid:16) U ρ U † V (d(ln( σ ))) ( Q j ) V † (cid:17)(cid:17) == − Tr (cid:16) U [ τ k , ρ ] U † V (d(ln( σ ))) ( Q j ) V † (cid:17) == − Tr (cid:16) U [ τ k , ρ ] U † V σ − ( E j − E j +1 ) V † (cid:17) , (67)so that it is (cid:16) L X k L Q j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag = Tr (cid:16) [ ρ , τ k ] ρ − ( E j − E j +1 ) (cid:17) = 0 , (68)and we have L P k L Q j ( S π ) = − L Q k (cid:16) Tr (cid:16) U ρ U † V (d(ln( σ ))) ( Q j ) V † (cid:17)(cid:17) == − L Q k (cid:16) Tr (cid:16) U ρ U † V σ − ( E j − E j +1 ) V † (cid:17)(cid:17) == − Tr (cid:16) U ( E k − E k +1 ) U † V σ − ( E j − E j +1 ) V † (cid:17) , (69)so that it is (cid:16) L P k L Q j ( S π ) (cid:17)(cid:12)(cid:12)(cid:12) diag = − Tr (cid:16) ρ − ( E k − E k +1 ) ( E j − E j +1 ) (cid:17) . (70)Collecting the results, from equation (37) we get i ∗ d g JS = 2 Tr (cid:16) ρ − ( E k − E k +1 ) ( E j − E j +1 ) (cid:17) ϑ j ⊗ s ϑ k ++2 Tr ([ τ k , ρ ] [ τ j , ln( ρ )]) θ j ⊗ s θ k . (71)If we write ρ = N X j =1 p j E j , (72)a direct comparison with the expression of the Fisher-Rao metric tensor g F R given at the end ofsection 3 shows that g F R = Tr (cid:16) ρ − ( E k − E k +1 ) ( E j − E j +1 ) (cid:17) ϑ j ⊗ s ϑ k , (73)which means that the metric-like tensor on M reduces to (a multiple of) the Fisher-Rao metrictensor whenever we restrict to the “classical part of the system”, i.e., to pairwise commutingmatrices. 14 If available, please cite the published version ow, we will show that there exists a basis in the Lie algebra of SU ( H ) for which the “quantumpart” of i ∗ d g JS is diagonal. Indeed, setting E jk = | e j ih e k | , we obtain a basis { E jk } j,k =1 ,...,N of B ( H ). Therefore, we can write ρ = X r p r E rr = X r p r E r ,τ j = X r,s T rsj E rs with T rsj = − T srj , (74)so that [ ρ , τ k ] = X r,s ( p s − p r ) T rsk E rs [ τ j , ln( ρ )] = X a,b ln p a p b ! T abj E ab , (75)and thus Tr ([ τ k , ρ ] [ τ j , ln( ρ )]) = X r,s ln p s p r ! ( p r − p s ) T rsk T srj == X r,s ln p s p r ! ( p r − p s ) 12 (cid:16) T rsk T srj + T srk T rsj (cid:17) == X r,s ln p s p r ! ( p r − p s ) 12 (cid:16) T rsk T srj + T rsk T srj (cid:17) == X r,s ln p s p r ! ( p r − p s ) < (cid:16) T rsk T srj (cid:17) . (76)Now, we may chose a particular basis in the Lie algebra of SU ( H ) which is splitted in threeparts, namely, the elements of the basis are of three kinds, first, there are elements of the type λ jµ = ı √ | j ih µ | + | µ ih j | ) (77)where we always relabel the indexes in such a way that Greek ones are greater than the Latinones, then, there are elements of the type σ jµ = − √ | j ih µ | − | µ ih j | ) (78)where we always relabel the indexes in such a way that Greek ones are greater than the Latinones, and finally, there are elements of the type e j = α j j | j + 1 ih j + 1 | − j X r =1 | r ih r | (79)with α j = √ j ( j +1) . Note that this basis is orthonormal with respect to the Cartan-Killing formon SU ( H ). At this point, we write λ jµ = X α,β M αβjµ E αβ , σ jµ = X α,β N αβjµ E αβ , e µ = X α,β O αβµ E αβ , (80)from which it immediately follows that the quantity in equation (76) is different from zero if andonly if τ j = τ k = λ lµ for some couple ( l, µ ), or τ j = τ k = σ lµ for some couple ( l, µ ). From this, we15 If available, please cite the published version onclude that, with respect to the basis of left-invariant vector fields and 1-forms associated withthe basis { λ jµ , σ jµ , e j } in the Lie algebra of SU ( H ), the “quantum part” of i ∗ d g JS is diagonal.Quite interestingly, this is also what happens when we consider the family of α -z-Renyi-Relative-Entropies introduced in [5]. Indeed, as it is shown in [14], the metric tensor is againdiagonal with respect to the basis of left-invariant vector fields and 1-forms associated with thebasis { λ jµ , σ jµ , e j } in the Lie algebra of SU ( H ). Furthermore, in the same paper it is shownalso that the metric tensor derived from Von Neumann-Umegaki relative entropy is monotonewith respect to quantum stochastic maps (completely positive and trace preserving maps onthe C ∗ -algebra associated to the quantum system). According to Petz theorem, monotonemetric tensor can be decomposed into the sum of two terms, a classical part which is theFisher-Rao metric tensor, and a quantum term which is coupled to the classical one via amonotone function f (a relation between this monotone function and the tomographic procedureto reconstruct a quantum state from the knowledge of different associated probability densities,has been proposed in [24, 25]). As already pointed out, Eq. (37) shows that the metric tensoron M obtained from a divergence function via a complex structure J is proportional to the oneobtained according to the procedure outlined in [14]. Therefore, also the metric tensor that wegot in this section satisfies the monotonicity property with respect to quantum stochastic maps. Inspired by the geometry of pure quantum states, in this work, we presented the constructionof a quasi-complex structure J on the Cartesian product M × M of a parallelizable statisticalmanifold M . By exploiting the geometrical properties of J , we defined a coordinate-free,algorithmic procedure to extract a symmetric, covariant (0,2) tensor g F and a presymplecticstructure ω F on M × M starting from a divergence function F on M × M . In particular, insection 3, we computed ω F and g F when M is the open interior ∆ + of the n -simplex ∆, and F isthe Kullback-Leibler divergence, and we proved that the pullback to ∆ + of the symmetric tensorfield g F is a constant multiple of the Fisher-Rao metric tensor. Then, in section 4, we consideredthe case in which M is a covering of the manifold of faithful quantum states in finite-dimensions,that is, M = SU ( H ) × ∆ + where SU ( H ) is the special unitary group of the n -dimensionalHilbert space H of the quantum system at hand, and F is the von Neumann-Umegaki relativeentropy. In this case, the metric tensor one obtains on M = SU ( H ) × ∆ + is splitted in twoparts, one which “lives” on the classical part ∆ + and is a constant multiple of the Fisher-Raometric tensor, and one which “lives” on the quantum part SU ( H ) and is a constant multiple ofthe symmetric covariant tensor extracted from the von Neumann-Umegaki relative entropy asdone, for instance, in [25, 14]. F.D.C. would like to thank partial support provided by the MINECO research project MTM2017-84098-P and QUITEMAD++, S2018/TCS-A4342. G.M. acknowledges financial support fromthe Spanish Ministry of Economy and Competitiveness, through the Severo Ochoa Programmefor Centres of Excellence in RD(SEV-2015/0554). G.M. would like to thank the support providedby the Santander/UC3M Excellence Chair Programme 2019/2020, and he is also a member ofthe Gruppo Nazionale di Fisica Matematica (INDAM),Italy.16
If available, please cite the published version eferences [1] S. I. Amari and H. Nagaoka.
Methods of Information Geometry . American MathematicalSociety, Providence, RI, 2000. ↓
2, 7[2] A. Ashtekar and T. A. Schilling. Geometrical formulation of quantum mechanics. InA. Harvey, editor,
On Einstein’s Path: Essays in Honor of Engelbert Schucking , pages 23 –65. Springer-Verlag, New York, 1999. ↓ Reviews in Mathematical Physics , Online Ready,DOI:10.1142/S0129055X20300010, 2019. ↓ Open Systems and Information Dynamics , 26(3):1950012, 2019. ↓ α -z-relative Renyi entropies. Journal of MathematicalPhysics , 56(2):022202–16, 2015. ↓ Information Geometry . Springer InternationalPublishing, Cham, 2017. ↓
2, 7[7] O. E. Barndorff-Nielsen and P. E. Jupp. Statistics, yokes and symplectic geometry.
Annalesde la Faculté des sciences de Toulouse : Mathématiques , 6(3):389 – 427, 1997. ↓ ↓ Bulletin of the London Mathematical Society , 48(3):499–506,2016. ↓ Geometry of Quantum States: An Introduction to QuantumEntanglement . Cambridge University Press, New York, 2006. ↓ Theoretical and Mathematical Physics , 152(1):894 – 903, 2007. ↓ Geometry from dynamics, classicaland quantum . Springer, Dordrecht, 2015. ↓ Statistical Decision Rules and Optimal Inference . American MathematicalSociety, Providence, RI, 1982. ↓ Annals of Physics , 395:238 – 274, 2018. ↓
7, 8, 9, 10, 12, 13, 16[15] F. M. Ciaglia, A. Ibort, and G. Marmo. Schwinger’s picture of quantum mechanics I:Groupoids.
Int. J. Geom. Met. Mod. Phys. , 16(8):1950119, 2019. ↓ Int. J. Geom. Met. Mod. Phys. , 16(9):1950136, 2019. ↓ If available, please cite the published version
17] F. M. Ciaglia, A. Ibort, and G. Marmo. Schwinger’s picture of quantum mechanics III:The statistical interpretation.
Int. J. Geom. Met. Mod. Phys. , 16(11):1950165, 2019. ↓ The Principles of Quantum Mechanics . Oxford University Press, London,1958. ↓ Rivista del Nuovo Cimento , 33:401 – 590, 2010. ↓ Advanced concepts in quantummechanics . Cambridge University Press, 2014. ↓ Physics Letters A , 374(48):4801 – 4803, 2010. ↓
5, 6, 13[22] K. Grabowska, J. Grabowski, M. Kuś, and G. Marmo. Lie groupoids in informationgeometry .
Journal of Physics A: Mathematical and Theoretical , 52(50):505202, 2019. ↓ Communications in MathematicalPhysics , 65(2):189 – 201, 1979. ↓ Journal of Physics A: Mathematical and Theoretical , 51(5):055302,2018. ↓
12, 16[25] V. I. Man’ko, G. Marmo, F. Ventriglia, and P. Vitale. Metric on the space of quantum statesfrom relative entropy. Tomographic reconstruction.
Journal of Physics A: Mathematicaland Theoretical , 50(33): 335302–29, 2017. ↓
7, 13, 16[26] V. I. Man’ko, G. Marmo, E. C. G. Sudarshan, and F. Zaccaria. Inner composition law ofpure-spin states. In R. C. Hilborn and G. M. Tino, editors,
Spin-Statistics Connection andCommutation Relations , pages 92 – 97. American Institute of Physics, New York, 2000. ↓ Rendiconti di Matematica e delle sue Applicazioni , 39:329 – 345, 2018. ↓ Journalof Soviet Mathematics , 56(5):2648 – 2669, 1991. ↓ Entropy , 20(6):472-17, 2018. Correction: Naudts,J. Quantum Statistical Manifolds. Entropy 2018, 20, 472.
Entropy , 20(10):796-3, 2018 . ↓ Journal of the Australian Mathe-matical Society , 90(3):371 – 384, 2011. ↓ Linear Algebra and its Applications , 244:81 –96, 1996. ↓ Bulletin of the Calcutta Mathematical Society , 37:81 – 91, 1945. ↓ Methods of Modern Mathematical Physics I: Functional Analysis .Academic Press, London, 1980. ↓ If available, please cite the published version
34] J.Schwinger.
Quantum Kinematics and Dynamics
Frontiers in Physics, W.A.Benjamin Inc.,New York, 1970. ↓ Entropy , 21(7):703-19, 2019. ↓ Symplectic and Kähler structures on statistical manifolds induced fromdivergence functions , pages 595 – 603. Springer-Verlag, Berlin, 2013. ↓919