[PDF] Local optimization on pure Gaussian state manifolds

Abstract

We exploit insights into the geometry of bosonic and fermionic Gaussian states to develop an efficient local optimization algorithm to extremize arbitrary functions on these families of states. The method is based on notions of gradient descent attuned to the local geometry which also allows for the implementation of local constraints. The natural group action of the symplectic and orthogonal group enables us to compute the geometric gradient efficiently. While our parametrization of states is based on covariance matrices and linear complex structures, we provide compact formulas to easily convert from and to other parametrization of Gaussian states, such as wave functions for pure Gaussian states, quasiprobability distributions and Bogoliubov transformations. We review applications ranging from approximating ground states to computing circuit complexity and the entanglement of purification that have both been employed in the context of holography. Finally, we use the presented methods to collect numerical and analytical evidence for the conjecture that Gaussian purifications are sufficient to compute the entanglement of purification of arbitrary mixed Gaussian states.

Full PDF

LLocal optimization on pure Gaussian state manifolds

Bennet Windt,

1, 2, ∗ Alexander Jahn, † Jens Eisert,

3, 4, 5, ‡ and Lucas Hackl

6, 2, 7, § Blackett Laboratory, Imperial College London, Prince Consort Road, SW7 2AZ, UK Max-Planck-Institut f¨ur Quantenoptik, Hans-Kopfermann-Str. 1, 85748 Garching, Germany Dahlem Center for Complex Quantum Systems,Freie Universit¨at Berlin, 14195 Berlin, Germany Mathematics and Computer Science, Takustraße 9,Freie Universit¨at Berlin, 14195 Berlin, Germany Helmholtz-Zentrum Berlin f¨ur Materialien und Energie, Hahn-Meitner-Platz 1, 14109 Berlin, Germany QMATH, Department of Mathematical Sciences, University of Copenhagen,Universitetsparken 5, 2100 Copenhagen, Denmark Munich Center for Quantum Science and Technology, Schellingstr. 4, 80799 M¨unchen, Germany

We exploit insights into the geometry of bosonic and fermionic Gaussian states to develop aneﬃcient local optimization algorithm to extremize arbitrary functions on these families of states. Themethod is based on notions of gradient descent attuned to the local geometry which also allows forthe implementation of local constraints. The natural group action of the symplectic and orthogonalgroup enables us to compute the geometric gradient eﬃciently. While our parametrization of statesis based on covariance matrices and linear complex structures, we provide compact formulas toeasily convert from and to other parametrization of Gaussian states, such as wave functions forpure Gaussian states, quasiprobability distributions and Bogoliubov transformations. We reviewapplications ranging from approximating ground states to computing circuit complexity and theentanglement of puriﬁcation that have both been employed in the context of holography. Finally,we use the presented methods to collect numerical and analytical evidence for the conjecture thatGaussian puriﬁcations are suﬃcient to compute the entanglement of puriﬁcation of arbitrary mixedGaussian states.

CONTENTS

I. Introduction 2II. Review of Gaussian states 2A. Quadrature operators and Majorana modes 3B. Deﬁnition of pure Gaussian states 4C. Gaussian transformations 5D. Geometry of pure Gaussian states 7E. Parametrization of Gaussian states 8F. Puriﬁcation of mixed Gaussian states 9III. Representations of Gaussian states 12A. Covariance matrix 12B. Linear complex structure 12C. Characteristic functions 12D. Quasiprobability distributions 14E. Gaussian unitaries 15F. Squeezed vacuum 15G. Bogoliubov transformation 16H. Thermal states 17I. Wave functions 171. Pure states 182. Mixed states 18IV. Optimization algorithm 19 ∗ [email protected] † [email protected] ‡ [email protected] § [email protected] A. Gradient descent on matrix manifolds 19B. Optimization on the Gaussian statemanifold 20C. Performing gradient descent 21D. Practical considerations 22E. The

GaussianOptimization.m package 22V. Applications 23A. Approximate ground states 23B. Gaussian entanglement of puriﬁcation(EoP) 24C. Gaussian complexity of puriﬁcation (CoP) 25VI. Optimality of Gaussian EoP 26A. Conjectures on optimality 26B. Numerical evidence 27C. Analytical bounds 28D. Proof of local optimality 29VII. Discussion 30Acknowledgments 31References 31 a r X i v : . [ qu a n t - ph ] N ov I. INTRODUCTION

Gaussian states form one of the most prominently usedand best understood families of quantum states. Thestandard deﬁnition covers bosonic [1–3] and fermionic [4]Gaussian states both pure and mixed. They naturallyappear as ground and thermal states of quadratic Hamil-tonians in physical systems and are hence ubiquitous innon-interacting quantum many-body systems in the con-densed matter context and as vacua in free ﬁeld theories.Bosonic Gaussian states are heavily used in the studyof bosonic systems with negligible interactions, such asBose-Einstein condensates [5], instances of systems ofcold atoms in optical lattices [6] and to a very good ap-proximation photonic systems [7]. Their fermionic coun-terparts are equally important for the study of fermionicquantum many-body systems, including systems cap-tured by the Bardeen–Cooper–Schrieﬀer (BCS) theory [8]or the Hartree-Fock framework [9] that can be seen asa variational principle over Gaussian fermionic states.Other applications range from ﬁeld theories [10], con-tinuous variable quantum information [1–3], relativisticquantum information [11] and quantum ﬁelds in curvedspacetime [12].Mathematically speaking, pure Gaussian states can beseen as forming K¨ahler sub-manifolds of the projectiveHilbert space, i.e. , they have a natural notion of dis-tance (Riemannian manifold with metric) and the struc-ture of a classical phase space (symplectic manifold withsymplectic form). This mathematical structure will beheavily relied on in this work, where pure bosonic andfermionic Gaussian states are the focus of attention. Forsystems constituted of N modes, the manifold of purebosonic states M b and of pure fermionic states M f canbe constructed as a symmetric space, i.e. , as a quotientof two Lie groups, namely M b = Sp(2 N, (cid:82) ) / U( N ) , (1) M f = O(2 N ) / U( N ) , (2)where Sp(2 N, (cid:82) ) is the symplectic group, O(2 N ) the or-thogonal group and U( N ) the unitary group. When re-stricting to one superselection sector of the parity of thefermion number, we can restrict to the special orthogonalgroup SO(2 N ). Gaussian manifolds come with a natu-ral group action of the respective groups, which we canexploit when performing local optimization.We optimize over bosonic and fermionic Gaussian man-ifolds by taking the natural geometry into account, i.e. ,the notion of distance between diﬀerent quantum statesas measured by the Fubini-Study metric [13, 14]. Givena Riemannian manifold M with local coordinates x =( x µ ) and positive-deﬁnite metric g = ( g µν ), such that v · u = g µν v µ u ν , the gradient descent vector ﬁeld (cid:101) F µ ofa function f is F µ = − G µν ∂f∂x ν , (3) where G µν is the inverse of g µν with G µσ g σν = δ µν .Typically, the inverse metric G µν needs to be re-evaluated at every point of the manifold, but for Gaussianstates we can explicitly construct a basis in which the ma-trix representation of G µν is constant. This provides acrucial speedup of the underlying algorithm.The goal of this manuscript is two-fold: First, wedemonstrate how the rich geometry of pure bosonic andfermionic Gaussian states can be exploited to ﬁnd ex-tremal points of arbitrary real functions without deal-ing with redundant directions or parametrizations. Sec-ond, we use a uniﬁed framework [15, 16] to describepure bosonic and fermionic states and carefully reviewhow to convert between other representations of Gaus-sian states. This ensures that a reader can seamlesslyapply our methods to their problem of choice. A crucial motivation for our work stems from thegoal to compare entanglement of puriﬁcation (EoP)and complexity of puriﬁcation (CoP) in free quantumﬁelds, which recently attracted increasing interest inthe context of applying quantum information meth-ods to holography and ﬁeld theory. We provide the

GaussianOptimization.m

Mathematica package as asimple implementation of our methods, which has alreadybeen used successfully in [17] to study EoP and CoP inquantum ﬁeld theory.

This manuscript is structured as follows: In section II,we review a uniﬁed formalism to treat pure bosonicand fermionic Gaussian states and compute the result-ing K¨ahler geometry (positive-deﬁnite metric, symplec-tic form) on the resulting state manifold. In section III,we provide a comprehensive treatment of the most com-monly used parametrizations of pure Gaussian states andhow to convert between them. In section IV, we use thegeometry of the pure Gaussian state manifold to developa gradient descent algorithm with an eﬃcient evaluationof G µν which avoids over-parametrization of tangent di-rections. The following section V is devoted to applica-tions, including the well-known problem of ﬁnding ap-proximate ground states, computing Gaussian entangle-ment of puriﬁcation (EoP) and Gaussian complexity ofpuriﬁcation (CoP), for which a given function f is op-timized over all Gaussian puriﬁcations of a given mixedGaussian state. In section VI, we use our methods tocollect numerical and analytical evidence for two conjec-tures stating that for mixed Gaussian states, the Gaus-sian EoP is actually optimal (and thus coincides withregular EoP), as well as stating which system decompo-sitions are necessary to reach this optimum. Finally, weconclude with a discussion of our results in section VII. II. REVIEW OF GAUSSIAN STATES

We introduce bosonic and fermionic Gaussian states,both pure and mixed, with a particular emphasis onthe geometry of the state manifold. While standard re-views of Gaussian states include ref. [1] based on covari-ance matrices, we follow the conventions of refs. [15, 18]based on linear complex structures that provides abasis-independent and uniﬁed treatment of bosons andfermions.

A. Quadrature operators and Majorana modes

Bosonic and fermionic quantum systems with N modescan be constructed from N creation or annihilation op-erators ˆ ξ a,a † ≡ (ˆ a , · · · , ˆ a N , ˆ a † , · · · , ˆ a † N ) , (4)which satisfy commutation relations [ˆ a i , ˆ a j ]=[ˆ a † i , ˆ a † j ]=0,[ˆ a i , ˆ a † j ]= δ ij for bosons or anti-commutation relations { ˆ a i , ˆ a j } = { ˆ a † i , ˆ a † j } =0, { ˆ a i , ˆ a † j } = δ ij for fermions. Insteadof (4), we can choose a basis of 2 N Hermitian operatorsˆ ξ q,p ≡ (ˆ q , · · · , ˆ q N , ˆ p , · · · , ˆ p N ) , (5)which are related to the ﬁrst by the equationsˆ a i = ˆ q i + i ˆ p i √ a † i = ˆ q i − i ˆ p i √ q i , ˆ q j ]=[ˆ p i , ˆ p j ]=0, [ˆ q i , ˆ p j ]= i δ ij for bosons and { ˆ q i , ˆ q j } = { ˆ p i , ˆ p j } =0, { ˆ q i , ˆ p j } = i δ ij for fermions. Mostreaders will be familiar with this notation for bosonicsystems, where the Hermitian basis operators in (5) arecalled quadratures . However, we will use the same nam-ing convention for Hermitian fermionic operators, whichoften go by the name of Majorana modes , just as ourcreation and annihilation operators from (4) referred toboth bosonic or fermionic variables. The goal of theseconventions is to treat bosons and fermions in a uniﬁedframework. All of our formulas containing indices a, b, c will be manifestly independent from the chosen basis ˆ ξ ,but when giving concrete examples, we will typically pro-vide the explicit matrix representations for the two basesfrom (4) and (5). The position of the index indicatesif the corresponding matrix row or column refers to theclassical phase space (upper index) or its dual (lower in-dex). We use Einstein’s summation convention where weimplicitly assume to sum over repeated indices, wherewe are only allowed to pair an upper and a lower in-dex. This formalism is heavily used in the general rel-ativity and high energy physics literature, but is partic-ularly suitable for the uniﬁed treatment of bosonic andfermionic Gaussian states.

We will use the symbols q,p ≡ and a,a † ≡ to indicate that the RHS of the equation gives theexplicit matrix representations in the basis (5) and (4) ,respectively. Readers familiar with Penrose’ abstract index notation [19] canalso read such as equations as tensor identities.

As we will see, Gaussian states are uniquely speci-ﬁed by their two-point correlation functions in the fun-damental operators ˆ ξ . For any state ρ , we denote theexpectation value of an operator ˆ O as (cid:104) ˆ O (cid:105) = Tr( ρ ˆ O ).We may separately consider the symmetrized and anti-symmetrized part of these correlations, given by the tworeal bilinear forms G ab = (cid:104) ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a (cid:105) (7)Ω ab = − i (cid:104) ˆ ξ a ˆ ξ b − ˆ ξ b ˆ ξ a (cid:105) , (8)(Requirement: z a = (cid:104) ˆ ξ a (cid:105) = 0) , (9)where we restrict to z a = 0 for the purpose of thismanuscript to present bosons and fermions in parallel. For bosons, the symplectic form

Ω is ﬁxed by canonicalcommutation relations (CCR), while the positive-deﬁnite metric G contains the physical correlations; for fermions,the situation is reversed, with G ﬁxed by canonical anti-commutation relations (CAR) and Ω describing the phys-ical correlations. In summary, we haveˆ ξ a , ˆ ξ b t = i Ω ab , (bosons) { ˆ ξ a , ˆ ξ b } = G ab . (fermions) (10)With respect to our bases, we have the state-independentexpressionsΩ q,p ≡ (cid:18) (cid:49) − (cid:49) (cid:19) a,a † ≡ (cid:18) − i (cid:49) i (cid:49) (cid:19) , (bosons) G q,p ≡ (cid:18) (cid:49) (cid:49) (cid:19) a,a † ≡ (cid:18) (cid:49)(cid:49) (cid:19) . (fermions) (11)When having chosen a set of creation and annihilationoperators, our Hilbert space H is spanned by the or-thonormal basis of number eigenvectors | n , . . . , n N (cid:105) withˆ n i | n , . . . , n N (cid:105) = n i | n , . . . , n N (cid:105) , where ˆ n i = ˆ a † i ˆ a i and n i ∈ (cid:78) ≥ for bosons and n i ∈ { , } for fermions. Wenow consider the state-dependent bilinear forms contain-ing the physical correlations. These are contained in the covariance matrix Γ ab , deﬁned as Γ ab = (cid:26) G ab (bosons) Ω ab (fermions) . (12)We can combine the state-dependent (12) and state-independent parts (11) into a single object J , deﬁned be-low. Due to the fact that G ab is always positive-deﬁnite,we can invert it to deﬁne its inverse g ab = ( G − ) ab with For bosonic states with z a (cid:54) = 0, it is easy to adjust the deﬁnitionof G ab to be given by G ab = (cid:104) ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a (cid:105) − z a z b . Whilethere exist fermionic states with z a (cid:54) = 0, they will either not beGaussian or they are unphysical (as z a would need to consist ofGrassmann variables). As we focus on physical Gaussian states,we do not consider either of these cases. Depending on convention, Γ ab is sometimes deﬁned with addi-tional prefactors, e.g. , as Γ ab = − Ω ab for fermions. Real basis Complex basisBosons

Phase space (ˆ q j , ˆ p k ) Also: (ˆ x j , ˆ p k ) CCR operators (ˆ b j , ˆ b † k ) Also: (ˆ a j , ˆ a † k ) Fermions

Majorana modes ˆ m a Also: γ a , c a , ( c j , ˜ c k ) CAR operators ( ˆ f j , ˆ f † k ) Also: (ˆ c j , ˆ c † k ) Uniﬁed (ˆ q j , ˆ p k ) or ˆ ξ a (ˆ a j , ˆ a † k ) or (ˆ ξ a + , ˆ ξ b − )TABLE I. Overview of notations for operator bases.

Listedare real (self-adjoint) and complex operator bases for bosonsand fermions, as well as a uniﬁed notation used throughoutthis work. For an N -mode quantum system, indices are inthe range j, k ∈ { , . . . , N } or a, b ∈ { , . . . , N } . The creationand annihilation operators in a complex basis satisfy canon-ical commutation/anti-commutation relations (CCR/CAR).Commonly used alternative notations are also listed, someomitting the hat notation for Hilbert space operators. G ac g cb = δ ab . Similarly, we deﬁne ω ab = (Ω − ) ab satis-fying Ω ac ω cb = δ ab . This enables us to deﬁne the linearmap J ab = (cid:26) − G ac ω cb (bosons) Ω ac g cb (fermions) , (13)which depends on the state under considerations. Wewill see in (15) that for Gaussian states the two formulasin (13) coincide, completely specifying all correlations forboth bosons and fermions.

B. Deﬁnition of pure Gaussian states

Up to this point, the quantum state ρ has been as-sumed to be an arbitrary quantum state in the Hilbertspace (with z a = 0). In what follows, we put a speciﬁcemphasis on the set of pure Gaussian states . There aremany equivalent deﬁnitions in the literature: One maydeﬁne Gaussian states as those satisfying Wick’s theo-rem, as ground states of non-interacting ( i.e. , quadratic),non-degenerate Hamiltonians, or as states vanishing un-der a full set of speciﬁc annihilation operators. Here, weuse yet another equivalent, though very compact deﬁni-tion based on (13), which states that for a state ρρ is pure a Gaussian state ⇔ J = − (cid:49) . (14)If this holds, both formulas in (13) coincide , such that ρ is pure Gaussian ⇔ − G ac ω cb = Ω ac g cb . (15)As a pure state, we can write ρ = | ψ (cid:105) (cid:104) ψ | for a normalizedstate vector | ψ (cid:105) . One can show that this state vector We can prove this by computing ( Gω ) − = ω − G − = Ω g andvice versa. Moreover, (14) implies J − = − J . The two relationstogether imply (15). | ψ (cid:105) is uniquely determined (up to a complex phase) byeither the covariance matrix Γ ab or equivalently by thecomplex structure J , which we use as a label to write | ψ (cid:105) = | J (cid:105) = | J (cid:105) . An alternative and completely equivalent deﬁnition ofpure Gaussian states can be phrased directly in terms of J , where | J (cid:105) is the solution of the equation ( δ ab + i J ab ) ˆ ξ b | J (cid:105) = 0 . (16)This deﬁnition is based on the observation that the eigen-vectors ˆ ξ a ± of J with eigenvalues ± i are given by ˆ ξ a ± = ( δ ab ∓ i J ab ) ˆ ξ b with ˆ ξ a − | J (cid:105) = 0 . (17)The variables ˆ ξ a ± behave in many ways as creation andannihilation operators, but do not require a speciﬁc ba-sis in phase space, which enables a compact covariantproof of Wick’s theorem [16]. Moreover, ˆ ξ a ± spans the N -dimensional complex eigenspaces V ± (cid:67) of J , which are thespaces of creation or annihilation operators associatedto | J (cid:105) , respectively. We refer to ˆ ξ a ± as phase space co-variant creation and annihilation operators, which satisfythe following commutation (bosons) or anti-commutation(fermions) relations:[ ˆ ξ a ± , ˆ ξ b ± ] = 0 , [ ˆ ξ a ± , ˆ ξ b ∓ ] = ∓ C ab , (bosons) { ˆ ξ a ± , ˆ ξ b ± } = 0 , { ˆ ξ a ± , ˆ ξ b ∓ } = C ab , (fermions) (18)where we introduced the 2-point function C ab = (cid:104) ˆ ξ a ˆ ξ b (cid:105) = 12 ( G ab + i Ω ab ) . (19)For a given state vector | J (cid:105) , we can always choose a basisˆ ξ q,p ≡ ( ˆ q , . . . , ˆ q N , ˆ p , . . . , ˆ q N ) a,a † ≡ ( ˆ a , . . . , ˆ a N , ˆ a † , . . . , ˆ a † N ) , (20)in which Ω and G simultaneously take the standard formsΩ q,p ≡ (cid:16) (cid:49) − (cid:49) (cid:17) a,a † ≡ (cid:16) − i (cid:49) i (cid:49) (cid:17) , G q,p ≡ (cid:16) (cid:49) (cid:49) (cid:17) a,a † ≡ (cid:16) (cid:49)(cid:49) (cid:17) . (21)In contrast to (10), where only one of the respective back-ground structure (Ω for bosons or G for fermions) takesthis form, while the other may take any allowed form, wehave now chosen the basis { ˆ ξ a } attuned to | J (cid:105) , so thatalso Γ ( G for bosons, Ω for fermions) takes the abovestandard form. In this basis, we ﬁnd If we allow for bosons z a = (cid:104) ξ (cid:105) a (cid:54) = 0, we would need to includethis in our label of the state vector to write | ψ (cid:105) = | J, z (cid:105) = | J, z (cid:105) . It is easy to verify J ab ˆ ξ b ± = ± i ˆ ξ a ± from (17). Complex conjugation of the basis ˆ ξ a satisﬁes ˆ ξ † a = C ab ˆ ξ b imply-ing ˆ ξ † a ± = C ab ˆ ξ b ∓ . We have the conjugation matrix C q,p ≡ (cid:18) (cid:49) (cid:49) (cid:19) a,a † ≡ (cid:18) (cid:49)(cid:49) (cid:19) . ˆ ξ − q,p ≡ (cid:16) ˆ a √ , . . . , ˆ a N √ , − i ˆ a √ , . . . , − i ˆ a N √ (cid:17) a,a † ≡ ( ˆ a , . . . , ˆ a N , , . . . , ) , ˆ ξ + q,p ≡ (cid:16) ˆ a † √ , . . . , ˆ a † N √ , i ˆ a † √ , . . . , i ˆ a † N √ (cid:17) a,a † ≡ ( , . . . , , ˆ a † , . . . , ˆ a † N ) . (22)Most of the relevant intuition for Gaussian states for N modes comes from considering one bosonic or twofermionic modes, as reviewed in the following examples,as Gaussian states for a single fermionic mode is almost trivial. We will further see explicitly that the familiesof fermionic Gaussian states consist of two disconnectedcomponents. Example 1 (Single mode pure Gaussian bosonic states) . We consider a single bosonic mode with ˆ ξ q,p ≡ (ˆ q, ˆ p ) a,a † ≡ (ˆ a, ˆ a † ) . With respect to the number eigenvectors | n (cid:105) , themost general Gaussian state vector with z a = 0 is | J (cid:105) = 1 (cid:112) cosh ρ ∞ (cid:88) n =0 √ (2 n )!2 n n ! (cid:0) − e i φ tanh ρ (cid:1) n | n (cid:105) , (23) where φ ∈ [0 , π ] and ρ ∈ [0 , ∞ ) . With respect to abovebases, one ﬁnds G q,p ≡ (cid:18) cosh ρ + cos φ sinh ρ sin φ sinh ρ sin φ sinh ρ cosh ρ − cos φ sinh ρ (cid:19) , a,a † ≡ (cid:18) e i φ sinh ρ cosh ρ cosh ρ − e − i φ sinh ρ (cid:19) , (24) J q,p ≡ (cid:18) − sin φ sinh ρ cos φ sinh ρ + cosh ρ cos φ sinh ρ − cosh ρ sin φ sinh ρ (cid:19) , a,a † ≡ (cid:18) − i cosh ρ i e i φ sinh ρ − i e − i φ sinh ρ i cosh ρ (cid:19) . (25) In summary, Gaussian states of a single bosonic modeform a two-dimensional plane parametrized by polar co-ordinates ( ρ, φ ) . Example 2 (Single and two mode pure Gaussianfermionic states) . We consider a single fermionic modewith ˆ ξ q,p ≡ (ˆ q, ˆ p ) a,a † ≡ (ˆ a, ˆ a † ) . There are only two distinct pureGaussian states, which are characterized by the state vec-tors (cid:26) | J + (cid:105) = | (cid:105)| J − (cid:105) = | (cid:105) (cid:27) , (26) whose covariance matrix and complex structures are Ω ± q,p ≡ (cid:18) ± ∓ (cid:19) a,a † ≡ (cid:18) ∓ i ± i 0 (cid:19) , (27) J ± q,p ≡ (cid:18) ± ∓ (cid:19) a,a † ≡ (cid:18) ∓ i 00 ± i (cid:19) . (28) In summary, there are only two distinct Gaussian purestates for a single fermionic mode rather than a family ofstates. We therefore consider also two fermionic modes with ˆ ξ q,p ≡ (ˆ q , ˆ q , ˆ p , ˆ p ) a,a † ≡ (ˆ a , ˆ a , ˆ a † , ˆ a † ) , where the mostgeneral Gaussian state vectors are (cid:40) | J + (cid:105) = cos θ | , (cid:105) + e i φ sin θ | , (cid:105)| J − (cid:105) = cos θ | , (cid:105) + e i φ sin θ | , (cid:105) (cid:41) (29) with θ ∈ [0 , π ] and φ ∈ [0 , π ] . Their covariance matrixand complex structure are Ω ± q,p ≡  ∓ sin θ sin φ ± cos θ ± sin θ cos φ ± sin θ sin φ − sin θ cos φ cos θ ∓ cos θ sin θ cos φ θ sin φ ∓ sin θ cos φ − cos θ − sin θ sin φ  a,a † ≡  e i φ sin θ − i cos θ − i e i φ sin θ − i cos θ i cos θ − i e − i φ sin θ θ i e − i φ sin θ  , (30) J ± q,p ≡  ∓ sin θ sin φ ± cos θ ± sin θ cos φ ± sin θ sin φ − sin θ cos φ cos θ ∓ cos θ sin θ cos φ θ sin φ ∓ sin θ cos φ − cos θ − sin θ sin φ  a,a † ≡ (cid:32) ∓ i cos θ i δ ∓ e − i φ sin θ δ ± e i φ sin θ i δ ∓ e i φ sin θ − i cos θ − i δ ± e i φ sin θ − i δ ± e − i φ sin θ ± i cos θ − i δ ∓ e − i φ sin θ i δ ± e − i φ sin θ − i δ ∓ e i φ sin θ i cos θ (cid:33) (31) with δ ± = ± , i.e. , δ + = 1 and δ − = 0 . In summary,Gaussian states of two fermionic modes form two dis-connected spheres parametrized by angles ( θ, φ ) , wherewe further distinguish the Gaussian state vectors of type | J + (cid:105) and | J − (cid:105) . The two sets are distinguished by the par-ity operator ˆ P = exp(i π ˆ N ) , as the total number operator ˆ N = (cid:80) i ˆ a † i ˆ a i is even for | J + (cid:105) and odd for | J − (cid:105) C. Gaussian transformations

In this section, we will introduce a special set of unitarytransformations that map Gaussian states into Gaus-sian states. They are generated by operators that arequadratic in ˆ ξ a . We deﬁne the Lie group G as lineartransformations on the classical phase space V that pre-serve the symplectic form Ω ab for bosons or the metric G ab for fermions G = (cid:26) Sp(2 N, (cid:82) ) (bosons) O(2 N, (cid:82) ) (fermions) , (32)which we represent as matrices M : V → V withSp(2 N, (cid:82) ) = (cid:8) M ab ∈ GL(2 N, (cid:82) ) (cid:12)(cid:12) M Ω M (cid:124) = Ω (cid:9) , O(2 N, (cid:82) ) = (cid:8) M ab ∈ GL(2 N, (cid:82) ) (cid:12)(cid:12) M GM (cid:124) = G (cid:9) . (33)The associated Lie algebras g are then deﬁned as sp (2 N, (cid:82) ) = (cid:8) K ab ∈ gl (2 N, (cid:82) ) (cid:12)(cid:12) K Ω + Ω K (cid:124) = 0 (cid:9) , so (2 N, (cid:82) ) = (cid:8) K ab ∈ gl (2 N, (cid:82) ) (cid:12)(cid:12) KG + GK (cid:124) = 0 (cid:9) . (34) Note that the Lie algebra of O(2 N, (cid:82) ) and SO(2 N, (cid:82) ) are thesame, commonly referred to as so (2 N, (cid:82) ). We can construct a (projective) representation of theseLie groups as unitary operators S ( M ) on Hilbert spaceby exponentiating quadratic operators. For this, we ﬁrstdeﬁne an identiﬁcation between Lie algebra elements K ∈ g and anti-Hermitian quadratic operators (cid:98) K with K ab ⇔ (cid:98) K = (cid:26) − i2 ω ac K cb ˆ ξ a ˆ ξ b (bosons) g ac K cb ˆ ξ a ˆ ξ b (fermions) , (35)which is uniquely ﬁxed by the requirement (cid:92) [ K , K ] = [ (cid:98) K , (cid:98) K ] . (36)For any M = e K , we deﬁne the squeezing operator S ( e K ) ∼ = e (cid:98) K , (37)where ∼ = implies equality up to a complex phase. Forfermions, products of M = e K for K ∈ so (2 N, (cid:82) ) willonly generate the subgroup SO(2 N, (cid:82) ), whose group el-ements satisfy det M = 1. To generate other group ele-ments M ∈ O(2 N, (cid:82) ) with det M = −

1, we can take anydual vector v a ∈ V ∗ satisfying v a G ab v b = 2 to deﬁne S ( M v ) = v a ˆ ξ a , (fermions) (38)representing( M v ) ab = v c G ca v b − δ ab ∈ O(2 N, (cid:82) ) (39)with det M v = −

1. We can further check that S ( M v )is unitary. Moreover, we have S † ( M v ) ˆ ξ a S ( M v ) =( M v ) ab ˆ ξ b . Consequently, together S ( e K ) and S ( M v )for a single chosen v a generate the full orthogonalgroup O(2 N, (cid:82) ), i.e. , every element M ∈ O(2 N, (cid:82) )with det M = − S ( M ) ∼ = S ( e K ) S ( M v ) for a ﬁxed v a and K = log M M − v . Thisdeﬁnition of S ( M ) forms a projective representation sat-isfying S ( M ) S ( M ) ∼ = S ( M M ) . (40)Furthermore, we can read oﬀ the group element M from S ( M ) by its action on ˆ ξ a via the relation S † ( M ) ˆ ξ a S ( M ) = M ab ˆ ξ b . (41)Every Gaussian state vector | J (cid:105) has a stabilizer subgroupU( N ) = (cid:8) M ∈ G (cid:12)(cid:12) M Γ M (cid:124) = Γ (cid:9) = (cid:8) M ∈ G (cid:12)(cid:12) M JM − = J (cid:9) (42)which preserves Γ and J . Similarly, the associated uni-tary transformation S ( M ) will preserve the quantumstate vector | J (cid:105) up to a complex phase, i.e. , we have The equality turns out to hold up to an overall sign, i.e. , we canchoose S ( M ), such that S ( M ) S ( M ) = ±S ( M M ). S ( M ) | J (cid:105) ∼ = | J (cid:105) for all M ∈ U( N ). This deﬁnes the Liesubalgebra u ( N ) = (cid:8) K ∈ g (cid:12)(cid:12) K Γ + Γ K (cid:124) = 0 (cid:9) = (cid:8) K ∈ g (cid:12)(cid:12) [ K, J ] = 0 (cid:9) . (43)Similarly, we have (cid:98) K | J (cid:105) ∝ | J (cid:105) . Given a Gaussian refer-ence state vector | J (cid:105) , we can reach any other Gaussiantarget state vector | J (cid:105) via | J (cid:105) ∼ = S ( M ) | J (cid:105) ∼ = | M Γ M (cid:124) (cid:105) . (44)The solution of the equation M Γ M (cid:124) = Γ is not unique,as we can always multiply by u ∈ U( N ) associatedto | J (cid:105) , such that ( M u )Γ ( M u ) (cid:124) = M u Γ u (cid:124) M (cid:124) = M Γ M (cid:124) . We can ﬁx a special solution T by imposingthe condition T Γ = Γ T (cid:124) leading to the simpler equa-tion J = T J T − = T J , which is solved by T = − JJ .We deﬁne this as the relative complex structure ∆ ab = T ac T cb = − J ac ( J ) cb = Γ ac (Γ − ) cb . (45)It captures the full basis independent information aboutthe relationship of the two Gaussian states J and J . Wehave the following properties as proven in [16]: • Bosons.

The spectrum of ∆ consists of pairs( e r i , e − r i ) with r i ∈ [0 , ∞ ), such that T = √ ∆has eigenvalues ( e r i , e − r i ). ∆ is diagonalizable anda symplectic group element. • Fermions.

The spectrum of ∆ consists of quadru-ples ( e i 2 r i , e i 2 r i , e − i 2 r i , e − i 2 r i ) with r i ∈ (0 , π )or pairs (1 ,

1) or ( − , − r i ∈ { , π } . If the number of pairs ( − , −

1) iseven, i.e. , the eigenvalue − J and J lie in thesame topological component of fermionic Gaussianstates, i.e. , they can be continuously deformed intoeach other. Otherwise, i.e. , if the number of eigen-value pairs ( − , −

1) is odd, J and J live in sep-arate components. T = √ ∆ is only well deﬁnedin the former case and has quadruple eigenvalues( e i r i , e i r i , e − i r i , e − i r i ) for r ∈ (0 , π ). If there areeigenvalue quadruples ( − , − , − , − to deﬁne T as areal linear map with T = ∆ in this sub block corre-sponding to choosing diﬀerent eigenvectors for thequadruple of eigenvalues (i , i , − i , − i).We can bring ∆, T and K = log T into block-diagonalform. We ﬁnd 2 × × { r i } from above correspond to ρ in ourbosonic example 1 and θ in our fermionic cxample 2. Sometimes also referred to as relative covariance matrix[20, 21]. In essence, T describes half way on the shortest path betweenΓ and Γ. The eigenvalues ( − , − , − , −

1) imply that Γ andΓ are on opposite poles of spheres, in which case all the pointson the equator are equivalent choices of being half-way. Example 3 (Bosons revisited) . We reconsider Exam-ple 1 and choose the reference state vector | J (cid:105) with G q,p ≡ (cid:16) (cid:17) a,a † ≡ (cid:16) (cid:17) , J q,p ≡ (cid:16) − (cid:17) a,a † ≡ (cid:16) i 00 − i (cid:17) . (46) A general symplectic transformation G = Sp(2 , (cid:82) ) is M q,p ≡ (cid:18) cos τ cosh ρ − sin θ sinh ρ − sin τ cosh ρ + cos θ sinh ρ sin τ cosh ρ + cos θ sinh ρ cos τ cosh ρ + sin θ sinh ρ (cid:19) a,a † ≡ (cid:18) e i τ cosh ρ i e i θ sinh ρ − i e − i θ sinh ρ e − i τ cosh ρ (cid:19) , for which we have | J (cid:105) ∼ = S ( M ) | J (cid:105) with Γ from (24) ,where φ = τ − θ . The stabilizer group of | J (cid:105) consists of u q,p ≡ (cid:18) cos ϕ sin ϕ − sin ϕ cos ϕ (cid:19) a,a † ≡ (cid:18) e i ϕ e − i ϕ (cid:19) . (47) From the relative complex structure ∆ = T = − JJ , wecompute the generator K = log T q,p ≡ ρ (cid:16) sin φ cos φ cos φ − sin φ (cid:17) a,a † ≡ ρ (cid:16) e − i φ − i e i φ (cid:17) , (48) such that | J (cid:105) ∼ = e (cid:98) K | J (cid:105) . We can always change basis toreach a standard forms φ = π , where we can read oﬀ theeigenvalues ( e ρ , e − ρ ) of ∆ . Example 4 (Fermions revisited) . We reconsider Exam-ple 2. For a single fermionic mode, we choose the refer-ence state vector | J (cid:105) with Ω q,p ≡ (cid:16) − (cid:17) a,a † ≡ (cid:16) − ii 0 (cid:17) , J q,p ≡ (cid:16) − (cid:17) a,a † ≡ (cid:16) − i 00 i (cid:17) . (49) The stabilizer subgroup

U(1) consists of the same el-ements as in (42) , which coincides with the group

SO(2 , (cid:82) ) . Consequently, the only group elements thattransform | J (cid:105) = | J + (cid:105) into | J − (cid:105) lie in the disconnectedcomponent. We also reconsider two fermionic modes withreference state vector | J (cid:105) given by Ω q,p ≡ (cid:16) (cid:49) − (cid:49) (cid:17) a,a † ≡ (cid:16) − i (cid:49) i (cid:49) (cid:17) , J q,p ≡ (cid:16) (cid:49) − (cid:49) (cid:17) a,a † ≡ (cid:16) − i (cid:49)

00 i (cid:49) (cid:17) . (50) There is a -dimensional subspace of these generatorsalso satisfying [ K, J ] , which generates U(2) ⊂ O(4 , (cid:82) ) .We can reach the most general complex structure J + bya continuous path generated by K = 12 log ∆ q,p ≡ θ  φ φ − cos φ − sin φ

00 sin φ − cos φ − sin φ φ  (51) for ∆ = − J + J . To reach state vectors of the form | J − (cid:105) , we must also apply an additional transformation S ( M v ) with v q,p ≡ ( √ , , , a,a † ≡ (1 , , , to ﬁnd | J − (cid:105) = S ( M v ) | J + (cid:105) . We can always change basis to reach a stan-dard forms φ = 0 , where we can read oﬀ the eigenvalues ( e i θ , e i θ , e − i θ , e − i θ ) of ∆ . D. Geometry of pure Gaussian states

The family of pure Gaussian states forms a diﬀeren-tiable manifold M . It provides a versatile tool for ana-lytical and numerical studies of bosonic and fermionicquantum systems with applications ranging from con-densed matter [5, 7–9] and quantum information [1, 3]to quantum optics [7] and ﬁeld theory [12]. Mathemat-ically, M is a symmetric space (type CI for bosons andDIII for fermions) and has the properties of a so-calledK¨ahler manifold. The latter makes Gaussian states par-ticularly suitable for variational studies, where groundstates and time evolution are approximated on a suitablesubset of Hilbert space. In the following, we will discussthe rich geometry of this manifold, which plays an impor-tant role when one wishes to locally optimize a functionon it. We closely follow the conventions of ref. [15], whichcontains a comprehensive review of the geometry of vari-ational families, which in turn builds upon ideas of thetime-dependent variational principle [22, 23].We recall our deﬁnition of Gaussian state vectors | J (cid:105) as normalized vectors in Hilbert space, such that theirlinear complex structure J ab satisﬁes J = − (cid:49) . Notethat knowing Γ does not ﬁx the complex phase of theHilbert space vector, i.e. , Γ actually describes elementsof a projective Hilbert space P ( M ) which we could repre-sent as (pure) density operators ρ Γ = | J (cid:105) (cid:104) Γ | rather thanHilbert space vectors | J (cid:105) . However, we often prefer tothink of pure quantum states as state vectors | ψ (cid:105) ratherthan density operators ρ = | ψ (cid:105) (cid:104) ψ | and accept that weneed to keep in mind that these vectors are actually onlydeﬁned up to a complex phase, i.e. , | J (cid:105) ∼ = e i ϕ | J (cid:105) .Given a covariance matrix Γ of a pure Gaussian statevector | J (cid:105) , we are only allowed to change it in such a waythat respects symmetry (symmetric for bosons, antisym-metric for fermions) and preserves purity ( J = − (cid:49) ). Forthe inﬁnitesimal change δ Γ ab , we thus ﬁnd the constraints δ Γ ab = δ Γ ba , δ Γ J (cid:124) = Jδ Γ , (bosons) δ Γ ab = − δ Γ ba , δ Γ J (cid:124) = Jδ Γ . (fermions) (52)Knowing the change of the covariance matrix Γ does notuniquely ﬁx the change of the state vector | J (cid:105) , as wecould also change the complex phase. Such change wouldbe proportional to i | J (cid:105) . To remove such pure changeof gauge, we require that the tangent vector | δ Γ (cid:105) = δ Γ ab | V ab (cid:105) is orthogonal to | J (cid:105) itself, i.e. , (cid:104) Γ | V ab (cid:105) = 0.Under this condition, one can derive [15] | V ab (cid:105) = (cid:40) i4 g ac ω bd ˆ ξ c + ˆ ξ d + | J (cid:105) (bosons) g ac ω bd ˆ ξ c + ˆ ξ d + | J (cid:105) (fermions) . (53)This allows us to compute the inner product between twodiﬀerent variations δ Γ and δ ˜Γ as (cid:104) δ Γ | δ ˜Γ (cid:105) = 12 (cid:16) g ( δ Γ , δ ˜Γ) + i ω ( δ Γ , δ ˜Γ) (cid:17) , (54)where g and ω are real linear forms on the tangent space, i.e. , the space of allowed variations δ Γ ab subject to (52).Interestingly, g is a metric (symmetric, positive-deﬁnite)just as g and ω is a symplectic form (antisymmetric, non-degenerate) just as ω . We can evaluate them using (54)and (53) leading to g ( δ Γ , δ ˜Γ) = Tr( δ Γ gδ ˜Γ g ) = δ Γ ab g bc δ ˜Γ cd g da , ω ( δ Γ , δ ˜Γ) = Tr( δ Γ gδ ˜Γ ω ) = δ Γ ab g bc δ ˜Γ cd ω da , (55)which establishes relationships between g , ω , g and ω .Given a Gaussian state vector | J (cid:105) and a Lie algebraelement K ∈ g , we compute the induced variation δ Γ K = ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 e tK Γ e tK (cid:124) = K Γ + Γ K (cid:124) , (56) δJ K = ddt (cid:12)(cid:12)(cid:12)(cid:12) t =0 e tK Je − tK = [ K, J ] . (57)This is the linear map δ Γ K : g → T Γ M : K (cid:55)→ δ Γ K .Its kernel consists of all Lie algebra elements that do notchange the covariance matrix Γ and is thus u ( N ) = (cid:8) K ∈ g (cid:12)(cid:12) [ K, J ] = 0 (cid:9) (58)from (43). We deﬁne its orthogonal complement u ⊥ ( N ) = (cid:8) K ∈ g (cid:12)(cid:12) { K, J } = 0 (cid:9) , (59)which is isomorphic to the tangent space T Γ M . Thiswill allow us to exploit the group structure of Gaussianstates to compute gradient descent with respect to g andsymplectic evolution with respect to ω without needingto evaluate them at every step. Example 5 (Tangent space for bosons) . We reconsidera single bosonic mode from example 4 at the state vector | J (cid:105) with Γ and J deﬁned in (50) . The tangent spacecan be parametrized as δG q,p ≡ (cid:18) a bb − a (cid:19) a,a † ≡ (cid:18) a + i b a − i b (cid:19) , (60) δJ q,p ≡ (cid:18) − b aa b (cid:19) a,a † ≡ (cid:18) b + i ab − i a (cid:19) . (61) The associated Hilbert space vector | δ Γ (cid:105) = δ Γ ab | V ab (cid:105) is | δ Γ (cid:105) = a + i b a † | J (cid:105) . (62) We further ﬁnd (cid:104) δ Γ | δ ˜Γ (cid:105) = a ˜ a + b ˜ b + i a ˜ b − b ˜ a which implies g ( δ Γ , δ ˜Γ) = a ˜ a + b ˜ b and ω ( δ Γ , δ ˜Γ) = a ˜ b − b ˜ a . (63) It is the genuine orthogonal complement on the Lie algebra g with respect to the Killing form K ( K, ˜ K ) = 2 N Tr( K ˜ K ). Example 6 (Tangent space for fermions) . We recon-sider Example 4. For a single fermionic mode, he tangentspace is trivial, i.e. , zero-dimensional, because the set ofpure Gaussian states consists of two discrete elements.We therefore directly consider two fermionic modes withreference state vector | J (cid:105) deﬁned in (50) . The tangentspace is then parametrized as δ Ω q,p ≡  a bb − a − a − b − b a  a,a † ≡  − i a − i b − i b i a i a i b i b − i a  , (64) δJ q,p ≡  a bb − a − a − b − b a  a,a † ≡  − i a − i b − i b i a i a i b i b − i a  . (65) The associated Hilbert space vector | δ Γ (cid:105) = δ ab | V ab (cid:105) is | δ Γ (cid:105) = − i2 ( a + i b )ˆ a † ˆ a † | J (cid:105) . (66) We further ﬁnd (cid:104) δ Γ | δ ˜Γ (cid:105) = a ˜ a + b ˜ b + i a ˜ b − b ˜ a which implies g ( δ Γ , δ ˜Γ) = a ˜ a + b ˜ b and ω ( δ Γ , δ ˜Γ) = a ˜ b − b ˜ a . (67) E. Parametrization of Gaussian states

In the previous sections, we saw that a Gaussian statevector | J (cid:105) is uniquely (up to a complex phase) charac-terized by its complex structure J . For our purpose, itis more eﬃcient to parametrize Gaussian states by ﬁrstchoosing a reference complex structure J and then labelthe Gaussian state vector | J M (cid:105) by the group transfor-mation M , such that J M = M J M − . While M is notunique for a given J M , i.e. , the map M (cid:55)→ J M is notinjective, it suﬃces for the purpose of optimization if wecan eﬃciently compute gradients on the group manifold.We will decompose the space of directions on the groupinto redundant directions (not changing the state) andnon-redundant directions (that change the state). Thisspace is called tangent space T J M M and it can be de-scribed by the allowed variations δJ M of the complexstructure J M . These variations are not completely free,because we only allow variations that respect the condi-tions on a complex structure J , i.e. , J = − (cid:49) and thatit is compatible with symplectic form Ω or metric G .For a reference J and a group element M , we can as-sociate to every Lie algebra element K ∈ g the variation δJ M ( K ) = M [ K, J ] M − . (68)This is the change induced from moving along M e tK away from J M .In some situations, we may not wish to parametrizethe full manifold of Gaussian states, but only a subset.This applies in particular when optimizing over Gaussianpuriﬁcations | J (cid:105) of a mixed Gaussian state ρ A , i.e. , we (cid:49) M h (cid:48) h (cid:48)⊥ G (cid:48) M J J M Tangent space to G (cid:48) (modulo h (cid:48) ), spanned by orthonormalbasis of generators Ξ µ of h ⊥ Ξ Ξ FIG. 1.

Parametrization of Gaussian states.

We ﬁx a (pure)reference Gaussian state vector | J (cid:105) and then use the sub-group G (cid:48) ⊂ G to generate the manifold M described by com-plex structures J M = MJ M − for M ∈ G (cid:48) . require ρ A = Tr H A (cid:48) | J (cid:105) (cid:104) J | for Hilbert spaces H = H A ⊗H A (cid:48) . Any other Gaussian puriﬁcation | ˜ J (cid:105) of ρ A is relatedto | J (cid:105) by a Gaussian transformation | ˜ J (cid:105) = S ( (cid:49) A ⊕ M A (cid:48) ) | J (cid:105) = S A (cid:48) ( M A (cid:48) ) | J (cid:105) , (69) i.e. , the set of puriﬁcations of ρ A is generated from | J (cid:105) by the subgroup G (cid:48) = { (cid:49) A ⊕ M A (cid:48) ∈ G} ⊂ G . (70)This group only aﬀects the Hilbert space H A (cid:48) , such thatthe reduction of | J (cid:48) (cid:105) onto H A will not change and thusstay to be ρ A . In summary, we will consider a subalgebra g (cid:48) ⊂ g that generates the allowed transformations, e.g. ,the ones only changing the subsystem H A (cid:48) . Analogousto the decomposition g = u ( N ) ⊕ u ⊥ ( N ), we can thendeﬁne h (cid:48) = (cid:8) K ∈ g (cid:48) (cid:12)(cid:12) [ K, J ] = 0 (cid:9) , h (cid:48)⊥ = (cid:8) K ∈ g (cid:48) (cid:12)(cid:12) { K, J } = 0 (cid:9) , (71)such that g (cid:48) = h (cid:48) ⊕ h (cid:48)⊥ . In this case, a basis of h (cid:48)⊥ con-sists of a maximal set of generators Ξ µ ∈ g (cid:48) that lead tolinearly independent changes of the state J .In summary, our parametrization of Gaussian states orsubfamilies is based on the following ingredients, whichare illustrated in ﬁg. 1. • Reference state vector | J (cid:105) . We specify a pureGaussian state vector | J (cid:105) as reference by its com-plex structure J . • Subalgebra g (cid:48) of allowed transformations. Wespecify a subalgebra g (cid:48) ⊂ g that we use to generateany other allowed state (potentially g (cid:48) = g ). • Generated subgroup G (cid:48) . The Lie subalgebra g (cid:48) generates the Lie subgroup G (cid:48) ⊂ G . • Manifold of certain Gaussian states M . Theresulting subgroup G (cid:48) generates all reachable com-plex structures J M = M J M − and the associatedstate vectors | J M (cid:105) . • Stabilizer h (cid:48) of J . We deﬁne the subalgebra h (cid:48) = { K ∈ g (cid:48) | [ K, J ] = 0 } ⊂ g (cid:48) generates thosetransformation that leave J invariant. • Subspace h (cid:48)⊥ changing J . We deﬁne the sub-space h (cid:48)⊥ = { K ∈ g (cid:48) |{ K, J } = 0 } of generators K that are orthogonal to the space h (cid:48) . • Tangent space T J M M . We span the tangentspace at a given Gaussian state J M = M J M − as δJ i = M [ K i , J ] M − with K i ∈ h (cid:48)⊥ .We therefore choose a basis Ξ ≡ (Ξ , . . . , Ξ m ) of h (cid:48)⊥ andthen compute g µν = g ( δ Γ µ , δ Γ ν ) and ω µν = ω ( δ Γ µ , δ Γ ν ) , (72)where δ Γ µ = Ξ µ Γ + Γ Ξ (cid:124) µ . We can simplify this to ﬁnd g µν = 14 Tr(Ξ µ Ξ ν + Ξ µ J Ξ ν J ) , (73) ω µν = 12 Tr(Ξ µ J Ξ ν ) , (74)where we have uniﬁed the expressions for bosons andfermions. F. Puriﬁcation of mixed Gaussian states

An important class of sub-manifolds of pure Gaussianstates that are related by the action of some subgroup G (cid:48) ⊂ G are Gaussian puriﬁcations of a given mixed Gaus-sian state ρ . Various measures of quantum correlations,such as entanglement of puriﬁcation (EoP) or complexityof puriﬁcation (CoP), are deﬁned as some critical valueon such manifolds, which we review in section V. Here,we discuss the properties of the underlying manifold ofGaussian puriﬁcations.In section II B, we focused on pure Gaussian states,which we introduced as those states, for which the com-plex structure J satisﬁes J = − (cid:49) . A mixed Gaus-sian state ρ is still fully characterized by J as computedin (13), but which now satisﬁes the condition (cid:49) ≤ − J , (bosons) ≤ − J ≤ (cid:49) . (fermions) (75)This implies that the eigenvalues of J appear in conju-gate pairs ± i c i with c i ∈ [1 , ∞ ) for bosons and c i ∈ [0 , J as complex struc-tures (unless J = − (cid:49) , i.e. , all c i = 1, in which case ρ is0pure), but rather as a restricted complex structure . As allthe eigenvalues are imaginary, we can diagonalize J onlyover the complex numbers. If we only use real transfor-mations, we can only bring J into a block-diagonal formwith antisymmetric 2 × ρ is Gaussian. Instead, weneed to require that ρ = (cid:40) e − q ab ˆ ξ a ˆ ξ b − c (bosons) e − i q ab ˆ ξ a ˆ ξ b − c (fermions) , (76)where q ab is a positive-deﬁnite bilinear form, i.e. , ρ is theexponential of a quadratic operator ( c ﬁxes the normal-ization Tr( ρ ) = 1). We will later see in formula 8 how J , q and c are related.We now consider puriﬁcations of Gaussian states. Forthis, we refer to the original Hilbert space as H A with2 N A associated operators ˆ ξ aA and classical phase space A . We consider a mixed Gaussian state ρ A in this systemwith complex structure J A . From our previous discussionof the eigenvalues ± i c i of J A , we can ﬁnd for every ﬁxedbasis ˆ ξ A a group transformation T A ∈ G A (symplectic ororthogonal transformation on A ), such that J ≡ T A J msta T − (77)with respect to ˆ ξ A . We have the mixed state standardform J msta ≡  cosh(2 r ) (cid:65) · · · · · · cosh(2 r N A ) (cid:65)  , (bosons) J msta ≡  cos(2 r ) (cid:65) · · · · · · cos(2 r N A ) (cid:65)  (fermions) (78) with squeezing parameters r i and the 2 × (cid:65) q,p ≡ (cid:18) − (cid:19) a,a † ≡ (cid:18) − i 00 i (cid:19) . (79)It is well-known that such a mixed Gaussian state ρ A canbe puriﬁed by adding more degrees of freedom. For this,we extend the Hilbert space from H A to H (cid:48) = H A ⊗ H A (cid:48) with operators ˆ ξ (cid:48) = ( ˆ ξ A , ˆ ξ A (cid:48) ). The puriﬁcation of ρ A isthen a state vector | J (cid:105) in the larger Hilbert space H (cid:48) ,such that ρ A = Tr H A (cid:48) | J (cid:105) (cid:104) J | . (80)This requires H A (cid:48) to be suﬃciently large, such that allmixed modes can be puriﬁed. In light of our previous con-siderations involving the squeezing parameters r i , system H A (cid:48) must contain at least as many modes N as there arenon-zero parameters r i associated in the standard formof J A to ρ A . The Gaussian puriﬁcation can then be in-ferred by constructing the complex structure J on thelarger phase space A ⊕ A , such that the restriction [ J ] A to A yields J A . Put diﬀerently, every non-zero squeez-ing parameter r i corresponds to an individual bosonic orfermionic degree of freedom that can and needs to bepuriﬁed by correlating it with an additional auxiliary de-gree of freedom. Of course, we are always free to addeven more auxiliary modes that are uncorrelated.The resulting puriﬁed form of J with respect to anenlarged basis ˆ ξ (cid:48) = ( ˆ ξ A , ˆ ξ A (cid:48) ) is given by J ≡ ( T A ⊕ T A (cid:48) ) J psta ( T − A ⊕ T − A (cid:48) ) (81)for arbitrary T A (cid:48) ∈ G A (cid:48) , i.e. , any such T A (cid:48) will lead toa valid puriﬁcation | J (cid:105) . The puriﬁed standard form J psta has been derived in [24] as1 J psta =  cosh(2 r ) (cid:65) · · · r ) (cid:83) · · · · · · · · · cosh(2 r N A ) (cid:65) · · · sinh(2 r N A ) (cid:83) · · · r ) (cid:83) · · · r ) (cid:65) · · · · · · · · · sinh(2 r N A ) (cid:83) · · · cosh(2 r N A ) (cid:65) · · · · · · · · · (cid:65) · · · · · · · · · · · · (cid:65)  , (bosons) J psta =  cos(2 r ) (cid:65) · · · r ) (cid:83) · · · · · · · · · cos(2 r N A ) (cid:65) · · · sin(2 r N A ) (cid:83) · · · − sin(2 r ) (cid:83) · · · r ) (cid:65) · · · · · · · · · − sin(2 r N A ) (cid:83) · · · cos(2 r N A ) (cid:65) · · · · · · · · · (cid:65) · · · · · · · · · · · · (cid:65)  , (fermions) (82)where we used the 2 × (cid:65) q,p ≡ (cid:16) − (cid:17) a,a † ≡ (cid:16) − i 00 i (cid:17) , (cid:83) q,p ≡ (cid:16) (cid:17) a,a † ≡ (cid:16) − i 0 (cid:17) . (83)Let us now discuss how unique a chosen puriﬁcationis. From the perspective of pure states, we can act witharbitrary unitary operators U = (cid:49) ⊗ U on the statevector | J (cid:105) , i.e. , | ψ U (cid:105) = U | J (cid:105) , while preserving the prop-erty ρ A = Tr H A (cid:48) | ψ U (cid:105) (cid:104) ψ U | . This is well-known in thecontext of the Schmidt-decomposition of | J (cid:105) . However,acting with such a general U will generally lead to a non-Gaussian state vector | ψ U (cid:105) . Instead, we restrict to Gaus-sian unitaries of the form S ( M ) = S ( (cid:49) A ⊕ M A (cid:48) ) with M A (cid:48) ∈ G A (cid:48) , where S ( M ) has been introduced in Sec-tion II C. Here, we used the representation theory of theLie group G , i.e. , the symplectic group for bosonic sys-tems and the orthogonal group for fermionic ones. Fromthe requirement that S ( M ) must act as the identity on H A , we have M = (cid:49) A ⊕ M A (cid:48) . (84)We therefore recognize exactly the setup described in Sec-tion II E with the subgroup G (cid:48) from (70).We can use the standard form of the puriﬁed J ﬁndthe subalgebra h (cid:48) ⊂ g (cid:48) that preserves J . More precisely,we have h (cid:48) = (cid:8) K = (cid:48) ⊕ K (cid:12)(cid:12) [ K, J ] = 0 (cid:9) . (85)If the restriction J A (cid:48) = [ J ] A (cid:48) would describe a pure Gaus-sian state, the resulting algebra h (cid:48) would be isomorphicto u ( N A (cid:48) ). However, for a mixed state complex structure J A , the algebra h (cid:48) will be smaller, which consequently means that its orthogonal complement h (cid:48)⊥ of those Liealgebra elements that change the state vector | J (cid:105) (cid:104) J | willbe of higher dimension than u ⊥ ( N ).It turns out that we have h (cid:48) = 0 if there are no r i = 0.Otherwise, we have h (cid:48) (cid:39) u ( N ) if there are N distinct pa-rameters r i = 0, which correspond to the allowed Gaus-sian unitaries that change the pure Gaussian state con-tained in ρ A . Put diﬀerently, we have ρ A = Tr H A (cid:48) | J (cid:105) (cid:104) J | = ρ r ⊗ · · · ⊗ ρ r NA ,ρ A (cid:48) = Tr H A | J (cid:105) (cid:104) J | = ρ r ⊗ · · · ⊗ ρ r NA ⊗ ρ ⊗ · · · ⊗ ρ (cid:124) (cid:123)(cid:122) (cid:125) N A (cid:48) − N A times , (86)where ρ r is generally a mixed Gaussian state of a singlebosonic or fermionic mode with ρ r = (cid:40) r cosh r e − ˆ n ln coth r (bosons) sin r cos r e − ˆ n ln tan r (fermions) , (87)where ˆ n is the number operator of a single bosonic orfermionic mode associated to the creation and annihila-tion operator of the corresponding block of J A or J A (cid:48) intheir block-diagonal form.In summary, provided that all r i (cid:54) = 0, we have h (cid:48) = 0,such that the set of orthogonal generators is given by h (cid:48)⊥ = (cid:48) ⊕ g A (cid:48) = { ⊕ K A (cid:48) | K A (cid:48) ∈ g A (cid:48) } , (88)where g A (cid:48) are the generators of G A (cid:48) . In this speciﬁc case,this orthogonal set forms itself a Lie algebra. Note thatthe the prime in h (cid:48) and h (cid:48)⊥ is not related to the prime in A (cid:48) .2 III. REPRESENTATIONS OF GAUSSIANSTATES

The literature on quantum physics is for good reasonfull of Gaussian states for bosonic and fermionic systemsand they appear under various names and they a mul-titude of diﬀerent forms. In this section, we attempt toprovide a comprehensive dictionary that collects the com-monly used notions of characterizing Gaussian states andexplains how to convert between them. Table II providesa summary of these notions, which we review in the fol-lowing sections including compact conversion formulas.We restrict to Gaussian states with z a = (cid:104) J | ˆ ξ a | J (cid:105) = 0,but it is relatively straightforward by incorporating suchdisplacements for bosons if necessary.All conversions are based on standard linear algebraoperations, i.e. , matrix addition and multiplication, eval-uation of eigenvalues and so on. Our formulas will includematrix functions f ( M ) which can be either evaluated aspower series or by applying f on the eigenvalues of M .Note that we do not require M to be symmetric or Her-mitian, as it suﬃces that either the power series of f ( x )converges or that M is diagonalizable for f ( M ) to bewell-deﬁned.In our formulas, we take great care to make any addi-tional structures explicit. For example, instead of writinga formula where we implicitly assume that the respectivebasis to be orthonormal with respect to some referenceinner product, we will write a basis independent (covari-ant) formula, which explicitly includes the relevant met-ric. If we then express the formula with respect to anorthonormal basis, the matrix g representing the metricis just the identity. A. Covariance matrix

Given any basis { ˆ ξ a } of (possibly complexiﬁed) linearobservables, we compute the bosonic covariance matrix G ab and the fermionic covariance matrix Ω ab of a Gaus-sian state vector | J (cid:105) with (cid:104) Γ | ˆ ξ a | Γ (cid:105) = 0 as deﬁned in (19)based on the following formula. Formula 1 (Covariance matrices of pure Gaussianstates) . A Gaussian state vector | J (cid:105) is fully characterizedby its bosonic or fermionic covariance matrix deﬁned as Γ ab = (cid:40) G ab = (cid:104) J | ( ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a ) | J (cid:105) (bosons) Ω ab = (cid:104) J | ( ˆ ξ a ˆ ξ b − ˆ ξ b ˆ ξ a ) | J (cid:105) (fermions) . (89)We clearly have G ba = G ab and Ω ba = − Ω ab . The co-variance matrix is the (anti-)symmetrised autocorrelatorof the quadrature operators and a key characteristic ofGaussian states is that they are unambiguously deﬁnedby their ﬁrst and second moments (the displacement inphase space and covariance matrix) only [1–4]. B. Linear complex structure

An alternative to parametrizing Gaussian states bytheir covariance matrix is to use the so called linear com-plex structure. It is less commonly used in quantum in-formation and condensed matter, but has been exten-sively studied in the context of quantum ﬁeld theory incurved spacetime [12].

Formula 2 (Linear complex structure) . Given a bosonicGaussian state vector | J (cid:105) with symplectic form Ω ab or afermionic Gaussian state vector | Ω (cid:105) with metric G ab , theassociated linear complex J is J ab = − G ac Ω − cb = Ω ac G − cb . (90)We will see that the compatibility of these three struc-tures shown in (90) imbues the manifold of pure Gaussianstates with the structure of a K¨ahler manifold. C. Characteristic functions

The characteristic function χ of a quasiprobability dis-tribution W on the phasespace V deﬁned is deﬁned asthe inverse Fourier transform χ : V ∗ → (cid:67) : v (cid:55)→ χ ( w ) = (cid:90) V d N ξe − i w a ξ a W ( ξ ) . (91)We see that χ is deﬁned on the dual phase space V ∗ .From the perspective of probability theory, one usuallyﬁrst deﬁnes W , from which χ is derived. However, inthe context of quantum theory, it is actually easier toﬁrst give explicit formulas for χ and then deﬁne W asits Fourier transform. This is why we ﬁrst present theresults for χ in the present subsection and then discussthe respective W in the next subsection. Note that forfermions, both χ and W are deﬁned as a polynomialin Grassman variables w a and ξ a , which anti-commuteamong themselves and with each other in (91).In the case of bosonic and fermionic quantum systems,there is a natural set of quasi probability distributions W s and their associated characteristic functions χ s labelledby a real parameter s ∈ [ − , s (cid:54) = 0, they aredeﬁned with respect to a notion of ordering creation andannihilation operators associated to a Gaussian referencestate | J (cid:105) with covariance matrix Γ . For most practicalapplications, only the following cases of s ∈ {− , , } are studied. Wigner.

For s = 0, the quasiprobability distributionis independent of Γ and called Wigner distribution ξ (cid:55)→ W ( ξ ). The Wigner characteristic function χ ( w ) of anoperator ˆ O can be computed from the operator ˆ O as χ ( w ) = Tr( ˆ O e − i w a ˆ ξ a ) . (92) Glauber.

For s = 1, we have the Glauber–Sudarshan P characteristic function, which is the Fourier transform3 Bosons Fermions P u r e s t a t e r e p r e s e n t a t i o n s Covariance matrix (III A) Covariance matrix (III A) G ab = (cid:104) Γ | (ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a ) | J (cid:105) Ω ab = (cid:104) Γ | (ˆ ξ a ˆ ξ b − ˆ ξ b ˆ ξ a ) | J (cid:105) Linear complex structure (III B) Linear complex structure (III B) ( δ ab + i J ab )ˆ ξ b | J (cid:105) = 0 ( δ ab + i J ab )ˆ ξ b | J (cid:105) = 0 Characteristic function (III C) Characteristic function (III C) χ s ( w ) = exp (cid:0) − w a ( G + sG ) ab w b (cid:1) χ s ( w ) = exp (cid:0) − i4 w a (Ω − s Ω ) ab w b (cid:1) Quasiprobability distribution (III D) Quasiprobability distribution (III D) W s ( ξ ) = exp ( − ξ a ( G + sG ) − ab ξ b ) √ det π ( G + sG ) W s ( ξ ) = exp ( − ξ a (Ω − s Ω ) − ab ξ b ) (cid:113) det Ω − s Ω02

Gaussian unitary (III E) Gaussian unitary (III E) | G (cid:105) = e (cid:98) K | (cid:105) = exp( − i2 k ab ˆ ξ a ˆ ξ b ) | (cid:105) | Ω (cid:105) = e (cid:98) K | (cid:105) = exp( k ab ˆ ξ a ˆ ξ b ) | (cid:105) Squeezed vacuum (III F) Squeezed vacuum (III F) | J (cid:105) = (cid:112) (cid:49) − γγ † e γ ij ˆ a † i ˆ a † j | (cid:105) = (cid:112) det( (cid:49) − L ) e − i2 ω ac L cb ˆ ξ a + ˆ ξ b + | (cid:105) | J (cid:105) = e γij ˆ a † i ˆ a † j | (cid:105) √ (cid:49) + γγ † = e gacLcb ˆ ξa + ˆ ξb + | (cid:105) √ det( (cid:49) − L ) Bogoliubov transformation (III G) Bogoliubov transformation (III G) ˆ b i = α ij ˆ a j + β ij ˆ a † j ˆ b i = α ij ˆ a j + β ij ˆ a † j Wave function (III I) ψ ( q ) = (cid:113) det Aπ exp( − q α (cid:0) A + i B ) αβ q β (cid:1) M i x e d s t a t e r e p r e s e n t a t i o n s Covariance matrix (III A) Covariance matrix (III A) G ab = Tr (cid:16) ρ (ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a ) (cid:17) Ω ab = Tr (cid:16) ρ (ˆ ξ a ˆ ξ b − ˆ ξ b ˆ ξ a ) (cid:17) Characteristic function (III C) Characteristic function (III C) χ s ( w ) = exp (cid:0) − w a ( G + sG ) ab w b (cid:1) χ s ( w ) = exp (cid:0) − i4 w a (Ω − s Ω ) ab w b (cid:1) Quasiprobability distribution (III D) Quasiprobability distribution (III D) W s ( ξ ) = exp ( − ξ a ( G + sG ) − ab ξ b ) √ det π ( G + sG ) W s ( ξ ) = exp ( − ξ a (Ω − s Ω ) − ab ξ b ) (cid:113) det Ω − s Ω02

Thermal state (III H) Thermal state (III H) ρ = exp( − q ab ˆ ξ a ˆ ξ b + c ) ρ = exp( − i2 q ab ˆ ξ a ˆ ξ b + c ) Wave function (III I) ρ ( q, ¯ q ) = (cid:113) det A + Cπ exp  − (cid:32) q ¯ q (cid:33) T (cid:32) A + i B C + i DC − i D A − i B (cid:33) (cid:32) q ¯ q (cid:33) TABLE II.

Common representations of Gaussian states.

We list commonly used representations of bosonic and fermionicGaussian states of both pure and mixed form. of the Glauber–Sudarshan P quasiprobability distribu-tion ξ (cid:55)→ P ( ξ ) = W − . The characteristic function is χ ( w ) = Tr( ˆ O e − i w a ˆ ξ a + e − i w a ˆ ξ a − ) . (93) Husimi.

For s = −

1, we have the Husimi Q char- acteristic function, which is the Fourier transform ofthe Glauber–Sudarshan P quasiprobability distribution ξ (cid:55)→ P ( ξ ) = W − . The characteristic function is χ − ( w ) = Tr( ˆ O e − i w a ˆ ξ a − e − i w a ˆ ξ a + ) . (94)4In all these cases, We can compute the expectationvalue of an arbitrary polynomial in linear observables { ˆ ξ a } as (cid:104) ( ˆ ξ a . . . ˆ ξ a n ) s (cid:105) = ∂∂w a . . . ∂∂w a (cid:12)(cid:12)(cid:12)(cid:12) w =0 χ s ( w ) , (95)where ( . . . ) s refers to the respective ordering with re-spect to a Gaussian reference state vector | (cid:105) = | J (cid:105) , i.e. ,symmetric ordering for s = 0, normal-ordering for s = 1and anti-normal ordering for s = −

1. From this condi-tion, the following forms of χ for Gaussian states can bederived. Formula 3 (Characteristic functions of Gaussian states) . The general formula for the characteristic function of apure or mixed Gaussian state with respect to the referencestate vector | (cid:105) = | J (cid:105) is χ s ( w ) = (cid:40) exp (cid:0) − w a ( G + sG ) ab w b (cid:1) (bosons) exp (cid:0) − i4 w a (Ω − s Ω ) ab w b (cid:1) (fermions) . (96)As can be seen from (91), the characteristic func-tion χ is deﬁned on the dual phase space V ∗ , while thequasiprobability distribution W is directly deﬁned on thephase space V . We can, however, use the isomorphism w a ⇔ ξ a given by w a = ω ab ξ b for bosons and w a = g ab ξ b for fermions to map the characteristic function χ s ( w ) intothe phase space function (cid:101) χ s ( ξ ), such that (cid:101) χ s ( ξ ) = (cid:40) χ s ( ω ab ξ b ) (bosons) χ s ( g ab ξ b ) (fermions) . (97)One can show that for pure Gaussian states W ( ξ ) ∝ (cid:101) χ ( ξ )for all ξ . D. Quasiprobability distributions

As foreshadowed in the previous section, bosonicand fermionic quantum states can also be representedas quasiprobability distributions on the classical phasespace, i.e. , real valued functions W : V → (cid:82) satisfying (cid:82) d N ξ W ( ξ ) = 1. In contrast to regular probability dis-tributions, W ( ξ ) is allowed to also take negative valuesand in fact, this negativity can be directly linked to thenon-classicality of the respective quantum state [25, 26].For Gaussian states, all quasiprobability distributions arethemselves Gaussian functions and in particular positive, i.e. , classical in the sense of ref. [25].More generally, operators O on Hilbert space can berelated to a phase space distributions W : V → (cid:67) , whichmay not be normalized. To deﬁne W ( ξ a ), we need towrite the operator O as a power series O = t + ( t ) a ˆ ξ a + ( t ) ab ˆ ξ a ˆ ξ b + . . . (98) in terms of linear observables ˆ ξ a (or as limit of a sequenceof such series). Clearly, the sequence is not unique, be-cause we can use commutation or anti-commutation re-lations to change the ordering of ˆ ξ a , ˆ ξ b and so on in (98),which will create additional terms. For example, we haveˆ q ˆ p = ˆ p ˆ q + i. Given a Gaussian reference state vector | (cid:105) = | J (cid:105) , we can express everything in terms of ˆ ξ a ± ,which are deﬁned with respect to | (cid:105) , and then use com-mutation relations to bring them into some standard or-dering. The most common orderings aresymmetric ordering ( s = 0): ( ˆ ξ a + ˆ ξ b − − ˆ ξ b − ˆ ξ a + ) + . . . , normal ordering ( s = 1): ˆ ξ a + ˆ ξ b − + . . . , anti-normal ordering ( s = − ξ a − ˆ ξ b + + . . . , where the parameter s ∈ [ − ,

1] describes a continuumof intermediate orderings, as introduced in ref. [27]. Letus emphasize that we bring the power series (98) by us-ing the canonical commutation or anti-commutation re-lations and not by just reordering the terms by force,which would change the resulting operator O . We canthen express ξ a ± in terms of ˆ ξ a via (17) to ﬁnd the coef-ﬁcients ( t sn ) a ...a n of a series expansion with ordering s .Plugging in the variables ξ q,p ≡ ( q , . . . , q N , p , . . . , p N ) a,a † ≡ ( a , . . . , a N , a † , . . . , a † N ) rather than operators ˆ ξ then de-ﬁnes the phase space distribution W s ( ξ ) = ∞ (cid:88) n =0 ( t sn ) a ...a n ξ a . . . ξ a n . (99)For Hermitian operators O , the associated W s will bereal-valued on V . One can further show that we haveTr O = (cid:82) dξ N W s ( ξ ). For a density operator ρ , we thushave (cid:82) ξ N W s ( ξ ) = 1. Given an observable O and adensity operator ρ , we can compute the expectation value (cid:104)O(cid:105) ρ = Tr( ρ O ) = (cid:90) dξ N W ρs ( ξ ) W O s ( ξ ) , (100) i.e. , the trace of the product of two operators can becomputed by just integrating over the pointwise productof the respective phase space functions. Note that thisformula does not generalize to computing the trace of aproduct of more than two operators.In practice, W s is most eﬃciently computed fromthe respective characteristic function χ s via the regularFourier transform W s ( ξ ) = 1(2 π ) N (cid:90) dw N χ s ( w ) e i w a ξ a . (101)With this in hand, we can compute the quasi-probabilitydistributions W s for Gaussian states as follows. Formula 4 (Quasiprobability distributions) . For aGaussian state with covariance matrix Γ , we have thequasiprobability distribution W s ( ξ ) =  e − ξa ( G + sG − ab ξb √ det π ( G + G ) (bosons) e i ξa (Ω − s Ω0) − ab ξb (cid:113) det Ω − s Ω02 (fermions) (102)5 with respect to the reference state vector | (cid:105) = | J (cid:105) . E. Gaussian unitaries

We can parametrize Gaussian states also by the uni-tary Gaussian transformation S ( M ) that takes us froma reference state (vacuum | (cid:105) = | J (cid:105) ) to the state un-der consideration, i.e. , | G (cid:105) or | Ω (cid:105) . This unitary is notunique, because we can always compose U with someother Gaussian unitary satisfying u | (cid:105) = e i ϕ | (cid:105) .We have the reference covariance matrix Γ associatedto the vacuum | (cid:105) and a target state with covariance ma-trix Γ. Using the relative covariance matrix∆ ab = − J ac ( J ) cb = Γ ac (Γ − ) cb . (103)With this, we can deﬁne the group element T = √ ∆,satisfying J = T J T − , from which we can deduce theLie algebra generator K = log T = 12 log ∆ . (104)The unitary transformation S satisfying | J (cid:105) = S | (cid:105) is S = e (cid:98) K = (cid:26) exp ( − i2 ω ac K cb ˆ ξ a ˆ ξ b ) (bosons) exp ( g ac K cb ˆ ξ a ˆ ξ b ) (fermions) . (105)If we know the anti-Hermitian quadratic operator (cid:98) K = − i2 h ab ˆ ξ a ˆ ξ b for bosons or (cid:98) K = h ab ˆ ξ a ˆ ξ b for fermions, wecan compute the associated generator K = (cid:26) Ω ac h cb (bosons) G ac h cb (fermions) , (106)from which we ﬁnd the transformed covariance matrix asΓ = T Γ T (cid:124) with T = e K . (107)In summary, we have the following formulas. Formula 5 (Pure Gaussian state transformations) . Given a reference Gaussian state vector | (cid:105) with covari-ance matrix Γ , we can compute for every Gaussian statevector | J (cid:105) the quadratic operator (cid:98) K , such that | J (cid:105) = e (cid:98) K | (cid:105) with K = 12 log ∆ , (108) where ∆ ab = Γ ac (Γ − ) cb . Vice versa, for the same setup(reference state vector | (cid:105) with covariance matrix Γ ), wecompute for every Gaussian unitary e (cid:98) K the covariancematrix Γ = M Γ M (cid:124) with M = e K . (109) Note that all equalities of quantum state vectors are onlyup to a global complex phase.

F. Squeezed vacuum

Given a bosonic or fermionic Gaussian state vector | (cid:105) together with a complete set of annihilation operators ˆ a i satisfying ˆ a i | (cid:105) = 0, a Gaussian state vector | J (cid:105) can bedescribed by a squeezing matrix γ , which is a complex N × N matrix that is symmetric for bosons and antisym-metric for fermions. For bosonic systems, we can therebyreach any covariance matrix G ab , while for fermionic sys-tems we can reach any covariance matrix Ω ab with thesame parity as explained around (38). In the following,we will derive the relations between K , γ , Γ and Γ .We choose our standard bases such that the K¨ahlerstructures associated to reference state | J (cid:105) take the stan-dard forms from (21). We consider | J (cid:105) ∼ = S ( T ) | (cid:105) , where T = ∆ = ΓΓ − . By construction, we have S ( T ) = e (cid:98) K with K = log ∆. Both K and L take the standardforms K q,p ≡ (cid:18) K K K − K (cid:19) a,a † ≡ (cid:18) K + i K K − i K (cid:19) , (110)which both anti-commute with J . Note that the decom-position into K and K still has a U( N ) redundancy, i.e. , we would preserve the standard forms (21) of J , Γ and Ω for bosons or G for fermions, while K and K willmix with each other.A matrix u ∈ U( N ) satisﬁes [ u, J ] = 0 and is u q,p ≡ (cid:18) u u − u u (cid:19) a,a † ≡ (cid:18) u − i u u + i u (cid:19) . (111)Under a change of basis K (cid:55)→ uKu − , we thus havethe change K + i K (cid:55)→ ( u − i u )( K + i K )( u +i u ). Mathematically speaking, we have the complex N -dimensional vector space V − (cid:67) of annihilation operatorsand V + (cid:67) of creation operators. The two spaces are em-bedded in the complexiﬁed phase space V (cid:67) and can becanonically identiﬁed using complex conjugation on V (cid:67) .Our goal is to ﬁnd a compact expression of | J (cid:105) . Weconsider K with { K, J } = 0, which satisﬁes (cid:98) K = (cid:40) − i2 ω ac K cb ( ˆ ξ a + ˆ ξ b + + ˆ ξ a − ˆ ξ b − ) (bosons) ω ac K cb ( ˆ ξ a + ˆ ξ b + + ˆ ξ a − ˆ ξ b − ) (fermions) . (112)We can simplify e (cid:98) K based on the known relationsexp [ r ( e i θ (ˆ a † ) − e − i θ ˆ a )]= exp [ e i θ (tanh r )(ˆ a † ) ] × exp[ − (ln cosh r )(ˆ n + )] × exp [ − ( e − i θ tanh r )ˆ a ] , (bosons) (113)exp [ r ( e i θ ˆ a † ˆ a † + e − i θ ˆ a ˆ a )]= exp [ e i θ tan r ˆ a † ˆ a † ] × exp[ − (ln cos r )(ˆ n + ˆ n − × exp [ e − i θ tan r ˆ a ˆ a ] , (fermions) (114)6which are derived in ref. [28]. Using them and the deﬁ-nition L = tanh K , we ﬁnd the covariant expressions e (cid:98) K = e − i2 ω ac L cb ˆ ξ a + ˆ ξ b + × e − i2 ω ac log( (cid:49) − L ) cb (ˆ ξ a + ˆ ξ b − + i4 Ω ba ) × e − i2 ω ac L cb ˆ ξ a − ˆ ξ b − , (bosons) (115) e (cid:98) K = e g ac L cb ˆ ξ a + ˆ ξ b + × e g ac log( (cid:49) − L ) cb (ˆ ξ a + ˆ ξ b − − G ba ) × e g ac L cb ˆ ξ a − ˆ ξ b − , (fermions) (116)where we emphasize that they only apply to algebra ele-ments K ∈ g with { K, J } = 0. When applied to | (cid:105) , weﬁnd | J (cid:105) = e (cid:98) K | (cid:105) = (cid:40) det ( (cid:49) − L ) e − i2 ω ac L cb ˆ ξ a + ˆ ξ b + | (cid:105) (bosons) det − ( (cid:49) − L ) e g ac L cb ˆ ξ a + ˆ ξ b + | (cid:105) (fermions) , (117)where we used e ± Tr log( (cid:49) − L ) = det ± ( (cid:49) − L ). Therelevant linear map L = tanh K takes the form L q,p ≡ (cid:18) L L L − L (cid:19) a,a † ≡ (cid:18) L + i L L − i L (cid:19) , (118)which is analogous to (110). We ﬁnd L q,p ≡ (cid:16) L + L L L − L L L L − L L L + L (cid:17) a,a † ≡ (cid:16) γ ∗ γ γγ ∗ (cid:17) (119)where we deﬁned in this basis the complex matrix γ := L + i L , (120)The matrix representations L and L are symmetric forbosons and antisymmetric for fermions. This leads todet( (cid:49) − L ) = (cid:40) det ( (cid:49) − γγ † ) (bosons) det ( (cid:49) + γγ † ) (fermions) , (121)where the sign changes due to the anti-symmetry of L i for fermions. Using the fact that the spaces of V ± (cid:67) ofcreation and annihilation operators are Hilbert spaceswith Hermitian inner product, we can use γ as bilinearform γ ij , which satisﬁes12 γ ij ˆ a † i ˆ a † j = (cid:40) − i2 ω ac L cb ˆ ξ a + ˆ ξ b + (bosons) g ac L cb ˆ ξ a + ˆ ξ b + (fermions) . (122)This leads to our ﬁnal formula as follows. Formula 6 (Parametrizing squeezed state vectors) . Given a Gaussian reference vacuum | (cid:105) with covariancematrix Γ and creation operators ˆ a † i , we can parametrizethe squeezed state vector | J (cid:105) by an arbitrary symmetriccomplex matrix γ , such that | J (cid:105) =  (cid:0) det( (cid:49) − γ † γ ) (cid:1) e γ ij ˆ a † i ˆ a † j | (cid:105) (bosons) (cid:0) det( (cid:49) + γ † γ ) (cid:1) − e γ ij ˆ a † i ˆ a † j | (cid:105) (fermions) . (123) We can seamlessly convert between the matrix γ and thecovariance Γ . In particular, we have L = tanh (cid:18)

12 log ∆ (cid:19) q,p ≡ (cid:18) Re γ Im γ Im γ − Re γ (cid:19) a,a † ≡ (cid:18) γγ ∗ (cid:19) , (124) which can be used to compute γ from ∆ ab = Γ ac (Γ − ) cb or vice versa. Note that the splitting from (124) onlyrequires the standard forms of (21) of the state vector | (cid:105) . For fermions in the real (Majorana) basis, we haveΩ ab q,p ≡ (cid:32) Ω αβ Ω α ˙ β Ω ˙ αβ Ω ˙ α ˙ β (cid:33) . (125)In this basis, we ﬁndΩ = Im (cid:2) − (cid:49) N − γ )( (cid:49) N + γ † γ ) − (cid:3) , Ω = Re (cid:2) ( (cid:49) N + 2 γ − γ † γ )( (cid:49) N + γ † γ ) − (cid:3) , Ω = Re (cid:2) ( − (cid:49) N + 2 γ + γ † γ )( (cid:49) N + γ † γ ) − (cid:3) , Ω = Im (cid:2) − (cid:49) N + γ )( (cid:49) N + γ † γ ) − (cid:3) . (126)The inverse operation is given by γ = Ω + Ω + i(Ω − Ω )2 (cid:49) N + Ω − Ω + i(Ω + Ω ) , (127)where the fraction A/B denotes AB − . Equivalently, forbosons we can write G ab q,p ≡ (cid:32) G αβ G α ˙ β G ˙ αβ G ˙ α ˙ β (cid:33) . (128)The elements of this bosonic covariance matrix can againbe directly expressed in terms of γ : G = Re (cid:2) ( (cid:49) N + 2 γ + γ † γ )( (cid:49) N − γ † γ ) − (cid:3) ,G = Im (cid:2) ( (cid:49) N + 2 γ + γ † γ )( (cid:49) N − γ † γ ) − (cid:3) ,G = Im (cid:2) ( − (cid:49) N + 2 γ − γ † γ )( (cid:49) N − γ † γ ) − (cid:3) ,G = Re (cid:2) ( (cid:49) N − γ + γ † γ )( (cid:49) N − γ † γ ) − (cid:3) . (129)Again, this relationship can be inverted, leading to γ = G − G + i( G + G )2 (cid:49) N + G + G + i( G − G ) . (130) G. Bogoliubov transformation

The transformation from one Gaussian state to theother is sometimes encoded in a Bogoliubov transforma-tion. This is an indirect way to describe the transforma-tion from a reference vacuum | (cid:105) annihilated by ˆ a i to thenew Gaussian state vector | J (cid:105) annihilated by ˆ b i withˆ b i = α ij ˆ a j + β ij ˆ a † j , (131)7where we sum over the repeated index j . We will see thatthe information contained in α ij and β ij is equivalent tothe one contained in a group transformation M ∈ G . Infact, a Bogoliubov transformation is nothing else than asymplectic or orthogonal group transformation expressedin a complex basis ˆ ξ a . Formula 7 (Bogoliubov transformations) . Given aGaussian reference state vector | (cid:105) with covariance ma-trix Γ and annihilation operators ˆ a i , we reach any Gaus-sian state vector | J (cid:105) by a Bogoliubov transformation ˆ b i = α ij ˆ a j + β ij ˆ a † j , (132) such that ˆ b i | J (cid:105) = 0 . We compute Γ from α and β via M q,p ≡ (cid:16) Re α + Re β Im β − Im α Im α + Im β Re α − Re β (cid:17) a,a † ≡ (cid:16) α ββ ∗ α ∗ (cid:17) (133) and then evaluating Γ = M Γ M (cid:124) . Vice versa, for agiven Γ , there are many choices of α and β . They canbe computed from (133) by setting M = T u , where T =∆ = ΓΓ − and choosing an arbitrary u ∈ U( N ) , such as u = (cid:49) . H. Thermal states

Every mixed Gaussian state can be written as a ther-mal state ρ = e − β ˆ H /Z , where ˆ H is a quadratic Hamil-tonian with a unique ground state and Z = Tr( e − β ˆ H ).Without loss of generality, we can assume β = 1 and Z = 1 by redeﬁning ˆ H . With this choice, ˆ H is alsoknown as the modular Hamiltonian . A general quadraticHamiltonian can be written asˆ H = (cid:40) c + q ab ˆ ξ a ˆ ξ b (bosons) c + i q ab ˆ ξ a ˆ ξ b (fermions) , (134)where q ab is symmetric for bosons and anti-symmetricfor fermions. Note that there is no factor of as in (35),which will simplify later conventions. Because ˆ H has aunique ground state, it follows that there exists a ba-sis of creation and annihilation operators with numberoperators ˆ n i , such thatˆ H = (cid:40) c + (cid:80) i ω i (ˆ n i ± ) (bosons) c + (cid:80) i ω i (ˆ n i ± ) (fermions) , (135)where ω i > n i . In this speciﬁc basis, the densityoperator ρ decomposes into a tensor product over singlemodes, from which we can derive the respective standardforms of J ab , G ab , Ω ab and q ab listed in table III. Formula 8 (Thermal states) . For a mixed Gaussianstate ρ with covariance matrix Γ , we can always write Bosons Fermions ρ N A (cid:79) i =1 (cid:18) e − n i ln coth r i cosh r i sinh r i (cid:19) N A (cid:79) i =1 (cid:16) cos r i sin r i e − n i ln tan r i (cid:17) J q,p ≡ N A (cid:77) i =1 (cid:18) r i − cosh 2 r i (cid:19) N A (cid:77) i =1 (cid:18) r i − cos 2 r i (cid:19) J a,a † ≡ N A (cid:77) i =1 (cid:18) − i cosh 2 r i

00 i cosh 2 r i (cid:19) N A (cid:77) i =1 (cid:18) − i cos 2 r i

00 i cos 2 r i (cid:19) G q,p ≡ N A (cid:77) i =1 (cid:18) cosh 2 r i

00 cosh 2 r i (cid:19) N A (cid:77) i =1 (cid:18) (cid:19) G a,a † ≡ N A (cid:77) i =1 (cid:18) r i cosh 2 r i (cid:19) N A (cid:77) i =1 (cid:18) (cid:19) Ω q,p ≡ N A (cid:77) i =1 (cid:18) − (cid:19) N A (cid:77) i =1 (cid:18) r i − cos 2 r i (cid:19) Ω a,a † ≡ N A (cid:77) i =1 (cid:18) − ii 0 (cid:19) N A (cid:77) i =1 (cid:18) − i cos 2 r i i cos 2 r i (cid:19) q q,p ≡ N A (cid:77) i =1 (cid:18) ln coth r i

00 ln coth r i (cid:19) N A (cid:77) i =1 (cid:18) r i − ln tan r i (cid:19) q a,a † ≡ N A (cid:77) i =1 (cid:18) r i ln coth r i (cid:19) N A (cid:77) i =1 (cid:18) r i − i ln tan r i (cid:19) c N A (cid:88) i =1 log (cosh r i sinh r i ) − N A (cid:88) i =1 log (cos r i sin r i ) TABLE III. We list the standard forms of J , G , Ω, q and c for a mixed Gaussian state ρ = exp( − c − q ab ˆ ξ a ˆ ξ b ). ρ = e − ˆ H with ˆ H from (134) and q ab = (cid:40) − i ω ac arccoth (i J ) c b (bosons) − i g ac arctanh (i J ) c b (fermions) , (136) c =  log det (cid:16) (cid:49) + J (cid:17) (bosons) − log det (cid:16) (cid:49) + J (cid:17) (fermions) , (137) where q ab and c diverge for J = − (cid:49) in such a way thatthe limit of ρ describes the projector ρ = | J (cid:105) (cid:104) J | . Theserelations can be easily inverted to compute J and Γ interms of q ab . I. Wave functions

Most physicists encounter Gaussian states for the ﬁrsttime when studying the quantum harmonic oscillator.The ground state is a Gaussian state with Gaussian wavefunction q (cid:55)→ ψ ( q ), where q ∈ Q is a vector in position8space Q . In this section, we show how every pure bosonicGaussian state can be represented as Gaussian wave func-tion, either as pure wave function q (cid:55)→ ψ ( q ) or as mixedwave function ( q, ¯ q ) (cid:55)→ ρ ( q, ¯ q ), and how to convert be-tween wave functions and covariance matrices.In order to write down a wave function, one needs tomake a choice by splitting the classical phase space V intothe direct sum V = Q ⊕ P with dim Q = dim P = N ,such that the symplectic form vanishes on Q, P ⊂ V .More precisely, we ﬁnd the block formΩ ab = (cid:18) α ˙ β Ω ˙ αβ (cid:19) and ω ab = (cid:18) ω α ˙ β ω ˙ αβ (cid:19) , (138)where we have q ∈ Q and p ∈ P . The phase spacedecomposition V = Q ⊕ P induces a dual decomposition V ∗ = Q ∗ ⊕ P ∗ . The oﬀ-diagonal blocks in Ω and ω induceisomorphism Q (cid:39) P ∗ and Q ∗ (cid:39) P .

1. Pure states

We write the most general pure Gaussian state as ψ ( q ) = (cid:0) det Aπ (cid:1) / exp (cid:18) − q α ( A αβ + iB αβ ) q β (cid:19) . (139)Note that the determinant of the bilinear form A impliesthat the wave function is not a scalar function, but rathera scalar density of weight 1 /

2. This ensures that thesquare modulus of the wave function can be integratedover Q to give probabilities. We decompose the bosoniccovariance matrix G ab and the symplectic form Ω ab basedon our decomposition of the phase space V = Q ⊕ P , suchthat G ab = (cid:104) ψ | ( ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a ) | ψ (cid:105) = (cid:32) G αβ G α ˙ β G ˙ αβ G ˙ α ˙ β (cid:33) . (140)Note that the only requirement for the respective decom-position V = Q ⊕ P is that the restrictions Ω αβ and Ω ˙ α ˙ β vanish. Formula 9 (Pure state wave function) . Given a bosonicGaussian state vector | G (cid:105) and a phase space decompo-sition V = Q ⊕ P , we can convert between the covari-ance matrix decomposed in the blocks (140) and the wavefunction representation from (139) containing the bilin-ear forms A αβ and B αβ using G αβ = ( A − ) αβ ,G ˙ α ˙ β = − Ω ˙ αγ ( A + BA − B ) γδ Ω δ ˙ β ,G α ˙ β = − ( A − ) αγ B γδ Ω δ ˙ β ,G ˙ αβ = Ω ˙ αγ B γδ ( A − ) γβ . (141) This isomorphism means that we can identify the position vector q α with dual momentum q α ω α ˙ β and similar. Vice versa, we can solve these equations for A and B interms of G to ﬁnd A αβ = G − αβ and B αβ = G − αγ G γ ˙ δ ω ˙ δβ . (142)Note that G − αβ is the inverse of the N × N block G αβ satisfying G αβ G − βγ = δ αγ over Q ⊂ V which should notbe confused with the full 2 N × N inverse g ab of G ab with G ab g bc = δ ac over V .

2. Mixed states

Similarly, we can also write out the most general mixedGaussian state in the position representation as ρ ( q, ¯ q ) = Z exp (cid:18) − (cid:16) q ¯ q (cid:17) T (cid:16) A + i B C + i DC − i D A − i B (cid:17) (cid:16) q ¯ q (cid:17)(cid:19) , (143)where q = ( q , . . . , q N ), ¯ q = (¯ q , . . . , ¯ q N ) and the normal-ization is given by Z = (cid:0) det A + Cπ (cid:1) / . (144)Again, the wave function representation of the mixedstate ρ is density of weight 1 /

2. As before, we wouldlike to relate the bilinear forms A , B , C and D in termsof the covariance matrix G ab = Tr[ ρ ( ˆ ξ a ˆ ξ b + ˆ ξ b ˆ ξ a )] = (cid:32) G αβ G α ˙ β G ˙ αβ G ˙ α ˙ β (cid:33) . (145)We ﬁnd the following relations. Formula 10 (Mixed state wave function) . The diﬀerentblocks of the covariance matrix G ab are related to thematrices A, B, C, D via G αβ = (cid:0) ( A + C ) − (cid:1) αβ ,G ˙ α ˙ β = − Ω ˙ αγ (cid:0) A − C + ( B + D )( A + C ) − ( B − D ) (cid:1) γδ Ω δ ˙ β ,G α ˙ β = − (cid:0) ( A + C ) − (cid:1) αγ ( B − D ) γδ Ω δ ˙ β ,G ˙ αβ = Ω ˙ αγ ( B + D ) γδ (cid:0) ( A + C ) − (cid:1) δβ , (146) which can be inverted to give A αβ = 12 (cid:16) G − αβ − ω α ˙ γ (cid:16) G ˙ γ ˙ δ − G ˙ γ(cid:15) G − (cid:15)ζ G ζ ˙ δ (cid:17) ω ˙ δβ (cid:17) ,B αβ = − (cid:16) G − αγ G γ ˙ δ ω ˙ δβ − ω α ˙ γ G ˙ γδ G − δβ (cid:17) ,C αβ = 12 (cid:16) G − αβ + ω α ˙ γ (cid:16) G ˙ γ ˙ δ − G ˙ γ(cid:15) G − (cid:15)ζ G ζ ˙ δ (cid:17) ω ˙ δβ (cid:17) ,D αβ = − (cid:16) G − αγ G γ ˙ δ ω ˙ δβ + ω α ˙ γ G ˙ γδ G − δβ (cid:17) . (147) In our standard basis, we will have Ω α ˙ β q,p ≡ (cid:49) , Ω ˙ αβ q,p ≡− (cid:49) , ω α ˙ β q,p ≡ − (cid:49) and Ω α ˙ β q,p ≡ (cid:49) , which simpliﬁes aboveexpressions further i.e. , C = D = 0. IV. OPTIMIZATION ALGORITHM

Having reviewed the parametrization of Gaussianstates using complex structures and having related thisformalism to the other most commonly used parametriza-tions in the literature, we now return to our initial goalof eﬃcient local optimization over the class of Gaussianstates. Finding the minimal value (or maximum) of afunction f on some large manifold M is in general ahard problem and the primary goal in the ﬁeld of math-ematical optimization. One distinguishes between globaland local optimization, i.e. , if one is able to ﬁnd theglobal minimum or if one may get stuck in a local one.In this section, we present a schematic overview of ourapproach to eﬃcient local optimization over the class ofpure Gaussian states, based on the geometric consider-ations of section II D. We also allude to the ﬂexibilityof our optimization algorithm in ﬁnding global minimaand avoiding the pitfalls of poor convergence. We usea rudimentary gradient descent implementation [29], butexploit the natural geometry of Gaussian states and ex-ploit the Lie group structure of G . A. Gradient descent on matrix manifolds

Given a function f : M → (cid:82) on some manifold M , wecan ﬁnd its minimum from some starting point using gra-dient descent. At any point x ∈ M in the manifold, therange of possible directions of motion can be expandedin a basis of the vectors in the tangent space to M at x , denoted T x M . Gradient descent is one of the mostbasic methods of ﬁnding a minimum by moving itera-tively in directions which locally decrease the function.This means picking out suitable vectors in T x M directedalong those directions which minimise the function value.Speciﬁcally, these are the components of the gradient de-scent vector ﬁeld on the manifold, which is given by F µ = − G µν ∂f∂x ν , (148) i.e. , it associates with each point x ∈ M the directionalderivative of f . The inverse metric G µν is included inthis deﬁnition to remove the sensitivity of the gradientto the choice of local basis x µ .The analytical solution to the gradient descent prob-lem is the integral curve associated with the vectorﬁeld (148). In a numerical realization of gradient de-scent, we approximate this continuous curve by suﬃ-ciently small discrete incremental steps. This notion ofmoving a certain distance along one of the tangent vec-tors in T x M while remaining on M , which is trivial when M is ﬂat, is realized for the general non-ﬂat case by a so-called retraction map. This is a map R : T M → M ,with restrictions to the domains T x M given by the maps R x : T x M → M , (149)which is required to satisfy R x (0) = x and dR x (0) = id T x M , (150)where id T x M denotes the identity mapping on the tan-gent space. It is important to note that the retractionmap is not unique and that the most convenient choice ofa retraction map will be that which minimizes the com-putational eﬀort while remaining a suﬃciently accurateapproximation to the continuous integral curve. If we optimize over a Lie group G , a natural choicefor the retraction map is the exponential map, which,for a matrix Lie group, is simply given by the matrixexponential, e K = ∞ (cid:88) n =0 K n n ! , (152)since for Lie groups tangent vectors and Lie algebra el-ements K are equivalent. When optimizing over a Liegroup G , we divide the integral curve into continuoussegments, curves γ ( t ) = M e tK , t ∈ [0 ,

1] connecting sub-sequent points M and M (cid:48) = M e K in the group manifold.Here, K denotes the tangent vector to the group man-ifold at M corresponding with motion to M (cid:48) and thismotion is realized by the exponential (retraction) map.In practice, computing the full power series in (152) isexpensive and we would prefer a more computationallyviable approximation to the exponential. In principle,we can consider small Lie algebra elements K under thenorm || K || = Tr( KK (cid:124) ) and then perform a reasonabletruncation of the power series. The issue here is that apower series approximation of the exponential will not liein the desired Lie group i.e. , it cannot serve its purposeas a retraction map. There is, however, another approx-imation to the matrix exponential, which does fulﬁll thiscriterion: For algebra elements K , we have e (cid:15)K ∼ ( (cid:49) + (cid:15) K )( (cid:49) − (cid:15) K ) as (cid:15) → (cid:15) K is 1, then the expression in (153) cannot beinverted. We avoid this by choosing (cid:15) suﬃciently small. In the ﬂat case M = (cid:82) n , the natural choice of the retractionmap is R x ( u ) = x + u , (151)since in this case the manifold and its tangent space are globallyisomorphic. This is the familiar notion of moving forward bysome step u from a point x . K K K (cid:49) M M M e tK e tK e tK GM R ( F ) R ( F ) F h (cid:48) h (cid:48)⊥ J J = M J M − J J Sub tangent space to G , gen-erated by orthonormal ba-sis of generators Ξ µ of h (cid:48)⊥ Ξ Ξ Tangent space to M ,with local coordinates x µ x x FIG. 2.

Gradient descent on the Gaussian state manifold.

Visualization of the gradient descent geometry for a two-dimensionalstate manifold. The state manifold M is parametrized in terms of complex structures J n which in turn are parametrized bythe transformations M n in the Lie group G as given in (44), with respect to some reference state J . The blue surface indicatesa tangent space to M at state J . The vector ﬁeld ˜ X in this tangent space is also included, as well as the retraction mapwhich is shown as a projection of the vector ﬁeld onto M to visualise the notion of moving in the direction of a tangent vectorbut in the manifold. The associated group G is also shown and crucially it should be noted that the tangent spaces at eachpoint in the group are aligned with the manifold tangent spaces (to highlight the isomorphism between the two) but do notreproduce the smooth manifold M , since the equivalent of the curve traced out on M by successive retraction maps simplyconnects matrix elements which lie along the (light blue) ﬁbers which represent the stabilizers h (cid:48) as indicated illustratively atthe identity. In G , the gradient vector ﬁeld at points M n is denoted by K n as deﬁned in (156). The lines connecting points M n and M n +1 in the group are deﬁned by e sK n with s ∈ [0 ,

1] as written in (159).

Evaluating the RHS of (153) is much more eﬃcient thancomputing the exponential of the LHS. In fact, comput-ing the product of the matrix ( (cid:49) + (cid:15) K ) with the inverse( (cid:49) − (cid:15) K ) can be almost as eﬃciently implemented ascomputing the product of two matrices. B. Optimization on the Gaussian state manifold

As discussed in Section II, the Gaussian state manifoldis equipped with a Riemannian metric g µν with inverse G µν and therefore the vector ﬁeld (148) can be naturallydeﬁned for both the bosonic and fermionic state man-ifolds (1) and (2). In general, the inverse metric G µν needs to be re-evaluated at every point of the manifold,but as suggested previously, by moving into one of thestandard basis choices in (21), the matrix representationof the inverse metric is constant. This is the ﬁrst of theproperties of Gaussian states which allows for a particu-larly computationally eﬃcient implementation of gradi-ent descent optimization over this class of states. Sec-tion II E outlines in some detail the parametrization ofthe Gaussian state manifold in terms of transformations M of some reference state complex structure J . Basedon this parametrization, when optimizing over the man-ifold M of all pure Gaussian states we may equally saythat we are optimizing over the matrix groups (33) quo-tiened by the redundancies associated U( N ), which form1manifolds of dimensionsdim M b = N (2 N + 1) − N = N ( N + 1) , (154)dim M f = N (2 N − − N = N ( N − . (155)This means that in practice, the vector ﬁeld F µ is com-puted not with respect to the local basis of a tangentspace to M at a state J M , but rather with respect to anorthonormal basis Ξ µ of the Lie algebra g or, more pre-cisely, of the subspace h (cid:48)⊥ ⊂ g introduced in (71) whichgenerates non-zero variations in the complex structure.This idea is what leads to the expression for the varia-tion of a state in terms of a Lie algebra element in (68).We can relate the gradient vector F µ to the associatedLie algebra element K as K = F µ Ξ µ (156)using local basis Ξ µ of h (cid:48)⊥ . A second key point arisesfrom the left-invariance of the Riemannian metric on theGaussian state manifold: Since this leads to the preser-vation of the orthonormality of any choice of Ξ µ undertransformations in the group, we do not need to computea new orthonormal basis at diﬀerent points in the man-ifold. Instead, we can choose Ξ µ to be the generatorsof the Lie group, which form the natural orthonormalbasis of the tangent space to the identity. This leadsto signiﬁcant computational speedups, particularly whenoptimizing over high-dimensional manifolds. The setupfor gradient descent on the Gaussian state manifold isvisualized in ﬁg. 2. C. Performing gradient descent

Since we work in a parametrization of the state mani-fold solely in terms of the transformations of a referencestate, a computational implementation of the algorithmshould be able to evaluate the target function f for anystate vector | J M (cid:105) with only J and M as arguments.Here, we write this as ( M, J ) (cid:55)→ f ( M, J ). To deﬁne lo-cal derivatives, we introduce a local coordinate system x µ around a point M ∈ G , such that f ( x ) = f ( M e x µ Ξ µ , J )leading to ∂f∂x µ = ∂∂t (cid:12)(cid:12)(cid:12)(cid:12) t =0 f (cid:16) M e x µ Ξ µ , J (cid:17) , (157)which allows us to deﬁne the vector ﬁeld F µ accordingto (148) with respect to the basis Ξ µ of h (cid:48)⊥ . We re-emphasize here that (157) lets us naturally express thegradient in terms of the variation of the group elementonly, i.e. , at no point are we required to move from thegroup G to the state manifold M . This approach is shownfor various examples in Section V.We now provide a step-by step explanation of the re-alization of an iterative gradient descent minimizationalgorithm based on the considerations above. The steps Choose state vector | J (cid:105) with complexstructure J .Compute g µν to ﬁnd ONB Ξ µ of h (cid:48)⊥ .Choose initial M with J = M J M − .Calculate F n = f ( M n ; J ) and K n = F µ Ξ µ Deﬁne M n +1 ( s ) := M n exp ( − sK n / (cid:107) K n (cid:107) )Calculate F n +1 ( s ) = f ( M n +1 ( s ); J ) F n +1 ( s ) < F n Decreasestep s Re-deﬁne M n +1 ( s ) (cid:55)→ M n Stop?

Return ﬁnal result F n set n = 1deﬁne 0 ≤ s ≤ Graphical representation of the algorithm.

We showa step-by-step explanation of the optimization algorithm de-scribed in the main text. Gray shading indicates a decisionbox and the color-coded sections correspond with those distin-guished in the main text. The ”Stop?” decision box indicatesthe implementation of a stop criterion. are summarized graphically in ﬁg. 3, which complementsﬁg. 2.

1. Initialization.

Our algorithm is initialized ona Gaussian state vector | J (cid:105) with covariance matrix Γ and complex structure J , such that the action of thesubgroup G (cid:48) ⊂ G generates the state manifold under con-sideration. We then construct an orthonormal basis Ξ µ for h (cid:48)⊥ , which are both deﬁned with respect to Γ . Forthis, we compute the metric G µν explicitly. We evaluate g µν in an arbitrary basis Ξ µ , so we can orthogonalize it,such that both g µν and G µν are equal to the identity.The metric g µν is eﬃciently computed as g µν = 14 (Ξ µ Ξ ν + Ξ µ Γ Ξ (cid:124) ν Γ ) , (158)where we use (73) with respect to the reference state vec-tor | J (cid:105) = | (cid:105) . By construction, we identify the tangentspaces at all group elements M n with the ones at M = (cid:49) .While | J (cid:105) is usually chosen in some standard form, wecan still initialize the algorithm on some M based on theproblem at hand.

2. Gradient computation.

We now perform suc-2cessive steps in the group as M n +1 = M n exp (cid:18) − s K n || K n || (cid:19) , ≤ s ≤ , (159)where K n = F µ Ξ µ with F µ calculated at the point M n and s chosen such that f ( M n +1 ; J ) < f ( M n ; J ).

3. Sub-routine to determine step-size.

To choosean appropriate step-size s , we use a sub-routine at eachiteration which should be chosen so as to balance the ef-ﬁciency gained by needing fewer steps to reach the mini-mum and the extra computational eﬀort of executing thesubroutine. In the examples discussed in later section,we found that the very rudimentary approach of itera-tively halving the step size was suﬃcient to ensure goodconvergence. However, a simple line search methods likea quasi-Newton routine can also be used.

4. Stop condition.

This iterative motion is repeateduntil some pre-determined stopping criterion ( e.g. , a tol-erance on the gradient norm or the diﬀerence betweensubsequent function values) is reached.

D. Practical considerations

While the focus of this work is not on elaborate numer-ical methods for implementing the algorithm describedabove, we mention here for the sake of completeness someadditional considerations regarding the practical imple-mentation the algorithm. These have been added in ourimplementation of the algorithm to varying degrees toenhance its eﬃciency.

Constrained optimization.

Our approach lends it-self intuitively to constrained optimization, since we canchoose to restrict our optimization to a smaller range ofstates, being some subspace of G , by truncating the Liealgebra basis. We show an example of this is in the sec-tion on Complexity of Puriﬁcation, where those algebraelements which do not generate non-zero variations of thecomplex structure can be explicitly cut out of the basis. Extension to global optimization.

There is ulti-mately no fail-safe way to locate global minima usinggradient descent. However, we can increase the probabil-ity of convergence to the global minimum by performinggradient descent from a number of suﬃciently far sepa-rated starting points in the manifold and choosing thelowest of the local minima from a large enough samplesize.

Identifying suitable starting points.

In choosingdiﬀerent starting points, it may be possible, given someanalytical intuition about the physical system at hand, toidentify starting points in the manifold from which therisk of landing in a local minimum is particularly low.Additionally, there may be some starting points in themanifold from which gradient descent will converge thefastest ( i.e. , from which it will take a much lower numberof iterations to reach the minimum).

Parallel optimization.

Where the most suitablestarting points cannot be found analytically, we mustresort to numerical methods: If, rather than minimiz-ing successively from diﬀerent starting points, we chooseto perform the optimizations simultaneously, we can dis-criminate between trajectories which promise to convergemore or less quickly to the minimum. In our algorithm,we implement this feature, and after each set of 5 itera-tions, only the 10% of trajectories with the lowest func-tion value and the highest gradient, respectively, are pur-sued further. While this does generally speaking greatlyreduce the total number of iterations required, there isof course a trade-oﬀ between this improvement and thecomputational eﬀort of an initially large number of tra-jectories.

E. The

GaussianOptimization.m package

To complement the theory outlined in this paper, wesupply the public

GaussianOptimization.m

Mathemat-ica package with a simple implementation of the opti-mization algorithm discussed in the previous sections.The package revolves around the function

GOOptimize ,which performs the gradient descent optimization fromsome initial complex structure and transformation. Theinput arguments to this function are divided into threecategories:

Problem-speciﬁc.

These are the arguments relatedto the speciﬁc optimization problem at hand, the scalarfunction and its derivative with respect to some Lie al-gebra element, expressed in terms of the initial complexstructure and an arbitrary transformation, in the spiritof Table IV.

System-speciﬁc.

These relate to the geometry ofthe optimization problem. Fundamentally, this in-cludes the symplectic or orthogonal basis (which can begenerated using the built-in functions

GOSpBasis and

GOOBasis ), but also the corresponding metric (generatedby

GOMetricSp and

GOMetricO ). It also includes the ini-tial complex structure and the (list of) initial transfor-mations.

Procedure-speciﬁc.

These are the parameters re-lated to the numerical implementation of the algorithm,including stopping criteria based on step limits and tol-erances on the function value and gradient.The

GaussianOptimization.m package is designed tobe user-friendly and all functions come with compre-hensive documentation. It is accompanied by an exam-ple notebook which includes a systematically organizedoverview over the functions included in the package, aswell as implementations of the applications discussed inthe next section. The functions and function gradientsfor these applications are also implemented as part of thepackage.3

V. APPLICATIONS

In this section, we show how our optimization algo-rithm may be used in several relevant physical contexts:approximating the ground state of Hamiltonians andcomputing the entanglement and complexity of puriﬁca-tion for fermionic and bosonic systems. We indicate howto parametrize the function to be extremized in terms ofthe complex structure and how to compute the associatedlocal derivatives (157). As discussed in the previous sec-tion, this is essential to unlocking the full computationaleﬃciency of the algorithm. We also provide suggestionsregarding convenient starting points and parametriza-tions. In the examples of this section, the gradient ofthe function f could be obtained analytically in terms ofthe complex structure J using the chain rule and proper-ties of matrix calculus. However, this may not be possiblein general, e.g. , if a function f is the result of some nu-merical algorithm, it may not be possible to compute itsderivative analytically. In this case, automatic diﬀeren-tiation (AD) procedures [30], which provide a numericalalgorithm to compute the gradient without the need of ananalytical derivative and also without the drawbacks ofa purely numerical derivative, present a computationallyfeasible alternative. A. Approximate ground states

The energy function E = (cid:104) ˆ H (cid:105) is one of the most promi-nent functions on families of pure quantum states thatshould be minimized. This is particularly relevant in thecontext of ﬁnding variational ground states, i.e. , ﬁndingstates within a given variational family of ansatz groundstates that approximate the true ground state most ac-curately with respect to some merit function, which typ-ically is the energy expectation value E = (cid:104) ˆ H (cid:105) .Pure Gaussian states and certain submanifolds areknown to be very suitable variational families to approx-imate ground states of bosonic and fermionic Hamiltoni-ans with local interactions. The approximation typicallyimproves with the dimension of the system, i.e. , Gaus-sian states often only capture qualitative features in onespatial dimension, but improve when moving to two andand three dimensions, as mean ﬁeld descriptions becomemore accurate. Gaussian states are also heavily used astrial states in mathematical physics to ﬁnd upper boundsto the energy of quantum gases, e.g. , when studying thedilute limit of Bose gases [5]. There exists a range of dif-ferent tools to ﬁnd the best Gaussian state , i.e. , the Gaus-sian state with the lowest energy expectation value E . Aprominent example is the Hartree-Fock method, whichis typically applied to fermions, but can also be used toapproximate bosonic ground states. Another establishedmethod is imaginary time evolution, where the geometricﬂow of e − τ ˆ H is approximated on the given manifold.From a purely numerical perspective, many optimiza-tion methods are suitable to ﬁnd the minimum of a function on conveniently parametrized family. How-ever, in the context of Gaussian states many stan-dard parametrization (using squeezing parameters orquadratic Hamiltonians acting on a reference state) tendto converge unreliably or get stuck in local minima. Tak-ing the natural Riemannian geometry of Gaussian states(induced by the Fubini-Study metric) into account cansigniﬁcantly improve the convergence of such numeri-cal methods. In fact, one can show that gradient de-scent with respect to this natural geometry coincideswith projected imaginary time evolution [15], which isknown to have favorable convergence properties. Both,the group-theoretic parametrization and the resultingstraight-forward gradient descent algorithm are there-fore perfectly suitable to ﬁnd approximate ground stateswithin the Gaussian state families.Finding the minimum of energy function E = (cid:104) ˆ H (cid:105) re-quires us to evaluate E and its derivative dE eﬃciently.For this, we assume that ˆ H can be written as ﬁnite seriesˆ H = h + ( h ) a ˆ ξ a + · · · + ( t n ) a ...a n ˆ ξ a . . . ˆ ξ a n , (160)whose expectation can be evaluated using Wick’s theo-rem. For a Gaussian state | J (cid:105) = | J, (cid:105) with n -point cor-relation function C a ...a n n = (cid:104) ˆ ξ a . . . ˆ ξ a n (cid:105) , Wick’s theoremstates the following.(a) Odd correlation functions vanish, i.e. , C n +1 = 0.(b) Even correlation functions are given by the sumover all two-contractions C a ··· a n n = (cid:88) σ | σ | n ! C a σ (1) a σ (2) . . . C a σ (2 n − a σ (2 n ) , (161)where C ab = ( G ab + i Ω ab ) has been introducedin (19) and the permutations σ satisfy σ (2 i − <σ (2 i ) and | σ | = 1 for bosons and | σ | = sgn( σ ) forfermions.We can always use canonical commutation or anti-commutation relations to ensure that ( t i ) a ...a i is to-tally symmetric (bosons) or anti-symmetric (fermions),in which case Wick’s theorem only leads to all contrac-tions with Γ ab , i.e. , either way, we ﬁnd E = E (Γ)as polynomial in the entries of Γ. A Lie algebra ele-ment K perturbs Γ at linear order (tangent vector) as δ Γ = K Γ + Γ K (cid:124) , such that dE = ∂E∂ Γ ab δ Γ ab = ∂E∂ Γ ab ( K Γ + Γ K (cid:124) ) ab . (162)This allows us a straight-forward implementation of gra-dient descent on the manifold of all pure Gaussian states(or appropriate submanifolds) based on the algorithmdiscussed in section IV.As an interesting observation, let us mention that thedescribed approach can also be used to approximate realtime evolution on the manifold of pure Gaussian states.Due to the fact that the manifold of pure Gaussian states4 (A) Approximate ground states (B)

Entanglement of puriﬁcation (C)

Complexity of puriﬁcationBosonic f E = (cid:104) ˆ H (cid:105) S AA (cid:48) = Tr D log D C = (cid:113) Tr log (∆)Bosonic df dE = dEd Γ ab ( K Γ + Γ K (cid:124) ) ab dS AA (cid:48) = Tr dD log D dC = 2 Tr log (∆)∆ − δ ∆Fermionic f E = (cid:104) ˆ H (cid:105) S AA (cid:48) = − Tr D log D C = (cid:113) i8 Tr log (∆)Fermionic df dE = dEd Γ ab ( K Γ + Γ K (cid:124) ) ab dS AA (cid:48) = − Tr dD log D dC = − − δ ∆TABLE IV. Function and gradient parametrization.

We list the quantities discussed in this section as scalar functions of thecomplex structure J M = MJ M − at M in the state manifold. We also list the associated gradient functions, parametrized bythe inﬁnitesimal changes in the complex structure δJ M ( K ), as given in (68). The expressions for the CoP are deﬁned in termsof ∆ = − J M J T , introduced as the relative covariance matrix in ref. [20], and δ ∆ = δJ M ( K ) J − . is a K¨ahler manifold the commonly used variational prin-ciples (Lagrangian, McLachlan, Dirac-Frankel) agree [15]and can be implemented as Hamiltonian equations of mo-tion. Our gradient descent algorithm implements the vec-tor ﬁeld F µ = − G µν ∂E∂x ν , (163)which must be replaced by the Hamiltonian time evolu-tion X µ = − Ω µν ∂E∂x ν , (164) i.e. , we only need to adjust our algorithm in step

2. Gra-dient computation , where we replace K n = F µ Ξ µ by K n = X µ Ξ µ . Here, Ω µν is the inverse of ω µν computedfrom (74). Just as in the case of G µν our group-theoreticparametrization (left invariance) of the Gaussian statemanifold ensures that we only need to evaluate Ω µν onceand can use the same matrix for subsequent steps in thealgorithm. Note, however, that there are important dif-ferences between imaginary time evolution (gradient de-scent) and real time evolution. For imaginary time evolu-tion, the step size is only used to ensure that the energydecreases, while for real time evolution we need to keeptrack of it to know the current time parameter. Moreover,for imaginary time evolution, we just need to make surethat the energy function decreases with each step, whileother errors due to the ﬁnite step size are not a problem.For real time evolution, we will always make small errorsdue to the ﬁnite step size and can only try to decreaseit by enforcing relevant conservation laws. In particular,the energy function should stay exactly constant, so wecan try to use the step size to keep the accumulated errorunder control.We can also use (162) in combination with (163)and (164) to derive the real and imaginary evolutionequations of the covariance matrix Γ, namely [15, 31] ddt Γ = − G ∂E∂ Γ G + Ω ∂E∂ Γ Ω) , (real) ddτ Γ = − G ∂E∂ Γ G + Ω ∂E∂ Γ Ω) . (imaginary) (165)For bosonic systems, it is natural to also allow for a non-zero displacement vector z a = (cid:104) ˆ ξ a (cid:105) , which can be seam-lessly integrated in the presented formalism, as discussedin refs. [15, 31]. In summary, our group-theoretic parametrization ofGaussian states and the resulting optimization algorithmare suitable to ﬁnd approximate ground states and toperform projected real time evolution on the manifold ofpure Gaussian state. In practical applications, we cantypically reduce the dimension of the manifold by imple-menting certain symmetries, e.g. , translational symme-try, from scratch by reducing the number of Lie algebragenerators Ξ µ accordingly and choosing an initial staterespecting the chosen symmetry. B. Gaussian entanglement of puriﬁcation (EoP)

The entanglement of puriﬁcation (EoP), ﬁrst intro-duced in ref. [32], quantiﬁes the degree of entanglementbetween subsystems in a composite quantum system. Assuch, it serves as a valuable correlation measure and hasrecently become of interest in quantum many body sys-tems [33]. Suppose we are given a mixed state in someHilbert space H = H A ⊗ H B which can be described bya density operator ρ AB . We now deﬁne a new Hilbertspace, H (cid:48) = H A ⊗ H B ⊗ H A (cid:48) ⊗ H B (cid:48) , (166)choosing the ancillary H A (cid:48) ⊗H B (cid:48) in such a way that thereexists a puriﬁcation | ψ (cid:105) ∈ H (cid:48) such that ρ AB = Tr H A (cid:48) ⊗H B (cid:48) | ψ (cid:105) (cid:104) ψ | . (167)Of course, this puriﬁcation is not unique, and the EoPis deﬁned in terms of the von Neumann entropy S ( ρ ) = − Tr( ρ log ρ ), as E P := inf | J (cid:105)∈H (cid:48) S (Tr BB (cid:48) | J (cid:105) (cid:104) J | ) , (168) i.e. , the minimum of the entanglement entropy betweensubsystems A ⊕ A (cid:48) and B ⊕ B (cid:48) . Accordingly, determiningthe EoP in general requires an optimization over the fullHilbert space H (cid:48) , which is a computationally intensivetask that quickly becomes unfeasible.A much more reasonable problem is to focus instead onthe Gaussian EoP , obtained by assuming that both theinitial mixed state and the puriﬁcation are Gaussian. The5optimization over all puriﬁcations then reduces to thefamiliar problem of optimization on the sub-manifold ofGaussian states composed from puriﬁcations of ρ AB . Theproperties of these states have been discussed in somedetail in section II F – in fact, the only diﬀerence to notehere is that in the context of EoP, we label the originalsubsystem by A ⊕ B rather than A and the ancillary by A (cid:48) ⊕ B (cid:48) rather than A (cid:48) .We recall from the previous discussion on Gaussianpuriﬁcations that the manifold of puriﬁcations can beparametrised in terms of complex structures J with re-strictions to the subsystems given by the restricted com-plex structures J AB , J A (cid:48) B (cid:48) , J A (cid:48) A , J BB (cid:48) . In Refs. [18, 34],an expression based on this parametrization was derivedfor the Gaussian entanglement entropy, ﬁrst deﬁned in[35] for bosons and [36] for fermions. The expressionreads S AA (cid:48) ( | J (cid:105) ) =  Tr (cid:16) (cid:49) A +i J AA (cid:48) log (cid:12)(cid:12)(cid:12) (cid:49) A +i J AA (cid:48) (cid:12)(cid:12)(cid:12)(cid:17) (bosons) − Tr (cid:16) (cid:49) A +i J AA (cid:48) log (cid:12)(cid:12)(cid:12) (cid:49) A +i J AA (cid:48) (cid:12)(cid:12)(cid:12)(cid:17) (fermions) , (169)and once again makes use of the complex structure for-malism to provide a uniﬁed expression for both bosonsand fermions. This expression can be framed more con-cisely by deﬁning D = ( (cid:49) + i J AA (cid:48) ) as S AA (cid:48) = (cid:40) Tr (cid:0) D log D (cid:1) (bosons) − Tr ( D log D ) (fermions) . (170)The derivative of the entanglement entropy can be ob-tained by a straightforward application of the productrule and the cyclicity of the trace as dS AA (cid:48) = (cid:40) Tr (cid:0) dD log D (cid:1) (bosons) − Tr ( dD log D ) (fermions) , (171)where we have deﬁned dD = i2 δJ AA (cid:48) . These results aresummarised in Table IV.Equipped with a manifold of pure Gaussian states anda scalar function and its derivative deﬁned on this man-ifold in terms of the complex structure, we are now ina position to employ our optimization algorithm to eﬃ-ciently compute the Gaussian EoP.In practice, we begin with a matrix representation ofthe mixed state reduced complex structure J AB in a basisˆ ξ = ( ˆ ξ A , ˆ ξ B ) which decomposes over the two subsystems A and B . We denote the transformation by T whichrelates J AB to its mixed state standard form J msta deﬁnedin (78), so that J AB = T J msta T − . (172)The transformation is obtained from the eigenvectors of J AB as discussed in section II F. We can now constructan initial puriﬁcation of the form in (78). In doing so,we are free to choose any dim( A (cid:48) B (cid:48) ) ≥ dim( AB ), and we speak of a minimal puriﬁcation when the number ofmodes in AB is the same as that in A (cid:48) B (cid:48) .The convenience of our choice of basis as ˆ ξ (cid:48) =( ˆ ξ A , ˆ ξ B , ˆ ξ A (cid:48) , ˆ ξ B (cid:48) ) now becomes evident: In this basis, thecomplex structure of the puriﬁed state will take the blockform J = (cid:18) J AB J AB,A (cid:48) B (cid:48) J A (cid:48) B (cid:48) ,AB J A (cid:48) B (cid:48) (cid:19) , (173)where the blocks on the main diagonal are the restrictedcomplex structures deﬁning the mixed states ρ AB and ρ A (cid:48) B (cid:48) . It should be noted that, in contrast, the oﬀ-diagonal blocks do not represent complex structures asthey map from A ⊕ B to A (cid:48) ⊕ B (cid:48) or vice versa. While J AB is ﬁxed to preserve the restriction to the original subsys-tem, varying J A (cid:48) B (cid:48) and the oﬀ-diagonal blocks in a com-patible way corresponds to diﬀerent puriﬁcations of ρ AB .The state manifold of interest is therefore parametrizedby the transformations M A (cid:48) B (cid:48) acting on the reduced com-plex structure J A (cid:48) B (cid:48) , which act on the full complex struc-ture as M = (cid:49) AB ⊕ M A (cid:48) B (cid:48) .We initialize the optimization algorithm at the initialpuriﬁcation, which is in the standard form and thereforethe very ﬁrst transformation must be of the form M = T ⊕ (cid:102) M (174)where M denotes an arbitrary starting point in the vari-ational manifold and the leading block in the transfor-mation returns J st AB to its initial form J AB . This ensuresthat the restriction of J = M J M − to A ⊕ B returnsthe initial reduced complex structure J AB . The optimiza-tion can then proceed in the way outlined in the previoussection, with steps M n = (cid:49) AB ⊕ (cid:102) M n (175)to ensure that the optimization procedure leaves J AB un-changed.A comprehensive study of Gaussian entanglement ofpuriﬁcation in free quantum ﬁeld theories based on ourmethods can be found in [17]. C. Gaussian complexity of puriﬁcation (CoP)

In ref. [37], it has ﬁrst been suggested that a notionof (circuit) complexity might provide fresh insights andmight meaningfully complement notions of entanglementin holography. Motivated by the subsequent interest in holographic complexity as well as a preceding geomet-ric interpretation of complexity in quantum circuits [38],signiﬁcant attention has been dedicated to extending no-tions of complexity to quantum ﬁeld theories [39, 40].Since the thermal and ground states of free quantumﬁelds are Gaussian, a framework for the complexity ofGaussian states has been developed in refs. [20, 21], whichwe draw on here.6A particular area of recent interest has been the studyof complexity of puriﬁcation (CoP) as a correlation mea-sure in composite quantum systems based on the notionof complexity rather than entanglement [41]. In this con-text, a typical problem would be the following: We aregiven a mixed state in some Hilbert space H A , which ischaracterized by a density matrix ρ A . We now deﬁne anew Hilbert space, H (cid:48) = H A ⊗ H A (cid:48) , (176)choosing the ancillary system B in such a way there existsa puriﬁcation | ψ T (cid:105) ∈ H (cid:48) such that ρ A = Tr A | ψ T (cid:105) (cid:104) ψ T | . (177)We refer to this puriﬁcation as the target state. TheCoP is deﬁned as the minimum of a complexity func-tion C with respect to some reference state ψ R over allpuriﬁcations of the initial state i.e. , C P = min | ψ T (cid:105)∈H (cid:48) C ( | ψ T (cid:105) , | ψ R (cid:105) ) . (178)There are several distinct proposals for the complexityfunction C ( | ψ T (cid:105) , | ψ R (cid:105) ) in the literature. In the contextof this work, we will once again focus only on GaussianCoP by making the assumption that both the referenceand target states are Gaussian in nature. For bosonic andfermionic Gaussian states, there exists a consensus def-inition associated with the geodesic distance betweenreference and target states, whose analytical expressionshave been derived in ref. [21] for bosons and in ref. [20]for fermions.The most concise formulation of this complexity func-tion unsurprisingly involves the relative complex struc-ture, introduced in (45), of the target and referencestates, ∆ = − J T J R (179)which captures all the information between the two. Interms of ∆, the complexity is then deﬁned as C = (cid:115) | Tr log (∆) | J A , as in the previous section. However, we mayalso choose a more computationally eﬃcient approach, bynoting a somewhat subtle point regarding the complexity Note, however, that even for Gaussian states, there also existalternative p -norm deﬁnitions [42]. function. By the cyclicity of the trace, any transforma-tion J T (cid:55)→ M J T M − will change the complexity (180) ina way equivalent to the transformation J R (cid:55)→ M − J R M .This means that we can choose to optimize over the man-ifold of pure reference states rather than target states.This seems arbitrary until we note that we may assumewithout loss of generality that the reference state is aproduct state between A and A (cid:48) i.e. , that the matrixrepresentation of J R in our basis is simply J R = [ J R ] A ⊕ [ J R ] A (cid:48) , (181)where the subscripts A and A (cid:48) denote the restrictions toeither subsystem.A comprehensive study of Gaussian complexity of pu-riﬁcation in free quantum ﬁeld theories based on ourmethods can be found in [17]. VI. OPTIMALITY OF GAUSSIAN EOP

This section focuses on a speciﬁc application deﬁnedand reviewed in the previous section V B, namely en-tanglement of puriﬁcation. We combine our numericalresults from our numerical algorithm with several analyt-ical arguments to support the conjecture that for mixedGaussian states only Gaussian puriﬁcations are requiredto compute the entanglement of puriﬁcation.

A. Conjectures on optimality

We will present numerical and analytical argumentsfor the validity of the following two conjectures.

Conjecture 1 (Gaussian optimality conjecture) . Givena mixed Gaussian state ρ of a bosonic or fermionic sys-tem and a system decomposition V = A ⊕ B with N A and N B degrees of freedom, respectively, it is suﬃcientto optimize over all Gaussian puriﬁcations to computethe entanglement of puriﬁcation (minimal entanglemententropy S AA (cid:48) over all puriﬁcations in H A ⊗ H B ⊗ H A (cid:48) ⊗H B (cid:48) ), i.e. , the global minimum of S AA (cid:48) is reached on thesubmanifold of Gaussian states. Conjecture 2 (Minimum puriﬁcation conjecture) . When minimizing the S AA (cid:48) over all Gaussian puriﬁca-tions, the minimum is reached when choosing the num-bers of degrees of freedom of the purifying systems A (cid:48) and B (cid:48) to be given by the respective numbers of degreesof freedom in A and B , i.e. , N A (cid:48) = N A and N B (cid:48) = N B . At ﬁrst sight, this conjecture may appear rather ambi-tious, considering that we assume that the optimizationover the generally exponentially small family of Gaus-sian puriﬁcation (compared to all non-Gaussian puriﬁ-cations) is suﬃcient and that the number of purifyingdegrees of freedom just need to match the ones of the7original system (in contrast to the bounds for ﬁnite di-mensional non-Gaussian systems from ref. [32]). How-ever, for researchers familiar with typical properties ofGaussian states, our conjectures will likely appear muchmore realistic, considering that Gaussian states providein many settings one of the simplest non-trivial realiza-tions of quantum information concepts. This appearsin the context analytical formulas for the entanglemententropy and other correlations measures (such as the log-arithmic negativity).Let us emphasize that we have formulated two distinctconjectures that only together provide us with clear in-structions on how the full entanglement of puriﬁcationcan be computed numerically from the Gaussian opti-mization algorithm presented in the previous section.Conjecture 1 ensures that for mixed Gaussian states, weonly need to consider Gaussian puriﬁcations, but to ac-tually run the algorithm, we need to choose both thetotal number N of degrees of freedom to purify as wellas how we split these purifying degrees of freedom intothe auxiliary subsystems A (cid:48) and B (cid:48) . B. Numerical evidence

We provide numerical support for conjectures 1 and 2based on two paradigmatic models. For bosons, we con-sider the Klein-Gordon scalar ﬁeld with mass m , dis-cretized on a one-dimensional periodic lattice with N sites, equipped with the Hamiltonianˆ H = δ N (cid:88) i =1 (cid:18) ˆ π i + m δ ˆ ϕ i + 1 δ ( ˆ ϕ i − ˆ ϕ i +1 ) (cid:19) , (182)where δ > H = − N (cid:88) i =1 (2 J ˆ S x i ˆ S x i +1 + h ˆ S z i ) (183)in the critical limit J = h . Here, S x i and S z i represent thelocal spin-1 / i -thsite (the conventions match [43–45]).Providing categorical numerical evidence for the ﬁrstconjecture proves a substantial challenge, since it requiresan optimization over the entire Hilbert space, which is thedaunting problem that our Gaussian approach is tryingto circumvent.In the fermionic case, the ﬁnite-dimensional Hilbertspace allows us to adapt our approach to gradient de-scent using Lie groups and algebras to this problem byoptimizing over the (compact) group of unitary transfor-mations U(2 N ) of an N -mode density operator. On themanifold of transformations U ∈ U(2 N ) with respect tosome reference state ρ , parametrizing the non-Gaussianpuriﬁcations according to ρ U = U ρ U † , we can deﬁne theentanglement entropy function as S AA (cid:48) = − Tr ( ρ AA (cid:48) log ρ AA (cid:48) ) , (184) d Non-Gaussian Gaussian10 0.00306129 0.0030610130 0.00038316 0.0003829150 0.00000166 0.0000015170 0.00046825 0.0004680190 0.00434489 0.00434461TABLE V.

Numerical evidence for conjecture 1.

We presentthe numerically computed non-Gaussian EoP to 7 s.f. fordisjoint intervals of width N A = N B = 1 (which we purifywith N A (cid:48) = N B (cid:48) = 1) at a distance of d sites in the fermioniccritical transverse Ising model, on a circle with N = 100, with J = h = 1. We contrast this with the Gaussian EoP result. with ρ AA (cid:48) = Tr BB (cid:48) (cid:0) U ρ U − (cid:1) . In line with the previousdiscussion, we can also deﬁne the derivative of S AA (cid:48) as dS AA (cid:48) = − Tr ( δρ AA (cid:48) log ρ AA (cid:48) ) , (185)with δρ AA (cid:48) = Tr BB (cid:48) (cid:16) U [ ρ , ˆ K ] U − (cid:17) for ˆ K ∈ u (2 N ).It should be noted that computing the partial trace fora fermionic density operator in this context is non-trivial.In practice, we construct the initial puriﬁed density oper-ator in the convenient basis ˆ ξ = ( ˆ ξ A , ˆ ξ B , ˆ ξ A (cid:48) , ˆ ξ B (cid:48) ). Trac-ing out the subsystem BB (cid:48) therefore involves a permu-tation of the degrees of freedom, in the sense ρ ABA (cid:48) B (cid:48) (cid:55)→ ρ AA (cid:48) BB (cid:48) . While such a re-ordering is trivial for commut-ing bosonic degrees of freedom (or spin degrees of free-dom), the permutation of fermionic creation operators doanti-commute, so the computation of the partial traceto ﬁnd ρ AA (cid:48) will lead to extra sign ﬂips due to the re-quired permutations. This is subtle, but well-understoodin various contexts [46–49] and already taken into ac-count when we computed the entanglement entropy offermionic Gaussian states in (169). Evidence for conjecture 1.

This approach to non-Gaussian optimization proves eﬃcient at small systemsizes, however, the computational eﬀort grows exponen-tially in the number of degrees of freedom in the systemand it soon becomes unfeasible. Table V shows the non-Gaussian EoP for the fermionic (critical) Ising model,within the numerically accessible regime. Evidently, thisdata supports the conjecture that the optimal puriﬁca-tion of a mixed Gaussian state is Gaussian.

Evidence for conjecture 2.

We can tackle the sec-ond conjecture in a more comprehensive way, since itonly requires us to perform Gaussian optimization. Ta-ble VI shows the numerical Gaussian EoP for a varietyof dimensions, for both the bosonic and fermionic cases.Evidently, here we also see good agreement between thenumerical results and our expectations based on the con-jecture.8 N A + N B N A (cid:48) + N B (cid:48) d = 10 0.01861871 0.02073452 0.11482986 0.02187326 0.02371109 0.17471765 0.11708904 0.02307170 0.11708905 d = 30 0.00022978 0.00026256 0.09482403 0.00028175 0.00223536 0.15435431 0.09486090 0.00029999 0.09486090 d = 50 0.00001590 0.00002022 0.09458539 0.00002412 0.00197901 0.15410719 0.09459102 0.00002588 0.09459102 d = 70 0.00034749 0.00048793 0.09504583 0.00064375 0.00259556 0.15470108 0.09523927 0.00068457 0.09523927 d = 90 0.03052751 0.04357474 0.13688870 0.05929784 0.06093414 0.20883147 0.15464517 0.06226844 0.15464518Critical transverse ﬁeld Ising model d = 10 0.00288040 0.00639951 0.06677170 0.00933387 0.01324964 0.11126531 0.07202181 0.01234729 0.07202181 d = 30 0.00057335 0.00135921 0.06301066 0.00210074 0.00604134 0.10643203 0.06426257 0.00276998 0.06426256 d = 50 0.00040596 0.00098339 0.06273091 0.00155212 0.00549403 0.10606776 0.06367226 0.00204620 0.06367227 d = 70 0.00062304 0.00153900 0.06314448 0.00247813 0.00641779 0.10668275 0.06466840 0.00326765 0.06466840 d = 90 0.00408126 0.01078810 0.07007032 0.01883801 0.02269729 0.11772862 0.08218632 0.02506887 0.08218632TABLE VI. Numerical evidence for conjecture 2.

We present the numerically computed Gaussian EoP to 9 s.f. for disjointintervals of width N A and N B at a distance of d sites in the Klein-Gordon model (top) and critical transverse Ising model(bottom), on a circle with N = 100 sites. For the Klein-Gordon model, we set the mass to m/δ = 0 .

1, and for the Ising model,we set J = h = 1. The optimal puriﬁcation, highlighted in color, is evidently obtained for equal numbers of degrees of freedomin the original subsystems and the corresponding subsystems of the ancillary, i.e. , N A = N A (cid:48) and N B = N B (cid:48) . C. Analytical bounds

As alluded in the previous section, it is highly plausibleto conjecture that the puriﬁcation for which the entangle-ment entropy is minimized belongs to the class of Gaus-sian states (conjecture 1). We have some further analyt-ical evidence for this: After all, the map from quantumstates on H (cid:48) to ones on H A ⊗ H A (cid:48) performing a par-tial trace over the complement of H A ⊗ H A (cid:48) can be seenas a constrained Gaussian channel, reﬂecting the con-straint that the inputs must be such that the reductionsto H A ⊗ H B are precisely the given Gaussian states ρ AB .Captured in this way, the entanglement of puriﬁcationcan be seen as a solution to a minimum output entropy problem of a Gaussian quantum channel [50, 51], a prob-lem in which the von-Neumann entropy of the output ofa quantum channel is minimized under varying the inputof the channel. At least for Gaussian bosonic systemshas been settled under rather general conditions (albeitnot under the speciﬁc constraints considered here). Thisconnection will be made more precise elsewhere. Notreferring to this conjecture, the Gaussian entanglementof puriﬁcation (only allowing for Gaussian puriﬁcations),constitutes an upper bound for E P .That said, the quality of approximation can bebounded by a lower bound to E P that can be computed[32]. This is the entanglement of formation (EoF) E F of ρ AB [52], satisfying E F ( ρ AB ) ≤ E P ( ρ AB ) , (186)and being deﬁned as the inﬁmum E F ( ρ AB ) := inf (cid:88) j p j S (Tr B | ψ j (cid:105) (cid:104) ψ j | ) (187) with (cid:88) j p j | ψ j (cid:105) (cid:104) ψ j | = ρ AB . (188)The entanglement of formation can be in instances com-puted and also conveniently bounded [53]. The easiestsuch lower bounds, valid for arbitrary as well as for Gaus-sian states, for which it is extremal both in the bosonicand fermionic [54] setting, is the hashing bound S (Tr B ρ AB ) − S ( ρ AB ) ≤ E F ( ρ AB ) ≤ E P ( ρ AB ) . (189)Another insight helpful in the numerical optimization ofthe Gaussian entanglement of puriﬁcation is a bound tothe number of auxiliary modes constituting systems A (cid:48) B (cid:48) that is required without restricting generality. Naively,one might expect that one needed a squared number ofbosonic or fermionic modes in the puriﬁcation. In fact,it is easy to see that one can restrict systems A (cid:48) B (cid:48) to becomposed of as many modes N A (cid:48) and N B (cid:48) as A and B consist of, i.e. , N A and N B .Given a quantum state ρ AB , it will be associated withsome J AB . As discussed in section II F, we can alwaysﬁnd a basis, such that J AB takes the standard form ofa mixed state given by (78). Note, however, that thiswill be in general with respect to a basis that mixes thedegrees of freedom of A and B . If we start with a basisˆ ξ = ( ˆ ξ A , ˆ ξ B ), there exists a group transformation T AB ∈G AB , such that J AB ≡ (cid:18) J A J A,B J B,A J B (cid:19) = T AB J msta T − AB , (190)where the mixed state standard form was deﬁned in (78).We can use this M AB , which combines A and B to con-struct a puriﬁcation | J (cid:105) ABA (cid:48) B (cid:48) , in which A (cid:48) and B (cid:48) are9correlated in the same way. For this, we complete thebasis ˆ ξ = ( ˆ ξ A , ˆ ξ B ) from (190) to ˆ ξ (cid:48) = ( ˆ ξ A , ˆ ξ B , ˆ ξ A (cid:48) , ˆ ξ B (cid:48) )and choose in this basis J ≡ T J psta T − with T = T AB ⊕ T A (cid:48) B (cid:48) , (191) i.e. , we use the same transformation T AB to combine ˆ ξ A and ˆ ξ B as we use to mix ˆ ξ A (cid:48) and ˆ ξ B (cid:48) . Here, we have thepuriﬁed standard form J psta from (82). From our numeri-cal studies, we know that this choice is generally not theoptimal one, but it provides a meaningful starting pointfor our optimization algorithm.We can also move on to arrive at analytical upperbounds, however. For this purpose, we block-diagonalizethe submatrices [ J ] A and [ J ] B individually (rather than[ J ] AB as a whole as in (190)), i.e. , we write J ≡ M ˜ JM − with M = M A ⊕ M B ⊕ (cid:49) A (cid:48) B (cid:48) (192)with M A ∈ G A and M B ∈ G B , so that˜ J AB =  ˜ c A (cid:65) · · · X · · · ˜ c AN A (cid:65) ˜ c B (cid:65) · · · − X (cid:124) ... . . . ...0 · · · ˜ c BN B (cid:65)  , (193)where (cid:65) has been introduced before and X is some2 N A × N B rectangular matrix and ˜ c Ai and ˜ c Bi are realnumbers in [0 , ∞ ) for bosons and [0 ,

1] for fermions. As˜ J is just originating from a basis transformation of thepuriﬁed J and J psta , we have˜ J = − (cid:49) . (194)One can now arrive at analytical upper bounds, acknowl-edging the following insight. The von-Neumann entropyof AA (cid:48) can be computed from the reduction [ ˜ J ] AA (cid:48) .One ﬁnds for the reduction the von-Neumann entropyvia (169). This way, the von-Neumann entropy formulacan directly be computed on the level of [ ˜ J ] AA . Let P be the pinching that projects the matrix ˜ J into the 2 × J . Since i ˜ J is Hermitian (with respectto the inner product of g ), such a pinching will render theresulting matrix P (i ˜ J ) more mixed than i ˜ J in the senseof majorization [55], i.e. , if the non-increasingly orderedeigenvalues of i ˜ J AA (cid:48) are ± i ˜ λ i and the ones of P (i ˜ J AA (cid:48) )are ± i ˜ λ (cid:48) i , we will have j (cid:88) i =1 ˜ λ (cid:48) i ≤ j (cid:88) i =1 ˜ λ i and N A + N A (cid:48) (cid:88) i =1 ˜ λ (cid:48) i = N A + N A (cid:48) (cid:88) i =1 ˜ λ i (195)for all j in the ﬁrst equation. Since the function S AA (cid:48) ( ˜ J )as a function of ˜ J is Schur-concave both for bosons and fermions, we have S AA (cid:48) ( P (i ˜ J )) ≥ S AA (cid:48) (i ˜ J ) , (196)again for both bosons and fermions. In other words, thepinched matrix gives rise to an upper bound to the von-Neumann entropy of the involved quantum states andhence also an upper bound to the entanglement of pu-riﬁcation. That said, now the eigenvalues entering theexpression can be read oﬀ directly, giving rise to an ex-plicit formula of an upper bound of the entanglement ofpuriﬁcation. This mindset can be used to avoid costly nu-merical optimization and to study systems in the thermo-dynamic limit, while still arriving at reasonable bounds. D. Proof of local optimality

Some further analytical evidence in support of con-jecture 1 is provided by the fact that the entanglemententropy S AA (cid:48) is locally optimal for a Gaussian puriﬁca-tion, i.e. , we will prove that after ﬁnding the optimalGaussian puriﬁcation | J (cid:105) ABA (cid:48) B (cid:48) with minimal S AA (cid:48) anyinﬁnitesimal non-Gaussian change of | J (cid:105) ABA (cid:48) B (cid:48) will notlower S AA (cid:48) . For simplicity of notation, we write | J (cid:105) forthe puriﬁcation | J (cid:105) ABA (cid:48) B (cid:48) on ABA (cid:48) B (cid:48) . We consider themixed Gaussian state ρ AB . Let us deﬁne | J (cid:105) as the op-timal Gaussian puriﬁcation, i.e. , a Gaussian state vectorsuch that ρ AB = Tr A (cid:48) B (cid:48) | J (cid:105) (cid:104) J | (197)and such that the entanglement entropy S AA (cid:48) ( | J (cid:105) (cid:104) J | ) = S ( ρ AA (cid:48) ) is minimal among all Gaussian states. In prac-tice, we would choose here N A = N A (cid:48) and N B = N B (cid:48) assuggested by conjecture 2, but this is not important forthe argument.As discussed in section III H, we can write the mixedGaussian ρ AA (cid:48) = exp( − ˆ H AA (cid:48) ) /Z with ˆ H AA (cid:48) = q ab ˆ ξ a ˆ ξ b and Z = e c based on formula 8. If we now perturbour optimal puriﬁcation in a non-Gaussian way, i.e. , byapplying a unitary | ψ (cid:15) (cid:105) = ( (cid:49) A ⊗ (cid:49) B ⊗ U A (cid:48) B (cid:48) ( (cid:15) )) | J (cid:105) (198)with U A (cid:48) B (cid:48) (0) = (cid:49) A (cid:48) ⊗ (cid:49) B (cid:48) , the ﬁrst law of entanglemententropy [56–58] states that the linear change of δS AA (cid:48) around (cid:15) = 0 is given by δS AA (cid:48) = dd(cid:15) (cid:104) ψ (cid:15) | ˆ H AA (cid:48) | ψ (cid:15) (cid:105) (cid:12)(cid:12) (cid:15) =0 . (199)However, we note that ˆ H AA (cid:48) is a quadratic Hamiltonian,which implies that the ﬁrst order change of the entangle-ment entropy will only feel the change of the two-pointfunction of | ψ (cid:15) (cid:105) . This means that at linear order, wecan replace the change of | ψ (cid:15) (cid:105) by a Gaussian change ofthe state. However, by assumption the state vector | J (cid:105) has been the optimal Gaussian puriﬁcation, such that0any Gaussian perturbation will always increase the en-tanglement entropy S AA (cid:48) . Moreover, as the Gaussianpuriﬁcation | J (cid:105) has been assumed to be optimal amongall Gaussian puriﬁcations, the variation δS AA (cid:48) will van-ish at linear order. In summary, even if we allow fornon-Gaussian perturbations of | J (cid:105) , we will have δS AA (cid:48) = 0 . (200)However, this does not exclude the possibility of a ﬁnitetransformation U (cid:15) to lower the entanglement entropy, butconstitutes a ﬁrst step towards proving that the Gaussianpuriﬁcation is optimal. VII. DISCUSSION

We have presented a geometric approach to optimizeover arbitrary diﬀerentiable functions on the manifolds ofpure bosonic or fermionic Gaussian states. Our method isbased on the well-known gradient descent algorithm, butexploits the natural action of a Lie group onto these man-ifolds to move between diﬀerent Gaussian states. Thisway, we can eﬃciently perform gradient descent with re-spect to the Fubini-Study metric associated to the man-ifold of Gaussian states. In the context of variationalfamilies, it is an important question if a given manifoldsatisﬁes the so-called K¨ahler property [15], but for thepurpose of gradient descent on Gaussian manifolds thisproperty is not important and we show explicitly howour approach can be applied to suitable Gaussian sub-manifolds (generated by subgroups of the symplectic ororthogonal group).For the most part of this manuscript, we used a newformalism for the treatment of Gaussian states thatlargely uniﬁes the bosonic and fermionic case and em-phasizes their similarities. This formalism is based onthe geometric K¨ahler structures consisting of a metric G , a symplectic form Ω and a complex structure J onthe classical phase space V of the theory, as reviewedin section II. In order to carefully distinguish if a matrixrepresents a linear map (such as J ), a bilinear form (suchas G and Ω) or a dual bilinear form (such as g and ω ), weused the index position of a respective matrix entry (suchas J ab vs. G ab ). As there are many equivalent ways todescribe and parametrize Gaussian states, we provideda comprehensive dictionary in section III to allow for aseamless conversion between diﬀerent formalisms. Thisdictionary may also be of use to other applications in-volving Gaussian states.We have further implemented our optimization algo-rithm numerically to study three applications that arerelevant for condensed matter physics, quantum informa-tion and high energy theory, namely ﬁnding approximateground states, computing the Gaussian entanglement ofpuriﬁcation (EoP) and ﬁnally calculating the so-calledGaussian complexity of puriﬁcation (CoP). For each ofthese applications, we have reviewed the key ingredients of our optimization procedure, namely an analytical ex-pression for the function and its gradient in terms of thecomplex structure J parametrizing our Gaussian statefamily.In section VI, we combined numerical and analyticalinsights to support a conjecture on the optimality ofGaussian entanglement of puriﬁcation, i.e. , we claimedthat for a mixed Gaussian state it is suﬃcient to op-timize entanglement of puriﬁcation only over Gaussianstates. This claim was supported by numerical evidencefrom small fermionic systems, where we can also performthe full optimization over all puriﬁcations and ﬁnd that itagrees with one over only Gaussian puriﬁcations. More-over, we show analytically that the Gaussian entangle-ment of puriﬁcation is locally optimal even in the largerset of non-Gaussian optimizations. Finally, our conjec-ture also makes a statement about the required numberof degrees of freedom (and their distribution) in the pu-rifying subsystem. This is supported by our numerics, aswell.The key reason why we do not need to re-evaluate theFubini-Study metric at each step of our optimization al-gorithm lies in the fact that our optimization manifold(Gaussian states or suitable submanifolds) are generatedby the Lie group G (Sp(2 N, (cid:82) ) for bosons, O(2 N, (cid:82) ) forfermions) or a suitable subgroup G (cid:48) . As we have a uni-tary representation U ( M ) of group elements M ∈ G , theHilbert space inner product is preserved under the left-action of this group. It therefore suﬃces to choose an or-thonormal basis of Lie algebra elements at one point (ata given reference state vector | J (cid:105) in the manifold) andthis basis will stay orthonormal when moving to otherstates via the group action U ( M ) | J (cid:105) = | M J M − (cid:105) . An-other advantage is that we naturally ensure to not over-parametrize, i.e. , we can remove those Lie algebra ele-ments that do not change the reference state | J (cid:105) whichensures via the natural group action that we also donot have redundant directions at other states. All ofthese desirable properties also apply to other families ofpure states, as long as they are generated from some uni-tary representation of a Lie group. A prominent exam-ple of such families are the so-called group theoretic co-herent states introduced by Gilmore [59, 60] and Perelo-mov [61, 62]. The only diﬀerence to the Gaussian caseis that we may not have equally simple analytical for-mulas for the functions we would like to optimize, suchas expectation values (Wick’s theorem) or entanglemententropies. Of course, our method also applies to thefamily of all pure states (projective Hilbert space) andin fact, we already used an appropriately adjusted ver-sion of our algorithm when we computed the full non-Gaussian entanglement of puriﬁcation in section VI Bfor small fermionic systems (large fermionic or generalbosonic systems are not feasible due to the large or inﬁ-nite dimension of the associated Hilbert space).1 ACKNOWLEDGMENTS

We thank Eugenio Bianchi, Hugo Camargo, Igna-cio Cirac, Tommaso Guaita, Michal Heller, TadashiTakayanagi and Tao Shi for inspiring discussions. BWis supported in part by the Heinrich B¨oll Foundation un-dergraduate scholarship scheme and the Imperial CollegePresident’s Undergraduate Scholarship and gratefully ac-knowledges the hospitality from MPQ. AJ has been sup-ported by the FQXi as well as the Perimeter Institutefor Theoretical Physics. Research at Perimeter Insti- tute is supported by the Government of Canada throughthe Department of Innovation, Science, and EconomicDevelopment, and by the Province of Ontario throughthe Ministry of Research and Innovation. JE has beensupported by the DFG (CRC 183, project B01, FOR2724, EI 519/14-1). This work has also received fund-ing from the European Union’s Horizon 2020 researchand innovation programme under grant agreement No.817482 (PASQuanS). LH acknowledges support by VIL-LUM FONDEN via the QMATH center of excellence(grant No. 10059). [1] C. Weedbrook, S. Pirandola, R. Garcia-Patron, N. J.Cerf, T. C. Ralph, J. H. Shapiro, and S. Lloyd, “Gaus-sian quantum information,” Rev. Mod. Phys. , 621(2012).[2] J. Eisert and M. B. Plenio, “Introduction to the basicsof entanglement theory in continuous-variable systems,”Int. J. Quant. Inf. , 479 (2003).[3] G. Adesso, S. Ragy, and A. R. Lee, “Continuous vari-able quantum information: Gaussian states and beyond,”Open Sys. Inf. Dyn. , 1440001 (2014).[4] S. Bravyi, “Lagrangian representation for fermionic lin-ear optics,” Quantum Inf. and Comp. , 216–238 (2005).[5] C. J. Pethick and H. Smith, Bose–Einstein condensationin dilute gases (Cambridge university press, 2008).[6] T. Guaita, L. Hackl, T. Shi, C. Hubig, E. Demler, andJ. I. Cirac, “Gaussian time-dependent variational prin-ciple for the Bose-Hubbard model,” Phys. Rev. B ,094529 (2019).[7] D. F. Walls and G. J. Milburn,

Quantum optics (SpringerScience & Business Media, 2007).[8] J. Bardeen, L. N. Cooper, and J. R. Schrieﬀer, “Theoryof superconductivity,” Phys. Rev. , 1175 (1957).[9] D. R. Hartree, “The wave mechanics of an atom with anon-coulomb central ﬁeld. part i. theory and methods,”in

Math. Proc. Camp. Phil. Soc. , Vol. 24 (Cambridge Uni-versity Press, 1928) pp. 89–110.[10] M. Peskin,

An introduction to quantum ﬁeld theory (CRCpress, 2018).[11] D. E. Bruschi, J. Louko, E. Mart´ın-Mart´ınez, A. Dragan,and I. Fuentes, “Unruh eﬀect in quantum informationbeyond the single-mode approximation,” Phys. Rev. A , 042332 (2010).[12] A. Ashtekar and A. Magnon, “Quantum ﬁelds in curvedspace-times,” Proc. Roy. Soc. Lond. , 375–394 (1975).[13] G. Fubini, Sulle metriche deﬁnite da una forma hermi-tiana: nota , Vol. 63 (Oﬃce graf. C. Ferrari, 1904) pp.502–513.[14] E. Study, “K¨urzeste Wege im komplexen Gebiet,” Math-ematische Annalen , 321–378 (1905).[15] Lucas Hackl, Tommaso Guaita, Tao Shi, Jutho Haege-man, Eugene Demler, and J. Ignacio Cirac, “Geome-try of variational methods: dynamics of closed quantumsystems,” SciPost Phys. , 048 (2020), arXiv:2004.01015[quant-ph].[16] Lucas Hackl and Eugenio Bianchi, “Bosonic andfermionic gaussian states from k¨ahler structures,” arXive-prints , arXiv–2010 (2020). [17] Hugo A. Camargo, Lucas Hackl, Michal P. Heller,Tadashi Takayanagi, and Bennet Windt, “Entanglementand complexity of puriﬁcation in (1+1)-dimensional freeconformal ﬁeld theories,” in preparation (2020).[18] L. F. Hackl, “Aspects of gaussian states: Entanglement,squeezing and complexity,” (2018).[19] R. M. Wald, General relativity (University of Chicagopress, 2010).[20] L. Hackl and R. C. Myers, “Circuit complexity for freefermions,” JHEP , 139 (2018).[21] S. Chapman, J. Eisert, L. Hackl, M. P. Heller, R. Jef-ferson, H. Marrochio, and R. C. Myers, “Complexityand entanglement for thermoﬁeld double states,” SciPostPhys. , 034 (2019).[22] J. Haegeman, J. I. Cirac, T. J. Osborne, I. Piˇzorn, H. Ver-schelde, and F. Verstraete, “Time-dependent variationalprinciple for quantum lattices,” Phys. Rev. Lett. ,070601 (2011).[23] C. M. Dawson, J. Eisert, and T. J. Osborne, “Unifyingvariational methods for simulating quantum many-bodysystems,” Phys. Rev. Lett. , 130501 (2008).[24] L. Hackl and R. H. Jonsson, “Minimal energy cost ofentanglement extraction,” Quantum , 165 (2019).[25] A. Kenfack and K. Zyczkowski, “Negativity of theWigner function as an indicator of non-classicality,” J.Opt. B , 396 (2004).[26] A. Mari, K. Kieling, B. Melholt Nielsen, E. S. Polzik,and J. Eisert, “Directly estimating nonclassicality,” Phys.Rev. Lett. , 010403 (2011).[27] K. E. Cahill and R. J. Glauber, “Ordered expansionsin boson amplitude operators,” Phys. Rev. , 1857(1969).[28] D. R. Truax, “Baker-Campbell-Hausdorﬀ relations andunitarity of SU (2) and SU (1 ,