Clock Quantum Monte Carlo: an imaginary-time method for real-time quantum dynamics
CClock Quantum Monte Carlo: an imaginary-time method for real-time quantumdynamics
Jarrod R. McClean and Al´an Aspuru-Guzik Department of Chemistry and Chemical Biology, Harvard University, Cambridge MA, 02138
In quantum information theory, there is an explicit mapping between general unitary dynamicsand Hermitian ground state eigenvalue problems known as the Feynman-Kitaev Clock. A prominentfamily of methods for the study of quantum ground states are quantum Monte Carlo methods, andrecently the full configuration interaction quantum Monte Carlo (FCIQMC) method has demon-strated great promise for practical systems. We combine the Feynman-Kitaev Clock with FCIQMCto formulate a new technique for the study of quantum dynamics problems. Numerical examplesusing quantum circuits are provided as well as a technique to further mitigate the sign problemthrough time-dependent basis rotations. Moreover, this method allows one to combine the par-allelism of Monte Carlo techniques with the locality of time to yield an effective parallel-in-timesimulation technique.
I. INTRODUCTION
Understanding the evolution of quantum systems isa central problem in physics and the design of emerg-ing quantum technologies. However, exact simulationsof quantum dynamics suffer from the so-called curse ofdimensionality [1]. That is, the dimension of the Hilbertspace grows exponentially with the size of the physicalsystem. An effective remedy for the curse of dimension-ality in some classical systems has been the use of MonteCarlo methods, which in many cases has an error with re-spect to number of samples that is independent of the di-mension of the simulated system [2]. Unfortunately thisfavorable scaling is often lost in quantum systems of inter-est due to the emergence of the famous sign problem. Inparticular, it has hindered the use of Monte Carlo meth-ods for fermionic systems, where it is sometimes called“the fermion sign problem”, and for real-time dynamicsof general quantum systems, where it is known as “thedynamical sign problem”. The generic sign problem hasbeen proven to belong to the computational complexityclass NP-Complete [3], and recent studies of complexityhave refined knowledge about the computational powerof sign-problem free (or “stoquastic”) Hamiltonians [4, 5].However, these results do not preclude the effective use ofthese methods on many interesting instances of physicalproblems.In particular, despite the generic challenges of the signproblem, Monte Carlo methods have been used withgreat success in the study of electronic systems, providinga standard of accuracy in quantum chemistry and con-densed matter [6–10]. In some of these methods, such asfixed node diffusion Monte Carlo, the use of a trial wave-function allows one to approximately remove the compli-cations of the sign problem at the cost of a small bias inthe resulting energy. One alternative to such an approx-imation is the use of interacting walker methods [11],which attempt to solve the problem exactly without thebias introduced by a trial function. Recently, Booth et. alintroduced an interacting walker method in the discretebasis of Slater determinants called Full Configuration In- teraction Quantum Monte Carlo (FCIQMC) [12]. Thesign problem in the context of this algorithm has beenstudied in some detail [13–15] and it has been successfullyapplied to both small molecular systems of chemical in-terest and extended bulk systems [16, 17].The use of Monte Carlo methods to study the real-timedynamics of generic quantum systems has been compar-atively less prevalent [18]. The dynamical sign problemmay become more severe both with the size of the system,and duration for which it is simulated [19–21]. Despitethese challenges, advances are being made in the treat-ment of these problems, including hybridization of MonteCarlo techniques with other methods [22–26].The sign problem has been studied in the context ofquantum computation, where it is known that a suffi-cient condition for efficient probabilistic classical simula-tion of the adiabatic evolution of a quantum system usingMonte Carlo methods is that the Hamiltonian governingthe quantum system is sign problem free (also knownas stoquastic) and frustration free [4, 5, 27]. ProjectorMonte Carlo algorithms have been developed specificallyfor this type of problem [4, 28]. Moreover, the use oftools from quantum information allows any generic uni-tary evolution of a quantum system to be written as theground state eigenvalue problem of a Hermitian Hamil-tonian [29–31]. In this work, we exploit this equivalenceto adapt the interacting walker method introduced byBooth et. al [12] to treat the dynamical sign problemwith a method designed for the fermion sign problem.The paper is organized as follows. First, we review thetime-embedded discrete variational principle [31] and de-rive from it the Clock Hamiltonian [29–31], which are theessential tools for writing a general unitary evolution as aground state eigenvalue problem of a Hermitian Hamilto-nian. We then review the FCIQMC method and adapt itfor application to the Clock Hamiltonian. A discussion ofthe theoretical and practical manifestation of the dynam-ical sign problem in this setting follows with numericalexamples from quantum computation. Finally, we intro-duce a general framework of basis rotations which canbe used to ameliorate the sign problem and study the a r X i v : . [ qu a n t - ph ] O c t performance of this method when used in parallel com-putation. II. DYNAMICS AS A GROUND STATEPROBLEM
It has been shown that any unitary quantum evolutionmay be formulated as a ground state eigenvalue problemwith applications to classical simulation of quantum sys-tems [31]. We briefly review the relevant results here sothat this work remains self-contained.Consider a quantum system that is described at dis-crete time steps t by a normalized wavefunction | Ψ t (cid:105) .The dynamics of this system are described by a sequenceof unitary operators { U t } such that U t | Ψ t (cid:105) = | Ψ t +1 (cid:105) and U † t | Ψ t +1 (cid:105) = | Ψ t (cid:105) . From the properties of unitary evolu-tion, the following is clear:2 − (cid:104) Ψ t +1 | U t | Ψ t (cid:105) − (cid:104) Ψ t | U † t | Ψ t +1 (cid:105) = 0 . (1)Moreover, if the wavefunctions at each point in timeare only approximately known (but normalized) then (cid:88) t (cid:16) − (cid:104) Ψ t +1 | U t | Ψ t (cid:105) − (cid:104) Ψ t | U † t | Ψ t +1 (cid:105) (cid:17) ≥ t and is orthonormal suchthat (cid:104) t (cid:48) | t (cid:105) = δ t,t (cid:48) . With this construction, we see that bydefining H (cid:48) = 12 (cid:16) (cid:88) t I ⊗ | t (cid:105) (cid:104) t | − U t ⊗ | t + 1 (cid:105) (cid:104) t |− U † t ⊗ | t (cid:105) (cid:104) t + 1 | + I | t + 1 (cid:105) (cid:104) t + 1 | (cid:17) (3)which acts on the composite system-time Hilbert space,all valid time evolutions minimize S = (cid:88) t,t (cid:48) (cid:104) t (cid:48) | (cid:104) Ψ t (cid:48) | H (cid:48) | Ψ t (cid:105) | t (cid:105) . (4)Note that we have adopted the convention of script let-ters for operators acting in the system-time Hilbert spacesuch as H (cid:48) as opposed to operators only acting on thesystem such as U t . The time-embedded discrete varia-tional principle immediately follows, which simply statesthat this quantity is stationary under variations of thewavefunction δ | Ψ t (cid:105) for all valid time evolutions, or δ S = δ (cid:88) t,t (cid:48) (cid:104) t (cid:48) | (cid:104) Ψ t (cid:48) | H (cid:48) | Ψ t (cid:105) | t (cid:105) = 0 (5)To select a particular evolution of interest, one mayintroduce a penalty operator that fixes the state of the system at a given time. Typically, this might represent aknown initial state, and this operator in the system-timeHilbert space is given by C = ( I − | Ψ (cid:105) (cid:104) Ψ | ) ⊗ | (cid:105) (cid:104) | . (6)The minimization of a Hermitian quadratic form con-strained to have unit norm is equivalent to the eigenvalueproblem for the corresponding Hamiltonian. We intro-duce the Lagrange multiplier λ to enforce normalization.As both S and C are Hermitian by construction, mini-mization of L = (cid:88) t,t (cid:48) (cid:104) t (cid:48) | (cid:104) Ψ t (cid:48) | H (cid:48) + C | Ψ t (cid:105) | t (cid:105)− λ (cid:88) t,t (cid:48) (cid:104) t (cid:48) | (cid:104) Ψ t (cid:48) | Ψ t (cid:105) | t (cid:105) − (7)is equivalent to solving for the eigenvector correspondingto the smallest eigenvalue of the Hermitian operator H = H (cid:48) + C (8)which we refer to as the Clock Hamiltonian. This Hamil-tonian has a unique ground state with eigenvalue 0 givenby the history state, | Φ (cid:105) = 1 √ T (cid:88) t | Ψ t (cid:105) ⊗ | t (cid:105) (9)which encodes the entire evolution of the quantum sys-tem. Thus, the quantum dynamics of the physical systemcan be obtained by finding the ground state eigenvectorof H . III. FCIQMC FOR THE CLOCK HAMILTONIAN
The Full Configuration Interaction Quantum MonteCarlo (FCIQMC) method was introduced by Booth et.al as a method to accurately find the ground state forelectronic structure problems in a basis of Slater determi-nants without appealing to the fixed node approximationto eliminate the fermion sign problem [12]. We review thebasics of the theory behind this method and show how itcan be adapted for the Clock Hamiltonian H , such thatit simulates the full time evolution of a quantum system.Let | Φ i (cid:105) and λ i be the eigenvectors and correspondingeigenvalues of H . Any vector | Ψ (cid:105) in the system-timeHilbert space acted upon by H can be decomposed interms of the eigenvectors of H such that | Ψ (cid:105) = (cid:88) i c i | Φ i (cid:105) (10)It follows that for any | Ψ (cid:105) not orthogonal to the groundstate of the Clock Hamiltonian, | Φ (cid:105) , thatlim τ →∞ e − τ H | Ψ (cid:105) = lim τ →∞ (cid:88) i e − τλ i c i | Φ i (cid:105) ∝ | Φ (cid:105) (11)Because H trivially commutes with itself, we may breakthis operator into the successive application of many op-erators, such that for a large number of slices N of a finite τ , e − τ H = (cid:0) e − τN H (cid:1) N ≈ (1 − δτ H ) N (12)where δτ = τ /N . Note that the linearized time propa-gator used here is both simple to implement for discretesystems as well as unbiased in the final ( τ → ∞ ) resultgiven some restrictions on δτ [32]. Thus with a prescrip-tion to stochastically apply the operator G = (1 − δτ H ) (13)repeatedly to any vector in the system-time Hilbertspace, we can simulate the quantum dynamics ofthe physical system. τ is sometimes interpreted asimaginary-time by analogy to the Wick-rotated time-dependent Schr¨odinger equation, however we will onlyrefer to τ as “simulation time” here, to avoid confusionwith the simultaneous presence of real and imaginary-time.To represent a vector in the system-time Hilbert space,we introduce discrete walkers represented by { i, t } withassociated real and imaginary integer weights R ( { i, t } )and I ( { i, t } ). These walkers correspond to a singlesystem-time configuration. The indices correspond toa system state | i (cid:105) at time t with a complex integerweight defined by its real and imaginary components, W ( { k, t } ) = R ( { k, t } ) + i I ( { k, t } ). Given a collection setof these walkers, the corresponding normalized vector isgiven by | Ψ (cid:105) = 1 K (cid:88) { i,t } W ( { i, t } ) | i (cid:105) ⊗ | t (cid:105) (14)where K is the normalization constant given by the sumof the absolute value of all the complex integer walkerweights K = (cid:88) { i,t } | W ( { i, t } ) | . (15)We also use this notation to define matrix elements foran operator O between a state | i (cid:105) | t (cid:105) and | j (cid:105) | t (cid:48) (cid:105) as O { j,t (cid:48) } , { i,t } = (cid:104) j | (cid:104) t (cid:48) | O | i (cid:105) | t (cid:105) (16)To stochastically apply the operator G to a vector rep-resented by a set of such walkers, the following four stepsare used, adapted from the original implementation byBooth et. al:1. Spawning: This step addresses the off-diagonal el-ements of G . For each walker { i, t } , we consider N r = R ( { i, t } ) real parents and N i = I ( { i, t } )imaginary parents, both with the correct associatedsign. For each of the real parents N i , we select acoupled state at an adjacent time and attempt to spawn a real child and imaginary child { j, t (cid:48) } withprobabilities p R s ( { j, t (cid:48) }|{ i, t } ) = δτ (cid:12)(cid:12) R ( H { j,t (cid:48) } , { i,t } ) (cid:12)(cid:12) p g t ( t (cid:48) , t ) p g s ( { j, t (cid:48) }|{ i, t } ) (17) p I s ( { j, t (cid:48) }|{ i, t } ) = δτ (cid:12)(cid:12) I ( H { j,t (cid:48) } , { i,t } ) (cid:12)(cid:12) p g t ( t (cid:48) , t ) p g s ( { j, t (cid:48) }|{ i, t } ) (18)with corresponding signs S R = − sign( R ( H { j,t (cid:48) } , { i,t } )) (19) S I = − sign( I ( H { j,t (cid:48) } , { i,t } )) (20)and for each of the imaginary parents N i we selecta coupled state at an adjacent time and attempt tospawn a real child and imaginary child { j, t (cid:48) } withprobabilities p R s ( { j, t (cid:48) }|{ i, t } ) = δτ (cid:12)(cid:12) I ( H { j,t (cid:48) } , { i,t } ) (cid:12)(cid:12) p g t ( t (cid:48) , t ) p g s ( { j, t (cid:48) }|{ i, t } ) (21) p I s ( { j, t (cid:48) }|{ i, t } ) = δτ (cid:12)(cid:12) R ( H { j,t (cid:48) } , { i,t } ) (cid:12)(cid:12) p g t ( t (cid:48) , t ) p g s ( { j, t (cid:48) }|{ i, t } ) (22)with corresponding signs S R = sign( I ( H { j,t (cid:48) } , { i,t } )) (23) S I = − sign( R ( H { j,t (cid:48) } , { i,t } )) (24)where probabilities p s > (cid:98) p s (cid:99) walkers and spawningan additional walker with probability p s − (cid:98) p s (cid:99) . δτ is the simulation time step and may be used tocontrol the rate of walker spawning. The functions p g t ( t (cid:48) , t ) and p g s ( { j, t (cid:48) }|{ i, t } ) are the probabilityof suggesting a walker at the new time t (cid:48) and ofthe particular state j respectively. For the Clock,an efficient choice of the time generation function, p g t ( t (cid:48) , t ) is t (cid:48) = t ± p g s ( { j, t (cid:48) }|{ i, t } )should be chosen based on knowledge of the struc-ture of U t such that connected states may reacheach other. In this work we use a uniform distri-bution where states connected by U t are selectedrandomly with equal probability, however this canbe refined using knowledge of U t .In this case, where { j, t (cid:48) } (cid:54) = { i, t } , the matrix el-ements H { j,t (cid:48) } , { i,t } may be written more explicitlyas H { j,t (cid:48) } , { i,t } = − (cid:104) j | U t | i (cid:105) t (cid:48) = t + 1 − (cid:104) j | U † t | i (cid:105) t (cid:48) = t −
10 otherwise (25)2. Diagonal Death/Cloning: This step addresses theapplication of the diagonal of G . In this step, foreach parent walker { i, t } (not yet including childwalkers spawned in the last step), calculate p d ( { i, t } ) = δτ ( H { i,t } , { i,t } − S ) (26)where S is a shift that is used to control the pop-ulation in the simulation. Now for each real andimaginary parent N r and N i associated with { i, t } ,if p d >
0, the parent is destroyed. If p d <
0, theparent is cloned with a probability | p d | , handlinginstances of greater than unit probabilities as inthe previous step.In the case of the Clock, the diagonal matrix ele-ments take on a simple form H { i,t } , { i,t } = / − |(cid:104) i | Ψ (cid:105)| ) t = 01 / t = T −
11 otherwise (27)3. Annihilation: In this step, all previously existingand newly spawned walkers are searched, and anywhich match are combined such that both their realand imaginary components are summed. In theevent that any walker ends up with 0 total weight,it is removed entirely from the simulation. In thecase of the Clock, it is advantageous to store walk-ers grouped by time, such that in parallel imple-mentations the simulation can be easily split alongthis dimension. This will be elaborated upon later.Within each group it is advisable to use any nat-ural ordering present on the basis states to enablebinary search that can locate identical walkers in atime that is logarithmic in the number of walkers ata given time. Alternatively one can use hash tablesto facilitate annihilation [33].A single iteration of the above algorithm is cartoonedin Fig. 1. By using this procedure, the operator G is iter-atively applied until the state of walkers is equilibrated atsome simulation time τ > τ eq , with a number of walkers N w . The average of some observable O may be estimatedat simulation time τ according to (cid:104) O (cid:105) ( τ ) = (cid:104) Φ( τ ) | O | Φ( τ ) (cid:105)(cid:104) Φ( τ ) | Φ( τ ) (cid:105) (28)By averaging over the simulation time τ and correctingfor the autocorrelation time of the quantity (cid:104) O (cid:105) usingstandard statistical procedures, the average may be con-verged to the desired precision. In general, however, thesimulation time averaged quantity (cid:104) O (cid:105) τ may be biaseddue to the sign problem [13–15]. This bias may be re-moved to an arbitrary degree by increasing the number ofwalkers N w such that the state remains sign-coherent be-tween steps. The number of walkers required to removethe bias to a given precision depends both on the sever-ity of the sign problem and the amount of Hilbert space SpawningDiagonal Death Annihilation
FIG. 1. A schematic representation of a single iteration of theFCIQMC algorithm for the Clock. The larger squares repre-sent real-time, and sub-squares represent the possible quan-tum states occupied by either positive (blue) or negative (red)walkers. In each iteration, the set of parents spawns potentialchildren to adjacent times, with parentage being indicated bydotted lines. Simultaneously the set of parent walkers areconsidered for diagonal death. Finally, the remaining set ofparents and spawned children are combined, allowing walkerswith opposing signs at the same state and time to annihilate. the physical problem occupies [13–15]. To this end, wedefine a problem-dependent number n c such that when N w > n c , the time averaged quantity (cid:104) O (cid:105) τ is accurate tothe desired precision. Because this is an NP-Completeproblem, one must expect that in general, n c is on theorder of the dimension of the Hilbert space, D , that is,it grows exponentially with the size of the system andlinearly with real-time. However, for many systems ofinterest in ground state problems it has been found that n c << D [15–17], and one might expect the same to betrue for some dynamical problems. We now turn our at-tention to the scaling and properties of n c for dynamicalsystems. IV. MANIFESTATION OF THE SIGNPROBLEM
The conditions for the efficient simulation of a Hamil-tonian on a classical computer have been studied in thecontext of quantum complexity theory. It is known thatif a Hamiltonian is frustration free and has real, non-positive off-diagonal elements in a standard basis (sto-quastic) that it may be probabilistically simulated to aset precision in a time that is polynomial in the size ofthe system [4, 5].For practical purposes, there are limitations on thesystem operators one may simulate. In particular, thesystem operators must be the sum of a polynomial num-ber of terms. This simply originates from the need tobe able to efficiently evaluate matrix elements of a givenstate. The interaction of at most k particles, or k − localinteractions, in the physical system is a sufficient butnot necessary condition for this to be true. The Clockconstruction has also been recently generalized to openquantum systems [34], where even in this case a 2 − localconstruction is generally possible with the use of gadgets.Alternatively, if the Clock is constructed from a sequenceof unitary gates that act on at most k qubits in quan-tum computation, then the Clock will also satisfy thisrequirement.The presence of frustration in interacting systems cancause the autocorrelation time of measured observablesto diverge exponentially, rendering their efficient simula-tion intractable even in cases where the Hamiltonian isbosonic or sign problem free [3, 35]. It has been provengenerally that the Clock Hamiltonian is frustration free,with a unique ground state separated from the first ex-cited state with a gap of O (1 /T ) where T is the numberof discrete time steps being considered at once.If an operator is stoquastic (or sign problem free), thenthe off-diagonal elements that correspond to transitionsin a Monte Carlo simulation all be non-positive. The op-erator G will contain only positive transition probabilitiesin this case and have a ground state corresponding to aclassical probability distribution by the Perron-Frobeniustheorem [4, 5]. In the context of the FCIQMC methodintroduced, this means that walkers will never changesigns throughout the simulation, and all averages will besign-coherent and unbiased independent of the number ofwalkers N w . In the Clock Hamiltonian, the off-diagonalelements correspond to the set of unitary operators withtheir adjoints { U t , U † t } , and the penalty term C . For thestandard computational initial state ( | (cid:105) ⊗ N ), the penaltyterm C has a fixed sign, and thus the Clock Hamilto-nian is stoquastic if { U t , U † t } represented in the standardbasis has all real positive entries, yielding non-positiveoff-diagonal entries in the Clock.Given the ubiquity of k − local interactions in physicalproblems and the rigorous proof that the Clock Hamilto-nian is frustration free, we will take these two conditionsas given and consider more carefully the stoquastic con-dition. Consider a simple example of a unitary evolutionthat may be simulated on a classical computer efficiently,namely reversible classical computational. All reversibleclassical circuits may be expressed in terms of Toffoli gatesequences, which is unitary and acts to switch a target bitconditional on the state of two control bits. In the stan-dard computational basis it has a representation given by T of = (29)The Clock when constructed with unitary Toffoli gates isstoquastic and n c ≈
1. Although a stoquastic Hamilto-nian is sufficient for this to be the case, it is not a neces-sary condition. To see this, consider a slightly differentset of unitary operators, namely the standard Pauli groupgates, X i , Y i , and Z i in combination with the CNOTgate. These gates have the following unitary representa-tions in the standard computational basis X = (cid:18) (cid:19) (30) Y = (cid:18) − ii (cid:19) (31) Z = (cid:18) − (cid:19) (32) CN OT = (33)Considering for now only the computational basis we sim-ulate in (a restriction we later lift), it is clear that giventhe complex entries and varying signs of the off-diagonals,that a Clock Hamiltonian built from this gate set will notbe stoquastic if even a single Y or Z gate is used. How-ever these gates also have the property that they mapsingle configurations to single configurations, and as asa result no interference occurs, yielding all sign coherentaverages and n c ≈
1. We call this type of transforma-tion, which is configuration number preserving, “quasi-classical”, in contrast to classical which we define as con-figuration number preserving as well as phase preserving.Thus a stoquastic Clock Hamiltonian is a sufficient, butnot necessary condition for the simulation procedure tobe sign-problem free.Consider a slightly more general local rotation R pa-rameterized by an angle θR ( θ ) = (cid:18) cos θ sin θ − sin θ cos θ (cid:19) (34)In this case, the value of n c as a function of system size ismore complex. These represent the real-time evolutionsof local spin Hamiltonians for systems of spin- particles.In Fig. 2 we consider a single rotation R ( θ ) with θ = 5 π/
32 applied uniformly to 11 qubits initialized to | (cid:105) ⊗ N . As the Clock Hamiltonian in this simulation is Mean number of walkers × − . . . . . . . . h S z i FIG. 2. Computed expectation value for S z for a single qubitat the final time in the simulation as a function of the averagenumber of walkers kept in the simulation. There are 11 totalqubits in the simulation. It is apparent the system exhibitsa sharp transition between a totally sign incoherent samplingwhere all averages become zero, and a sign coherent regionwhere the averages begin to converge to the appropriate value. neither stoquastic nor quasi-classical, one observes a sign-incoherent region for a small number of walkers, whereall averages tend towards 0, until some critical thresh-old N w > n c is reached where a transition occurs tosign-coherent sampling, and the average converges to thetrue value. We note that some implementations of theFCIQMC algorithm have used the diagonal shift S as aproxy for convergence [12], but we did not observe a sim-ilar plateau trend here. The history state being sampledin this case is given by | Ψ (cid:105) = (cid:88) t √ T ( R ( θ ) | (cid:105) ) ⊗ t | (cid:105) ⊗ T − t (35)The formal structure of this evolution is quite similarfor all values of θ , however the states that result canexhibit quite different features with respect to the signproblem in sampling. In Fig. 3 we examine the samecircuit, but include many different rotation angles. Onesees that not only does the value n c change as a functionof rotation angle, but the rate of the transition is quitedifferent as well, favoring sharper, earlier transitions forrotations that are closer to quasi-classical ( θ = 0). V. MITIGATING THE SIGN PROBLEM
In the last section we considered the impact of signproblem as it related to local rotations (or the dynamicsof distinguishable non-interacting particles). The appar-ent challenges in this domain are unsettling given thattrivial solutions are known for this problem. Here we
Mean number of walkers × − . . . . . . . h S z i π/ π/ π/ FIG. 3. Computed expectation value for S z for a single qubitat the final time in the simulation as a function of the averagenumber of walkers kept in the simulation and the rotationangle used in the simulation. The rotation angle θ is indicatedby the line label. The simulation contains 11 total qubits forall rotation angles. One sees that the closer the rotation is toquasi-classical, the sharper and earlier the transition to signcoherent sampling. propose a simple scheme to mitigate the sign problem toan arbitrary extent using preliminary computation.It is known that the sign problem is generically basisdependent. To this end we propose an analogous ap-proach to the interaction picture in quantum dynamics,where the walkers at each point in time are expressed ina new basis determined by a generic time-dependent uni-tary rotation given by {B t } . The evolution operators arealso dressed by this change such that in the new basis,the Clock is constructed from the rotated operators givenby U (cid:48) t = B † t +1 U t B t (36)Moreover, the computation of any Hermitian observ-able O must also take into consideration the new basis,such that (cid:104) O (cid:105) ( τ ) = (cid:104) Φ( τ ) | B † t O B t | Φ( τ ) (cid:105)(cid:104) Φ( τ ) | Φ( τ ) (cid:105) (37)If one finds a set of {B t } that renders the Clock Hamil-tonian stoquastic or quasi-classical, the resulting Hamil-tonian may be sampled readily. One expects that in gen-eral, finding this basis must be at least as difficult assolving exactly the problem of the quantum evolution.In fact, it is easy to see that one may choose the exactevolution as the set of basis rotations, and that this ren-ders the Clock Hamiltonian stoquastic and trivial, suchthat the evolution is dictated by the identity at all times.Of course the price one must pay for this is that the com-putation of observables requires the full evolution to beknown. | (cid:105) R ( θ ) •| (cid:105) R ( θ ) •| (cid:105) R ( θ ) •| (cid:105) R ( θ ) · · · FIG. 4. The quantum circuit diagram for the circuit usedto test the efficacy of time-dependent local rotations in ame-liorating the sign problem. The angle used in this case is θ = 0 .
49. We compare the results from this circuit as a func-tion of the number of controls that are removed from the NOTgates (crossed circle here), and whether time dependent localbasis rotations are utilized. The controls are removed fromthe end of the circuit first.
However, as was seen above, it is not necessary forthe Clock to be rendered completely trivial. Even ap-proaching a quasi-classical Hamiltonian in an approxi-mate sense can greatly reduce the sampling costs. Forsome instances, one may find an approximate set of ro-tations that make the Clock nearly stoquastic or quasi-classical, and the remainder of the sign problem can behandled by maintaining a reasonable number of walk-ers N w in the simulation. As an example of this pro-cedure, we consider the simple case where {B t } are de-termined entirely by the local rotations in a quantumcircuit. Specifically, for local rotations, B t = (cid:81) t (cid:48) VI. PARALLEL-IN-TIME SCALING Monte Carlo methods are often championed as the ul-timate parallel algorithms, associated with the phrase“embarrassingly parallel”. Given the evolution of mod-ern computational architectures towards many-core ar-chitectures with slower Clock speeds, Monte Carlo willcontinue to play a growing role in the numerical simula-tion of physics at the boundaries of our computationalcapabilities. Interacting walker Monte Carlo methods,can be more difficult to parallelize effectively due to theannihilation step where communication of walkers is un-avoidable.In contrast to the most general interacting walker al-gorithm, which may require heavy communication be-tween all processes, the FCIQMC method applied to theClock Hamiltonian may take advantage of time-localityto create an efficient parallel-in-time algorithm using thestandard method of domain decomposition in time. Us-ing this construction, only processes containing adjacenttimes need to communicate their child walkers, whichmay be done simultaneously in a time that is constantfor the number of processes involved. This remains trueso long as the number of time steps under considerationis larger than the number of processes in use, which istypically the case. In the case that the number of pro-cesses is much greater than the number of timesteps, thisscheme may still be used by blocking multiple processorsto each time, and utilizing an all-to-all communicationpattern within each block only. The difference betweenthese two communication patterns is highlighted in Fig.6. (a)(b) FIG. 6. A schematic representation of the communicationpatterns the annihilation step of interacting walker MonteCarlo schemes, where the boxes represent different MPI pro-cesses and the ellipsis represents the rest of the processes. Inthe case of the Clock (a), a time domain decomposition al-lows one to restrict communication to only nearest neighbourprocesses, facilitating simple, constant time communicationamenable to the architecture of modern parallel computers.In the more general case (b), a clear partitioning may not bereadily achievable, and all processes may need to communi-cate with all other processes, creating a bottleneck. To demonstrate the scaling properties of this approach,we consider the scaling as a function of the number ofprocessors for fixed total problem size, or strong scaling,with our implementation. This benchmarking is doneon a standard Linux cluster composed of AMD Opteron6376 processors. The parallel speedup with respect tosingle core time as a function of the number of processorsis given in Fig 7. Here, we see that we are able to combinethe parallelism of Monte Carlo with the locality of time-decomposition to achieve practical parallel efficiencies ofover 95% with 2 processors and 70% with 8 using a simpleMPI implementation on a commodity cluster. VII. CONCLUSIONS In this work we reviewed the mapping between uni-tary dynamics and ground state eigenvalue problems. Wethen showed how the FCIQMC method, a technique orig-inally designed to ameliorate the fermionic sign problemfor ground state electronic systems, could be applied toquantum dynamics problems as a direct result of thismapping. This establishes a potential research direction Number of cores P a r a ll e l s p ee d u p Speedup . . . . . . P a r a ll e l E f fi c i e n c y Efficiency FIG. 7. A scaling study of our method implementation with afixed total problem size (strong scaling), showing parallel effi-ciencies and speedups. The simulation consisted of 11 qubitswith 128 time points generated by consecutive local rotationswith θ ≈ . walkers in each simulation-time step and the wall clock timewas measured to the point of an equivalent number of statis-tical samples. for explicit connections between the fermionic and dy-namical sign problems that plague quantum Monte Carlosimulations, and provides a pathway for the transfer oftools between the two domains.The numerical consequences of the dynamical signproblem in this context were studied using a few basicquantum circuits. It was found that even local rota-tions can exhibit a severe sign problem depending onthe form of the rotation and how different it is from aquasi-classical operation. We then introduced a generalmethod analogous to the interaction picture in dynamicsor natural orbitals in the study of eigenstates that usesbasis rotations to mitigate the difficulty of the problem.The costs and benefits of different types of rotations re-quire further research, however we showed that even localrotations can have a significant benefit for non-trivial cir-cuits. Finally, we discussed the structure of the problemin the context of parallel-in-time dynamics, and showedhigh parallel efficiencies with only a basic MPI implemen-tation on a commodity cluster.Overall, we believe this is a promising new methodfor the simulation of quantum dynamics. It clarifies thebridge between dynamics and ground state problems andis capable of effectively utilizing parallel computing archi-tectures. While we have only demonstrated it for quan-tum circuits, we believe it will be generally useful for thestudy of quantum dynamics. VIII. ACKNOWLEDGEMENTS We acknowledge Ryan Babbush and ThomasMarkovich for their helpful comments on the manuscript.J.M. is supported by the Department of Energy Com- putational Science Graduate Fellowship under grantnumber DE-FG02-97ER25308. A.A-G. acknowledgesthe support of the Air Force Office of Scientific Researchunder Award No. FA9550-12-1-0046 and the NationalScience Foundation under Award No. CHE-1152291. [1] D. Tannor, Introduction to Quantum Mechanics: ATime-Dependent Perspective (University Science Books,2007).[2] K. Binder and D. W. Heermann, Monte Carlo simulationin statistical physics: an introduction (Springer, 2010).[3] M. Troyer and U. J. Wiese, Phys. Rev. Lett. , 170201(2005).[4] S. Bravyi, D. P. Divincenzo, R. Oliveira, and B. M.Terhal, Quantum Info. Comput. , 361 (2008).[5] S. Bravyi and B. Terhal, SIAM J. Comput. , 1462(2009).[6] B. Hammond, W. Lester, and P. Reynolds, Monte CarloMethods in Ab Initio Quantum Chemistry , World Sci-entific Lecture and Course Notes in Chemistry ; Vol. 1(World Scientific, 1994).[7] M. Nightingale and C. Umrigar, Quantum Monte CarloMethods in Physics and Chemistry , Nato ASI series(Kluwer Academic Publishers, 1999).[8] S. Baroni and S. Moroni, Phys. Rev. Lett. , 4745(1999).[9] W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Ra-jagopal, Rev. Mod. Phys. , 33 (2001).[10] S. Zhang and H. Krakauer, Phys. Rev. Lett. , 136401(2003).[11] S. Zhang and M. H. Kalos, Phys. Rev. Lett. , 3074(1991).[12] G. H. Booth, A. J. W. Thom, and A. Alavi, J. Chem.Phys. , 054106 (2009).[13] J. S. Spencer, N. S. Blunt, and W. M. Foulkes, J. Chem.Phys. , 054110 (2012).[14] M. H. Kolodrubetz, J. S. Spencer, B. K. Clark, andW. M. Foulkes, J. Chem. Phys. , 024110 (2013).[15] J. J. Shepherd, G. E. Scuseria, and J. S. Spencer, ArXive-prints (2014), arXiv:1407.4800 [physics.comp-ph].[16] G. H. Booth and A. Alavi, J. Chem. Phys. , 174104(2010).[17] G. H. Booth, A. Gruneis, G. Kresse, and A. Alavi, Na-ture , 365 (2013). [18] C. H. Mak, Phys. Rev. Lett. , 899 (1992).[19] R. Feynman and A. Hibbs, Quantum Mechanics and PathIntegrals: Emended Edition (Dover Publications, Incor-porated, 2012).[20] D. Thirumalai and B. J. Berne, J. Chem. Phys. , 5029(1983).[21] N. Makri and W. H. Miller, Chem. Phys. Lett. , 10(1987).[22] N. Makri and W. H. Miller, J. Chem. Phys. , 2170(1988).[23] C. H. Mak and D. Chandler, Phys. Rev. A , 5709(1990).[24] N. Makri, Comp. Phys. Comm. , 389 (1991).[25] N. Makri, J. Phys. Chem , 2417 (1993).[26] V. Jadhao and N. Makri, J. Chem. Phys. , 161102(2008).[27] A. M. Childs, D. Gosset, and Z. Webb, Science , 791(2013).[28] S. Bravyi, ArXiv e-prints (2014), arXiv:1402.2295[quant-ph].[29] R. Feynman, Int. J. Theor. Phys. , 467 (1982).[30] A. Kitaev, A. Shen, M. Vyalyi, and N. Vyalyi, Classicaland Quantum Computation , Graduate Studies in Math-ematics (American Mathematical Society, 2002).[31] J. R. McClean, J. A. Parkhill, and A. Aspuru-Guzik,Proc. Natl. Acad. Sci. U.S.A. , E3901 (2013).[32] N. Trivedi and D. M. Ceperley, Phys. Rev. B , 4552(1990).[33] G. H. Booth, S. D. Smart, and A. Alavi, ArXiv e-prints(2013), arXiv:1305.6981 [physics.comp-ph].[34] D. G. Tempel and A. Aspuru-Guzik, ArXiv e-prints(2014), arXiv:1406.5631 [quant-ph].[35] T. C. Wei, M. Mosca, and A. Nayak, Phys. Rev. Lett.104