# Long-time simulations with high fidelity on quantum hardware

Joe Gibbs, Kaitlin Gili, Zoë Holmes, Benjamin Commeau, Andrew Arrasmith, Lukasz Cincio, Patrick J. Coles, Andrew Sornborger

LLong-time simulations with high ﬁdelity on quantum hardware

Joe Gibbs, Kaitlin Gili,

1, 2

Zoë Holmes, Benjamin Commeau,

3, 4

AndrewArrasmith, Lukasz Cincio, Patrick J. Coles, and Andrew Sornborger Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA. Department of Physics, University of Oxford, Clarendon Laboratory, Oxford, U.K. Information Sciences, Los Alamos National Laboratory, Los Alamos, NM, USA. Department of Physics, University of Connecticut, Storrs, Connecticut, CT, USA. (Dated: February 9, 2021)Moderate-size quantum computers are now publicly accessible over the cloud, opening the excit-ing possibility of performing dynamical simulations of quantum systems. However, while rapidlyimproving, these devices have short coherence times, limiting the depth of algorithms that maybe successfully implemented. Here we demonstrate that, despite these limitations, it is possibleto implement long-time, high ﬁdelity simulations on current hardware. Speciﬁcally, we simulatean XY-model spin chain on the Rigetti and IBM quantum computers, maintaining a ﬁdelity of atleast 0.9 for over 600 time steps. This is a factor of 150 longer than is possible using the iteratedTrotter method. Our simulations are performed using a new algorithm that we call the ﬁxed stateVariational Fast Forwarding (fsVFF) algorithm. This algorithm decreases the circuit depth andwidth required for a quantum simulation by ﬁnding an approximate diagonalization of a short timeevolution unitary. Crucially, fsVFF only requires ﬁnding a diagonalization on the subspace spannedby the initial state, rather than on the total Hilbert space as with previous methods, substantiallyreducing the required resources.

I. INTRODUCTION

The simulation of physical systems is both valuable forbasic science and has technological applications across adiverse range of industries from materials design to phar-maceutical development. Relative to classical computers,quantum computers have the potential to provide an ex-ponentially more eﬃcient means of simulating quantummechanical systems. Quantum hardware has progressedsubstantially in recent years [1, 2]. However, despite con-tinual progress, we remain in the ‘noisy intermediate-scale quantum’ (NISQ) era in which the available hard-ware is limited to relatively small numbers of qubitsand prone to errors. Simulation algorithms designed forfault-tolerant quantum computers, such as Trotterizationmethods [3, 4], qubitization methods [5], and Taylor se-ries methods [6], require deeper circuits than viable giventhe short coherence times of current hardware. Thus al-ternative approaches are needed to successfully imple-ment useful simulations on NISQ hardware.Variational quantum algorithms [7–22], where a clas-sical computer optimizes a cost function measured ona quantum computer, show promise for NISQ quantumsimulations. An early approach introduced an iterativemethod, where the state is variationally learned on astep-by-step basis using action principles [18, 19, 23].Subsequently, a generalization of the variational quan-tum eigensolver [10] was developed for simulations inlow lying energy subspaces [20]. Very recently, quantum-assisted methods have been proposed that perform allnecessary quantum measurements at the start of the al-gorithm instead of employing a classical-quantum feed-back loop [24–26].In this work, we improve upon a recently proposedvariational quantum algorithm known as Variational Fast Forwarding (VFF) [21]. VFF allows long time simula-tions to be performed using a ﬁxed depth circuit, thusenabling a quantum simulation to be ‘fast forwarded’ be-yond the coherence time of noisy hardware. The VFFalgorithm requires ﬁnding a full diagonalization of theshort time evolution operator U of the system of inter-est. Once found, the diagonalization enables any initialstate of that system to be fast forwarded. However, forpractical purposes, one is often interested in studying theevolution of a particular ﬁxed initial state of interest. Inthat case a full diagonalization of U is overkill. Instead, itsuﬃces to ﬁnd a diagonal compilation of U that capturesits action on the given initial state. Here, we show thatfocusing on this commonly encountered but less exactingtask can substantially reduce the resources required forthe simulation.Speciﬁcally, we introduce the ﬁxed state VFF algo-rithm (fsVFF) for fast forwarding a ﬁxed initial statebeyond the coherence time of a quantum computer. Thisapproach is tailored to making dynamical simulationmore suitable for NISQ hardware in two key ways. First,the cost function requires half as many qubits as VFF.This not only allows larger scale simulations to be per-formed on current resource-limited hardware, but alsohas the potential to enable higher ﬁdelity simulationssince larger devices tend to be noisier. Second, fsVFFcan utilize simpler ansätze than VFF both in terms of thedepth of the ansatz and the number of parameters thatneed to be learnt. Thus, fsVFF can reduce the width,depth, and total number of circuits required to fast for-ward quantum simulations, hence increasing the viabilityof performing simulations on near-term hardware.We demonstrate these advantages by implementinglong-time high ﬁdelity quantum simulations of the 2-qubit XY spin chain on Rigetti’s and IBM’s quantum a r X i v : . [ qu a n t - ph ] F e b computers. Speciﬁcally, while the iterated Trotter ap-proach has a ﬁdelity of less than 0.9 after 4 time stepsand has completely thermalized by 25 time steps, withfsVFF we achieve a simulation ﬁdelity greater than 0.9for over 600 time steps.In our analytical results, we prove the faithfulnessof the fsVFF cost function by utilizing the newly de-veloped No-Free-Lunch theorems for quantum machinelearning [27, 28]. We also provide a proof of the noiseresilience of the fsVFF cost function, speciﬁcally the op-timal parameter resilience [29]. Finally, we perform ananalysis of simulation errors under fast-forwarding.The diagonalizations obtained using fsVFF may fur-ther be useful for determining the eigenstates and eigen-values of the Hamiltonian on the subspace spanned bythe initial state. This can be done using a time seriesanalysis, by using fsVFF to reduce the depth of the quan-tum phase estimation (QPE) algorithm, or using a simplesampling method. We demonstrate on IBM’s quantumcomputer that, while standard QPE fails on real hard-ware, fsVFF can be used to obtain accurate estimates ofthe spectrum. II. BACKGROUND

Before presenting our fsVFF algorithm, let us ﬁrst re-view the original VFF algorithm from Ref. [21]. Considera Hamiltonian H on a d = 2 n dimensional Hilbert space(i.e., on n qubits) evolved for a short time ∆ t with thesimulation unitary e − iH ∆ t , and let T (larger than ∆ t )denote the desired simulation time. Then the VFF algo-rithm consists of the following steps:1. Approximate e − iH ∆ t with a single-timestep Trot-terized unitary denoted U = U (∆ t ) .2. Variationally search for an approximate diagonal-ization of U by compiling it to a unitary with astructure of the form V ( α , ∆ t ) := W ( θ ) D ( γ , ∆ t ) W ( θ ) † , (1)where α = ( θ , γ ) is a vector of parameters. Here, D ( γ , ∆ t ) is a parameterized unitary that will (aftertraining) encode the eigenvalues of U (∆ t ) , while W ( θ ) is a parameterized unitary matrix that willconsist of the corresponding eigenvectors [21]. Thecompilation is performed using the local Hilbert-Schmidt test [13] to ﬁnd the parameters θ opt and γ opt that minimize the local Hilbert-Schmidt cost.3. Use the compiled form to simulate for time T = N ∆ t using the circuit W ( θ opt ) D ( γ opt , N ∆ t ) W ( θ opt ) † . (2)VFF has proven eﬀective for providing a ﬁxed quantumcircuit structure with which to fast-forward beyond the coherence time of current noisy quantum devices. How-ever, the algorithm requires a full diagonalization of U over the entire Hilbert space. The local Hilbert-Schmidttest used to ﬁnd this diagonalization requires n qubits.Additionally, the ansatz must be suﬃciently expressibleto diagonalize the full unitary U to a high degree of ap-proximation [30–32]. This typically requires a large num-ber of parameters and a reasonably deep circuit. Theseoverheads limit VFF’s utility on current hardware.In what follows, we introduce a more NISQ-friendlyreﬁnement to VFF that reduces these overheads whenone is interested in fast-forwarding a ﬁxed initial state | ψ (cid:105) , rather than any possible initial state. The ﬁxedstate VFF algorithm (fsVFF) is summarised in Fig. 1.We note that VFF, like the standard iterated Trot-ter approach to quantum simulation, necessarily incurs aTrotter error by approximating e − iH ∆ t with U = U (∆ t ) .This Trotter error may be removed using the VariationalHamiltonian Diagonalization algorithm (VHD), which di-rectly diagonalizes the Hamiltonian H [22]. However,VHD is yet more resource intensive than VFF on currenthardware, so we focus here on reﬁning VFF. III. FIXED STATE VARIATIONAL FASTFORWARDING ALGORITHMA. Cost function

In fsVFF, instead of searching for a full diagonalizationof U over the entire Hilbert space, we search for a diag-onal compilation of U that captures the action of U onthe initial state | ψ (cid:105) and its future evolution, e − iHt | ψ (cid:105) .Here, we introduce a cost function tailored to this task.To make precise what is required of the cost for fsVFF,let us ﬁrst note that as the state | ψ (cid:105) evolves, it remainswithin its initial energy subspace. This can be seen byexpanding the initial state in terms of the energy eigen-basis {| E k (cid:105)} n k =1 (the eigenbasis of H ) as | ψ (cid:105) = n eig (cid:88) k =1 a k | E k (cid:105) , (3)where a k = (cid:104) E k | ψ (cid:105) , and noting that e − iHt | ψ (cid:105) = n eig (cid:88) k =1 a k e − iE k t | E k (cid:105) . (4)Thus it follows that if | ψ (cid:105) spans n eig energy eigenstatesof H , so does e − iHt | ψ (cid:105) for all future times. Thereforeto ﬁnd a compilation of U that captures its action on e − iHt | ψ (cid:105) (for all times t ) it suﬃces to ﬁnd a compi-lation of U on the n eig dimensional subspace spannedby {| E k (cid:105)} n eig k =1 . We stress that the eigenstates {| E k (cid:105)} n k =1 need not be ordered, and therefore the subspace spannedby the subset {| E k (cid:105)} n eig k =1 is not necessarily low lying inenergy. Input:Unitary Diagonalization with Maximizing Fidelity up to :

Gradient Descent Optimization Loop

Evaluate Gradients:Evaluate Cost:Output: Fast Forwarded Simulation

Approximately simulate H for time using Update Parameters: Calculate : Construct :Calculate :

If construct

G(k+1) If Hadamard Test (a) Trotter Expansion:(b) (c)(d) (e) H, | i H, | i U, | i U, | i , n eig N t C fsVFF := 1 n eig n eig X k =1 |h | ( V † ) k U k | i| ⇢ ✓ i ⌘ @C fsVFF @✓ i , i ⌘ @C fsVFF @ i ⇢ @C fsVFF @✓ i , @C fsVFF @ i W ( ✓✓✓ opt ) D ( opt , N t ) W ( ✓✓✓ opt ) † k = n eig k = n eig k = n eig ✓✓✓ opt , opt G ( k ) = [ h l | l i ] l,l = kl,l =0 Det( G ( k )) = 0Det( G ( k )) = 0 n eig = k n eig n eig n eig G ( k ) G ( k ) G ( k )Det( G ( k ))Det( G ( k ))Det( G ( k )) FIG. 1.

The fsVFF Algorithm. (a) An input Hamiltonian and an initial input state are necessary (b) to create a singletime-step Trotterized unitary, U (∆ t ) and (c) to calculate the number of eigenstates spanned by the initial state. The value of n eig is calculated by constructing a matrix of state overlaps U k | ψ (cid:105) and increasing the matrix dimension until the determinantis zero. (d) The unitary is then variationally diagonalized into the form, V ( α , ∆ t ) = W ( θ ) D ( γ , ∆ t ) W † ( θ ) . The cost function C fsVFF is minimized with a classical optimizer (e.g., gradient descent), where the parameters θ and γ are updated. (e) Theoptimal parameters θ opt and γ opt are then used to implement a fast-forwarded simulation with the diagonalized unitary form. A No-Free-Lunch Theorem for quantum machinelearning introduced in Ref. [27] proves that to perfectlylearn the action of a unitary on a d -dimensional spacerequires d training pairs. In the context of fsVFF, weare interested in learning the action of a unitary on an n eig -dimensional subspace. Since the unitary is block di-agonal, one can directly apply this NFL theorem to thesubspace of interest. Therefore n eig training pairs arerequired to learn the unitary’s action on this subspace.(Note, we assume here that the training states are notentangled with an additional register. It was shown inRef. [28] that using entangled training data can reducethe required number of training states. In fact, this morepowerful method is used by the VFF algorithm. How-ever, producing such entangled training data requires ad-ditional qubits and two-qubit gates and therefore is lessNISQ-friendly.)The No-Free-Lunch theorem therefore implies that n eig states are required to learn U on | ψ (cid:105) (assuming leakagedue to Trotter error is negligible). In general these statesmay be freely chosen from the subspace spanned by | ψ (cid:105) .Here a convenient choice in training states would be | ψ (cid:105) and its Trotter evolutions, that is the set { U k | ψ (cid:105)} n eig k =1 .Motivated by these observations, we deﬁne our cost func- tion for fsVFF as C fsVFF := 1 − n eig n eig (cid:88) k =1 | (cid:104) ψ | ( V † ) k U k | ψ (cid:105) | , (5)where similarly to VFF we use a diagonal ansatz V ( α , ∆ t ) := W ( θ ) D ( γ , ∆ t ) W ( θ ) † . This cost quantiﬁesthe overlap between the initial state evolved under U for k time steps, U k | ψ (cid:105) , and the initial state evolved underthe trained unitary for k time steps, W D k W † | ψ (cid:105) , aver-aged over n eig time steps. Assuming we have access tothe unitary that prepares the state | ψ (cid:105) , the state over-laps can be measured using n qubits, via a circuit thatperforms a Loschmidt echo [29]. Therefore C fsVFF canbe evaluated using only n qubits. This is half as many asstandard VFF, opening up the possibility of performinglarger simulations on current hardware.It is important to note that while the exact time-evolved state exp( − iHt ) | ψ (cid:105) is perfectly conﬁned to theinitial subspace, the approximate evolution induced by U (∆ t ) allows for leakage from the initial subspace [33].Thus the subspace spanned by { U k | ψ (cid:105)} n eig k =1 in generaldoes not perfectly overlap with {| E k (cid:105)} n eig k =1 . However, byreducing ∆ t and considering higher order Trotter approx-imations, this leakage can be made arbitrarily small. InAppendix A, we prove that in the limit that leakage fromthe initial subspace is negligible, C fsVFF is faithful. Thatis, we show that the cost vanishes, C fsVFF = 0 , if and onlyif the ﬁdelity of the fast-forwarded simulation is perfect, F τ = | (cid:104) ψ | W † D τ W U τ | ψ (cid:105) | = 1 , (6)for all times τ . Note, that the reverse direction is trivial.If F τ = 1 for all τ , then C fsVFF = 0 .Similar to the VFF cost, the fsVFF cost is noise re-silient in the sense that incoherent noise should not aﬀectthe global optimum of the function. This is proven for abroad class of incoherent noise models using the resultsof Ref. [29] in Appendix B.Nonetheless, it is only possible to measure C fsVFF if theunitary U n eig can be implemented comfortably within thecoherence time of the QC. Additionally, the number ofcircuits required to evaluate C fsVFF in general scales with n eig . Given these two restrictions, fsVFF is limited tosimulating quantum states spanning a non-exponentialnumber of eigenstates. Consequently, we advocate us-ing fsVFF to simulate states with n eig = poly ( n ) . Cru-cially these states need not be low lying and therefore ourapproach is more widely applicable than the SubspaceVariational Quantum Simulator (SVQS) algorithm [20],which simulates ﬁxed low energy input states.While C fsVFF was motivated as a natural choice of costfunction to learn the evolution induced by a target uni-tary on a ﬁxed initial state, it is a global cost [34] andhence it encounters what is known as a barren plateau forlarge simulation sizes [32, 34–44]. In Appendix C we sug-gest an alternative local version of the cost to mitigatesuch trainability issues. B. Calculating n eig To evaluate C fsVFF , it is ﬁrst necessary to determine n eig , the number of energy eigenstates of the systemHamiltonian H spanned by | ψ (cid:105) . In this section, wepresent an algorithm to calculate n eig .Our proposed algorithm utilizes the fact that the num-ber of energy eigenstates spanned by | ψ (cid:105) is equivalentto the number of linearly independent states in the set V ∞ where V k := {| ψ l (cid:105)} l = kl =0 with | ψ l (cid:105) := U (∆ t ) l | ψ (cid:105) .The subspace K k ( U, ψ ) spanned by V k is known as theKrylov subspace associated with the operator U and vec-tor | ψ (cid:105) [45]. Therefore n eig is equivalently the dimensionof the Krylov subspace K ∞ ( U, ψ ) .To determine the dimension of K ∞ ( U, ψ ) we can uti-lize the fact that the determinant of the Gramian matrixof a set of vectors (i.e., the matrix of their overlaps) iszero if and only if the vectors are linearly dependent. TheGramian corresponding to V k is given by G ( k ) = (cid:104) ψ | ψ (cid:105) (cid:104) ψ | ψ (cid:105) · · · (cid:104) ψ | ψ k (cid:105)(cid:104) ψ | ψ (cid:105) (cid:104) ψ | ψ (cid:105) · · · (cid:104) ψ | ψ k (cid:105) ... ... . . . ... (cid:104) ψ k | ψ (cid:105) (cid:104) ψ k | ψ (cid:105) · · · (cid:104) ψ k | ψ k (cid:105) . (7) If Det ( G ( k )) (cid:54) = 0 , then the vectors in V k are linearly in-dependent and therefore span at least a k +1 dimensionalsubspace. Conversely, if Det ( G ( k )) = 0 , the set V k con-tains linear dependencies and the subspace they span isless than k + 1 dimensional. Therefore, if we can ﬁnd k min , the smallest k such that Det ( G ( k )) = 0 , then (not-ing that G ( k ) is a k + 1 dimensional matrix) we knowthat k min is the largest number of linearly independentvectors spanned by V ∞ . That is, k min is the dimensionof K ∞ ( U, ψ ) and so we have that n eig = k min .The overlaps (cid:104) ψ l | ψ l (cid:48) (cid:105) for any l and l (cid:48) can be mea-sured using the Hadamard Test, shown in Fig. 1, andthus the Hadamard test can be used to determine G ( k ) on quantum hardware. Since the Gramian here con-tains two symmetries, hermiticity and the invariance (cid:104) ψ l | ψ l (cid:48) (cid:105) = (cid:104) ψ | U − l U l (cid:48) | ψ (cid:105) = (cid:104) ψ | U l (cid:48) − l | ψ (cid:105) = (cid:104) ψ | ψ l (cid:48) − l (cid:105) ,we only have to calculate the ﬁrst row of the matrix G ( k ) on the quantum computer.In summary, our proposed algorithm to determine n eig consists of the following loop. Starting with k = 1 ,1. Construct G ( k ) using the Hadamard test.2. Calculate (classically) Det ( G ( k )) .If Det ( G ( k )) = 0 , terminate the loop and concludethat n eig = k .If Det ( G ( k )) (cid:54) = 0 , increase k → k + 1 and return tostep 1.This is shown schematically in Fig. 1.We remark that in the presence of degeneracies in thespectrum of H , the eigenvectors corresponding to degen-erate eigenvalues are not unique. Therefore, in this case,the number of states spanned by | ψ (cid:105) depends on howthe eigenvectors corresponding to degenerate eigenval-ues are chosen. However, as detailed in Appendix A,to learn the action of U on | ψ (cid:105) , what matters is thenumber of eigenstates spanned by | ψ (cid:105) corresponding to unique eigenvalues. This is equivalent to the dimensionof the Krylov subspace K ∞ ( U, ψ ) . Consequently, thealgorithm detailed above can also be used in this case. C. Ansatz

The fsVFF algorithm, similarly to VFF, employs anansatz of the form V ( α , ∆ t ) = W ( θ ) D ( γ , ∆ t ) W † ( θ ) , (8)to diagonalize the initial Trotter unitary U (∆ t ) . Here W ( θ ) is a quantum circuit that approximately rotatesthe standard basis into the eigenbasis of H , and D ( γ ) is a diagonal unitary that captures the (exponentiated)eigenvalues of H . A generic diagonal operator D can bewritten in the form D ( γ , ∆ t ) = (cid:89) q e γ q Z q ∆ t , (9)where γ q ∈ R and we use the notation Z q = Z q ⊗ · · · ⊗ Z q n n , (10)with Z j the Pauli Z operator acting on qubit j . WhileEq. (9) provides a general expression for a diagonal uni-tary, for practical ansätze it may be desirable to assumethat the Z q operators are local operators and the prod-uct contains a polynomial number of terms, i.e., is in O ( poly ( n )) . There is more ﬂexibility in the construc-tion of the ansätze for W since these are generic unitaryoperations. A natural choice might be to use a hardware-eﬃcient ansatz [46].One of the main advantages of fsVFF is that diago-nalization is only necessary over the subspace spannedby the initial state, rather than the entire Hilbert spacewhich will be signiﬁcantly larger. To outperform stan-dard VFF, it is in our interest to take advantage of thissmall subspace to ﬁnd compact ansätze.The two main impeding factors we wish to minimizeto aid diagonalization are error rates and optimizationtime. Therefore, when searching for ansätze, our priori-ties are to minimize the number of CNOT gates required(the noisiest component in the ansätze) and the num-ber of rotation parameters. There is, however, a tradeoﬀ between expressibility of the ansatz and its trainabil-ity. There needs to be enough freedom in the unitary tomap the required eigenvectors to the computational basisbut generically highly expressive ansätze exhibit barrenplateaus [32].For systems with symmetries and/or systems that arenearby perturbations of known diagonalizable systems, itmay be possible to ﬁnd a fully expressive, compact ansatzby inspection. This is the case for a simple 2-qubit XYHamiltonian, as discussed in Section IV.More generally, it can be challenging to analyti-cally ﬁnd compact but suﬃciently expressible ansätze.Nonetheless, it is possible to variationally update theansatz structure and thereby systematically discover sim-ple structures. One straightforward approach is to use alayered ansatz where each layer initializes to the iden-tity gate [47, 48]. The ansatz can be optimized untilit plateaus, redundant single qubit gates removed, thenanother layer can be appended and the process repeats.Alternatively, more sophisticated discrete optimizationtechniques may be used to variationally search the spaceof ansätze. D. Summary of algorithm

The ﬁxed state Variational Fast Forwarding algorithm(fsVFF) is summarized in Fig. 1. We start with an ini-tial state | ψ (cid:105) that we wish to evolve under the Hamilto-nian H .1. The ﬁrst step is to approximate the short time evo-lution using a single step Trotter approximation U . 2. This Trotter approximation can be used to ﬁnd anapproximation for n eig , the dimension of the en-ergy eigenspace spanned by | ψ (cid:105) , using the methodoutlined in Section III B.3. Equipped with a value for n eig , we then varia-tionally search for a diagonalization of U over theenergy subspace spanned by | ψ (cid:105) using C fsVFF ,Eq. (5). At each iteration step the gradient of thecost with respect to a parameter θ i is measured onthe quantum computer for a ﬁxed set of parametersusing the analytic expressions for ∂ θ i C fsVFF pro-vided in Appendix D. These gradients are used toupdate the parameters using a classical optimizer,such as those in Refs. [49–51]. The output of theoptimization loop is the set of parameters that min-imize C fsVFF , { θ opt , γ opt } = arg min θ , γ C fsVFF ( θ , γ ) . (11)4. Finally, the state | ψ (cid:105) can be simulated for time T = N ∆ t using the circuit W ( θ opt ) D ( γ opt , N ∆ t ) W ( θ opt ) † . (12)That is, by simply multiplying the parameters γ opt in the diagonalized unitary by a constant numberof iterations N .In Appendix E, we show that the total simulation ﬁ-delity, in the limit that leakage is small, is expectedto scale sub-quadratically with the number of fast-forwarding time steps N . Thus if the minimal cost fromthe optimization loop is suﬃciently small, we expect thefsVFF algorithm to allow for long, high ﬁdelity simula-tions. IV. HARDWARE IMPLEMENTATION

In this section we demonstrate that fsVFF can be usedto implement long time simulations on quantum hard-ware. Speciﬁcally, we simulate the XY spin chain, whichhas the Hamiltonian H XY := n − (cid:88) j =1 X j X j +1 + Y j Y j +1 , (13)where X j and Y j are Pauli operators on the j th qubit. Inwhat follows, we ﬁrst present results showing that we candetermine n eig for an initial state | ψ (cid:105) using the methoddescribed in Section III B. We then demonstrate that thefsVFF cost can be trained to ﬁnd an approximate diago-nalization of H XY on the subspace spanned by | ψ (cid:105) . Weﬁnally use this diagonalization to perform a long timefast forwarded simulation. In all cases we focus on a twoqubit chain, i.e. n = 2 , and we approximate its evolutionoperator using a ﬁrst-order Trotter approximation. k . . . . . D e t( G ( k )) Classical SimulationHardware Implementation n eig = 1 n eig = 2 n eig = 3 FIG. 2.

Gramian Determinant Calculation.

Here weplot the determinant of the Gramian matrix, Det ( G ) , for G measured on the Honeywell quantum computer (solid) andsimulated classically (dashed) for a 2-qubit XY spin chain.Speciﬁcally we looked at states spanning k = 1 (blue), k = 2 (yellow) and k = 3 (red) eigenstates. For both sets ofdata Det ( G ( n eig )) ≈ , demonstrating the eﬀectiveness ofthe method for determining n eig that we introduce in Sec-tion III B. For the Honeywell implementation we used 1000measurement samples per circuit. A. Determining n eig The 2-qubit XY Hamiltonian has the eigenvectors {| (cid:105) , √ ( | (cid:105) + | (cid:105) ) , √ ( | (cid:105) − | (cid:105) ) , | (cid:105)} , correspond-ing to the eigenvalues {0, 1, -1, 0}. As proof of princi-ple, we tested the algorithm for determining n eig on thestates | (cid:105) (corresponding to n eig = 1 ), | (cid:105) ( n eig = 2 )and √ ( | (cid:105) + | (cid:105) ) ( n eig = 3 ). As described in Sec-tion III B, the n eig of these states can be found by calcu-lating Det (( G ( k )) for increasing values of k since, as k isincreased, the determinant ﬁrst equals 0 when k = n eig .To verify this for the states considered here, we ﬁrstdetermine G using a classical simulator. As seen in Fig-ure 2, in this case Det (( G ( k )) exactly equals 0 when k = n eig . We then measured G on Honeywell’s quantumcomputer. Although on the real quantum device gatenoise and sampling errors are introduced, the results re-produce the classical results reasonably well. Namely, atthe correct value of k , Det (( G ( k )) drastically reduces andapproximately equals 0. Thus, we have shown that it ispossible to determine n eig for an initial state by measur-ing G on quantum hardware. B. Training

We tested the training step of the algorithm onIBM and Rigetti’s quantum computers, speciﬁcallyibmq_toronto and Aspen-8. For the purpose of imple-menting a complete simulation, we chose to focus on sim-ulating the evolution of the state | ψ (cid:105) = | (cid:105) . As dis-cussed in the previous section, this state spans n eig = 2 eigenstates.To diagonalize H XY on the 2-dimensional subspacespanned by | (cid:105) , we used a hybrid quantum-classical op- FIG. 3.

Ansatz for Hardware Implementation . Theansatz used to diagonalize the 2-qubit XY Hamiltonian in thesubspace of initial state | (cid:105) for the implementation on Rigettiand IBM’s quantum computers. Here R j ( θ ) = exp( − iθσ j / for j = x, y, z . timization loop to minimize C fsVFF . For a state with n eig = 2 the cost C fsVFF , Eq. (5), uses two trainingstates { U (∆ t ) k | ψ (cid:105)} k =1 , where U ( t ) is the ﬁrst-orderTrotter approximation of H XY . On the IBM quantumcomputer we evaluated the full cost function for eachgradient descent iteration. However, the time availableon the Aspen-8 device was limited, so to speed up therate of optimization we evaluated the overlap on just oneof the two training states per iteration, alternating be-tween iterations (instead of evaluating the overlaps onboth training states every iteration). To allow the move-ment through parameter space to use information aver-aged over the two timesteps, whilst only using a singletraining state per cost function evaluation, momentumwas added to the gradient updates [52].To take advantage of the fact that more compact an-sätze are viable for fsVFF, we variationally searched fora short depth ansatz, tailored to the target problem.Speciﬁcally, we started training with a general 2-qubitunitary and then during training the structure was min-imised by pruning unnecessary gates. In Figure 3, weshow the circuit for the optimal ansatz obtained usingthe method. The ansatz requires one CNOT gate andtwo single qubit gates for W and only one R z rotation for D . This is a substantial compression on the most generaltwo qubit ansatz for W which requires 3 CNOTs and 15single qubit rotations and the most general 2 qubit ansatzfor D which requires 2 R z rotations and one 2-qubit ZZrotation (though in the case of the XY Hamiltonian thismay be simpliﬁed to only 2 R z rotations [53]).Figure 4(a) shows the fsVFF cost function versusthe number of iterations for the implementations onibmq_toronto (yellow) and Aspen-8 (red). The dashedline indicates the noisy cost value obtained from thequantum computer. To evaluate the quality of the op-timization, we additionally classically compute the truecost (indicated by the solid lines) using the parametersfound on ibmq_toronto and Aspen-8. While the noisycost saturates at around − , we obtained a minimumnoise-free cost of the order − . The two orders of mag-nitude diﬀerence between the noisy and the noise-freecost is experimental evidence that the cost function isnoise resilient on extant quantum hardware. − − − C o s t , C f s V FF a) NoisyNoise-free N . . . . . F i d e li t y , F b) fsVFF (ibmq-toronto)fsVFF (Aspen-8)Iterated Trotter

625 1275 20000 . . . FIG. 4.

Hardware Implementation. a) The 2-qubit pa-rameterised quantum circuit shown in Figure 3 was trainedto diagonalize U (∆ t ) , a ﬁrst order Trotter expansion of the2-qubit XY Hamiltonian with ∆ t = 0 . , in the subspacespanned by | (cid:105) . The dashed line plots the noisy cost asmeasured on ibmq_toronto (yellow) and Aspen-8 (red) us-ing 30,000 samples per circuit. The solid line indicates theequivalent noise-free cost that was calculated on a classicalsimulator. b) The initial state | ψ (cid:105) = | (cid:105) is evolved forwardsin time on the ibmq_rome quantum computer using the it-erated Trotter method (blue) and using fsVFF with the opti-mum parameters found on ibmq_toronto (yellow) and Aspen-8 (red). The quality of the simulation is evaluated by plottingthe ﬁdelity F = (cid:104) ψ | ρ | ψ (cid:105) between the evolved state and exactevolution. The grey dotted line at F = 0 . represents theoverlap with the maximally mixed state. The black dottedline denotes a threshold ﬁdelity at F = 0 . . The inset showsthe fast-forwarding of the ansatz trained on ibmq_torontoon a longer timescale, where the ﬁdelity dropped below 0.9(0.8) at 625 (1275) timesteps. All data was taken using 1000samples per circuit. C. Fast forwarding

Finally we took the two sets of parameters foundfrom training on ibmq_toronto and Aspen-8, and usedthem to implement a fast-forwarded simulation of thestate | (cid:105) on ibmq_rome. To evaluate the quality ofthe fast forwarding we calculated the ﬁdelity, F ( N ) = (cid:104) ψ ( N ) | ρ ( N ) | ψ ( N ) (cid:105) , between the density matrix of thesimulated state, ρ ( N ) , after N timesteps, and the exacttime evolved state, | ψ ( N ) (cid:105) , at time T = N ∆ t . We usedQuantum State Tomography to reconstruct the outputdensity matrix, and then calculated the overlap with theexact state classically.As shown by the plots of F ( N ) in Figure 4(b), fsVFF signiﬁcantly outperforms the iterated Trotter method.Let us refer to the time before the simulation falls be-low an error threshold δ as the high ﬁdelity time. Thenthe ratio of the high ﬁdelity time for fsVFF ( T FF δ ) and forstandard Trotterization ( T Trot δ ) is a convenient measureof simulation performance, R FF δ = T FF δ /T Trot δ . (14)A simulation can be said to have been successfully fast-forwarded if R FF δ > . The iterated Trotter methoddropped below a simulation inﬁdelity threshold of δ =1 − F = 0 . . after 4 (8) timesteps. In compar-ison, fsVFF maintained a high ﬁdelity for 625 (1275)timesteps. Thus we achieved a simulation fast-forwardingratio of R FF0 . = 156 ( R FF0 . = 159 ). V. ENERGY ESTIMATION

The diagonalization obtained from the optimizationstage of fsVFF, W ( θ opt ) D ( γ opt , ∆ t ) W ( θ opt ) † , implicitlycontains approximations of the eigenvalues and eigenvec-tors of the Hamiltonian of the system of interest. Inthis section we discuss methods for extracting the energyeigenstates and eigenvalues from a successful diagonal-ization and implement them on quantum hardware.The energy eigenvectors spanned by the initial state | ψ (cid:105) can be determined by the following simple samplingmethod. The ﬁrst step is to apply W † to the initial state | ψ (cid:105) . In the limit of perfect learning and vanishing Trot-ter error, this gives W ( θ opt ) † | ψ (cid:105) = n eig (cid:88) k =1 a k | v k (cid:105) (15)where a k = (cid:104) E k | ψ (cid:105) and {| v k (cid:105)} n eig k =1 is a set of compu-tational basis states. The energy eigenstates spannedby | ψ (cid:105) are then found by applying W ( θ opt ) to anyof the states obtained from measuring W ( θ opt ) † | ψ (cid:105) in the computational basis, that is {| E k (cid:105)} n eig k =1 = { W ( θ opt ) | v k (cid:105)} n eig k =1 .Extracting the energy eigenvalues from D is more sub-tle. Firstly, as W DW † and U , even in the limit of per-fectly minimizing the cost C fsVFF , may disagree by aglobal phase φ , at best we can hope to learn the diﬀerencebetween, rather than absolute values, of the energy eigen-values of H . For simple cases, where the diagonal ansatz D is composed of a polynomial number of terms, theseenergy value diﬀerences may be extracted directly byrewriting D in the computational basis. For example, inour hardware implementation D ( γ ) = exp (cid:16) − i γ ∆ tZ (cid:17) ⊗ | ψ (cid:105) is given by γ opt ∆ t + kπ . Here k in an integer correcting for the arbitrary phase arisingfrom taking the log of D that can be determined usingthe method described in Ref. [22]. Using this approach,we obtain 1.9995 and 2.0019 from the training on IBM a) QPE j | (cid:105) H QF T † | E k (cid:105) U (∆ t ) j ⇒ j | (cid:105) H QF T † | v k (cid:105) D ( γ opt , j ∆ t )b) QEE | (cid:105) H (cid:104) σ x + iσ y (cid:105)| E k (cid:105) U (∆ t ) ⇒ | (cid:105) H (cid:104) σ x + iσ y (cid:105)| v k (cid:105) D ( γ opt , ∆ t )1 FIG. 5.

Energy estimation circuits a)/b) show circuit diagrams depicting the enhancement of QPE/QEE using fsVFF. Acircuit depth reduction is achieved through replacing U (∆ t ) with D ( γ opt , ∆ t ) , and removing the need to prepare an eigenstatein favour of a computational basis state, | v k (cid:105) = W † | E k (cid:105) . QPE relies on implementing controlled unitaries of the form U (∆ t ) j and therefore replacing these with D ( γ opt , j ∆ t ) results in an exponential reduction in circuit depth. 3 U RE D E LOLW \ $ P S OLW XG H 6WDQGDUG43(IV9))HQKDQFHG43( FIG. 6.

Quantum Phase Estimation.

Using the 2-qubit diagonalization found from training on ibmq_toronto,QPE was performed on ibmq_boeblingen on the eigenvector | E (cid:105) := √ ( | (cid:105) + | (cid:105) ) . A phase of e πi is applied, so themeasured output should be 001 with probability 1. The vari-ation distance from the target probability distribution whenusing QPE with fsVFF was 0.578, compared to 0.917 usingstandard QPE. and Rigetti respectively, in good agreement of the the-oretically expected value of 2. For more complex cases,this simple post-processing method will be become in-tractable and an algorithmic approach will be necessary.Quantum Phase Estimation (QPE) [54] and Quan-tum Eigenvalue Estimation (QEE) [55] are fault toler-ant quantum algorithms for estimating the eigenvaluesof a unitary operation. However, their implementationon current quantum devices is limited by the relianceon the execution of controlled unitaries from ancillaryqubits. These controlled unitaries require many entan-gling gates, and introduce too much noise to be realizedfor large scale systems on current hardware. Once an evo-lution operator has been diagonalized in the subspace ofan initial state, fsVFF can be used to signiﬁcantly reduce the circuit depth of QPE and QEE, as shown in Figure 5.In this manner, fsVFF provides a NISQ friendly means ofestimating the eigenvalues within a subspace of a Hamil-tonian.To demonstrate the power of fsVFF to reduce thedepth of QPE, we perform QPE using the diagonaliza-tion obtained from training on IBM’s quantum com-puter. Speciﬁcally, we consider the input eigenvector | E (cid:105) := √ ( | (cid:105) + | (cid:105) ) . This is one of the eigenvec-tors spanned by the input state of our earlier hardwareimplementation, | ψ (cid:105) = | (cid:105) . We then consider evolving | E (cid:105) under H XY for a time step of ∆ t = 1 / . Since theenergy of the state | E (cid:105) equals 1, we expect this to resultin a phase shift of e πi/ being applied to | E (cid:105) . We im-plemented QPE and fsVFF enhanced QPE to measurethis phase using the circuits shown in Fig 5. We chose tomeasure to 3 bits of precision and therefore the outputshould be the measurement 001 with probability one. AsFigure 6 shows, it appears that the standard QPE imple-mentation was unable to discern this phase. In contrast,when fsVFF was used to reduce the circuit depth, theoutput distribution was strongly peaked at the correctstate.QEE requires only one ancillary qubit, a single im-plementation of e − iHt , and no Quantum Fourier Trans-form and therefore is less resource intensive than QPE.Nonetheless, we can again, as shown in Fig. 5, use fsVFFas a pre-processing step to reduce the circuit depth.We tested this on the 3-qubit XY Hamiltonian by ﬁrstperforming fsVFF on a quantum simulator with the ini-tial state | ψ (cid:105) = | (cid:105) . Having obtained an approximatediagonalization, we determined the eigenstates using thesampling method described earlier. Figure 7 shows theresults of the measurement of W ( θ opt ) † | ψ (cid:105) , with fourstrong peaks corresponding to the four eigenvectors inthis subspace.Figure 8 shows the results of QEE implemented onibmq_boeblingen. We use the basis states found fromthe sampling method as our inputs to reduce the depth 3 U RE D E LOLW \ $ P S OLW XG H FIG. 7.

Determining the eigenstates spanned by theinitial state : The 3-qubit XY Hamiltonian was diagonal-ized on a quantum simulator in the subspace of initial state | ψ (cid:105) = | (cid:105) to obtain θ opt and γ opt . Here we show the out-put of measuring W ( θ opt ) † | ψ (cid:105) in the computational basis onibmq_boeblingen. The 4 non-zero states correspond to the 4eigenvectors spanned by | ψ (cid:105) . ([DFW FIG. 8.

Eigenvalue Estimation using QEE.

Here weshow the result of implementing QEE (using fsVFF as a pre-processing step) on ibmq_santiago to calculate the eigenval-ues of the eigenvectors in the subspace spanned by | (cid:105) . Thesolid yellow, red, blue and green lines represent the eigenval-ues obtained for the | (cid:105) , | (cid:105) , | (cid:105) and | (cid:105) states, withexact corresponding energies of {-2.828, 0, 0, 2.828}, indi-cated by the dotted lines. The eigenvalues are plotted asphases since for ∆ t = 1 there is a one to one correspondence. of the circuit, and remove the need to use the time-seriesmethod originally proposed for extracting the eigenval-ues, as we could calculate the eigenvalues individuallyby inputting their corresponding eigenvectors. A valueof ∆ t = 1 was used so the phase calculated directlymatched the eigenvalue. After removing a global phase,QEE had accurately found the eigenvalues of the foureigenvectors, with a mean-squared error from the truevalues of . × − . VI. DISCUSSION

In this work, we demonstrated that despite the mod-est size and noise levels of the quantum hardware that iscurrently available, it is possible to perform long time dy-namical simulations with a high ﬁdelity. Speciﬁcally, wehave introduced fsVFF, a new algorithm for NISQ sim-ulations, which we used to simulate a 2-qubit XY-modelspin chain on the Rigetti and IBM quantum computers.We achieved a ﬁdelity of at least 0.9 for over 600 timesteps. This is a 150-fold improvement on the standarditerated Trotter approach, which had a ﬁdelity of lessthan 0.9 after only 4 time steps.Central to the success of the fsVFF algorithm is thefact that it is tailored to simulating a particular ﬁxedinitial state rather than an arbitrary initial state. By fo-cusing on this less demanding task, we showed that it ispossible to substantially reduce the width and depth ofthe previously proposed VFF algorithm. In particular,since fsVFF only requires ﬁnding a diagonalization of ashort-time evolution unitary on the subspace spanned bythe initial state (compared to the entire Hilbert space inthe case of VFF), fsVFF can utilize much simpler an-sätze. This is demonstrated in our hardware implemen-tation, where one CNOT and two parameterized singlequbit rotations proved suﬃcient for an eﬀective ansatzfor W and one single qubit rotation was suﬃcient for D .While one expects that shorter-depth ansätze may beused to diagonalize a unitary on a subspace, rather thantotal Hilbert space, the question of how to design suchansätze remains an important open question for futureresearch. We believe it would be worthwhile both to ex-plore analytic approaches whereby the symmetry prop-erties of the ansatz are used, as well as to use machinelearning approaches to systematically search for suitableansätze.The fsVFF algorithm, similarly to VFF, is fundamen-tally limited by the initial Trotter error approximatingthe short time evolution of the system. The VariationalDiagonalization Hamiltonian (VHD) algorithm [22] maybe used to remove this error. However, like VFF, VHD isdesigned to simulate any possible initial state. There area number of diﬀerent approaches inspired by fsVFF thatcould be explored for reducing the resource requirementsof the VHD algorithm by focusing on simulating a par-ticular initial state. Such a “ﬁxed state VHD” algorithmwould allow for more accurate long time simulations onNISQ hardware.More generally, our work highlights the trade oﬀ be-tween the universality of an algorithm and the resourcesrequired to implement it. One can imagine a number ofalternative ways in which the universality of an algorithmcan be sacriﬁced, without signiﬁcantly reducing its util-ity, in order to make it more NISQ friendly. For example,one is often interested in studying the evolution of a par-ticular observable of interest, rather than all possible ob-servables. It would be interesting to investigate whether aﬁxed-observable fsVFF could further reduce the resources0required to implement long time high ﬁdelity simulations.More broadly, an awareness of this trade oﬀ may proveuseful beyond dynamical simulation for the ongoing chal-lenge of adapting quantum algorithms to the constraintsof NISQ hardware. VII. ACKNOWLEDGEMENTS

JG and KG acknowledge support from the U.S. De-partment of Energy (DOE) through a quantum comput-ing program sponsored by the Los Alamos National Lab-oratory (LANL) Information Science & Technology In-stitute. ZH, BC, and PJC acknowledge support fromthe LANL ASC Beyond Moore’s Law project. We ac-knowledge the LANL Laboratory Directed Research andDevelopment (LDRD) program for support of AS and ini-tial support of BC under project number 20190065DR as well as LC under project number 20200022DR. AA wassupported by the U.S. Department of Energy (DOE), Of-ﬁce of Science, Oﬃce of High Energy Physics QuantISEDprogram under Contract No. DE-AC52-06NA25396. LCand PJC were also supported by the U.S. DOE, Of-ﬁce of Science, Basic Energy Sciences, Materials Sci-ences and Engineering Division, Condensed Matter The-ory Program. This research used quantum computing re-sources provided by the LANL Institutional ComputingProgram, which is supported by the U.S. Department ofEnergy National Nuclear Security Administration underContract No. 89233218CNA000001. This research usedadditional quantum computational resources supportedby the LANL ASC Beyond Moore’s Law program and bythe Oak Ridge Leadership Computing Facility, which isa DOE Oﬃce of Science User Facility supported underContract DE-AC05-00OR22725. [1] Frank Arute, Kunal Arya, Ryan Babbush, Dave Ba-con, Joseph Bardin, Rami Barends, Rupak Biswas, Ser-gio Boixo, Fernando Brandao, David Buell, Brian Bur-kett, Yu Chen, Zijun Chen, Ben Chiaro, Roberto Collins,William Courtney, Andrew Dunsworth, Edward Farhi,Brooks Foxen, and John Martinis, “Quantum supremacyusing a programmable superconducting processor,” Na-ture , 505–510 (2019).[2] Frank Arute, Kunal Arya, Ryan Babbush, Dave Ba-con, Joseph C Bardin, Rami Barends, Andreas Bengts-son, Sergio Boixo, Michael Broughton, Bob B Buck-ley, et al. , “Observation of separated dynamics of chargeand spin in the fermi-hubbard model,” arXiv preprintarXiv:2010.07965 (2020).[3] Seth Lloyd, “Universal quantum simulators,” Science ,1073–1078 (1996).[4] AT Sornborger and Ewan D Stewart, “Higher-ordermethods for simulations on quantum computers,” Physi-cal Review A , 1956 (1999).[5] Guang Hao Low and Isaac L Chuang, “Hamiltonian sim-ulation by qubitization,” Quantum , 163 (2019).[6] Dominic W Berry, Andrew M Childs, Richard Cleve,Robin Kothari, and Rolando D Somma, “Simulatinghamiltonian dynamics with a truncated taylor series,”Physical review letters , 090502 (2015).[7] M. Cerezo, Andrew Arrasmith, Ryan Babbush, Si-mon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod RMcClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio,and Patrick J. Coles, “Variational quantum algorithms,”arXiv preprint arXiv:2012.09265 (2020).[8] Suguru Endo, Zhenyu Cai, Simon C Benjamin, and XiaoYuan, “Hybrid quantum-classical algorithms and quan-tum error mitigation,” arXiv preprint arXiv:2011.01382(2020).[9] Kishor Bharti, Alba Cervera-Lierta, Thi Ha Kyaw,Tobias Haug, Sumner Alperin-Lea, Abhinav Anand,Matthias Degroote, Hermanni Heimonen, Jakob S.Kottmann, Tim Menke, Wai-Keong Mok, Sukin Sim,Leong-Chuan Kwek, and Alán Aspuru-Guzik, “Noisy intermediate-scale quantum (nisq) algorithms,” arXivpreprint arXiv:2101.08448 (2021).[10] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q.Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien,“A variational eigenvalue solver on a photonic quantumprocessor,” Nature Communications , 4213 (2014).[11] Edward Farhi, Jeﬀrey Goldstone, and Sam Gutmann,“A quantum approximate optimization algorithm,” arXivpreprint arXiv:1411.4028 (2014).[12] Jarrod R McClean, Jonathan Romero, Ryan Babbush,and Alán Aspuru-Guzik, “The theory of variationalhybrid quantum-classical algorithms,” New Journal ofPhysics , 023023 (2016).[13] S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn-borger, and P. J. Coles, “Quantum-assisted quantumcompiling,” Quantum , 140 (2019).[14] Ryan LaRose, Arkin Tikku, Étude O’Neel-Judy, LukaszCincio, and Patrick J Coles, “Variational quantumstate diagonalization,” npj Quantum Information , 1–10 (2019).[15] Andrew Arrasmith, Lukasz Cincio, Andrew T. Sorn-borger, Wojciech H. Zurek, and Patrick J. Coles, “Vari-ational consistent histories as a hybrid algorithm forquantum foundations,” Nature Communications , 3438(2019).[16] M. Cerezo, Alexander Poremba, Lukasz Cincio, andPatrick J Coles, “Variational quantum ﬁdelity estima-tion,” Quantum , 248 (2020).[17] Y. Li and S. C. Benjamin, “Eﬃcient variational quantumsimulator incorporating active error minimization,” Phys.Rev. X , 021050 (2017).[18] Suguru Endo, Jinzhao Sun, Ying Li, Simon C Benjamin,and Xiao Yuan, “Variational quantum simulation of gen-eral processes,” Physical Review Letters , 010501(2020).[19] Yong-Xin Yao, Niladri Gomes, Feng Zhang, ThomasIadecola, Cai-Zhuang Wang, Kai-Ming Ho, and Peter POrth, “Adaptive variational quantum dynamics simula-tions,” arXiv preprint arXiv:2011.00622 (2020). [20] Kentaro Heya, Ken M Nakanishi, Kosuke Mitarai, andKeisuke Fujii, “Subspace variational quantum simulator,”arXiv preprint arXiv:1904.08566 (2019).[21] Cristina Cirstoiu, Zoe Holmes, Joseph Iosue, Lukasz Cin-cio, Patrick J Coles, and Andrew Sornborger, “Vari-ational fast forwarding for quantum simulation beyondthe coherence time,” npj Quantum Information , 1–10(2020).[22] Benjamin Commeau, Marco Cerezo, Zoë Holmes, LukaszCincio, Patrick J Coles, and Andrew Sornborger,“Variational hamiltonian diagonalization for dynamicalquantum simulation,” arXiv preprint arXiv:2009.02559(2020).[23] Colin J Trout, Muyuan Li, Mauricio Gutiérrez, YukaiWu, Sheng-Tao Wang, Luming Duan, and Kenneth RBrown, “Simulating the performance of a distance-3 sur-face code in a linear ion trap,” New Journal of Physics , 043038 (2018).[24] Kishor Bharti and Tobias Haug, “Quantum assisted sim-ulator,” arXiv preprint arXiv:2011.06911 (2020).[25] Jonathan Wei Zhong Lau, Kishor Bharti, Tobias Haug,and Leong Chuan Kwek, “Quantum assisted simula-tion of time dependent hamiltonians,” arXiv preprintarXiv:2101.07677 (2021).[26] Tobias Haug and Kishor Bharti, “Generalized quan-tum assisted simulator,” arXiv preprint arXiv:2011.14737(2020).[27] Kyle Poland, Kerstin Beer, and Tobias J Osborne, “Nofree lunch for quantum machine learning,” arXiv preprintarXiv:2003.14103 (2020).[28] Kunal Sharma, M Cerezo, Zoë Holmes, Lukasz Cincio,Andrew Sornborger, and Patrick J Coles, “Reformula-tion of the no-free-lunch theorem for entangled data sets,”arXiv preprint arXiv:2007.04900 (2020).[29] Kunal Sharma, Sumeet Khatri, M. Cerezo, and Patrick JColes, “Noise resilience of variational quantum compil-ing,” New Journal of Physics , 043006 (2020).[30] Sukin Sim, Peter D. Johnson, and Alán Aspuru-Guzik,“Expressibility and entangling capability of parameter-ized quantum circuits for hybrid quantum-classical al-gorithms,” Advanced Quantum Technologies , 1900070(2019).[31] Kouhei Nakaji and Naoki Yamamoto, “Expressibility ofthe alternating layered ansatz for quantum computa-tion,” arXiv preprint arXiv:2005.12537 (2020).[32] Zoë Holmes, Kunal Sharma, M Cerezo, and Patrick JColes, “Connecting ansatz expressibility to gradi-ent magnitudes and barren plateaus,” arXiv preprintarXiv:2101.02138 (2021).[33] Burak Sahinoglu and Rolando D Somma, “Hamiltoniansimulation in the low energy subspace,” arXiv preprintarXiv:2006.02660 (2020).[34] Marco Cerezo, Akira Sone, Tyler Volkoﬀ, Lukasz Cin-cio, and Patrick J Coles, “Cost-function-dependent bar-ren plateaus in shallow quantum neural networks,” arXivpreprint arXiv:2001.00550 (2020).[35] Jarrod R. McClean, Sergio Boixo, Vadim N. Smelyanskiy,Ryan Babbush, and Hartmut Neven, “Barren plateausin quantum neural network training landscapes,” NatureCommunications (2018), 10.1038/s41467-018-07090-4.[36] M. Cerezo and Patrick J Coles, “Impact of barrenplateaus on the hessian and higher order derivatives,”arXiv preprint arXiv:2008.07454 (2020). [37] Andrew Arrasmith, M. Cerezo, Piotr Czarnik, LukaszCincio, and Patrick J Coles, “Eﬀect of barrenplateaus on gradient-free optimization,” arXiv preprintarXiv:2011.12245 (2020).[38] Zoë Holmes, Andrew Arrasmith, Bin Yan, Patrick JColes, Andreas Albrecht, and Andrew T Sornborger,“Barren plateaus preclude learning scramblers,” arXivpreprint arXiv:2009.14808 (2020).[39] Tyler Volkoﬀ and Patrick J Coles, “Large gradients viacorrelation in random parameterized quantum circuits,”Quantum Science and Technology (2021).[40] Kunal Sharma, Marco Cerezo, Lukasz Cincio, andPatrick J Coles, “Trainability of dissipative perceptron-based quantum neural networks,” arXiv preprintarXiv:2005.12458 (2020).[41] Arthur Pesah, M. Cerezo, Samson Wang, Tyler Volkoﬀ,Andrew T Sornborger, and Patrick J Coles, “Absenceof barren plateaus in quantum convolutional neural net-works,” arXiv preprint arXiv:2011.02966 (2020).[42] Alexey Uvarov and Jacob Biamonte, “On barren plateausand cost function locality in variational quantum algo-rithms,” arXiv preprint arXiv:2011.10530 (2020).[43] Carlos Ortiz Marrero, Mária Kieferová, and NathanWiebe, “Entanglement induced barren plateaus,” arXivpreprint arXiv:2010.15968 (2020).[44] Taylor L Patti, Khadijeh Najaﬁ, Xun Gao, and Su-sanne F Yelin, “Entanglement devised barren plateaumitigation,” arXiv preprint arXiv:2012.12658 (2020).[45] A. Krylov, “De la resolution numerique de l’equationservant a determiner dans des questions de mecaniqueappliquee les frequences de petites oscillations des sys-temes materiels,” Bulletin de l’Academie des Sciences del’URSS. Classe des sciences mathematiques et na , 491–539 (1931).[46] Abhinav Kandala, Antonio Mezzacapo, Kristan Temme,Maika Takita, Markus Brink, Jerry M. Chow, andJay M. Gambetta, “Hardware-eﬃcient variational quan-tum eigensolver for small molecules and quantum mag-nets,” Nature , 242–246 (2017).[47] Edward Grant, Leonard Wossnig, Mateusz Ostaszewski,and Marcello Benedetti, “An initialization strategy foraddressing barren plateaus in parametrized quantum cir-cuits,” Quantum , 214 (2019).[48] Andrea Skolik, Jarrod R McClean, Masoud Mohseni,Patrick van der Smagt, and Martin Leib, “Layerwiselearning for quantum neural networks,” arXiv preprintarXiv:2006.14904 (2020).[49] Jonas M Kübler, Andrew Arrasmith, Lukasz Cin-cio, and Patrick J Coles, “An adaptive optimizer formeasurement-frugal variational algorithms,” Quantum ,263 (2020).[50] Andrew Arrasmith, Lukasz Cincio, Rolando D Somma,and Patrick J Coles, “Operator sampling for shot-frugaloptimization in variational algorithms,” arXiv preprintarXiv:2004.06252 (2020).[51] Ryan Sweke, Frederik Wilde, Johannes Jakob Meyer,Maria Schuld, Paul K Fährmann, Barthélémy Meynard-Piganeau, and Jens Eisert, “Stochastic gradient descentfor hybrid quantum-classical optimization,” Quantum ,314 (2020).[52] Aaron Defazio, “Understanding the role of momentumin non-convex optimization: Practical insights froma lyapunov analysis,” arXiv preprint arXiv:2010.00406(2020). [53] Elliott Lieb, Theodore Schultz, and Daniel Mattis, “Twosoluble models of an antiferromagnetic chain,” Annals ofPhysics , 407–466 (1961). [54] Michael A. Nielsen and Isaac L. Chuang, Quantum Com-putation and Quantum Information (Cambridge Univer-sity Press, 2000).[55] Rolando D. Somma, “Quantum eigenvalue estimation viatime series analysis,” (2020), arXiv:1907.11748 [quant-ph]. Appendix A: Faithfulness of cost function

Here we demonstrate that the cost function is faithful in the limit that leakage from the subspace spanned by theinitial state can be disregarded. That is, suppose we could compile the exact evolution of the system U = exp( − iH ∆ t ) for a short timestep ∆ t . Then the cost function vanishes, C fsVFF ( U, V, ψ ) = 1 − n eig n eig (cid:88) k =1 | (cid:104) ψ | V † k U k | ψ (cid:105) | = 0 , (A1)if and only if the ﬁdelity of the fast-forwarded simulation is perfect, F τ = | (cid:104) ψ | V † τ U τ | ψ (cid:105) | = 1 , (A2)for all times τ . Note, the reverse direction is trivial. If F τ = 1 for all τ then C fsVFF ( U, V, ψ ) = 0 .Before embarking on the core of the proof let us ﬁrst emphasize that in the deﬁnition of the cost (A1) we averageover n eig terms, where n eig is the number of eigenstates spanned by the initial state | ψ (cid:105) corresponding to uniqueeigenvalues of the Hamiltonian H . The restriction to eigenstates with unique eigenvalues is important since it is thiswhich determines the dimension of the subspace spanned by the future evolution of | ψ (cid:105) (which in turn determinesthe number of training states required to learn U = exp( − iH ∆ t ) ).To see this consider a Hamiltonian H with a spectrum { E k } n k =1 and corresponding eigenstates {| E k (cid:105)} n k =1 . Theinitial state | ψ (cid:105) can be expanded in the energy eigenbasis as | ψ (cid:105) = m (cid:88) k =1 a k | E k (cid:105) . (A3)The future evolution of such a state, i.e. the set of states S k ( ψ , H ) := { e − iHj ∆ t | ψ (cid:105)} kj =0 (A4)with k = ∞ , is solely contained within the subspace spanned by {| E k (cid:105)} mk =1 since e − iHj ∆ t | ψ (cid:105) = m (cid:88) k =1 a k e − iE k j ∆ t | E k (cid:105) . (A5)If the spectrum { E k } n k =1 is non-degenerate then the evolution generates relative phases between all the m = n eig eigenstates spanned by | ψ (cid:105) . In that case, S ∞ will span the entirety of the m dimensional subspace spanned by {| E k (cid:105)} mk =1 , i.e., an m = n eig dimensional space. However, suppose the eigenstate expansion of | ψ (cid:105) includes twodegenerate eigenstates, i.e. two eigenstates that share the same eigenvalue. In that case the evolution generates relativephases between only m − of the eigenstates {| E k (cid:105)} mk =1 and therefore S ∞ is conﬁned to a m − dimensional subspace.More generally, if the eigenstate expansion of | ψ (cid:105) includes n eig states with unique eigenvalues, then evolution under U = exp( − iH ∆ t ) generates relative phases between n eig states and so S ∞ will span a n eig dimensional subspace ofthe space. In this manner, it is the number of eigenstates spanned by the initial state | ψ (cid:105) corresponding to unique eigenvalues, n eig , that determines the subspace spanned by S ∞ .We note that any initial state | ψ (cid:105) can be written in the form Eq. (A3) where, crucially, the sum is over onlyeigenstates corresponding to unique energies, i.e. m = n eig . For a Hamiltonian with a non-degenerate spectrum thisexpansion is trivial. For a Hamiltonian with a degenerate spectrum there is some freedom in how the eigenstatescorresponding to degenerate eigenvalues are deﬁned, since the superposition of degenerate eigenstates is also aneigenstate corresponding to the same energy. Therefore, henceforth, for degenerate Hamiltonians, we suppose thatthe energy eigenbasis {| E k (cid:105)} n k =1 is deﬁned such that the eigenstate expansion of | ψ (cid:105) only contains eigenstates withunique energies, that is m = n eig terms.We remark that our approach here may also be framed in the language of Krylov spaces. The Krylov subspace [45]associated with the operator U and vector | ψ (cid:105) is the linear subspace spanned by the vectors generated by evolving | ψ (cid:105) under U up to k times. That is K k ( U, ψ ) = span {V k } where V k := {| ψ l (cid:105)} l = kl =0 , (A6)with | ψ l (cid:105) := U l | ψ (cid:105) . Now supposing U = e − iH ∆ t and | ψ (cid:105) = (cid:80) n eig k =1 a k | E k (cid:105) , we have that V k = S k . Thus the futureevolution of | ψ (cid:105) is conﬁned to the Krylov space K ∞ ( U, ψ ) . This is an n eig dimensional subspace, and therefore, aswill become clear, we require n eig training states in order to learn U on this subspace.4To prove the forward direction, we ﬁrst note that if C fsVFF = 0 then as (cid:54) | (cid:104) ψ | V † k U k | ψ (cid:105) | (cid:54) , we have that | (cid:104) ψ | V † k U k | ψ (cid:105) | = 1 for all k . It thus follows that the action of V k on | ψ (cid:105) agrees with the action of U k on | ψ (cid:105) up to an unknown phase e iφ k , i.e. for (cid:54) k (cid:54) n eig we have that V k | ψ (cid:105) = e iφ k U k | ψ (cid:105) . (A7)Or equivalently, the action of V and U agree on the training states S train := {| ψ k (cid:105)} n eig − k =0 up to an unknown phase e i ˜ φ k , that is V | ψ k (cid:105) = e i ˜ φ k U | ψ k (cid:105) ∀ | ψ k (cid:105) ∈ S train , (A8)where ˜ φ k = φ k +1 − φ k .Now by construction (see Section III B) the n eig training states S train are linearly independent. Furthermore, sincethe initial simulation time ∆ t may be chosen freely, the unitary U = exp( − iH ∆ t ) can be chosen such that none ofthe states in S train are orthogonal. In this case, the unknown phases all agree and we have that ˜ φ k = φ for all k . Tosee this note that given | ψ (cid:105) and | ψ (cid:105) are linearly independent but non-orthogonal, the state | ψ (cid:105) can be representedas follows: | ψ (cid:105) = c | ψ (cid:105) + c ⊥ | ψ ⊥ (cid:105) , (A9)where | c | + | c ⊥ | = 1 and | c | (cid:54) = 0 . Then from (A8), we ﬁnd that e − i ˜ φ = (cid:104) ψ | W | ψ (cid:105) (A10) = | c | e i ˜ φ + | c ⊥ | (cid:104) ψ ⊥ | W | ψ ⊥ (cid:105) (A11) = | c | e i ˜ φ + (1 − | c | ) (cid:104) ψ ⊥ | W | ψ ⊥ (cid:105) , (A12)where W := V † U (note that this use of W is distinct from W ( θ ) used in the main text for a parameterized eigenvectorunitary). The above expression can be rearranged as e − i ˜ φ − (cid:104) ψ ⊥ | W | ψ ⊥ (cid:105) = | c | ( e − i ˜ φ − (cid:104) ψ ⊥ | W | ψ ⊥ (cid:105) ) . (A13)Since | e iθ | = 1 and | c | (cid:54) = 0 the aforementioned equation is satisﬁed if and only if (cid:104) ψ ⊥ | W | ψ ⊥ (cid:105) = e − i ˜ φ = e − i ˜ φ , (A14)which implies that ˜ φ = ˜ φ . Then by recursively applying this procedure to the rest of the states in the training set,we ﬁnd that ˜ φ k = ˜ φ j := φ for (cid:54) k (cid:54) n eig − and (cid:54) j (cid:54) n eig − .To understand the constraints from the minimization of the cost function, it is helpful to consider the form of theunitary matrix W . It follows from Eq. (A8), and the fact that since W is unitary ( (cid:80) j | W ij | = (cid:80) i | W ij | = 1 ), that W can be represented as [27]: W = e iφ . . . ... . . . e iφ W ⊥ . Here the upper left hand block spans the n eig dimensional subspace spanned by the input training states S train and W ⊥ is an unknown unitary matrix acting on the ( d − n eig ) dimensional space orthogonal to S train . For later conveniencelet us also note that the matrix ˜ W := U V † is also a unitary matrix of the form ˜ W = e iφ . . . ... . . . e iφ ˜ W ⊥ , where ˜ W ⊥ is again a ( d − n eig ) dimensional unitary matrix.5We are now in a position to show that if C fsVFF ( U, V, ψ ) = 0 , and by construction the states in S train are linearlyindependent and non-orthogonal, then F τ = | (cid:104) ψ | V † τ U τ | ψ (cid:105) | = 1 , (A15)for all times τ . To see this ﬁrst note that for all times τ (cid:54) n eig that F τ = 1 follows directly from Eq. (A7). Now byconstruction, any state | ψ τ (cid:105) for τ > n eig linearly depends on the states in S train and so it can be written as | ψ τ (cid:105) = n eig (cid:88) j =0 b ( τ ) j | ψ j (cid:105) , (A16)where b ( τ ) j = (cid:104) ψ j | ψ τ (cid:105) . Thus we have that W | ψ τ (cid:105) = ˜ W | ψ τ (cid:105) = e iφ | ψ τ (cid:105) (A17)for any time τ . It straightforwardly follows from Eq. (A7) and Eq. (A17) that the simulation ﬁdelity at time τ = n eig +1 equals 1, F n eig +1 = | (cid:104) ψ | V † n eig +1 U n eig +1 | ψ (cid:105) | (A18) = | (cid:104) ψ n eig | W | ψ n eig (cid:105) | = 1 . (A19)Now, let us consider the simulation ﬁdelity at time τ = n eig + 2 , F n eig +2 = | (cid:104) ψ | V † n eig +2 U n eig +2 | ψ (cid:105) | (A20) = | (cid:104) ψ n eig | V † W U | ψ n eig (cid:105) | (A21) = | (cid:104) ψ n eig | U † ˜ W W U | ψ n eig (cid:105) | (A22) = | (cid:104) ψ n eig +1 | ˜ W W | ψ n eig +1 (cid:105) | = 1 , (A23)where we have again used Eq. (A7) and Eq. (A17). Finally, the simulation ﬁdelity at an arbitrary time τ > n eig canbe evaluated as follows, F τ = | (cid:104) ψ | V † τ U τ | ψ (cid:105) | = | (cid:104) ψ | ( U † ˜ W ) τ U τ | ψ (cid:105) | = 1 . (A24)Thus, as claimed, if the cost function C fsVFF vanishes the simulation ﬁdelity is perfect for all times. Appendix B: Noise Resilience of Cost

To make a connection with prior results on noise resilience, we ﬁrst note that C fsVFF , Eq. (5), can be rewritten as C fsVFF ( U, V, ψ ) = 1 n eig n eig (cid:88) k =1 C ( k )fsVFF ( U, V, ψ ) , (B1)with C ( k )fsVFF ( U, V, ψ ) = 1 − Tr (cid:20) | (cid:105)(cid:104) | Y ψ k | (cid:105)(cid:104) | (cid:16) Y ψ k (cid:17) † (cid:21) . (B2)Here Y ψ k := V † ψ W D k W † U k V ψ with V ψ | (cid:105) = | ψ (cid:105) . Let QC k denote the circuit used to evaluate the cost term C ( k )fsVFF . Let ˜ C ( k ) fsVFF denote the noisy version of the cost term C ( k )fsVFF , that is the cost evaluated when the circuit QC j is run in the presence of noise. Let V opt k and ˜ V opt k denote the sets of unitaries that optimise C ( k )fsVFF and ˜ C ( k ) fsVFF respectively. Now, it was shown in [29] that the costs C ( k )fsVFF are resilient to measurement noise, gate noise, and Paulichannel noise in the sense that for circuits experiencing such noise we have ˜ V opt k ⊆ V opt k . This means that any set ofparameters that minimize the noisy cost ˜ C ( k ) fsVFF are guaranteed to also minimize the true exact cost C ( k )fsVFF .We will now argue that this implies that C fsVFF is also noise resilient. To do so, we ﬁrst note that the costs C ( k )fsVFF can be minimized simultaneously by any unitary V opt that matches the target unitary U up to a global phase φ ,6i.e. such that V opt = exp( − iφ ) U . Therefore, assuming that the ansatz for V is suﬃciently expressive that the costs C ( k )fsVFF can be simultaneously minimized, the total cost C fsVFF is minimized by the set of unitaries that minimize eachof the C ( k )fsVFF costs simultaneously, that is the set ∩ k V opt k := V opt . (Note, if the ansatz is not suﬃciently expressivethen it might not be possible to simultaneously minimize each of the terms and therefore the intersection might beempty). Now, given that ˜ V opt k ⊆ V opt k , it follows that ˜ V opt := ∩ k ˜ V opt k ⊆ V opt . Thus, as claimed the total cost C fsVFF isalso noise resilient in the sense that any set of parameters that minimize the noisy cost ˜ C fsVFF also minimize the trueexact cost C fsVFF . Appendix C: Local cost with trainability guarantee

While C fsVFF was motivated in Section III A as a natural choice of cost function to learn the evolution inducedby a target unitary on a ﬁxed initial state, it encounters what is known as a barren plateau for large simulationsizes [34, 35]. (See Refs. [32, 36–44]) for further details about the barren plateau phenomenon.) Namely, the gradientof the cost vanishes exponentially with n . As a result, for large systems the cost landscape is prohibitively ﬂat andtherefore an exponential precision is required to discern a minimization direction. This precludes successful training.In this appendix we introduce a local cost function to surmount this diﬃculty. To motivate our local cost, we ﬁrstrecall that C fsVFF , Eq. (5), can be rewritten as C fsVFF = 1 − n eig n eig (cid:88) k =1 Tr (cid:20) | (cid:105)(cid:104) | Y ψ k | (cid:105)(cid:104) | (cid:16) Y ψ k (cid:17) † (cid:21) , (C1)where Y ψ k := V † ψ W D k W † U k V ψ with V ψ | (cid:105) = | ψ (cid:105) . Analogously, we now deﬁne the local ﬁxed state VFF cost as C Local fsVFF := 1 n n (cid:88) j =1 C Local,j fsVFF , (C2)with C Local,jfsVFF = 1 − n eig n eig (cid:88) k =1 Tr (cid:20)(cid:0) | (cid:105)(cid:104) | j ⊗ ¯ j (cid:1) Y ψ k | (cid:105)(cid:104) | (cid:16) Y ψ k (cid:17) † (cid:21) , (C3)and where ¯ j denotes all qubits except the j th qubit. Each of the C Local, j fsVFF terms can be measured using the sameLoschmidt echo circuit as C fsVFF but the ﬁnal measurement is performed on just the j th qubit (rather than allqubits).Following the proof techniques of [13, 29], it is possible to show that C Local fsVFF (cid:54) C fsVFF (cid:54) nC Local fsVFF . (C4)It therefore follows that C Local fsVFF = 0 if and only if C fsVFF = 0 . Thus, given that C fsVFF is faithful (as shown inAppendix A), its local variant C Local fsVFF is also faithful.Crucially, the results of Ref. [34] provide a trainability guarantee for the local ﬁxed state cost function C Local fsVFF .Namely, as long as the depth of the ansatz is in O (log( n )) , the gradient ∂ θ C Local fsVFF scales at worst as poly ( n ) . Thatis, as long as the ansatz is not too deep, the cost landscape will be suﬃciently featured for eﬀective training. Wetherefore advocate using C Local fsVFF for simulations of larger systems.

Appendix D: Cost function gradient derivation

Here we derive the analytic expressions for the gradient of the cost function C fsVFF ( U, V, ψ ) for gradient descentoptimisation. To emphasize the independence of each of the terms in the cost function and its dependence on θ and γ we write C fsVFF ( U, V, ψ ) = 1 n eig n eig (cid:88) k =1 C ( k )fsVFF ( θ , γ ) , (D1)7where we have C ( k )fsVFF ( θ , γ ) := 1 − | (cid:104) ψ | W ( θ ) D ( k γ ) W ( θ ) † U k | ψ (cid:105) | (D2) = 1 − Tr[ XW ( θ ) D ( k γ ) W ( θ ) † Y k W ( θ ) D ( k γ ) † W ( θ ) † ] , (D3)with X = | ψ (cid:105) (cid:104) ψ | and Y k = U k | ψ (cid:105) (cid:104) ψ | ( U † ) k . Expressions.

With the ansatz in (1), the partial derivative of C fsVFF ( U, V, ψ ) with respect to θ l , a parameter ofthe eigenvector operator W ( θ ) , is ∂C fsVFF ( U, V, ψ ) ∂θ l = 12 (cid:16) C fsVFF ( U, W l + DW † ) − C fsVFF ( U, W l − DW † )+ C fsVFF ( U, W D ( W l + ) † ) − C fsVFF ( U, W D ( W l − ) † ) (cid:17) . (D4)The operator W l + ( W l − ) is generated from the original eigenvector operator W ( θ ) by the addition of an extra π ( − π )rotation about a given parameter’s rotation axis: W l ± := W ( θ l ± ) with ( θ l ± ) i := θ l ± π δ i,l . (D5)Similarly, the partial derivative with respect to γ l , a parameter of the diagonal operator D ( γ ) , is ∂C fsVFF ∂γ l = 1 n eig n eig (cid:88) k =1 k (cid:16) C ( k )fsVFF (cid:0) U, W D l + W † (cid:1) − C ( k )fsVFF (cid:0) U, W D l − W † (cid:1) (cid:17) , (D6)where C ( k )fsVFF := 1 − | (cid:104) ψ | W D k W † U k | ψ (cid:105) | (D7)and D l ± := D ( γ l ± ) with ( γ l ± ) i := γ l ± π δ i,l . (D8) Derivative with respect to γ l . Here we provide the derivation of the partial derivative of C fsVFF ( U, V, ψ ) withrespect to γ l in (D6). Taking the partial derivative of C ( k )fsVFF ( θ , γ ) with respect to an angle γ l gives ∂C ( k )fsVFF ∂γ l = − Tr (cid:20) X (cid:18) W ∂D ( k γ ) ∂γ l W † (cid:19) | Y k (cid:105)(cid:104) Y k | (cid:0) W D ( k γ ) † W † (cid:1)(cid:21) − Tr (cid:20) X (cid:0) W D ( k γ ) W † (cid:1) | Y k (cid:105)(cid:104) Y k | (cid:18) W ∂D ( k γ ) † ∂γ l W † (cid:19)(cid:21) . (D9)The eigenvector operator, D , consists of products of Pauli rotations and can be decomposed as D = D L exp (cid:18) − ikγ l σ l (cid:19) D R (cid:48) ≡ D L D R , (D10)where the operators D L and D R (cid:48) consist of all Pauli rotations to the left and right of the σ l rotation respectively andwe have deﬁned D R = exp( − iθ l σ l / D R (cid:48) for convenience. It follows that the diﬀerential of W with respect to θ l takesthe form ∂D∂γ l = − ikD L σ l D R , (D11)which on substituting into Eq. (D9) gives ∂C ( k )fsVFF ∂γ l = ik (cid:18) Tr (cid:2) XW D L σ l D R W † | Y k (cid:105)(cid:104) Y k | W D † W † (cid:3) − Tr (cid:104) XW DW † | Y k (cid:105)(cid:104) Y k | W D † R σ l D † L W † (cid:105) (cid:19) = ik (cid:104) XW D L (cid:16) σ l D R W † | Y k (cid:105)(cid:104) Y k | W D † R − D R W † | Y k (cid:105)(cid:104) Y k | W D † R σ l (cid:17) D † L W † (cid:105) = ik (cid:104) XW D L [ σ l , ρ ( k )1 ] D † L W † (cid:105) , (D12)8where we have deﬁned ρ ( k )1 = D R W † | Y k (cid:105)(cid:104) Y k | W D † R . (D13)Eq. (D6) is now obtained directly from Eq. (D12) via the following identity, which holds for any state ρ , i [ σ l , ρ ] = e iσ l π/ ρe − iσ l π/ − e − iσ l π/ ρe iσ l π/ . (D14)Speciﬁcally we ﬁnd that ∂C ( k )fsVFF ∂γ l = k (cid:104) X ⊗ W D L ( e iσ l π/ D R W † | Y k (cid:105)(cid:104) Y k | W D † R e − iσ l π/ ) D L W † (cid:105) − k (cid:104) XW D L ( e − iσ l π/ D R W † | Y k (cid:105)(cid:104) Y k | W D † R e iσ l π/ ) D L W † (cid:105) = k (cid:2) X ⊗ W D l − W † | Y k (cid:105)(cid:104) Y k | W D l − W † (cid:3) − k (cid:2) X ⊗ W D l + W † | Y k (cid:105)(cid:104) Y k | W D l + W † (cid:3) = k (cid:16) C ( k )fsVFF (cid:0) U, W D l + W † (cid:1) − C ( k )fsVFF (cid:0) U, W D l − W † (cid:1) (cid:17) . (D15)Thus we are left with ∂C fsVFF ∂γ l = 1 n eig n eig (cid:88) k =1 ∂C ( k )fsVFF ∂γ l = 1 n eig n eig (cid:88) k =1 k (cid:16) C ( k )fsVFF (cid:0) U, W D l + W † (cid:1) − C ( k )fsVFF (cid:0) U, W D l − W † (cid:1) (cid:17) . (D16) Derivative with respect to θ l . Here we provide the derivation of the partial derivative of C fsVFF ( U, V, ψ ) withrespect to θ l in (D4). Taking the partial derivative of C ( k )fsVFF ( θ ) with respect to an angle θ l gives ∂C ( k )fsVFF ∂θ l = − Tr (cid:20) X (cid:18) ∂W∂θ l DW † (cid:19) | Y k (cid:105)(cid:104) Y k | (cid:0) W D † W † (cid:1)(cid:21) − Tr (cid:20) X (cid:0) W DW † (cid:1) | Y k (cid:105)(cid:104) Y k | (cid:18) W D † ∂W † ∂θ l (cid:19)(cid:21) − Tr (cid:20) X (cid:18) W D ∂W † ∂θ l (cid:19) | Y k (cid:105)(cid:104) Y k | (cid:0) W D † W † (cid:1)(cid:21) − Tr (cid:20) X (cid:0) W DW † (cid:1) | Y k (cid:105)(cid:104) Y k | (cid:18) ∂W∂θ l D † W † (cid:19)(cid:21) . (D17)The eigenvector operator, W , consists of products of Pauli rotations and can be decomposed as W = W L exp (cid:18) − iθ l σ l (cid:19) W R (cid:48) ≡ W L W R , (D18)where the operators W L and W R (cid:48) consist of all Pauli rotations to the left and right of the σ l rotation respectivelyand we have deﬁned W R = exp( − iθ l σ l / W R (cid:48) for convenience. It follows that the diﬀerential of W with respect to θ l takes the form ∂W∂θ l = − iW L σ l W R , (D19)which on substituting into Eq. (D17) gives ∂C ( k )fsVFF ∂θ l = i (cid:18) Tr (cid:104) XW L [ σ l , ρ ( k )2 ] W † L (cid:105) − Tr (cid:104) XW DW † R [ σ l , ρ ( k )3 ] W R D † W † (cid:105) (cid:19) , (D20)where we have deﬁned ρ ( k )2 = W R DW † | Y k (cid:105)(cid:104) Y k | W D † W † R and ρ ( k )3 = W † l | Y k (cid:105)(cid:104) Y k | W L . (D21)Eq. (D4) is now obtained directly from Eq. (D20) via Eq. (D14).9 Appendix E: Fast forwarding error

Let us start by writing the state | ψ (cid:105) as | ψ (cid:105) = (cid:32) v ψ (cid:33) to emphasise that | ψ (cid:105) spans a subspace of the total Hilbert space. We then note that in the limit in which leakagecan be disregarded the evolution unitary can be written in terms of its block decomposition U = (cid:32) U (cid:107) U ⊥ (cid:33) . where U (cid:107) acts on the subspace spanned by ψ and U ⊥ acts on the rest of the Hilbert space. Let us also write thelearnt unitary in a block decomposition form as V = (cid:32) V (cid:107) (cid:15) a (cid:15) b V ⊥ (cid:33) . In general, the oﬀ-diagonal blocks are expected to be close to the null matrices, therefore we can take a perturbativeapproach to calculating V N . Expanding V N to ﬁrst order in (cid:15) a and (cid:15) b gives V N = (cid:32) V (cid:107) (cid:15) a (cid:15) b V ⊥ (cid:33) N = (cid:32) V N (cid:107) V (cid:15) a ( N ) V (cid:15) b ( N ) V N ⊥ (cid:33) , where V (cid:15) a ( N ) := V N − (cid:107) (cid:15) a V (cid:107) + N − (cid:88) k =1 V N − − k (cid:107) (cid:15) a V k ⊥ V (cid:15) b ( N ) := V N − ⊥ (cid:15) b V ⊥ + N − (cid:88) k =1 V N − − k ⊥ (cid:15) b V k (cid:107) . (E1)Therefore we can now write V N | ψ (cid:105) = (cid:32) V N (cid:107) V (cid:15) a ( N ) V (cid:15) b ( N ) V N ⊥ (cid:33) (cid:32) v ψ (cid:33) = (cid:32) V N (cid:107) v ψ V (cid:15) b ( N ) v ψ (cid:33) . We are interested in the fast forwarded simulation ﬁdelity F N = | (cid:104) ψ | V † N U N | ψ (cid:105) | , (E2)which to ﬁrst order in (cid:15) a and (cid:15) b evaluates to F N = | (cid:104) ψ | V †(cid:107) N U N (cid:107) | ψ (cid:105) | . (E3)The ﬁdelity between any two quantum states | ψ a (cid:105) and | ψ b (cid:105) can be related to the trace norm distance between theexact and simulated fast forwarded states via the relation − | (cid:104) ψ a | ψ b (cid:105) | = (cid:18) || ψ a − ψ b || (cid:19) , (E4)where we use the shorthand ψ a = | ψ a (cid:105) (cid:104) ψ a | and ψ b = | ψ b (cid:105) (cid:104) ψ b | . Therefore we can write − F N = || U N (cid:107) ψ U †(cid:107) N − V N (cid:107) ψ V †(cid:107) N || . (E5)We further note that we can write || U N (cid:107) ψ U †(cid:107) N − V N (cid:107) ψ V †(cid:107) N || = || ( U N (cid:107) − V N (cid:107) ) ψ U †(cid:107) N + V N (cid:107) ψ (cid:16) U †(cid:107) N − V †(cid:107) N (cid:17) || , (E6)0 Timesteps, N − − − − I nﬁd e li t y , − F Slope = 2

FIG. 9.

Quadratic Scaling of Inﬁdelity : The 3-qubit diagonalization used to perform the Eigenvalue Estimation fromFigure 8 is fast-forwarded with its inﬁdelity, 1 - F , plotted against the number of timesteps, N . The close alignment ofinﬁdelity against the dotted line with form c × N reproduces the scaling of Eq. E9. and therefore on applying Holder’s inequality, || XY || (cid:54) || X || ∞ || Y || , we have that || U N (cid:107) ψ U †(cid:107) N − V N (cid:107) ψ V †(cid:107) N || (cid:54) || U N (cid:107) − V N (cid:107) || ∞ (cid:16) || ψ U †(cid:107) N || + || ψ V (cid:107) N || (cid:17) . (E7)Now from Lemma 1 of [21] we have that || U N (cid:107) − V N (cid:107) || ∞ (cid:54) N || U (cid:107) − V (cid:107) || ∞ and from the unitary invariance of theSchatten norms we have that || ψ U †(cid:107) N || = || ψ V †(cid:107) N || = || ψ || = 1 , therefore we are left with || U N (cid:107) ψ U †(cid:107) N − V N (cid:107) ψ V †(cid:107) N || (cid:54) N || U (cid:107) − V (cid:107) || ∞ . (E8)We conclude that the ﬁnal simulation ﬁdelity is bounded as − F N (cid:54) N (cid:0) || U (cid:107) − V (cid:107) || ∞ (cid:1) ..