[PDF] Universal Quantum Emulator

Abstract

We propose a quantum algorithm that emulates the action of an unknown unitary transformation on a given input state, using multiple copies of some unknown sample input states of the unitary and their corresponding output states. The algorithm does not assume any prior information about the unitary to be emulated, or the sample input states. To emulate the action of the unknown unitary, the new input state is coupled to the given sample input-output pairs in a coherent fashion. Remarkably, the runtime of the algorithm is logarithmic in D, the dimension of the Hilbert space, and increases polynomially with d, the dimension of the subspace spanned by the sample input states. Furthermore, the sample complexity of the algorithm, i.e. the total number of copies of the sample input-output pairs needed to run the algorithm, is independent of D, and polynomial in d. In contrast, the runtime and the sample complexity of incoherent methods, i.e. methods that use tomography, are both linear in D. The algorithm is blind, in the sense that at the end it does not learn anything about the given samples, or the emulated unitary. This algorithm can be used as a subroutine in other algorithms, such as quantum phase estimation.

Full PDF

UUniversal Quantum Emulator

Iman Marvian and Seth Lloyd

1, 2 Research Laboratory of Electronics, Massachusetts Institute of Technology, Cambridge, MA 02139 Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139 (Dated: June 10, 2016)We propose a quantum algorithm that emulates the action of an unknown unitary transformationon a given input state, using multiple copies of some unknown sample input states of the unitaryand their corresponding output states. The algorithm does not assume any prior information aboutthe unitary to be emulated, or the sample input states. To emulate the action of the unknownunitary, the new input state is coupled to the given sample input-output pairs in a coherent fashion.Remarkably, the runtime of the algorithm is logarithmic in D , the dimension of the Hilbert space,and increases polynomially with d , the dimension of the subspace spanned by the sample inputstates. Furthermore, the sample complexity of the algorithm, i.e. the total number of copies of thesample input-output pairs needed to run the algorithm, is independent of D , and polynomial in d .In contrast, the runtime and the sample complexity of incoherent methods, i.e. methods that usetomography, are both linear in D . The algorithm is blind , in the sense that at the end it does notlearn anything about the given samples, or the emulated unitary. This algorithm can be used as asubroutine in other algorithms, such as quantum phase estimation. PACS numbers:

I. INTRODUCTION

A universal quantum simulator is a machine that canbe programmed to mimic the dynamics of other quan-tum systems [1]. The time evolution of the simulatorobeys the same equations of motion as the evolution ofthe simulated system. A universal quantum emulator ,on the other hand, is a machine that mimics the input-output relation of another system, by looking to the out-put of that system on some sample input states. Unlikea simulator, an emulator does not need to obey the samedynamical equations as of the emulated system.In this Letter we introduce a quantum algorithm thatemulates the action of an unknown unitary transforma-tion on new given input states. The algorithm couplesthe new input state to multiple copies of some unknown sample input-output pairs , that is copies of some inputstates of the unitary as well as copies of the correspondingoutput states. We do not assume any prior informationabout the unitary to be emulated, or the given sampleinput states. The algorithm emulates the action of theunitary on any given state in the subspace spanned bythe previously given input states, which could be muchsmaller than the system Hilbert space. Indeed, we areinterested in the cases where d , the dimension of thissubspace is constant or, at most, polylogarithmic in D ,the dimension of the system Hilbert space.Obviously, having multiple copies of sample input-output pairs we can perform measurements on them, andusing state tomography ﬁnd an approximate classical de-scription of these states in a standard basis. This, inturn, yields the classical description of the unknown uni-tary transformation, which then can be used to simulateits action on the new given states. This approach, how-ever, is highly ineﬃcient and impractical: First of all,state tomography in a large Hilbert space is a hard task and requires lots of copies of the sample states. Second,even if we ﬁnd an approximate classical description ofthe unitary transformation, or if we are given its exactdescription, in general we cannot implement this trans-formation on a new given state eﬃciently.More precisely, the approaches based on tomographyrun in time Ω( D ) and need Ω( D ) copies of state, where D is the dimension of the system Hilbert space. In con-trast, the runtime of the algorithm proposed in this workis O (log D ) and polynomial in d , and its sample com-plexity , i.e. the total number of copies of the sampleinput-output pairs that are needed to run the algorithm,is independent of D and polynomial in d . Therefore, ouralgorithm is not only exponentially faster than the ap-proaches based on tomography, its sample complexity isalso dramatically lower.It is interesting to compare this result with the scenariostudied in [2], where one wants to learn an unknown uni-tary U by applying it for a ﬁnite number of times to somequantum states, so that later, when we do not have accessto U , we can reproduce its eﬀect on a new input state. Itturns out that the strategy that maximizes the averageﬁdelity, where the average is taken over all states in theHilbert space, is an incoherent measure-and-rotate strat-egy, i.e. a method that uses tomography [2]. In contrastto this result, our work shows that under the practicalassumption that the action of the unitary should be em-ulated in a low-dimensional subspace, and not the entireHilbert space, the coherent methods are much more pow-erful than the incoherent ones.This algorithm is blind , in the sense that at the end itdoes not learn anything about the given samples, or theemulated unitary. Although the algorithm uses a ran-domized technique, it always generates an output withhigh ﬁdelity with the desired output state. Therefore, itcan be used as a subroutine in larger algorithms, such a r X i v : . [ qu a n t - ph ] J un as quantum phase estimation. Another possible appli-cation of this algorithm is to use it to cancel the eﬀectof an unknown unitary channel, without doing processtomography and ﬁnding the description of the unitary.Furthermore, the algorithm could be useful in the caseswhere we can prepare the input and the correspondingoutputs of a unitary transformation eﬃciently, but we donot know how to implement the unitary itself. II. PRELIMINARIES

We ﬁrst present the algorithm for the special case ofpure sample states, and later explain how it can be gen-eralized to the case of mixed states as well.Let S in = {| φ in k (cid:105)(cid:104) φ in k | : k = 1 , · · · , K } be a set of sampleinput states of the unitary U and S out = {| φ out k (cid:105)(cid:104) φ out k | = U | φ in k (cid:105)(cid:104) φ in k | U † : k = 1 , · · · , K } be the corresponding out-puts. Let H in and H out be the subspaces spanned by {| φ in k (cid:105) : k = 1 , · · · , K } and {| φ out k (cid:105) : k = 1 , · · · , K } re-spectively, and d be the dimension of these subspaces.We assume the set of input samples S in contains suﬃ-cient number of diﬀerent states to uniquely determinethe action of U on the subspace H in (up to a globalphase). It can be easily shown that having the classi-cal description of the input and output states in S in and S out we can uniquely determine the action of U on anyinput state | ψ (cid:105) ∈ H in (up to a global phase), if and onlyif the matrix algebra generated by S in , that is the setof polynomials in the elements of S in , is the full matrixalgebra on H in , i.e. contains all operators with supportscontained in H in1 . Therefore, in the following we natu-rally assume this assumption is satisﬁed. Furthermore,we assume K the number of diﬀerent sample input statesin S in is poly( d ).To implement the algorithm, we need multiple copiesof each sample state in S in and S out . Interestingly, at theend of the algorithm most of these states remain almostunaﬀected. Indeed, the main use of the given copies ofsample states is to simulate controlled-reﬂections aboutthese states.Let R in ( k ) = e iπ | φ in k (cid:105)(cid:104) φ in k | and R out ( k ) = e iπ | φ out k (cid:105)(cid:104) φ out k | be the reﬂections about the input and output states | φ in k (cid:105) If we know how unitary U transforms elements of S in , then wealso know how it transforms any operator in the matrix algebragenerated by the elements of S in . Therefore, if S in generates thefull matrix algebra on H in , then we know how U transforms anydensity operator with support contained in H in . On the otherhand, if S in does not generate the full matrix algebra on H in ,then its commutant contains unitaries that are block-diagonalwith respect to H in , and act non-trivially on this subspace. Forany such unitary V , unitaries U and UV act exactly in the sameway on all the input states in the set S in . Therefore, just hav-ing the classical description of states in S in and S out , we cannotdistinguish unitaries UV and U , for any unitary V in the com-mutant of S in , even though they act diﬀerently on states in H in .This proves the claim. and | φ out k (cid:105) , respectively. In the proposed algorithm weneed to implement the controlled-reﬂections R in a ( k ) and R out a ( k ), deﬁned as R a ( k ) = | (cid:105)(cid:104) | a ⊗ I + | (cid:105)(cid:104) | a ⊗ e iπ | φ k (cid:105)(cid:104) φ k | , (2.1)where a is the label for the control qubit, and I is theidentity operator on the main system. Note that we havesuppressed the superscripts in and out in both sides.Using the given copies of the sample states, we can ef-ﬁciently simulate these controlled-reﬂections via the den-sity matrix exponentiation technique of [3]. Based on thistechnique, in the Supplementary Material we present anew simple quantum circuit that uses n copies of an un-known state σ to simulate the unitary e − itσ , or its con-trolled version | (cid:105)(cid:104) |⊗ I + | (cid:105)(cid:104) |⊗ e − itσ , for any real t , witherror (cid:15) = O (( t +1) /n ), and in time O ( n log( n ) × log( D )),where D is the dimension of the Hilbert space (See Ap-pendix C). This circuit only uses single qubit gates andcontrolled-Fredkin gates, i.e. the controlled-controlled-SWAP gates [4]. In the simplest case where the systemis a qubit ( D = 2), this circuit is basically simulatingthe Heisenberg interaction between the system and eachgiven copy of state σ .Therefore, in the following, where we present the algo-rithm, we assume all the controlled-reﬂections { R a ( k ) :1 ≤ k ≤ K } can be eﬃciently implemented.To simplify the presentation, we use the notation W a ( k ) ≡ R a ( k ) H a R a (1), where again we have sup-pressed in and out superscripts in both sides. Here H a denotes the Hadamard gate H acting on qubit a ,where H | (cid:105) = | + (cid:105) and H | (cid:105) = |−(cid:105) , and |±(cid:105) = ( | (cid:105) ±| (cid:105) ) / √

In this section we present and analyze the algorithmfor the universal quantum emulator, in the special casewhere all the sample input-output pairs are pure states.Later, we present several generalizations of this algo-rithm, including to the case where the given samples con-tain mixed states.Fig.(1) exhibits the quantum circuit that emulates theaction of an unknown unitary transformation U on anygiven state | ψ (cid:105) in the input subspace H in . For a generalinput state, which is not restricted to this subspace, thiscircuit ﬁrst projects the state to this subspace, and ifsuccessful, then applies the unitary U to it.In this algorithm ( k , · · · , k T ) are T integers chosenuniformly at random from integers 1 , · · · , K , where T isa constant that determines the precision of emulation,and we choose it to be polynomial in d , and independentof D . Furthermore, state | φ in1 (cid:105) (and | φ out1 (cid:105) ) is one of thesample input states (and its corresponding output) whichis chosen randomly at the beginning of the algorithm, FIG. 1: The quantum circuit for emulating unitary transformation U for the special case of pure input-output sample pairs.Here k , · · · , k T are T = poly( d ) integers chosen uniformly at random from integers 1 , · · · , K . We use the given copies ofsample states in S in and S out to simulate the controlled-reﬂections R in a ( k ) and R out a ( k ), respectively. and is ﬁxed during the algorithm. In steps (i) and (iv) ofthe algorithm we implement, respectively, the unitaries W in a i ( k i ) and W out a i † ( k i ) on the system and qubit a i , for i = 1 , · · · , T . As we explained before, all the conditionalreﬂections R in a ( k ) and R out a ( k ) can be eﬃciently simulatedusing the given copies of states | φ in k (cid:105) and | φ out k (cid:105) .In step (ii) of the algorithm we perform a qubit mea-surement in the computational basis {| (cid:105) , | (cid:105)} . Then,after the measurement with probability 1 − (cid:104) ψ | Π in | ψ (cid:105) weget outcome b = 1, in which case we project the systemto a state close to ( I − Π in ) | ψ (cid:105) / (cid:112) − (cid:104) ψ | Π in | ψ (cid:105) , whereΠ in is the projector to the subspace H in . On the otherhand, with probability (cid:104) ψ | Π in | ψ (cid:105) we get the outcome b = 0, in which case the ﬁnal state of circuit is closeto U Π in | ψ (cid:105) / (cid:112) (cid:104) ψ | Π in | ψ (cid:105) . In this case the algorithm con-sumes a copy of state | φ out1 (cid:105) , and returns a copy of state | φ in1 (cid:105) .Note that, although the algorithm uses random inte-gers ( k , · · · , k T ), for suﬃciently large T it always trans-forms the input state | ψ (cid:105) ∈ H in to a state with highﬁdelity with the desired output state U | ψ (cid:105) . A. How it works

To simplify the following discussion, we ﬁrst assumethe initial state | ψ (cid:105) is in the input subspace H in , andthen we consider the general case.To understand step (i) of the algorithm, we ﬁrst focuson the reduced state of the system. From this point ofview, during step (i) we are trying to erase the state ofsystem and push it into | φ in1 (cid:105) , a state which is chosenrandomly from the sample input set S in . This erasing isdone by repetitive measuring and conditional mixing .Consider the ﬁrst round of step (i), where we have randomly chosen integer k in 1 , · · · , K , and apply theunitary W ( k ) = R a ( k ) H a R a (1) to the system andqubit a initially prepared in state |−(cid:105) . The eﬀect ofthis transformation on the reduced state of system canbe interpreted in the following way: we perform a mea-surement in { P = | φ in1 (cid:105)(cid:104) φ in1 | , P ⊥ = I − P } basis, and ifthe system is found in state | φ in1 (cid:105) , we leave it unchanged;otherwise, we apply the random reﬂection R in ( k ) to it.Tracing over qubit a , and averaging over the values of k , we ﬁnd that the overall eﬀect of this transformationon the reduced state of system can be described by thequantum channel W ( ρ ) = P ρP + D ( P ⊥ ρP ⊥ ), where D ( ρ ) = 1 K K (cid:88) k =1 R in ( k ) ρ R in ( k ) . (3.1)In step (i) we repeat the above procedure for T times with T diﬀerent ancillary qubits and random integers. Usingthe facts that (1) ancillary qubits are initially uncorre-lated with each other, and (2) diﬀerent random integers k , · · · , k T are statistically independent of each other, weﬁnd that at the end of step (i) the average reduced stateof system is described by state W T ( | ψ (cid:105)(cid:104) ψ | ).Next, recall the assumption that the input set S in gen-erates the full matrix algebra in H in , an assumptionwhich is crucial for being able to uniquely determinethe action of U on H in . Interestingly, this assumptionnow translates to the fact that channels D and W haveunique ﬁxed point states with support contained in H in ,namely the totally mixed state in H in , and | φ in1 (cid:105)(cid:104) φ in1 | re-spectively . Furthermore, since the reﬂections R in ( k ) are The ﬁxed points of the unital channel D are the commutants block diagonal with respect to the input subspace H in ,channels D and W map any initial state inside H in to astate with support contained in this subspace. It followsthat for any initial state | ψ (cid:105) ∈ H in , at the end of step (i)with probability almost one the system should be in state | φ in1 (cid:105) . More precisely, the probability of ﬁnding the sys-tem in a state orthogonal to | φ in1 (cid:105) is exponentially smallin T .To summarize, in step (i) we erase the initial state ofsystem and push it into state | φ in1 (cid:105) . The fact that wehave enough resources to uniquely determine the actionof unitary U on any state in the input subspace H in ,translates to the fact that any state in this subspace canbe erased in a coherent unitary fashion.But quantum information is conserved during a uni-tary evolution. This means that all the informationabout state | ψ (cid:105) should now be encoded in the ancil-lary qubits a = a · · · a T . Furthermore, since we startwith a global pure state, and since at the end of step(i) the reduced state of system is close to a pure state,we ﬁnd that the joint reduced state of ancillary qubits a should also be close to a pure state, denoted by | Ψ( k ) (cid:105) a .Note that in addition to state | ψ (cid:105) , state | Ψ( k ) (cid:105) a also de-pends on the sample set S in , and the random integers k ≡ ( k , · · · , k T ). We conclude that at the end of step(i) with high probability the system and ancillary qubits a are in the product state (cid:2) W in a T ( k T ) · · · W in a ( k ) (cid:3) | ψ (cid:105)|−(cid:105) ⊗ T ≈ | φ in1 (cid:105)| Ψ( k ) (cid:105) a . (3.2)Next, in step (ii) we basically perform a measurementon the system in { P = | φ in1 (cid:105)(cid:104) φ in1 | , P ⊥ = I − P } basis.At this point we know that for any initial state | ψ (cid:105) ∈H in with probability almost one the system should be instate | φ in1 (cid:105) , in which case the measurement projects theancillary qubit to state | (cid:105) . On the other hand, if weproject the qubit to state | (cid:105) it means the erasing hasnot been successful.Finally, assuming | ψ (cid:105) ∈ H in , we ﬁnd that applyingsteps (iii) and (iv) maps the system to state U | ψ (cid:105) . Thiscan be seen by multiplying both sides of Eq.(3.2) in uni-tary U , and using the facts that U | φ in1 (cid:105) = | φ out1 (cid:105) , and of its Kraus operators, i.e. the commutants of the reﬂections { R in ( k ) = I − | φ in k (cid:105)(cid:104) φ in k | : k = 1 , · · · , K } . Since {| φ in k (cid:105)(cid:104) φ in k | : k = 1 , · · · K } generates the full matrix algebra on H in , it turnsout that the only ﬁxed points of D with support contained in H in are multiples of the identity operator on this subspace. In thecase of W , deﬁnition W ( τ ) = P τP + D ( P ⊥ τP ⊥ ) implies thatany ﬁxed point τ of this channel satisﬁes P ⊥ D ( P ⊥ τP ⊥ ) P ⊥ = P ⊥ τP ⊥ . Since D is trace-preserving, this can hold if and onlyif P ⊥ τP ⊥ = 0, or P ⊥ τP ⊥ is a ﬁxed point of D . The fact thatthe only ﬁxed points of D with support contained in H in are themultiples of the identity operator on H in , implies that the lattercase is impossible. We conclude that the unique ﬁxed point stateof W with supports contained in H in is | φ in1 (cid:105)(cid:104) φ in1 | . ( U ⊗ I a ) W in a ( k )( U † ⊗ I a ) = W out a ( k ). This implies (cid:104) W out a † ( k ) · · · W out a T † ( k T ) (cid:105) | φ out1 (cid:105)| Ψ( k ) (cid:105) a ≈ U | ψ (cid:105)|−(cid:105) ⊗ T . (3.3)Eq.(3.3) means that, after step (ii) by preparing the sys-tem in the output sample state | φ out1 (cid:105) , which is given tous, and running all the operations in step (i) backward,with unitaries R in a ( k ) replaced by R out a ( k ), we get state U | ψ (cid:105) at the end.Using the fact that all unitaries W a ( k ) = R a ( k ) H a R a (1) act trivially on the subspace orthogonalto H in , it can be easily seen that for a general input | ψ (cid:105) , which is not contained in H in , the algorithm ﬁrstperforms a projective measurement that projects theinput state to the subspace H in , or the orthogonalsubspace. Then, if the system is found to be in H in ,which corresponds to outcome b = 0 in the measurementin step (ii), it applies the unitary U to the component ofstate in this subspace.In the Supplementary Material we prove the followingquantitative version of the above argument: Suppose weimplement the quantum circuit presented in Fig. 1, us-ing perfect controlled-reﬂections R a ( k ). Let E U be thequantum channel that describes the overall eﬀect of thecircuit on the system, in the case where we do not post-select based on the outcome of measurement in step (ii),which means we do not care if erasing has been success-ful or not. Then, for an arbitrary input state ρ , the(Uhlmann) ﬁdelity [5, 6] of E U ( ρ ) and the desired state U ρU † satisﬁesF( E U ( ρ ) , U ρU † ) ≥ p erase ( ρ ) = (cid:104) φ in1 |W T ( ρ ) | φ in1 (cid:105) , (3.4)where p erase ( ρ ) is the probability that we have success-fully erased the state of system (and pushed it into state | φ in1 (cid:105) ), which corresponds to outcome 0 in the measure-ment (See Appendix A). On the other hand, if we posts-elect to the cases where the erasing has been successful,then for pure input state ρ , the ﬁdelity between the out-put of the algorithm and the desired state U ρU † is lowerbounded by (cid:112) p erase ( ρ ).Interestingly, as we show in the Supplementary Mate-rial, Eq.(3.4) holds in a much more general setting: sup-pose we run the above algorithm with any other choice ofunitaries W in a ( λ ) and W out a ( λ ) = ( U ⊗ I a ) W in a ( λ )( U † ⊗ I a )that couple the system to a qubit a , where λ is a ran-dom parameter chosen according to a probability distri-bution p ( λ ). Then, Eq.(3.4) still holds for the channel W ( τ ) = (cid:80) λ p ( λ )Tr a (cid:0) W in a ( λ )[ τ ⊗ |−(cid:105)(cid:104)−| a ] W in a † ( λ ) (cid:1) . Inthe Supplementary Material we use this generalizationto extend the algorithm to the case where the samplescontain mixed states.Eq.(3.4) best captures the working principle behindthis algorithm, which can be called emulating via coher-ent erasing . Note that using this equation we can deter-mine for which input states, the emulation works well: ifwe have the required resources to coherently erase state ρ and bring the system to a pure state which we knowhow transforms under unitary U , then we can emulatethe action of unitary U on ρ . B. Coordinates of the input state relative to thesamples

As we saw before, at the end of step (i) all the infor-mation about the input state | ψ (cid:105) is encoded in the an-cillary qubits. Finding the explicit form of this encodedversion of state clariﬁes an interesting interpretation ofthis algorithm: step (i) of the algorithm ﬁnds the coor-dinates of the input state | ψ (cid:105) relative to the frame de-ﬁned by the input samples | φ in k (cid:105) , · · · , | φ in k T (cid:105) . Then, step(iv) reconstructs the state with exactly the same coordi-nates relative to the frame deﬁned by the output samples | φ out k (cid:105) , · · · , | φ out k T (cid:105) .Let | t (cid:105) a = | (cid:105) ⊗ t ⊗ | (cid:105) ⊗ ( T − t ) be the state of qubits a · · · a T , in which a t +1 · · · a T are all in state | (cid:105) , andthe rest of qubits are in state | (cid:105) . Then, at the end ofstep (i) the joint state of the system and a = a · · · a T isgiven by | φ in1 (cid:105) T − (cid:88) t =0 (cid:104) φ in1 | ψ ( t, k ) (cid:105) | t (cid:105) a + | ψ ( T, k ) (cid:105)| T (cid:105) a , (3.5)where the (unnormalized) vectors | ψ ( t, k ) (cid:105) are deﬁned viathe recursive relation | ψ ( t + 1 , k ) (cid:105) = R in ( k t +1 ) P ⊥ | ψ ( t, k ) (cid:105) , (3.6)and | ψ (0 , k ) (cid:105) = | ψ (cid:105) . The argument in the previous sec-tion implies that for initial state | ψ (cid:105) ∈ H in , the typicalnorm of | ψ ( t, k ) (cid:105) is exponentially small in t , and hence forsuﬃciently large T the last term in Eq.(3.5) is negligible.It follows that this expansion indeed describes a generalrecursive method for specifying any vector | ψ (cid:105) ∈ H in interms of the scalars {(cid:104) φ in1 | ψ ( t, k ) (cid:105) : t = 0 , · · · , T − } .These scalars only depend on the relation between | ψ (cid:105) and states | φ in k (cid:105) , · · · , | φ in k T (cid:105) , i.e. applying the same uni-tary transformation on | ψ (cid:105) and these states leaves theminvariant. We can interpret these scalars as the coor-dinates of vector | ψ (cid:105) relative to the frame deﬁned bystates | φ in k (cid:105) , · · · , | φ in k T (cid:105) . Then, in this language, the step(i) of the algorithm is a circuit for ﬁnding the coordi-nates of the given state | ψ (cid:105) relative to the input frame | φ in k (cid:105) , · · · , | φ in k T (cid:105) . Note that because of the no-cloning the-orem [7], in order to ﬁnd the coordinates of a quantumstate in a reversible fashion and encode it in the ancillaryqubits, we need to erase the state of system.Therefore, the step (i) of this algorithm provides aneﬃcient reversible method for ﬁnding the coordinates of Note that we only use a ( T + 1)-dimensional subspace of the 2 T -dimensional Hilbert space of the ancillary qubits. Indeed, thisalgorithm can be easily modiﬁed to be implemented using only O (log T ) ancillary qubits. a given state with respect to a general frame, using mul-tiple copies of states corresponding to that frame . Thismethod can be useful for other applications, where in-stead of implementing operations on the system directly,we ﬁrst ﬁnd the coordinates of state of system with re-spect to other quantum states, implement an operationon the coordinates, and then transform the state back tothe physical space. C. Runtime and error analysis

It follows from Eq.(3.4) that if we run the circuit inFig.(1) with ideal controlled-reﬂections, then for any ini-tial state | ψ (cid:105) ∈ H in , the trace distance between the out-put of the circuit and the desired state U | ψ (cid:105) is less thanor equal to (cid:15) , provided that we choose T ≥ d × log(8 d(cid:15) − )1 − | λ D | , (3.7)where λ D is the eigenvalue of channel D with the secondlargest magnitude (See Appendix B). Therefore, as oneexpects from the discussion in the previous section, theruntime is mainly determined by the the mixing time ofthe random unitary channel D , or equivalently, its spec-tral gap 1 − | λ D | .It turns out that in the actual algorithm for the uni-versal quantum emulator, where we need to simulate thecontrolled-reﬂections using the given copies of samplestates, the dominant source of error is due to the im-perfections in these simulations. In the SupplementaryMaterial we show that for initial state | ψ (cid:105) ∈ H in , thetransformation | ψ (cid:105) → U | ψ (cid:105) can be implemented with er-ror (cid:15) > N tot = ˜ O (cid:0) d × (cid:15) − (1 − | λ D | ) (cid:1) (3.8)total copies of sample states, and in time t tot = ˜ O ( N tot × log D ), where ˜ O suppresses more slowly-growing terms(See Appendix B 2). Note that the only place where D ,the dimension of Hilbert space, shows up in the analysisis in the simulation of the controlled-reﬂections. D. Optimality

In contrast with the approaches based on tomography,whose runtime and sample complexity both scales (at The given copies of each sample state can be interpreted as aQuantum Reference Frame (QRF) for a direction in the Hilbertspace. A QRF usually refers to a reference frame for physicaldegrees of freedom, such as position or time, which is treatedquantum mechanically [8–12]. In contrast, here we are using theconcept of QRF in a more abstract sense. Therefore, the relevantsymmetry group, which for the standard QRF’s is usually thegroups such as SO(3), U(1) or Z N [8–12], in this case is SU(D). least) linearly with D , the runtime t tot of the above al-gorithm is only logarithmic in D , whilst its sample com-plexity N tot is independent of D .In general, the lowest achievable runtime is lowerbounded by Ω(log γ D ), where γ is a constant of orderone depending on the circuit architecture. This is be-cause, in general, the state of each qubit in the desiredoutput state U | ψ (cid:105) depends non-trivially on the state ofall other qubits in state | ψ (cid:105) , as well as the state of allqubits in the sample states. This means that the run-time of the algorithm is bounded by the minimum timerequired for information about one qubit in the systemto propagate to all other qubits. Since the quantum cir-cuit is formed from local unitary gates each acting on fewqubits, this time grows with log D , the number of qubits.For instance, in the case where qubits are all on a lineand interact only with their nearest neighbors, this timeis of the order log D .Furthermore, it can be easily seen that the lowestachievable runtime scales, at least, linearly with d , thedimension of the input subspace. Indeed, just to check ifthe given state is inside the input subspace spanned bythe sample states, one needs to interact with at least d diﬀerent sample states, which requires time of order d .We conclude that the runtime of the proposed algorithmcannot be improved drastically. IV. GENERALIZATIONS

As we explained at the end of Sec.(III A), the pro-posed algorithm for the universal quantum emulatorworks based on a simple and general principle, namelyemulating via coherent erasing. Hence, it turns out thatthe algorithm can be generalized in several diﬀerentways. Some of the possible generalizations are presentedin the Supplementary Material (See Appendix D). Inparticular,1) We show that the algorithm can be generalizedto the case where the sample input-output sets containmixed states. More precisely, as long as the sample inputset, and hence the sample output set, contains (at least)one state close to a pure state, we can still coherentlyerase the sate of system and push it into this purestate. Then, we can emulate the action of the unknownunitary, using the same approach we used in the originalalgorithm. The main diﬀerence with the original versionof algorithm is that, instead of the controlled-reﬂections,in the case of mixed states we need to implement controlled-translations with respect to the sample states,i.e. the unitaries | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − itσ k , for a randomvalue of t and a sample state σ k .2) We present a more eﬃcient algorithm which worksunder certain extra assumptions about the sample inputstates. Namely, this algorithm assumes the input sam-ples are states in an (unknown) orthonormal basis for theinput subspace, plus one or more states in the conjugate basis. The working principle behind this algorithm isagain emulating via coherent erasing. In this algorithmthe state of system is erased via a coherent measurementin the orthonormal basis followed by another coherentmeasurement in the conjugate basis.3) We show the algorithm can be generalized toemulate the controlled version of unitary U , i.e. toimplement the unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ U . Thisgeneralization is crucial for some applications such asquantum phase estimation.4) We show that if the sample input states can onlyapproximately be transformed to the sample outputstates via a unitary transformation, with an errorbounded by δ , then the proposed algorithm emulatesthis unitary with error poly( d ) × δ . A. Emulating Projective Measurements

We can use the algorithm presented in this paper toemulate projective measurements, using copies of samplestates where each sample comes with a (classical) labelthat speciﬁes the outcome of the measurement for thisstate.Suppose in the algorithm presented in Fig.(1) we usethe same set of states as both the input and the outputsamples. In this case, the algorithm basically performsthe two-outcome projective measurement that projectsthe given state to the subspace spanned by the samplestates, or its complement. Then, any arbitrary projectivemeasurement can be implemented as a sequence of thesetwo-outcome projective measurements.This approach, however, only works if the set of samplestates in each subspace generates the full matrix algebrain that subspace. But, in general, to specify a projec-tive measurement we only need to specify the subspacesthat correspond to diﬀerent outcomes, and therefore thisextra assumption is unjustiﬁed in this context. In theSupplementary Material we present a diﬀerent eﬃcientalgorithm for emulating projective measurements, whichdoes not require this extra assumption. This algorithmalso uses random controlled-reﬂections about the givensample states.

V. DISCUSSION

We presented an eﬃcient algorithm for emulating uni-tary transformations and projective measurements. Thealgorithm uses a novel randomized technique, and worksbased on a simple principle, which can be called emulat-ing via coherent erasing.The important problem of eﬃcient emulation of gen-eral quantum channels is left open. It is interesting tosee if there are some physically relevant assumptions un-der which one can eﬃciently emulate quantum channels.This could have applications, e.g. in the context of quan-tum error correction.Another open question is that whether there exists aneﬃcient method for emulating unitary transformationsin the general case where all the given sample states aremixed. In the approach taken in this work it seems cru-cial that (at least) one of the sample states should beclose to a pure state. Of course, one can use the givencopies of a mixed state to purify the state, e.g. using the method used in [13]. But these methods seem to havelarge sample complexity.

VI. ACKNOWLEDGMENTS

This work was supported by grants AFOSR No6929347 and ARO MURI No 6924605 . [1] S. Lloyd, Science , 1073 (1996).[2] A. Bisio, G. Chiribella, G. M. D’Ariano, S. Facchini, andP. Perinotti, Physical Review A , 032324 (2010).[3] S. Lloyd, M. Mohseni, and P. Rebentrost, Nature Physics , 631 (2014).[4] M. Nielsen and I. Chuang, Quantum Computation andQuantum Information , Cambridge Series on Informationand the Natural Sciences (Cambridge University Press,2000), ISBN 9780521635035.[5] A. Uhlmann, Reports on Mathematical Physics , 273(1976).[6] R. Jozsa, Journal of modern optics , 2315 (1994).[7] W. K. Wootters and W. H. Zurek, Nature , 802(1982). [8] S. D. Bartlett, T. Rudolph, and R. W. Spekkens, Reviewsof Modern Physics , 555 (2007).[9] I. Marvian and R. W. Spekkens, New Journal of Physics , 033001 (2013).[10] I. Marvian and R. W. Spekkens, Phys. Rev. A ,062110 (2014), URL http://link.aps.org/doi/10.1103/PhysRevA.90.062110 .[11] G. Gour and R. W. Spekkens, New Journal of Physics , 033023 (2008).[12] I. Marvian and R. Mann, Physical Review A , 022304(2008).[13] J. Cirac, A. Ekert, and C. Macchiavello, Physical reviewletters , 4344 (1999). Supplementary Material

Appendix A: Fidelity of emulation for the generalized algorithm

In this section we present a generalized version of the algorithm discussed in the paper. We also prove Eq.(3.4) forthis generalized algorithm.

1. Preliminaries

2. The generalized algorithm

Here we list the steps of the algorithm. The quantum circuit for this algorithm is presented in Fig.(2). (i)

Let λ , · · · , λ T be T random elements of set Λ, chosen independently according to the distribution p ( λ ). Consider T ancillary qubits labeled by a , · · · , a T , all initialized in the state |−(cid:105) . Apply the unitary W in a ( λ ) on the systemand the ancillary qubit a , unitary W in a ( λ ) on the system and a , and so on, until the last unitary W in a T ( λ T ), whichis applied to the system and a T . (ii) Apply the controlled-reﬂection R c ( φ in ) on the system and qubit c , initially prepared in state |−(cid:105) . Then, apply aHadamard gate to the qubit c and measure it in the computational basis. (iii) Swap a copy of state | φ out (cid:105) with the state of system, i.e. prepare the system in state | φ out (cid:105) . (iv) Recall λ , · · · , λ T chosen in step (i) and apply the sequence of unitaries W out a T † ( λ T ) , · · · , W out a † ( λ ) on the systemand the T ancillary qubits a T · · · a , in the following order: First apply W out a T † ( λ T ) to the system and qubit a T , thenapply W out a T − † ( λ T − ) to the system and a T − , and so on, until the last unitary W out a † ( λ ) which is applied on thesystem and qubit a .

3. Fidelity of emulation (Proof of Eq.(3.4))

Recall the deﬁnition of (Uhlmann) Fidelity, between two density operator σ and σ , F ( σ , σ ) = (cid:107)√ σ √ σ (cid:107) = Tr (cid:18)(cid:113) √ σ σ √ σ (cid:19) , (A3)where (cid:107) · (cid:107) is l norm, deﬁned as the sum of the singular values of the operator [5, 6]. In the following we prove Theorem 1

Let E U be the quantum channel that describes the overall eﬀect of of the algorithm presented in Fig.(2)on the state of system, in the case where we do not postselect based on the outcome of measurement in step (ii) (Inother words, we do not care if erasing has been successful or not). Then for any input state ρ the Uhlmann ﬁdelity of E U ( ρ ) and the desired state U ρU † satisﬁesF ( E U ( ρ ) , U ρU † ) ≥ p erase ( ρ ) = (cid:104) φ in |W T ( ρ ) | φ in (cid:105) , (A4) where p erase ( ρ ) is the probability that we have successfully erased the state of system (Corresponding to outcome inthe measurement in step(ii)), and W ( τ ) = (cid:88) λ ∈ Λ p ( λ ) Tr a ( W in ( λ )[ τ ⊗ |−(cid:105)(cid:104)−| a ] W in † ( λ )) . (A5) a T | i remains almost unchanged, i.e. their joint statehas ﬁdelity larger than or equal to ( n / ( n + 1), withtheir initial state | i ⌦ n .Therefore, in section III, where we present the algo-rithm, we assume all the controlled reﬂections { R a ( k ) :1  k  K } can be implemented eciently.To simplify the presentation, we introduce a notationfor a combination of these unitaries which is used re-peatedly in the following algorithm. For any integer k in1 · · · K , let W a ( k ) be the following combination of reﬂec-tions which act on the system and a control qubit a , W a ( k ) = R a ( k ) H a R a (1) . (2.2)Note that we have suppressed the superscripts in and out in both sides of this equation. Here H a is the Hadamardgate acting on qubit a , and is deﬁned by H | i = | + i and H | i = |i , where |±i = ( | i ± | i ) / p

2. Furthermore,the unitary R in a (1) (or R out a (1)) is the controlled reﬂectionwith respect to | in1 i (or | out1 i = U | in1 i ) which is one ofthe input (the corresponding output) states chosen atrandom at the beginning of the algorithm, and remainsﬁxed throught the algorithm. Interestingly the following H R ( k ) R (1) a FIG. 1: Unitary W a ( k ) is a combination of a Hadamard gateand two controlled reﬂections. algorithm only uses the controlled reﬂections R a ( k ), theHadamard gate and a swap gate. III. THE ALGORITHM

This algorithm ﬁrst projects the given unknown state | i to the subspace H in , and if successful, then appliesthe unitary U to it. (i) Let k , · · · , k T be T independent random integerschosen uniformly from the integers 1 · · · K , where T isa constant of order poly( d ). Consider T ancillary qubitslabeled by a , · · · , a T , all initialized in the state |i . Ap-ply the unitary W in a ( k ) on the system and the ancillaryqubit a , unitary W in a ( k ) on the system and a , and soon, until the last unitary W in a T ( k T ), which is applied tothe system and a T . (ii) Apply the controlled-reﬂection R c (1) on the systemand qubit c , initially prepared in state |i . Then, mea-sure qubit c in |±i = ( | i ± | i ) / p | + i . Otherwise, terminatethe algorithm and return: ”out of the input subspace”. (iii) Swap a copy of state | out1 i with the state of system,i.e. prepare the system in state | out1 i . (iv) Recall the random integers k , k , · · · , k T cho-sen in step (i) and apply the sequence of unitaries W out a T † ( k T ) , · · · , W out a † ( k ) on the system and the T an-cillary qubits a T · · · a , in the following order: First ap-ply W out a T † ( k T ) to the system and qubit a T , then apply W out a T † ( k T ) to the system and a T , and so on, untilthe last unitary W out a † ( k ) which is applied on the systemand qubit a . IV. HOW IT WORKS?

To simplify the following discussion, we ﬁrst assumethe initial state | i is in the subspace H in , and then westudy the general case.To explain the step (i) of the algorithm, we ﬁrst ignorethe evolution of the ancillary qubits a · · · a T , and focuson the reduced state of the system itself. From the pointof view of the reduced state of system, during step (i)we are simply trying to erase the state of the system andpush it into state | in1 i , a state which is chosen randomlyfrom the input set and is ﬁxed during the algorithm. Theerasing is done via repetitive measuring and conditionalmixing .Consider the ﬁrst round of step (i), where we have ran-domly chosen integer k in 1 · · · K , and apply W a ( k ) = R a ( k ) H a R a (1) to the system and qubit a initiallyprepared in state |i . We can interpert this operationin the following way: ﬁrst we perform a measurementin { P = | in1 ih in1 | , P ? = I | in1 ih in1 |} basis, and ifthe system is found to be in state | in1 i , we leave it un-changed. Otherwise, if the system is in the orthogonalsubspace, we apply the random reﬂection R in ( k ) to thesystem. Tracing over ancillary qubit a , and using thefact that k is chosen uniformly at random from inte-gers 1 · · · K , we ﬁnd that the e↵ect of applying unitary W in a ( k ) = R a ( k ) H a R a (1) on the reduced state ofsystem can be described by the quantum channel W ( ⇢ ) = P ⇢P + D ( P ? ⇢P ? ) , (4.1)where D ( ⇢ ) = 1 K K X k =1 R in ( k ) ⇢ R in ( k ) . (4.2)Next time we repeat the above procedure with a di↵erentancillary qubit, i.e. a . Note that this ancillary qubit isuncorrelated with the previous ancillary qubit a . Fur-thermore, di↵erent random k ’s in the string k · · · k T arealso independent of each other. It follows that the ef-fect of the next round on the reduced state of system isagain described by the quantum channel W . We con-clude that, at the end of step ( i ) the reduced state ofsystem is described by W T ( | ih | ). Note that channels D and W never map an initial state inside H in out of this H S w a p b a | i remains almost unchanged, i.e. their joint statehas ﬁdelity larger than or equal to ( n / ( n + 1), withtheir initial state | i ⌦ n .Therefore, in section III, where we present the algo-rithm, we assume all the controlled reﬂections { R a ( k ) :1  k  K } can be implemented eciently.To simplify the presentation, we introduce a notationfor a combination of these unitaries which is used re-peatedly in the following algorithm. For any integer k in1 · · · K , let W a ( k ) be the following combination of reﬂec-tions which act on the system and a control qubit a , W a ( k ) = R a ( k ) H a R a (1) . (2.2)Note that we have suppressed the superscripts in and out in both sides of this equation. Here H a is the Hadamardgate acting on qubit a , and is deﬁned by H | i = | + i and H | i = |i , where |±i = ( | i ± | i ) / p

To simplify the following discussion, we ﬁrst assumethe initial state | i is in the subspace H in , and then westudy the general case.To explain the step (i) of the algorithm, we ﬁrst ignorethe evolution of the ancillary qubits a · · · a T , and focuson the reduced state of the system itself. From the pointof view of the reduced state of system, during step (i)we are simply trying to erase the state of the system andpush it into state | in1 i , a state which is chosen randomlyfrom the input set and is ﬁxed during the algorithm. Theerasing is done via repetitive measuring and conditionalmixing .Consider the ﬁrst round of step (i), where we have ran-domly chosen integer k in 1 · · · K , and apply W a ( k ) = R a ( k ) H a R a (1) to the system and qubit a initiallyprepared in state |i . We can interpert this operationin the following way: ﬁrst we perform a measurementin { P = | in1 ih in1 | , P ? = I | in1 ih in1 |} basis, and ifthe system is found to be in state | in1 i , we leave it un-changed. Otherwise, if the system is in the orthogonalsubspace, we apply the random reﬂection R in ( k ) to thesystem. Tracing over ancillary qubit a , and using thefact that k is chosen uniformly at random from inte-gers 1 · · · K , we ﬁnd that the e↵ect of applying unitary W in a ( k ) = R a ( k ) H a R a (1) on the reduced state ofsystem can be described by the quantum channel W ( ⇢ ) = P ⇢P + D ( P ? ⇢P ? ) , (4.1)where D ( ⇢ ) = 1 K K X k =1 R in ( k ) ⇢ R in ( k ) . (4.2)Next time we repeat the above procedure with a di↵erentancillary qubit, i.e. a . Note that this ancillary qubit isuncorrelated with the previous ancillary qubit a . Fur-thermore, di↵erent random k ’s in the string k · · · k T arealso independent of each other. It follows that the ef-fect of the next round on the reduced state of system isagain described by the quantum channel W . We con-clude that, at the end of step ( i ) the reduced state ofsystem is described by W T ( | ih | ). Note that channels D and W never map an initial state inside H in out of this W in ( λ ) | i remains almost unchanged, i.e. their joint statehas ﬁdelity larger than or equal to ( n / ( n + 1), withtheir initial state | i ⌦ n .Therefore, in section III, where we present the algo-rithm, we assume all the controlled reﬂections { R a ( k ) :1  k  K } can be implemented eciently.To simplify the presentation, we introduce a notationfor a combination of these unitaries which is used re-peatedly in the following algorithm. For any integer k in1 · · · K , let W a ( k ) be the following combination of reﬂec-tions which act on the system and a control qubit a , W a ( k ) = R a ( k ) H a R a (1) . (2.2)Note that we have suppressed the superscripts in and out in both sides of this equation. Here H a is the Hadamardgate acting on qubit a , and is deﬁned by H | i = | + i and H | i = |i , where |±i = ( | i ± | i ) / p

To simplify the following discussion, we ﬁrst assumethe initial state | i is in the subspace H in , and then westudy the general case.To explain the step (i) of the algorithm, we ﬁrst ignorethe evolution of the ancillary qubits a · · · a T , and focuson the reduced state of the system itself. From the pointof view of the reduced state of system, during step (i)we are simply trying to erase the state of the system andpush it into state | in1 i , a state which is chosen randomlyfrom the input set and is ﬁxed during the algorithm. Theerasing is done via repetitive measuring and conditionalmixing .Consider the ﬁrst round of step (i), where we have ran-domly chosen integer k in 1 · · · K , and apply W a ( k ) = R a ( k ) H a R a (1) to the system and qubit a initiallyprepared in state |i . We can interpert this operationin the following way: ﬁrst we perform a measurementin { P = | in1 ih in1 | , P ? = I | in1 ih in1 |} basis, and ifthe system is found to be in state | in1 i , we leave it un-changed. Otherwise, if the system is in the orthogonalsubspace, we apply the random reﬂection R in ( k ) to thesystem. Tracing over ancillary qubit a , and using thefact that k is chosen uniformly at random from inte-gers 1 · · · K , we ﬁnd that the e↵ect of applying unitary W in a ( k ) = R a ( k ) H a R a (1) on the reduced state ofsystem can be described by the quantum channel W ( ⇢ ) = P ⇢P + D ( P ? ⇢P ? ) , (4.1)where D ( ⇢ ) = 1 K K X k =1 R in ( k ) ⇢ R in ( k ) . (4.2)Next time we repeat the above procedure with a di↵erentancillary qubit, i.e. a . Note that this ancillary qubit isuncorrelated with the previous ancillary qubit a . Fur-thermore, di↵erent random k ’s in the string k · · · k T arealso independent of each other. It follows that the ef-fect of the next round on the reduced state of system isagain described by the quantum channel W . We con-clude that, at the end of step ( i ) the reduced state ofsystem is described by W T ( | ih | ). Note that channels D and W never map an initial state inside H in out of this (i) (ii) (iv)(iii) W in ( λ Τ ) W out ( λ Τ ) (cid:1) W out ( λ ) (cid:1) I. FIDELITY OF EMULATION (PROOF OF EQ.(3.5))

Theorem 1

Let E U be the quantum channel that describes the overall effect of of the algorithm presented in Fig.(1) on thestate of system, in the case where we do not postselect based on the outcome of measurement in step (ii) (In other words, wedo not care if erasing has been successful or not). Then for an input state ⇢ whose support is restricted to H in , the Uhlmannﬁdelity of E U ( ⇢ ) and the desired state U ⇢U † satisﬁesF ( E U ( ⇢ ) , U ⇢U † ) p = h in |W T ( ⇢ ) | in i , (1) where p is the probability of outcome in the measurement in step(ii), i.e. the probability that we have successfully erasedthe state of system. On the other hand, if we postselect to the case where the erasing has been successful, the above lowerbound improves to p p . Proof.

Theorem 1

Let , · · · , T be T independent random element of a set ⇤ chosen according to the distribution p ( ) . Consider T ancillaryqubits labeled by a , · · · , a T , all initialized in the state |i . Apply the unitary W in a ( ) on the system and the ancillary qubit a , unitary W in a ( ) on the system and a , and so on, until the last unitary W in a T ( T ) , which is applied to the system and a T . (ii) Apply the controlled-reﬂection R c ( in ) on the system and qubit c , initially prepared in state |i . Then, apply a Hadmardgate to the qubit c and measure it in the computational basis. (iii) Swap a copy of state | out i with the state of system, i.e. prepare the system in state | out i . (iv) Recall , · · · , T chosen in step (i) and apply the sequence of unitaries W out a T † ( T ) , · · · , W out a † ( ) on the systemand the T ancillary qubits a T · · · a , in the following order: First apply W out a T † ( T ) to the system and qubit a T , then apply W out a T † ( T ) to the system and a T , and so on, until the last unitary W out a † ( ) which is applied on the system and qubit a . Theorem 1

Let E U be the quantum channel that describes the overall effect of of the algorithm presented in Fig.(2) on thestate of system, in the case where we do not postselect based on the outcome of measurement in step (ii) (In other words, wedo not care if erasing has been successful or not). Then for any input state ⇢ the Uhlmann ﬁdelity of E U ( ⇢ ) and the desiredstate U ⇢U † satisﬁes F ( E U ( ⇢ ) , U ⇢U † ) p = h in |W T ( ⇢ ) | in i , (3) where p is the probability of outcome in the measurement in step(ii), i.e. the probability that we have successfully erasedthe state of system, and W ( ⌧ ) = X ⇤ p ( ) Tr a ( W in ( )[ ⌧ ⌦ |ih| a ] W in † ( )) (4) On the other hand, if we postselect to the case where the erasing has been successful, the above lower bound improves to p p ,i.e. F ( E post U ( ⇢ ) , U ⇢U † ) p p = q h in |W T ( ⇢ ) | in i , (5) where E post U ( ⇢ ) is the output of the circuit in the case where the I. GENERALIZED ALGORITHMA. Preliminaries

We prove the theorem for the case of initial pure state ρ = | ψ (cid:105)(cid:104) ψ | . The result for general mixed states followsfrom the joint concavity of Uhlmann ﬁdelity.Let | Θ( λ ) (cid:105) be the joint state of the system and the ancillary qubits a · · · a T at the end of step (i), for a particularchoice of random parameters λ = ( λ , · · · , λ T ). This state is given by | Θ( λ ) (cid:105) = (cid:104) W in a T ( λ T ) · · · W in a ( λ ) (cid:105) | ψ (cid:105) ⊗ |−(cid:105) ⊗ T . (A7)Then, if we ignore the outcome of the measurement in step (ii), at the end of step (iii) the joint state of system andancillary qubits a · · · a T , for this particular choice of random parameters λ is given by | φ out (cid:105)(cid:104) φ out | ⊗ Tr S ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) , (A8)where the partial trace is over the main system Hilbert space. Therefore, at the end of the algorithm the averagejoint state of the system and qubits a · · · a T is given by σ ﬁn = (cid:88) λ , ··· ,λ T ∈ Λ p ( λ ) · · · p ( λ T ) (cid:104) W out a † ( λ ) · · · W out a T † ( λ T ) (cid:105) (cid:104) | φ out (cid:105)(cid:104) φ out |⊗ Tr S ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) (cid:105) (cid:2) W out a T ( λ T ) · · · W out a ( λ ) (cid:3) , (A9)where we have averaged over iid random parameters ( λ , · · · , λ T ), where each λ i ∈ Λ happens with probability p ( λ i ).Next, we look at F = (cid:104)−| ⊗ T (cid:104) ψ | U † σ ﬁn U | ψ (cid:105)|−(cid:105) ⊗ T , that is the squared of the ﬁdelity of the global output state σ ﬁn with state U | ψ (cid:105)|−(cid:105) ⊗ T . Using Eq.(A9) together with the facts that U | φ in (cid:105) = | φ out (cid:105) , and ( U ⊗ I a ) W in a ( λ )( U † ⊗ I a ) = W out a ( λ ) , and deﬁnition of | Θ( λ ) (cid:105) in Eq.(A7), we ﬁnd that F = (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:104) Θ( λ ) | (cid:104) | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) (cid:105) | Θ( λ ) (cid:105) . (A10)Next, we note that | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) = | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( P | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | P ) + | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( P ⊥ | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | P ⊥ ) (A11a)= (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) + | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( P ⊥ | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | P ⊥ ) , (A11b)0where P = | φ in (cid:105)(cid:104) φ in | , P ⊥ = I S − P , and I S and I a are, respectively, the identity operators on the system and theancillary qubits a · · · a T . Using the fact that Tr S ( P ⊥ | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | P ⊥ ) is a positive operator, we ﬁnd F = (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:104) Θ( λ ) | (cid:104) | φ in (cid:105)(cid:104) φ in | ⊗ Tr S ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) (cid:105) | Θ( λ ) (cid:105) (A12a) ≥ (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (A12b)= (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:16) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (cid:17) , (A12c)where I a is the identity operator on the qubits a · · · a T . Then, using the fact that for any distribution, the varianceof a random variable is non-negative, we ﬁnd that F ≥ (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:16) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (cid:17) (A13a) ≥ (cid:16) (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (cid:17) (A13b)= Tr  | φ in (cid:105)(cid:104) φ in | (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) Tr a ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | )  (A13c)= p ( | ψ (cid:105)(cid:104) ψ | ) , (A13d)where p erase ( | ψ (cid:105)(cid:104) ψ | ) is the probability that at the end of step (i) the reduced state of system is in state | φ in (cid:105) .Finally, we note that the ﬁnal state of system is obtained from state σ ﬁn , by tracing over the ancillary qubits, i.e. E U ( | ψ (cid:105)(cid:104) ψ | ) = Tr a ( σ glob ) . Then, using the monotonicity of Uhlmann ﬁdelity under partial trace we ﬁndF( E U ( | ψ (cid:105)(cid:104) ψ | ) , U | ψ (cid:105)(cid:104) ψ | U † ) ≥ F glob ≥ p erase ( | ψ (cid:105)(cid:104) ψ | ) . (A14)Using the joint concavity of Uhlmann ﬁdelity this proves the bound for arbitrary initial state ρ .Next, using the deﬁnition | Θ( λ ) (cid:105) = (cid:104) W in a T ( λ T ) · · · W in a ( λ ) (cid:105) | ψ (cid:105) ⊗ |−(cid:105) ⊗ T , it can be easily shown that (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T )Tr a ( | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | ) = W T ( | ψ (cid:105)(cid:104) ψ | ) , (A15a)where W ( τ ) = (cid:88) λ ∈ Λ p ( λ ) Tr a (cid:16) W in a ( λ ) (cid:104) τ ⊗ |−(cid:105)(cid:104)−| a (cid:105) W in a ( λ ) † (cid:17) , (A16)and the partial trace is over qubit a . Therefore, we conclude that for any initial state ρ it holds that F( E U ( ρ ) , U ρU † ) ≥ p erase ( ρ ) = (cid:104) φ in |W T ( ρ ) | φ in (cid:105) . In the above argument, we assumed we ignore the outcome of measurement in step (ii), which means that we donot care if the erasing has been successful or not. Now suppose we postselect to the cases where the erasing has beensuccessful. In this case after the measurement the joint state of the system and ancillary qubits a · · · a T is given by1 p erase ( | ψ (cid:105)(cid:104) ψ | ) × (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) , (A17)where the factor 1 /p erase ( | ψ (cid:105)(cid:104) ψ | ) is due to the postselection. Then, using an argument similar to the one we usedbefore, we can show that the the squared of the ﬁdelity of the ﬁnal joint state of the system and qubits a · · · a T atthe end of the algorithm, with state U | ψ (cid:105) ⊗ |−(cid:105) ⊗ T is given by F postglob 2 = 1 p erase ( | ψ (cid:105)(cid:104) ψ | ) × (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105)(cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (A18a)= 1 p erase ( | ψ (cid:105)(cid:104) ψ | ) × (cid:88) λ , ··· ,λ T p ( λ ) · · · p ( λ T ) (cid:16) (cid:104) Θ( λ ) | (cid:2) | φ in (cid:105)(cid:104) φ in | ⊗ I a (cid:3) | Θ( λ ) (cid:105) (cid:17) , (A18b)1Then, using Eq.(A13), we ﬁnd F postglob 2 ≥ p ( | ψ (cid:105)(cid:104) ψ | ) p erase ( | ψ (cid:105)(cid:104) ψ | ) = p erase ( | ψ (cid:105)(cid:104) ψ | ) . (A19)This means that if we postselect to the cases where erasing has been successful, then F ( U | ψ (cid:105)(cid:104) ψ | U † , ˜ ρ post ) ≥ F postglob ≥ (cid:112) p erase ( | ψ (cid:105)(cid:104) ψ | ) . (A20)2 Appendix B: Runtime and error analysis

In this section we ﬁrst study the runtime and the error in the output of the algorithm, assuming we run thequantum circuit in Fig. 1 with ideal controlled-reﬂections R a ( k ). Then, we consider the runtime and error in theactual algorithm, where we use the given copies of sample states to simulate the controlled-reﬂections.

1. Algorithm with perfect controlled-reﬂections

Let E U be the quantum channel that describes the overall eﬀect of the circuit in Fig.(1) on the system, in the casewhere we ignore the outcome of measurement in step (ii), and where we use ideal controlled-reﬂections in the circuit.Let (cid:107) · (cid:107) be the l -norm, deﬁned as the sum of the singular values of the operator. Then, Theorem 2

Suppose in the quantum circuit presented in Fig.(1) we choose T ≥ d × log(8 (cid:15) − id d )1 − | λ D | , (B1) where λ D is the eigenvalue of channel D ( · ) = K − (cid:80) Kk =1 e iπ | φ in k (cid:105)(cid:104) φ in k | ( · ) e iπ | φ in k (cid:105)(cid:104) φ in k | which has the second largest magni-tude. Then, for any input ρ whose support is restricted to the input subspace H in , the output of the circuit satisﬁes (cid:107) U ρU † − E U ( ρ ) (cid:107) ≤ (cid:15) id . Proof.

We start with Eq.(3.4) proven in Appendix A, F ( E U ( ρ ) , U ρU † ) ≥ (cid:104) φ in1 |W T ( ρ ) | φ in1 (cid:105) . (B2)Let P ⊥ in = Π in − P = Π in − | φ in1 (cid:105)(cid:104) φ in1 | be the projector to the ( d − H in orthogonal to | φ in1 (cid:105) ,and P ⊥ in ( · ) = P ⊥ in ( · ) P ⊥ in . Using the deﬁnition W ( σ ) = P σP + D ( P ⊥ σP ⊥ ), it can be easily seen that for any state σ whose support is restricted to H in , it holds that P ⊥ in ◦ W ( σ ) = P ⊥ in ◦ W ◦ P ⊥ in ( σ ) = P ⊥ in ◦ D ◦ P ⊥ in ( σ ) . (B3)Furthermore, using the fact that under channels W and D any state ρ whose support is restricted to H in is mappedto a state with support restricted to this subspace, we ﬁnd that P ⊥ in ◦ W T ( ρ ) = (cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( ρ ) . (B4)This implies F ( E U ( ρ ) , U ρU † ) ≥ (cid:104) φ in1 |W T ( ρ ) | φ in1 (cid:105) (B5)= 1 − Tr (cid:0) P ⊥ W T ( ρ ) (cid:1) (B6)= 1 − Tr (cid:0) P ⊥ in W T ( ρ ) (cid:1) (B7)= 1 − Tr (cid:0) P ⊥ in ◦ W T ( ρ ) (cid:1) (B8)= 1 − Tr (cid:16)(cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( ρ ) (cid:17) , (B9)where to get the third line we have used the fact that the support of W T ( ρ ) is contained in H in , and to get the lastline we have used Eq.(B3).It turns out that quantum operations D and P ⊥ in ◦ D ◦ P ⊥ in have several nice properties which simplify the followinganalysis. In particular, they both have Hermitian matrix representations: Let { F µ : µ = 1 · · · d } be an orthonormalbasis for the operator space acting on H in , such that Tr( F † µ F ν ) = δ µ,ν . Consider the matrix representation of D and P ⊥ in ◦ D ◦ P ⊥ in on H in , i.e. the matrices D µ,ν = Tr( F † µ D ( F ν )) and D ⊥ µ,ν = Tr( F † µ P ⊥ in ◦ D ◦ P ⊥ in ( F ν )), respectively. Then,the fact that any reﬂection e iπ | φ (cid:105)(cid:104) φ | is a Hermitian operator, implies that these matrices are both Hermitian (Notethat the matrix representation of channel W is not Hermitian). Let λ ⊥ be the largest eigenvalue of P ⊥ ◦ D ◦ P ⊥ . Itfollows that for any T , the map (cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T has also a Hermitian matrix representation, and its largest eigenvalue3is λ T ⊥ . This implies Tr (cid:16)(cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( ρ ) (cid:17) = Tr (cid:16) P ⊥ in (cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( ρ ) (cid:17) (B10a) ≤ Tr (cid:16) P ⊥ in (cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( P ⊥ in ) (cid:17) (B10b) ≤ | λ ⊥ | T × Tr( P ⊥ in ) (B10c)= | λ ⊥ | T × ( d − , (B10d)where to get the ﬁrst line we use the fact that for any X , P in P in ( X ) = P in ( X ), and to get the third line we usethe fact that λ T ⊥ is the largest eigenvalue of P ⊥ in ◦ D ◦ P ⊥ in , which implies for any X , Tr (cid:16) X † (cid:2) P ⊥ in ◦ D ◦ P ⊥ in (cid:3) T ( X ) (cid:17) ≤ Tr( X † X ) | λ ⊥ | T .Putting this into Eq.(B5) we ﬁnd that F ( E U ( ρ ) , U ρU † ) ≥ − Tr (cid:16)(cid:2) P ⊥ ◦ D ◦ P ⊥ (cid:3) T ( ρ ) (cid:17) ≥ − d × | λ ⊥ | T . (B11)Next, we use this bound together with the Fuchs-van de Graaf inequality, to bound the trace distance of E U ( ρ ) and U ρU † . We ﬁnd (cid:13)(cid:13) E U ( ρ ) − U ρU † (cid:13)(cid:13) ≤ (cid:113) − F ( E U ( ρ ) , U ρU † ) ≤ √ (cid:113) − F ( E U ( ρ ) , U ρU † ) (B12) ≤ (cid:113) d × | λ ⊥ | T . (B13)This means that to achieve error (cid:15) id in trace distance, it is suﬃcient to have T ≥ log(8 (cid:15) − d )log | λ | − ⊥ . (B14)In the following we prove | λ ⊥ | ≤ − − | λ D | d . (B15)This together with the fact that log(1 + x ) ≤ x for x > −

1, implies d − | λ D | ≥ | λ | − ⊥ . (B16)Therefore, using Eq.(B14), we conclude that if we choose T such that T ≥ d × log(8 (cid:15) − d )1 − | λ D | , (B17)then (cid:13)(cid:13) E U ( ρ ) − U ρU † (cid:13)(cid:13) ≤ (cid:15) id . Bound on the maximum eigenvalue of P ⊥ in ◦ D ◦ P ⊥ in To complete the proof, in the following we prove Eq.(B15), which is a bound on λ ⊥ , the largest eigenvalue of P ⊥ in ◦ D ◦ P ⊥ in . Recall that P ⊥ in ◦ D ◦ P ⊥ in and D both have Hermitian matrix representations. This means that D has aspectral decomposition as D ( X ) = (cid:80) µ λ µ M † µ Tr( M µ X ), where each λ µ and M † µ are, respectively, the eigenvalue of D and its corresponding eigenvector, and Tr( M † µ M ν ) = δ µ,ν . Then, (the absolute value of) the maximum eigenvalue of P ⊥ in ◦ D ◦ P ⊥ in is given by | λ ⊥ | = max X (cid:12)(cid:12) Tr( X † P ⊥ in ◦ D ◦ P ⊥ in ( X ) (cid:12)(cid:12) Tr( X † X ) = max X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) µ λ µ (cid:12)(cid:12) Tr( M µ P ⊥ in XP ⊥ in ) (cid:12)(cid:12) Tr( X † X ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . (B18)4Note that the Hermiticity of D implies that its eigenvectors form an orthonormal basis, and so (cid:88) µ (cid:12)(cid:12) Tr( M µ P ⊥ in XP ⊥ in ) (cid:12)(cid:12) = Tr ( P ⊥ in X † P ⊥ in XP ⊥ in ) . (B19)Then, using the fact that the operator Π in / √ d is the only (normalized) eigenvector of D with eigenvalue 1, we ﬁnd | λ ⊥ | = max X (cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:88) µ λ µ (cid:12)(cid:12) Tr( M µ P ⊥ in XP ⊥ in ) (cid:12)(cid:12) Tr( X † X ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ max X (cid:20) | Tr( P ⊥ in XP ⊥ in ) | d × Tr( X † X ) + | λ D | Tr( P ⊥ in X † P ⊥ in XP ⊥ in ) − | Tr( P ⊥ in XP ⊥ in ) | × d − Tr( X † X ) (cid:21) , (B20)where λ D is the second largest eigenvalue of D (in magnitude). This implies | λ ⊥ | ≤ max X (cid:20) | Tr( XP ⊥ in ) | d × Tr( X † X ) × (1 − | λ D | ) + | λ D | (cid:21) . (B21)Then, using the Cauchy-Shwarz inequality we ﬁnd | λ ⊥ | ≤ max X (cid:20) Tr( XX † )Tr( P ⊥ in ) d × Tr( X † X ) × (1 − | λ D | ) + | λ D | (cid:21) = 1 − − | λ D | d , (B22)where we have used the fact that Tr( P ⊥ in ) = d −

2. Total error in the actual algorithm with simulated controlled-reﬂections

As we showed in the previous section, in the idealized case where we are given perfect controlled-reﬂections bychoosing T ≥ T ( d, (cid:15) id ) ≡ d log(8 (cid:15) − d )1 − | λ D | , (B23)we can implement the transformation | ψ (cid:105) → U | ψ (cid:105) on any initial state | ψ (cid:105) ∈ H in with error less than or equal to (cid:15) id in trace distance. But, in the main algorithm we need to simulate the controlled-reﬂections using the given copies ofthe sample states.In the Appendix C we show that using n copies of state | φ (cid:105) we can implement the unitary e iπ | φ (cid:105)(cid:104) φ | , or its controlledversion, | (cid:105)(cid:104) |⊗ I + | (cid:105)(cid:104) |⊗ e iπ | φ (cid:105)(cid:104) φ | , with error O (1 /n ) in trace distance, and in time O (log D × n log n ) = ˜ O (log D × n ),where ˜ O suppresses more slowly-growing terms. In other words, to achieve the error of order (cid:15) ref in simulating eachcontrolled-reﬂection we need time of order ˜ O ( (cid:15) − × log D ), and ˜ O ( (cid:15) − ) copies of the corresponding state.Given that the circuit requires 4 T + 1 controlled-reﬂection, it follows that we can implement the transformation | ψ (cid:105) → U | ψ (cid:105) , with the total error less than or equal to (cid:15) tot = [4 T ( d, (cid:15) id ) + 1] × (cid:15) ref + (cid:15) id , (B24)in the total time t tot = [4 T ( d, (cid:15) id ) + 1] × ˜ O ( (cid:15) − × log D ) , (B25)and using the total number of input-output sample pairs N tot = [4 T ( d, (cid:15) id ) + 1] × (cid:15) − . (B26)Choosing (cid:15) id and (cid:15) ref such that T ( d, (cid:15) id ) × (cid:15) ref = (cid:15) id = (cid:15) , we ﬁnd that the transformation | ψ (cid:105) → U | ψ (cid:105) can beimplemented with the total error less than or equal to (cid:15) , in the total time t tot = ˜ O (cid:18) d (cid:15) − log D (1 − | λ D | ) (cid:19) , (B27)and using the total number of input-output sample pairs N tot = ˜ O (cid:18) d (cid:15) − (1 − | λ D | ) (cid:19) . (B28)5 Appendix C: An eﬃcient quantum circuit for exponentiating a density operator

In this section, we introduce an explicit simple quantum circuit that uses multiple copies of state σ , to simulatethe unitary e − itσ , or its controlled version | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − itσ , for any real t . The circuit works based on thedensity matrix exponentiation technique of [3].Let S be the SWAP operator acting on the system and an ancillary system with equal Hilbert space dimensions,such that S | µ (cid:105)| ν (cid:105) = | ν (cid:105)| µ (cid:105) , for any pair of states | µ (cid:105) , and | ν (cid:105) . Note that S is a Hermitian unitary operator. Then, asit is observed in Ref.([3]),Tr RF (cid:0) e − i ∆ tS [ ρ ⊗ σ ] e i ∆ tS (cid:1) = ρ + i ∆ t [ ρ, σ ] + O (∆ t ) = e − iσ ∆ t ρe iσ ∆ t + O (∆ t ) , (C1)where the partial trace is over the system with state σ (which can be interpreted as a quantum reference frame [8, 12]).Then, choosing suﬃciently small ∆ t , and repeating the above procedure for t/ ∆ t times with t/ ∆ t copies of state σ ,we can simulate the unitary e − iσt with arbitrary accuracy.In the following section we present a new simple and eﬃcient algorithm for simulating the unitary e − iθS for arbitrary θ , with error (cid:15) and in time O (log ( D ) × log(1 /(cid:15) )). Therefore, if we use this algorithm together with the above method,using one copy of state σ we can simulate the unitary e − iσ ∆ t with the total error of order ∆ t + (cid:15) , and in time O (log ( D ) × log(1 /(cid:15) )). Then, having n copies of state σ , to simulate the unitary e − iσt we can choose ∆ t = t/n anduse each copy to simulate e − iσ ∆ t . In this case the total error is (cid:15) tot = O (cid:0) n × ( (cid:15) + ∆ t ) (cid:1) = O (cid:0) n(cid:15) + t n (cid:1) , (C2)and the total runtime is t tot = O (log( D ) × log(1 /(cid:15) ) × n ) , (C3)where (cid:15) is the error in implementing the unitary e − i ∆ tS . Choosing (cid:15) = n , we ﬁnd that using n copies of state σ we can simulate unitary e − iσt , with the total error of order (cid:15) tot = O ( t +1 n ) in trace distance, and in time of order O (cid:0) log( D ) × n log( n ) (cid:1) .Furthermore, using Eq.(C1), it can be easily shown that if instead of the unitary e − i ∆ tS we implement its controlledversion, i.e. the unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iS ∆ t , then we can simulate the controlled version of e − iσ ∆ t , i.e. theunitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iσ ∆ t . To implement the unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iS ∆ t , we note that e − iS c ∆ t = | (cid:105)(cid:104) | ⊗ e − i ∆ t I + | (cid:105)(cid:104) | ⊗ e − iS ∆ t , (C4)where S c = | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ S , is the controlled-SWAP unitary. Using the algorithm presented in the next section,we can eﬃciently simulate this unitary with error (cid:15) in time O (log ( D ) × log(1 /(cid:15) )). Then, if after applying e − iS c ∆ t we apply e iσ z ∆ t/ on the control qubit, we implement the desired unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iS ∆ t , up to a globalphase.Therefore, having n copies of state σ to simulate the unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iσt , we ﬁrst apply the unitary e − iS c t/n to the system, the control qubit and each one of these n copies, and then apply the the unitary e iσ z t/ on thecontrol qubit. It follows that we can implement the unitary | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iσt ,with the total error less thanor equal to O ( t +1 n ) in time O (cid:0) log( D ) × n log( n ) (cid:1) .We conclude that Theorem 3

Using n copies of an unknown state σ we can simulate the unitary e − iσt , or its controlled version | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ e − iσ ∆ t in time O (cid:0) log( D ) × n log( n ) (cid:1) , with error O ( t +1 n ) in trace distance.

1. An eﬃcient quantum circuit for simulating the time evolution generated by the (controlled-)SWAPHamiltonian

In this section we present a simple quantum circuit to simulate the time evolution generated by the SWAP Hamilto-nian on a pair of systems whose Hilbert spaces have equal dimension. This algorithm simulates any such unitary witherror less than or equal to (cid:15) (in trace distance) in time O (log( D ) × log(1 /(cid:15) )), where D is the dimension of the Hilbertspace of each system. We also explain how the same idea can be used to simulate the controlled-SWAP Hamiltonianas well.6Let S be the SWAP operator acting on a pair of systems A and B . The fact that S is the identity operator impliesthat any unitary generated by S has the following form e iSθ = cos θ I + i sin θ S , (C5)where I is the identity operator on the joint system. To simulate a general unitary in this form we ﬁrst implement S c , the controlled-SWAP unitary deﬁned by S c = | (cid:105)(cid:104) | c ⊗ I + | (cid:105)(cid:104) | c ⊗ S , (C6)where c is the label for an ancillary qubit. This unitary can be eﬃciently implemented, using controlled-SWAPgates for qubits, also known as Fredkin gate [4]. To implement S c in this way, we need to apply O (log( D )) Fredkingates, where each Fredkin gate is controlled by the ancillary qubit c , and is acting on a qubit in systems A , and itscorresponding qubit in system B .Then, using the controlled-SWAP operator S c , we can easily implement the unitary e iSθ , for any θ : First, weprepare the ancillary qubit in state cos θ | (cid:105) + i sin θ | (cid:105) and apply the controlled-SWAP S c to the ancillary qubit andthe systems A and B in an arbitrary initial state | γ (cid:105) . This yields S c (cos θ | (cid:105) + i sin θ | (cid:105) ) ⊗ | γ (cid:105) = cos θ | (cid:105)| γ (cid:105) + i sin θ | (cid:105) S | γ (cid:105) . (C7)Next, we measure the ancillary qubit in { ( | (cid:105) ± | (cid:105) ) / √ } basis. Then, with probability 1/2 we project this qubit tostate ( | (cid:105) + | (cid:105) ) / √

2, in which case we getstate | (cid:105) − | (cid:105)√ θ | γ (cid:105) − i sin θ S | γ (cid:105) ) = | (cid:105) − | (cid:105)√ ⊗ e − iθS | γ (cid:105) . (C9)Therefore, in the latter case instead of the desired unitary e iθS we have implemented the unitary e − iθS . But becausethese unitaries all commute with each other, we can correct this error: we repeat the above process and this time we tryto implement the unitary e i θS , by preparing the ancilla in state cos 2 θ | (cid:105) + i sin 2 θ | (cid:105) , instead of cos θ | (cid:105) + i sin θ | (cid:105) .Then, again with probability 1/2 we will be successful and implement the overall unitary e i θS e − iθS = e iθS , as wedesired. On the other hand, with probability 1/2 we are unsuccessful in which case we have implemented the overallunitary e − iSθ e − iS θ = e − iS θ . In this case we repeat the above process and try to implement the unitary e iS θ , bypreparing the ancillary qubit in state cos 4 θ | (cid:105) + i sin 4 θ | (cid:105) .Repeating this process for r times we achieve the desired unitary with probability 1 − / r . Since each time weneed to apply S c , which takes time of order log D , we ﬁnd that the total time it takes to implement e iSθ with error (cid:15) is O (log ( D ) × log(1 /(cid:15) )).Finally, note that we can use a similar method to implement the unitary e iθS c = | (cid:105)(cid:104) | ⊗ e iθ I + | (cid:105)(cid:104) | ⊗ e iSθ = cos θI + i sin θS c . (C10)To implement this unitary, it suﬃces to replace the Fredkin gates in the above algorithm, with controlled-Fredkingates, i.e. controlled-controlled-SWAP gates each of which acts on the a pair of qubits in systems A and B , and iscontrolled by the original control qubit and the anicallry qubit that we need to implement the above procedure.7 Appendix D: Generalizations

In this section we discuss several generalizations of the results presented in the paper. These generalizations aresummarized in Sec.(IV) of the paper.

1. Input-output pairs of mixed states

The algorithm presented in the paper assumes the given samples of the input-output pairs are all pure states.However, as we explained at the end of Sec.(III A) this algorithm can be generalized to the case where the samplestates contain mixed states. It can be easily shown that, as long as the sample input states S in = { ρ in k , k = 1 , · · · K } contain (at least) one state close to a pure state, we can still coherently erase the sate of system and push it intothis pure state. Then, we can emulate the action of the unknown unitary U , using the same approach we used in theoriginal algorithm.Let H in be the subspace spanned by the union of the support of density operators in S in = { ρ in k , k = 1 , · · · K } .Then, having copies of sample input-output states in S in and S out , we can determine the action of U on any statein H in , if and only if the set S in generates the full matrix algebra on H in . Therefore, in the following we naturallyassume this condition is satisﬁed.To erase the state of system coherently we use controlled-translations with respect to the given sample states, i.e.the unitaries T a ( k, t ) = | (cid:105)(cid:104) | a ⊗ I + | (cid:105)(cid:104) | a ⊗ e − iρ k t , (D1)where we have suppressed in and out superscript in both sides. As we discussed in Appendix C, using the given copiesof sample states we can eﬃciently simulate these unitaries. Recall that the input set S in (and hence S out ) contains atleast one pure state. Without loss of generality, let ρ in1 and ρ out1 = U ρ in1 U † be the pure state in the sample set S in and itscorresponding output state in the set S out , respectively. Then instead of unitaries W a ( k ) = R a ( k ) H a R a (1) used in theoriginal algorithm, we use unitaries W a ( k, t ) = T a ( k, t ) H a T a (1 , π ), where we have suppressed in and out superscripts,and choose k and t uniformly at random from the sets 1 , · · · , K and [0 , ρ = | φ in1 (cid:105)(cid:104) φ in1 | is the unique ﬁxed point state of channel W = 1 /K (cid:80) Kk =1 (cid:82) dt Tr a (cid:0) W in a ( k, t )[ τ ⊗|−(cid:105)(cid:104)−| a ] W in a † ( k, t ) (cid:1) inside H in . It follows that we can coherently erase the state of system, and therefore, using thesame technique we used in the main algorithm, we can emulate the action of unitary U .

2. Emulating controlled unitaries

In many quantum algorithms, such as quantum phase estimation, one needs to implement the controlled version ofa unitary U , i.e. the unitary U c ≡ | (cid:105)(cid:104) | ⊗ I + | (cid:105)(cid:104) | ⊗ U . Can we modify the proposed algorithm to implement thecontrolled version of U as well?To answer this question, ﬁrst note that if the only given resources are multiple copies of states in S in and S out ,it is impossible to distinguish between unitaries U and e iθ U , for any phase e iθ . On the other hand, in general thecontrolled version of U and e iθ U are distinct unitaries, which are not equivalent up to a global phase. This meansthat even to specify what is the controlled version of U we need extra resources that deﬁne and ﬁx this global phase.For instance, we can use multiple copies of the input state | Φ in (cid:105) = ( α | (cid:105) + β | (cid:105) ) ⊗ | φ in (cid:105) , with α (cid:54) = 0 and β (cid:54) = 0 and | φ in (cid:105) ∈ H in , together with copies of its corresponding output | Φ out (cid:105) = U c | Φ in (cid:105) .Therefore, in the following we assume in addition to the multiple copies of states in S in and S out , we are also givenmultiple copies of states | Φ in (cid:105) and | Φ out (cid:105) . Again, we naturally assume the set S in generates the full matrix algebraon H in . This together with the fact that | φ in (cid:105) ∈ H in implies the set (cid:0) {| (cid:105)(cid:104) | , | (cid:105)(cid:104) } ⊗ S in (cid:1) ∪ {| Φ in (cid:105)(cid:104) Φ in |} (D2)generates the full matrix algebra on C ⊗ H in , where C denotes the Hilbert space of the controlled qubit. It followsthat, given these resources we can now implement the algorithm proposed in the paper to emulate the controlled-unitary U c on C ⊗ H in . In other words, if instead of states in the sets S in and S out we choose states from the sets (cid:0) {| (cid:105)(cid:104) | , | (cid:105)(cid:104) } ⊗ S in (cid:1) ∪ {| Φ in (cid:105)(cid:104) Φ in |} and (cid:0) {| (cid:105)(cid:104) | , | (cid:105)(cid:104) } ⊗ S out (cid:1) ∪ {| Φ out (cid:105)(cid:104) Φ out |} respectively, we implement unitary U c .8

3. More eﬃcient algorithm with prior information about samples

The algorithm presented in the paper does not assume any prior information about the sample input states, orthe relation between them. On the other hand, as we show in the following, making some assumptions about thesample input states, we can emulate the unknown unitary more eﬃciently. Note that we do not assume any priorinformation about any single sample states; the assumption is only about the relation between them. More precisely,the assumption is about the pairwise inner product between the states in the input set.The main idea behind this version of algorithm is again emulating via coherent erasing. We use the fact that bymeasuring the system in two conjugate bases, we can completely erase the state of system. Therefore, we assume we aregiven multiple copies of states in an orthonormal basis for the input subspace, and their corresponding output states.We also assume we are given multiple copies of one (or more) state in the conjugate basis, and their correspondingoutputs. Then using these sample states we can simulate coherent measurements in both basis, and coherently erasethe state of system.Let {| θ in k (cid:105) : k = 0 · · · d − } be d unknown orthonormal states in the d -dimensional input subspace H in , and | α in j (cid:105) = 1 √ d d − (cid:88) k =0 e i πkj/d | θ in k (cid:105) : j = 0 · · · d − , (D3)be the orthonormal basis for H in , which is conjugate to {| θ in k (cid:105) : k = 0 · · · d − } . We assume we are given multiplecopies of the input states {| θ in k (cid:105) : k = 0 · · · d − } and {| α in j (cid:105) : j = 0 · · · d − } , and their corresponding output states {| θ out k (cid:105) = U | θ in k (cid:105) : k = 0 · · · d − } and {| α out j (cid:105) = U | α in j (cid:105) : j = 0 · · · d − } . Note that, even if we are only given multiplecopies of states {| θ in k (cid:105) : k = 0 · · · d − } and multiple copies of one of the states in {| α in k (cid:105) : k = 0 · · · d − } , we caneﬃciently generate all states in this set (and similarly, for the set {| α out k (cid:105) : k = 0 · · · d − } ). In the ﬁrst step of this algorithm we perform a coherent measurement in {| θ in k (cid:105) : k = 0 · · · d − } basis. To do thiswe couple the system to an ancillary system with a d − dimensional Hilbert space. The ancillary system is initiallyprepared in the state | Γ (cid:105) = √ d − (cid:80) d − t =0 | t (cid:105) , where {| t (cid:105) : t = 0 · · · d − } is a standard orthonormal basis. Then, we usethe given copies of sample states {| θ in k (cid:105) : k = 0 · · · d − } to simulate the unitary V in P = d − (cid:88) k,t =0 e itk π/d | θ in k (cid:105)(cid:104) θ in k | ⊗ | t (cid:105)(cid:104) t | = d − (cid:88) t =0 e itP in π/d ⊗ | t (cid:105)(cid:104) t | , (D4)where P in = (cid:80) d − k =0 k | θ in k (cid:105)(cid:104) θ in k | . To eﬃciently simulate this unitary we ﬁrst note that it has a decomposition as V in P = d − (cid:89) k =0 d − (cid:88) t =0 e it | θ in k (cid:105)(cid:104) θ in k | πk/d ⊗ | t (cid:105)(cid:104) t | . (D5)Therefore, to simulate V in P we can simulate the (commuting) unitaries (cid:80) d − t =0 e it | θ in k (cid:105)(cid:104) θ in k | πk/d ⊗| t (cid:105)(cid:104) t | , for k = 0 , · · · , d − (cid:80) d − t =0 e it | θ in k (cid:105)(cid:104) θ in k | πk/d ⊗ | t (cid:105)(cid:104) t | can be eﬃciently simulated using the given copies of state | θ in k (cid:105) . Usingthe results presented in Appendix C, this simulation can be done in time O (log D ), where D is the dimension of theHilbert space. Then, it follows from the decomposition of V in P given by Eq.(D5) that we can simulate V in P in time O ( d log D ), using the given copies of sample states {| θ in k (cid:105) , k = 0 · · · d − } .Applying the unitary V in P to the system in state | ψ (cid:105) ∈ H in and the ancillary system prepared in state | Γ (cid:105) , we getstate | ψ (cid:105) ⊗ | Γ (cid:105) = d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ | Γ (cid:105) −→ V in P | ψ (cid:105) ⊗ | Γ (cid:105) = d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ √ d d − (cid:88) t =0 e itk π/d | t (cid:105) , (D6) Let P in = (cid:80) d − k =0 k | θ in k (cid:105)(cid:104) θ in k | . Having multiple copies of states {| θ in k (cid:105) : k = 0 · · · d − } we can simulate the unitary e − iP in s for any real s , and by applying this unitary, for diﬀerent values of s , we can transform one element of the set {| α in k (cid:105) : k = 0 · · · d − } to the otherelements. To simulate the unitary e − iP in s , we note that e − iP in s = (cid:81) d − k =0 e − isk | θ in k (cid:105)(cid:104) θ in k | . As we have seen in Appendix C each unitary e − isk | θ in k (cid:105)(cid:104) θ in k | can be eﬃciently simulated using multiple copies of state | θ in k (cid:105) , in time O (log( D )). Therefore, we can simulate the unitary e − iP in s in time O ( d log( D )). | ψ (cid:105) = (cid:80) d − k =0 ψ k | θ in k (cid:105) .Then, performing quantum Fourier transform on the ancillary system we transform the joint state to V in P | ψ (cid:105) ⊗ | Γ (cid:105) = d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ d − (cid:88) t =0 e itk π/d | k (cid:105) QFT −−−→ d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ | k (cid:105) . (D7)Then, to erase the information in the system we implement the unitary V in Q = d − (cid:88) k,t =0 e itk π/d | α in k (cid:105)(cid:104) α in k | ⊗ | t (cid:105)(cid:104) t | = d − (cid:88) t =0 e itQ in π/d ⊗ | t (cid:105)(cid:104) t | , (D8)where Q in = (cid:80) d − l =0 l | α in l (cid:105)(cid:104) α in l | . Note that we can eﬃciently simulate V in Q , using a similar approach we used to simulate V in P .Applying V in Q to state in Eq.(D7) we ﬁnd d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ | k (cid:105) −→ V in Q d − (cid:88) k =0 ψ k | θ in k (cid:105) ⊗ | k (cid:105) = d − (cid:88) k =0 ψ k e ikQ in π/d | θ in k (cid:105) ⊗ | k (cid:105) = | θ in0 (cid:105) ⊗ d − (cid:88) k =0 ψ k | k (cid:105) , (D9)where we have used the fact that e ikQ in π/d | θ in k (cid:105) = d − (cid:88) l =0 e i πlk/d | α in l (cid:105)(cid:104) α in l | θ in k (cid:105) = 1 √ d d − (cid:88) l =0 e i πlk/d e − i πkl/d | α in l (cid:105) = | θ in0 (cid:105) . (D10)Therefore, after these three steps for any input state | ψ (cid:105) ∈ H in we have (cid:2) V in Q ( I ⊗ F a ) V in P (cid:3) | ψ (cid:105)| Γ (cid:105) = | θ in0 (cid:105) ⊗ d − (cid:88) k =0 ψ k | k (cid:105) , (D11)where F a denotes the quantum Fourier transform on the ancillary system.At this point we have completely erased the state of system and transferred all its information to the ancillarysystem. Now using a method similar to the one used in the main algorithm, we can transform this state to state U | ψ (cid:105)| Γ (cid:105) : We replace | θ in0 (cid:105) with | θ out0 (cid:105) and apply V out P † ( I ⊗ F † a ) V out Q † to state | θ out0 (cid:105) ⊗ (cid:80) d − k =0 ψ k | k (cid:105) , where V out Q = d − (cid:88) k,t =0 e itk π/d | α out k (cid:105)(cid:104) α out k | ⊗ | t (cid:105)(cid:104) t | = d − (cid:88) t =0 e itQ out π/d ⊗ | t (cid:105)(cid:104) t | , (D12) V out P = d − (cid:88) k,t =0 e itk π/d | θ out k (cid:105)(cid:104) θ out k | ⊗ | t (cid:105)(cid:104) t | = d − (cid:88) t =0 e itP out π/d ⊗ | t (cid:105)(cid:104) t | . (D13)Note that these unitaries can be eﬃciently simulated using the same method we used to simulate the unitary V in P .It follows that we can emulate the action of unitary U on the input subspace in time O ( d log( D )).

4. Approximate transformations

In the above discussions we always assumed there exists a unitary U for which U | φ in k (cid:105) = | φ out k (cid:105) , for all k = 1 , · · · K .Now suppose we only know that these transformations are possible approximately. I.e. there exists a constant δ > U such that (cid:107) U | φ in k (cid:105) − | φ out k (cid:105)(cid:107) ≤ δ , for k = 1 , · · · , K . Then, it can be easily shown that if we run theproposed algorithm on an input state | ψ (cid:105) ∈ H in , with the given sample input-output pairs, in the output we generatea quantum state whose trace distance from state U | ψ (cid:105) is bounded by O ( T × δ ) = δ × poly( d ). This follows form thefact that, under the above assumption, we can repeat the argument that is used to derive Eq.(3.3) from Eq.(3.2), anduse (cid:107) U | φ in1 (cid:105) − | φ out1 (cid:105)(cid:107) ≤ δ and (cid:107) ( U ⊗ I a ) W in a ( k )( U † ⊗ I a ) − W out a ( k ) (cid:107) ≤ δ , which follows from (cid:107) U | φ in k (cid:105) − | φ out k (cid:105)(cid:107) ≤ δ ,for k = 1 , · · · , K . This implies Eq.(3.3) is satisﬁed with an additional error of order O ( T δ ), which proves the claim.Note that, for a general input-output sets S in and S out and a general input state | ψ (cid:105) ∈ H in , the lowest achievableerror in this transformation is O ( d × δ ).0 FIG. 3: The circuit for emulating the two-outcome projective measurement with projectors { Π , I − Π } , where Π is the projectorto the subspace spanned by the sample states {| φ k (cid:105) : 1 , · · · , K } . Appendix E: Emulating projective measurements

In this section we consider the problem of emulating projective measurements. We assume we are given multiplecopies of states that belong to diﬀerent subspaces of the Hilbert space, with the labels which specify these subspaces.Then, we are interested to simulate the projective measurement that projects any given input state to one of thesesubspaces. Note that any projective measurement can be realized as a sequence of two-outcome measurements.Therefore, in the following we focus on implementing two-outcome measurements.Consider the set of sample states {| φ k (cid:105) : 1 , · · · , K } . We do not make any assumption about the sample states,or the relation between them. Let H Π be the subspace spanned by these states, and Π be the projector to thissubspace. Assuming we are given multiple copies of each state in this set we are interested to implement the projectivemeasurement described by the projectors { Π , I − Π } .In the algorithm we use the controlled-reﬂections R a ( k ) = | (cid:105)(cid:104) | a ⊗ I + | (cid:105)(cid:104) | a ⊗ e iπ | φ k (cid:105)(cid:104) φ k | . (E1)As we explained in Appendix C, using n copies of state | φ k (cid:105) we can implement this unitary in time O (cid:0) log( D ) × n log( n ) (cid:1) , with error O ( n ) in trace distance, where D is the dimension of the Hilbert space. a. The Algorithm The quantum circuit for this algorithm is presented in Fig.(3). The algorithm has the following three steps: (i)

Consider T qubits a · · · a T , which are all initially prepared in state |−(cid:105) = ( | (cid:105) − | (cid:105) ) / √

2. Let k , · · · , k T be T independent random integers, each chosen uniformly at random form 1 , · · · , K . Apply the controlled-reﬂection R a ( k ) to the system and qubit a , R a ( k ) to the system and qubit a , until the last controlled-reﬂection R a T ( k T ),which is applied to the system and qubit a T . (ii) Perform the two-outcome projective measurement {|−(cid:105)(cid:104)−| ⊗ T , I − |−(cid:105)(cid:104)−| ⊗ T } on qubits a · · · a T . (This can beimplemented eﬃciently, e.g., by ﬁrst applying the Hadamard gates on qubits a · · · a T , then applying a T -bit Toﬀoligate controlled by a · · · a T , that is acting on an ancillary qubit initialized in state | (cid:105) , and ﬁnally applying theHadamard gates on qubits a · · · a T again. Then, measuring this ancillary qubit in the computational basis, realizesthe desired measurement.) (iii) Recall the random integers k , · · · , k T chosen in step (i). Apply the controlled-reﬂection R a T ( k T ) to the systemand qubit a T , then apply the controlled-reﬂection R a T − ( k T − ) to the system and qubit a T − , until the last controlled-reﬂection R a ( k ) which is applies to the system and qubit a .Then, at the end of the algorithm, the initial state | ψ (cid:105) is projected to state Π | ψ (cid:105) / (cid:112) (cid:104) ψ | Π | ψ (cid:105) with probability (cid:104) ψ | Π | ψ (cid:105) (corresponding to outcome b = 0 in the measurement in step (ii)), and is projected to state ( I − Π) | ψ (cid:105) / (cid:112) − (cid:104) ψ | Π | ψ (cid:105) with probability 1 − (cid:104) ψ | Π | ψ (cid:105) (corresponding to outcome b = 1 in the measurement).1 b. How it works In the following we assume the circuit is run with perfect controlled-reﬂections. Then, the additional error due tothe imperfections in the simulations of the controlled-reﬂections can be taken into account, using the same approachwe used for the main algorithm.Let E be the quantum channel that describes the overall eﬀect of circuit in Fig.(3) in the case where we ignorethe output of measurement, i.e. we do not postselect. Let H Π be the subspace spanned by the sample states {| φ k (cid:105) : k = 1 , · · · , K } . Theorem 4

Let λ min be the minimum non-zero eigenvalue of the density operator K − (cid:80) Kk =1 | φ k (cid:105)(cid:104) φ k | . If the supportof ρ , the state of system is contained in either the subspace H Π or the orthogonal subspace, then the probability oferror in emulating the measurement { I − Π , Π } is less than or equal to (1 − λ min ) T . Furthermore, the ﬁdelity of E ( ρ ) ,the output of the circuit, with ρ the desired output state, satisﬁes F ( E ( ρ ) , ρ ) ≥ − (1 − λ min ) T . Proof.

First, consider the case where ρ , the initial state of system, does not have any support on H Π , i.e. Π ρ Π = 0.Then, all the reﬂections e iπ | φ k (cid:105)(cid:104) φ k | act as the identity operator on this state. Therefore, after step (i) the state ofqubits a · · · a T remain unchanged. In this case, in step (ii) with probability one we project these qubits to state |−(cid:105) ⊗ T . It follows that in this case the algorithm emulates the action of measurement perfectly.Next, we consider the case where Π ρ Π = ρ . Using the fact that (cid:104)−| R a ( k ) |−(cid:105) a = I + e iπ | φ k (cid:105)(cid:104) φ k | I − | φ k (cid:105)(cid:104) φ k | , (E2)we ﬁnd that for a general state ρ (not necessarily restricted to H Π ) the probability that in step (ii) we ﬁnd the qubits a · · · a T in state |−(cid:105) ⊗ T is given by p ( |−(cid:105) ⊗ T ) = 1 K T K (cid:88) k , ··· ,K T Tr (cid:16) [ |−(cid:105)(cid:104)−| ⊗ T ⊗ I ] R a T ( k T ) · · · R a ( k ) (cid:2) ρ ⊗ |−(cid:105)(cid:104)−| ⊗ T (cid:3) R a ( k ) · · · R a T ( k T ) (cid:17) (E3a)= 1 K T K (cid:88) k , ··· ,K T Tr (cid:16) ( I − | φ k T (cid:105)(cid:104) φ k T | ) · · · ( I − | φ k (cid:105)(cid:104) φ k | ) ρ ( I − | φ k (cid:105)(cid:104) φ k | ) · · · ( I − | φ k T (cid:105)(cid:104) φ k T | ) (cid:17) (E3b)= Tr (cid:0) M T ( ρ ) (cid:1) , (E3c)where M ( · ) = K (cid:80) Kk =1 P ⊥ k ( · ) P ⊥ k , and P ⊥ k = I − | φ k (cid:105)(cid:104) φ k | .Next, note that for any operator X we haveTr( M ( X )) = 1 K K (cid:88) k =1 Tr( XP ⊥ k ) = Tr( X ) − Tr (cid:16) X K (cid:88) k =1 K | φ k (cid:105)(cid:104) φ k | (cid:17) = Tr( X ) − Tr( Xσ avg ) , (E4)where σ avg ≡ K (cid:88) K | φ k (cid:105)(cid:104) φ k | . (E5)The subspace H Π is deﬁned as the subspace spanned by {| φ k (cid:105) : k = 1 , · · · , K } . Therefore, the density matrix σ avg is automatically full-rank in this subspace. Let λ min be the minimum eigenvalue of σ avg in this subspace, i.e. theminimum nonzero eigenvalue of σ avg . Note that for any positive operator X whose support is restricted to H Π wehave Tr( Xσ avg ) ≥ λ min Tr( X ) . (E6)Then, using Eq.(E4) this means that for any positive operator X , whose support is restricted to H Π we haveTr( M ( X )) = Tr( X ) − Tr( Xσ avg ) ≤ Tr( X )(1 − λ min ) . (E7)Next, note that if X is a positive operator with support restricted to H Π , then M ( X ) is also a positive operator withsupport restricted to H Π . It follows that if X is a positive operator with support restricted to H Π , thenTr( M T ( X )) ≤ Tr( X ) × (1 − λ min ) T . (E8)2This together with Eq.(E3) imply that if the support of the initial state ρ is restricted to H Π , then the probabilitythat at the end of algorithm we ﬁnd qubits a · · · a T in state |−(cid:105) ⊗ T is given by p ( |−(cid:105) ⊗ T ) = Tr (cid:0) M T ( ρ ) (cid:1) ≤ (1 − λ min ) T . (E9)This completes the proof of the ﬁrst part of the theorem.Finally, the second part of the theorem, i.e. the bound on the ﬁdelity, follows from the following lemma, which isproven in the same way we proved Theorem 1. Lemma 5

Consider the following quantum operation: (i) We apply a unitary V ( λ ) to the system, where λ is arandom parameter chosen according to the probability distribution p ( λ ) . (ii) We perform a projective measurementon the system with projectors { P α : α ∈ A } . (iii) We apply the unitary V † ( λ ) .Let G be the quantum channel that describes the overall eﬀect of this operation on the system, in the case where weignore the outcome of the projective measurement. Then, for any input state ρ , the ﬁdelity of ρ and G ( ρ ) satisﬁes F ( ρ, G ( ρ )) ≥ max α ∈ A p α ( ρ ) (E10) where p α ( ρ ) is the (average) probability of outcome α in the measurement. Proof.

Quantum operation G is given by G ( · ) = (cid:88) λ p ( λ ) (cid:88) α ∈ A V † ( λ ) P α V ( λ )( · ) V † ( λ ) P α V ( λ ) . (E11)Consider a pure input state | γ (cid:105) . Then, the squared ﬁdelity of G ( | γ (cid:105)(cid:104) γ | ) and | γ (cid:105)(cid:104) γ | is given by F ( G ( | γ (cid:105)(cid:104) γ | ) , | γ (cid:105)(cid:104) γ | ) = (cid:104) γ |G ( | γ (cid:105)(cid:104) γ | ) | γ (cid:105) (E12)= (cid:88) α ∈ A (cid:88) λ p ( λ ) (cid:104) γ | V † ( λ ) P α V ( λ ) | γ (cid:105) (E13) ≥ max α ∈ A (cid:88) λ p ( λ ) (cid:104) γ | V † ( λ ) P α V ( λ ) | γ (cid:105) (E14) ≥ max α ∈ A (cid:16) (cid:88) λ p ( λ ) (cid:104) γ | V † ( λ ) P α V ( λ ) | γ (cid:105) (cid:17) (E15)= max α ∈ A p α ( | γ (cid:105)(cid:104) γ | ) , (E16)where to get the fourth line we have used the fact that the variance of any random variable is non-negative, andto get the last line we have used the fact that the average density operator of the system before the measurementis (cid:80) λ p ( λ ) V ( λ ) | γ (cid:105)(cid:104) γ | V † ( λ ), and so the probability of outcome α is (cid:80) λ p ( λ ) (cid:104) γ | V † ( λ ) P α V ( λ ) | γ (cid:105) . This proves thelemma for the special case of pure states. The result for general mixed states follows from the joint concavity ofﬁdelity.This lemma together with the fact that for input states contained in H Π , or the complement subspace the outcomeof the ideal measurement is deterministic, and the circuit generates this outcome with probability of error less than(1 − λ min ) T , proves the bound F ( E ( ρ ) , ρ ) ≥ − (1 − λ min ) T ..