Three-step implementation of any nxn unitary with a complete graph of n qubits
aa r X i v : . [ qu a n t - ph ] S e p Physical Review A , 032306 (2015) Three-step implementation of any n × n unitary with a complete graph of n qubits Amara Katabarwa ∗ and Michael R. Geller † Department of Physics and Astronomy, University of Georgia, Athens, Georgia 30602, USA (Dated: September 27, 2018)Quantum computation with a complete graph of superconducting qubits has been recently pro-posed, and applications to amplitude amplification, phase estimation, and the simulation of realisticatomic collisions given [Phys . Rev . A , n × n unitaries in a single step, but not general unitaries. Here we show thatany element in the unitary group U( n ) can be executed in no more than three steps, for any n . Thisenables the implementation of highly complex operations in constant time, and in some cases evenallows for the compilation of an entire algorithm down to only three operations. Using this protocolwe show how to prepare any pure state of an SES chip in three steps, and also how to compute, fora given SES state ρ, the expectation value Tr( ρO ) of any n × n Hermitian observable O in a constantnumber of steps. PACS numbers: 03.67.Lx, 85.25.Cp
I. PRETHRESHOLD QUANTUMCOMPUTATION
There is currently great interest in the develop-ment of special-purpose quantum computing devices andmethodologies that do not require full error correctionand which are practical now. For example, D-Wave Sys-tems produces commercial quantum annealers based onsuperconducting circuits that solve an important classof binary optimization problems [1]. However it is notknown whether the D-Wave annealers can outperformconventional classical supercomputers [2, 3]. An opti-cal approach [4] that solves an arguably less importantproblem—sampling from the distribution of bosons scat-tered by a unitary network—but which is likely capa-ble of quantum speedup has also been investigated [5–7]. An approach called the single-excitation-subspace(SES) method, also based on supercondonducting cir-cuits, has been proposed [8]. Here computations are per-formed in the n -dimensional SES of a complete graph of n qubits. We call these examples prethreshold , referringto the threshold theorem of fault-tolerant quantum com-putation, because they do not require exceeding fidelityand qubit-number thresholds before being applicable.A quantum computer chip implementing the SESmethod consists of a fully connected array of supercon-ducting qubits with tunable frequencies and tunable pair-wise σ x ⊗ σ x couplings; an abstract representation is givenin Fig. 1. It works by operating in a subspace of the full2 n -dimensional Hilbert space where the Hamiltonian canbe directly programmed. This programmability elimi-nates the need to decompose operations into elementary ∗ [email protected] † [email protected] FIG. 1. (Color online) Complete graph with n = 16. Thevertices (open circles) are qubits and the edges (colored lines)are tunable couplers. one- and two-qubit gates, enabling larger computationsto be performed within the available coherence time. Theprice for this high degree of controllability is that theapproach is not scalable. However, a technically unscal-able quantum computer is still useful for prethresholdquantum computation and might even be able to achievespeedup relative to a classical supercomputer for certaintasks. The SES approach trades physical qubits and highconnectivity for, in effect, longer coherence. This is asensible trade for quantum computing architectures suchas superconducting circuits, whose largest prethresholdproblem sizes are limited by coherence time, not by thedifficulty of introducing additional qubits. A realisticchip layout that provides space for the coupler circuitsand avoids the crossovers of Fig. 1 is shown in Fig. 2. FIG. 2. (Color online) Possible layout for the 16-qubit chip.
II. CONTENT OF THIS PAPER
A significant restriction of the SES method presentedin Ref. [8] is that the Hamiltonian programmed into thehardware is real and symmetric, whereas the most gen-eral Hamiltonian is complex Hermitian. If a target oper-ation has the form e − iA , where A is a known real sym-metric generator matrix, then the unitary can be imple-mented in one step. This is the case when the unitaryis symmetric ( U = U ⊤ ) and is reviewed in Sec. III A. Inthat section we also provide an improved procedure forconstructing the time-optimal SES Hamiltonian H cor-responding to a given generator A .However, a general element of the unitary group U( n )has the form e − iM with M complex Hermitian. Thisis the nonsymmetric unitary case ( U = U ⊤ ) discussed inSec. III B. We show there that any nonsymmetric element U ∈ U( n ) can be implemented in three steps, for any n .Applications of these techniques are given in Sec. IV.In Sec. IV A we show how to simulate time-independentbut otherwise arbitrary n × n complex Hamiltonians withan SES chip in three steps. In Sec. IV B we show how toprepare pure but otherwise arbitrary SES states in threesteps. And in Sec. IV C we explain how to compute ex-pectation values of arbitrary n × n Hermitian observables.
III. SES IMPLEMENTATION OF UNITARYOPERATORSA. Single-step implementation of symmetricunitaries
The basic single-step operation in SES quantum com-puting is the implementation of symmetric unitaries of the form U = e − iA , with A real and symmetric [8].Therefore, a standard task in SES algorithm designand implementation is the construction of an optimalprotocol—an SES Hamiltonian H and evolution time t qc —to implement that unitary. We assume here that thegenerator matrix A is known ; if it is not then the clas-sical overhead for obtaining A from U must be includedin the quantum runtime. (We also note that the genera-tor A = i log U is not unique.) The optimal protocol forimplementing a symmetric unitary depends on the func-tionality assumed of the chip, especially of the tunablecoupler circuits. Here we assume that the experimentallycontrolled SES Hamiltonian can be written, apart froman additive constant, as H = g max K with − ≤ K ii ′ ≤ , (1)which we call the standard form . In this case we areassuming that the couplings can be tuned continuouslybetween − g max and g max , and that the qubit frequenciescan be varied within a window of width 2 g max about someparking frequency. Because we are free to change theoverall phase of an SES state, we write the symmetricunitary as U = e − i ( A − cI ) e − ic , (2)where I is the n × n identity matrix, and then ignore theglobal phase e − ic . The value of c is chosen to minimizethe evolution time t qc , which is proportional to the angle θ A ≡ max ii ′ | A ii ′ − cδ ii ′ | . (3)The K matrix in (1) is then given by K = A − cIθ A , (4)and the evolution time is t qc = ~ θ A g max . (5)Note that θ A is not bounded by 2 π and can become arbi-trarily large. The global phase angle that minimizes θ A is c = min i A ii + max i A ii , (6)which is proved below. Although we have assumed thatthe SES Hamiltonian H = g max K is abruptly switchedon for a time t qc before being abruptly switched off—which is the fastest protocol—any SES Hamiltonian ofthe form H = g ( t ) K such that R ( g/ ~ ) dt = θ A may beused instead.To minimize (3) over c we consider two cases: In thefirst case max ii ′ | A ii ′ | occurs for an off-diagonal elementof A , in which case the minimum value of θ A is indepen-dent of c (because c only affects the diagonal elements ofthe shifted matrix A − cI ). Therefore we only need toconsider the second case where max ii ′ | A ii ′ | occurs for a diagonal element. The diagonal elements consist of points x ∈ (cid:8) A , A , · · · , A nn (cid:9) (7)on the real number line, bounded between min i A ii andmax i A ii . Placing c at the midpoint of the smallest re-gion containing all the points in (7) minimizes the largestdistance | A ii − c | . B. Three-step implementation of nonsymmetricunitaries: ABA decomposition
Our protocol relies on the matrix decomposition U = O e − iD O ⊤ , (8)where D is a real diagonal matrix and the O i ∈ O ( n )are real orthogonal matrices. This identity follows fromthe KAK decomposition of the Lie group U( n ) [9]. Toobtain the O i and D from U , we first compute χ ≡ U U ⊤ = O e − iD O ⊤ , (9)which is both symmetric and unitary. The real and imag-inary parts of χ are also separately symmetric. Then theunitarity condition (cid:0) Re χ − i Im χ (cid:1)(cid:0) Re χ + i Im χ (cid:1) = I (10)shows that Re χ and Im χ commute and can be simulta-neously diagonalized. O is determined by a Schur de-composition of Re χ , which always produces a real O (unlike the decomposition of χ itself). Then e − iD and O are obtained from O ⊤ χO and U ⊤ O e iD , respectively.The three-step implementation for a nonsymmetric U ∈ U( n ) follows from the identity U = e − iA e − iB e iA , (11)which we call the ABA decomposition . Here A and B arereal symmetric n × n matrices. To derive (11) we expressthe target unitary in the spectral form U = V e − i Λ V † , where V is complex unitary and Λ is real and diagonal.Decomposing V using (8) we have U = O e − iD O ⊤ e − i Λ O e iD O ⊤ , = e − iO DO ⊤ ( O O ⊤ ) e − i Λ ( O O ⊤ ) ⊤ e iO DO ⊤ , (12)which leads to (11) with generators A = O DO ⊤ , (13) B = O O ⊤ Λ O O ⊤ , (14)which are both real and symmetric. The classical runtimeto obtain A and B is about1 . × n . µ s (15) on a laptop computer [10]. The quantum runtime toimplement a nonsymmetric unitary is t qc = ~ (2 θ A + θ B ) g max , (16)with θ defined in (3). The generator matrices A and B in (11) are not unique.The ABA decomposition allows for the possibility ofimplementing highly complex operations in three steps.But this does not imply that an entire algorithm, com-piled into a single unitary, can be implemented in con-stant time, because the compiled unitary might not beknown a priori, and there is classical overhead (15) forcomputing A and B . More importantly, evaluating A and B for an entire algorithm would presumably be pro-hibitive when one is attempting to outperform classicalcomputers. Furthermore, algorithms might include mea-surement steps that cannot be postponed to the end. IV. APPLICATIONSA. Hamiltonian simulation
A useful application of (11) is to U = e − iHt/ ~ , where H is a given complex Hamiltonian. In this case we have e − iHt/ ~ = e − iA e − iB e iA , (17)with A and B given by (13) and (14), where Λ is a di-agonal matrix containing t/ ~ times the spectrum of H .This enables the fast simulation of any time-independentHamiltonian with an SES chip [11]. B. SES pure state preparation in 3 steps
In some cases it is possible to compile an entire algo-rithm down to only three steps. As an example we givean algorithm for preparing any (normalized) pure SESstate of the form | ψ i = n X i =1 a i | i ) , a i = | a i | e iθ i , ≤ θ i < π. (18)Here | i ) ≡ | · · · i · · · i is the i th SES basis state of the n -qubit processor. We proceed by giving a protocol withlinear depth that is practical for small n , which is thensubsequently compiled down to three steps.We start with the basis state | | · · · i by a microwavepulse, and then apply the standard-form SES Hamilto-nian H = g max K star for a time t qc = π ~ / √ ng max , with K star ≡
12 12 · · · · · · · · · · · · (19) i | a i | FIG. 3. (color online) Occupation probabilities for a uniformweight state. Phases of the probability amplitudes a i are notrepresented in this figure. the adjacency matrix for a star graph with qubit 1 atthe center (see Sec. IIIA of Ref. [8]). This produces the uniform state (cid:12)(cid:12) unif (cid:11) ≡ |
1) + |
2) + · · · + | n ) √ n , (20)apart from a phase.If the occupation probabilities in the target state (18)are uniform, | a i | = 1 n , (21)we call it a uniform weight state and represent it by thebar graph in Fig. 3. In this case we would apply thediagonal Hamiltonian H = g max K , where K = − θ π · · · θ π · · · · · · θ n π (22)to the uniform state | unif i for a time t qc = 2 π ~ /g max , which gives the desired target.Typically the target is not a uniform weight state, asrepresented in Fig. 4. In this case we use the solution | unif i = W diag ( U swap U diag ) M · · · ( U swap U diag ) | ψ i (23)to the inverse problem of constructing the uniform state | unif i from the target [12, 13]. Each of the M stepsin (23) consists of a pair of operations U diag and U swap that move weight between a pair of components. After i | a i | FIG. 4. (color online) Occupation probabilities for a typicaltarget state. In this example the components of maximumand minimum weight have indices i max = 2 and i min = 3. M = O ( n ) steps a uniform weight state is created. Thefinal operation W diag shifts the phases of the uniformweight state to that of (20). The first step is:1. Find the components i min and i max with the small-est and largest weights, respectively (if not unique,any solution is sufficient). These satisfy | a i min | ≤ n ≤ | a i max | , (24)excluding the case where both ≤ signs are identi-ties (which would violate the assumption that thetarget is nonuniform). Therefore | a i min | < | a i max | .
2. Perform a phase shift U diag = e − i H t qc / ~ that bringsthe probability amplitudes a i min and a i max to theform a i min = | a i min | and a i max = i | a i max | , with | a i min | < | a i max | . Apply SES Hamiltonian (1),where K is a diagonal matrix with K i min ,i min = θ i min / π and K i max ,i max = ( θ i max / π ) − , the otherelements zero, and t qc = 3 π ~ /g max . This phase shiftis necessary to prepare the state for the next oper-ation.3. Implement a partial iSWAP U swap = e − i H t qc / ~ fromcomponent i max to i min to bring the weight of i min to the uniform value, | a i min | → n , (25)and leaving component i max with weight | a i max | → | a i min | + | a i max | − n . (26)Apply SES Hamiltonian (1) with K i min ,i max = K i max ,i min = 1 and all other elements zero, and t qc = ϕ ~ /g max with ϕ given by | a i min | cos ϕ + | a i max | sin ϕ = p /n. (27)There is always a solution with 0 < ϕ < π/ U swap U diag ) | ψ i is a uniformweight state, it can be written in the form e iα |
1) + e iα |
2) + · · · + e iα n | n ) √ n , (28)and we apply the final operation W diag = e − i H t qc / ~ toproduce (20). Here we use SES Hamiltonian (1) with K = α π · · · α π · · · · · · α n π (29)and t qc = 2 π ~ /g max . If ( U swap U diag ) | ψ i is not a uniformweight state, we again find the minimum and maximumweight components i min and i max , and follow the aboveprotocol to generate ( U swap U diag ) ( U swap U diag ) | ψ i . Theprocedure is repeated until( U swap U diag ) M · · · ( U swap U diag ) ( U swap U diag ) | ψ i (30) is a uniform weight state, after which W diag is applied.The number of iterations required satisfies M ≤ n − . (31)This completes the solution to the inverse problem (23).We now use (23) to obtain | ψ i = ( U † diag U † swap ) · · · ( U † diag U † swap ) M W † diag | unif i , (32)which solves the general state-preparation problem in O ( n ) steps. Hermitian conjugations are implemented bychanging the signs of the K matrices given above. Theprotocol given in (32) is, by itself, practical for small n .The complete state preparation operation can be sum-marized as | ψ i = U | , (33)where U ≡ ( U † diag U † swap ) · · · ( U † diag U † swap ) M W † diag e − i π √ n K star (34)is the compiled unitary of the state-preparation algo-rithm. The three-step state preparation protocol usesthe ABA decomposition to implement (34). The totalstate preparation time, not including the |
1) state ini-tialization time, is given in (16).For example, suppose we wish to prepare the randomlychosen target | ψ i = 0 . |
1) + ( − . − . i ) |
2) + (0 . . i ) | . . i ) |
4) + ( − . . i ) | , (35)in the n = 5 graph, where for convenience the first com-ponent has been chosen to be real. Following the state-preparation protocol leads to the compiled unitary U = . . − . i . − . i . − . i . . i − . − . i . − . i . − . i − . . i − . . i . . i . . i − . − . i . − . i . − . i . . i . − . i . − . i − . . i . − . i − . . i . . i . . i . . i . − . i , (36)up to a phase factor. The first column of (36) is thetarget state. The ABA decomposition (11) then leads to A = − . . . − . − . . − . . − . − . . . − . − . . − . − . − . − . − . − . − . . − . − . (37) and B = − . . . . . . − . . . . . . − . . . . . . − . . . . . . − . . (38)The associated K matrices and evolution times are de-termined from the procedure given in Sec. III A: K A = . . . − . − . . − . − . − . . . − . − . . − . − . − . − . − . − . . − . − . ,θ A = 1 . , (39)and K B = . . . . − . . . . . . . . . . . . . . . . . . − . ,θ B = 1 . . (40)The total state preparation time, not counting the | g max / π =50 MHz.Although state preparation is implemented in threesteps for any n , the runtime does have a weak n -dependence, because θ A and θ B do. Averaged over ran-dom targets we find that2 θ A + θ B ≈ . × n . . (41)For small n , either the linear-depth protocol (32) or thethree-step protocol based on (34) can be used. Howeverfor large n , only the three-step protocol is practical. C. Computation of expectation values
Finally, we show how to compute the expectation value h O i ≡ Tr( ρO ) (42)of any n × n Hermitian observable O , by implementingthe protocol of Reck et al. [14]. Here ρ is any pure ormixed SES state provided as an input to the procedure. Standard readout of an SES processor consists of thesimultaneous measurement of each qubit in the diagonalbasis. The SES condition means that a single qubit willbe found in the state | i , with the remaining n − | i . Let i be the qubit observed in it’s excited state.The probability of observing the excitation in qubit i is p i = ( i | ρ | i ). Therefore, if we have access to multiplecopies of ρ we can repeat the readout N times to obtainestimates of the occupation probabilities p i with sam-pling errors no larger than (2 √ N ) − .To compute h O i , perform a (classical) spectral decom-position to a unitary V containing the eigenvectors of O as columns, and a real diagonal matrix D : O = V DV † .Then we have h O i = Tr( ρV DV † ) = Tr( ρ ′ D ) , (43)where ρ ′ ≡ V † ρV. (44)Therefore we can compute h O i by applying the unitaryoperator V † using the ABA decomposition, measuringthe resulting occupation probabilities, which we denoteby p [ V † ] i to indicate the application of V † , and then clas-sically evaluating the quantity h O i = n X i =1 D ii p [ V † ] i . (45) V. CONCLUSIONS
In this work we have extended the SES method ofRef. [8] to include a three-step implementation of arbi-trary n × n unitaries. The fast state preparation proto-col of Sec. IV B should be especially useful for practicalquantum computing applications. ACKNOWLEDGMENTS
This work was supported by the US National ScienceFoundation under CDI grant DMR-1029764. It is a plea-sure to thank Emmanuel Donate and Timothy Steele fortheir contributions during the early stages of this work. [1] M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lanting,F. Hamze, N. Dickson, R. Harris, A. J. Berkley, J. Jo-hansson, P. Bunyk, E. M. Chapple, C. Enderud, J. P.Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky, T. Oh,I. Perminov, C. Rich, M. C. Thom, E. Tolkacheva, C. J. S.Truncik, S. Uchaikin, J. Wang, B. Wilson, and G. Rose,Nature (London) , 194 (2011).[2] S. Boixo, T. F. Ronnow, S. V. Isakov, Z. Wang,D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer,Nature Phys. , 218 (2014). [3] T. F. Ronnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov,D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer,Science , 420 (2014).[4] S. Aaronson and A. Arkhipov, Theory of Computing ,143 (2013).[5] M. A. Broome, A. Fedrizzi, S. Rahimi-Keshari, J. Dove,S. Aaronson, T. C. Ralph, and A. G. White, Science , 794 (2013).[6] J. B. Spring, B. J. Metcalf, P. C. Humphreys,W. S. Kolthammer, X.-M. Jin, M. Barbieri, A. Datta, N. Thomas-Peter, N. K. Langford, D. Kundys, J. C.Gates, B. J. Smith, P. G. R. Smith, and I. A. Walm-sley, Science , 798 (2013).[7] M. Tillmann, B. Dakic, R. Heilmann, S. Nolte, A. Sza-meit, and P. Walther, Nature Photonics , 540 (2013).[8] M. R. Geller, J. M. Martinis, A. T. Sornborger, P. C.Stancil, E. J. Pritchett, H. You, and A. Galiautdinov,Phys. Rev. A , 062309 (2015).[9] A. W. Knapp, Lie Groups Beyond an Introduction (Birkh¨auser, 1996).[10] Computations were performed running 64-bit MATLABR2015a on an Apple MacBook Pro with a 2 . U for n between 50 and 500. Theclassical runtime for a 100 ×
100 matrix is about 60 ms. [11] In principle it is possible to simulate time-dependentcomplex Hamiltonians as well. However this would re-quire decomposing the evolution into a sequence of shorttime steps such that H ( t ) is approximately constantwithin each time step, and applying the ABA decom-position at each time step. Given the classical overheadrequired at each step, this does not seem useful for out-performing classical computers.[12] C. K. Law and J. H. Eberly, Phys. Rev. Lett. , 1055(1996).[13] M. Hofheinz, H. Wang, M. Ansmann, R. C. Bialczak,E. Lucero, M. Neeley, A. D. O’ Connell, J. Sank, D. Wen-ner, J. M. Martinis, and A. N. Cleland, Nature (London) , 546 (2009).[14] M. Reck, A. Zeilinger, H. J. Bernstein, and P. Bertani,Phys. Rev. Lett.73