[PDF] Remarks on Quantum Modular Exponentiation and Some Experimental Demonstrations of Shor's Algorithm

Abstract

An efficient quantum modular exponentiation method is indispensible for Shor's factoring algorithm. But we find that all descriptions presented by Shor, Nielsen and Chuang, Markov and Saeedi, et al., are flawed. We also remark that some experimental demonstrations of Shor's algorithm are misleading, because they violate the necessary condition that the selected number q= 2 s , where s is the number of qubits used in the first register, must satisfy n 2 ≤q<2 n 2 , where n is the large number to be factored.

Full PDF

RRemarks on Quantum Modular Exponentiation andSome Experimental Demonstrations of Shor’s Algorithm

Zhengjun Cao , ∗ , Zhenfu Cao, , Lihua Liu Abstract . An eﬃcient quantum modular exponentiation method is indispensiblefor Shor’s factoring algorithm. But we ﬁnd that all descriptions presented by Shor,Nielsen and Chuang, Markov and Saeedi, et al., are ﬂawed. We also remark that someexperimental demonstrations of Shor’s algorithm are misleading, because they violatethe necessary condition that the selected number q = 2 s , where s is the number ofqubits used in the ﬁrst register, must satisfy n ≤ q < n , where n is the large numberto be factored. Keywords . Shor’s factoring algorithm; quantum modular exponentiation; super-position; continued fraction expansion.

It is well known that factoring an integer n can be reduced to ﬁnding the order of an integer x withrespect to the module n (G. Miller [1]). The order is usually denoted by the notation ord n ( x ) . So far,there is not a polynomial time algorithm run on classical computers which can be used to computeord n ( x ). In 1994, P. Shor [2] proposed the ﬁrst quantum algorithm which can compute ord n ( x )in polynomial time. The factoring algorithm requires two quantum registers. At the beginning ofthe algorithm, one has to ﬁnd q = 2 s for some integer s such that n ≤ q < n , where n is to befactored. The followed steps are: Initialization . Put register-1 in the following uniform superposition1 √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) . Computation . Keep a in register-1 and compute x a in register-2 for some randomly chosen Department of Mathematics, Shanghai University, Shanghai, China. ∗ [email protected] Software Engineering Institute, East China Normal University, Shanghai, China. Department of Computer Science and Engineering, Shanghai Jiao Tong University, China. Department of Mathematics, Shanghai Maritime University, Shanghai, China. a r X i v : . [ c s . D S ] O c t nteger x . We then have the following state1 √ q q − (cid:88) a =0 | a (cid:105)| x a (cid:105) . Fourier transformation . Performing Fourier transform on register-1, we obtain the state1 q q − (cid:88) a =0 q − (cid:88) c =0 exp(2 πiac/q ) | c (cid:105)| x a (cid:105) . Observation . It suﬃces to observe the ﬁrst register. The probability p that the machinereaches the state | c, x k (cid:105) is (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) q (cid:88) a : x a ≡ x k exp(2 πiac/q ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) where 0 ≤ k < r = ord n ( x ), the sum is over all a (0 ≤ a < q ) such that x a ≡ x k . Continued fraction expansion . If there is a d such that − r ≤ dq − rc ≤ r , then the probabilityof seeing | c, x k (cid:105) is greater than 1 / r . Hence, we have (cid:12)(cid:12)(cid:12)(cid:12) cq − dr (cid:12)(cid:12)(cid:12)(cid:12) ≤ q ≤ n < r . (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) Since (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) q ≥ n , (cid:58)(cid:58)(cid:58) we (cid:58)(cid:58)(cid:58)(cid:58) can (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) round (cid:58)(cid:58)(cid:58)(cid:58) c/q (cid:58)(cid:58)(cid:58) to (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) obtain (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) d/r . Thus r can be obtained.P. Shor has speciﬁed the operations for the process | (cid:105)| (cid:105) → √ q (cid:80) q − a =0 | a (cid:105)| (cid:105) , but not speciﬁedthe operations for the process √ q (cid:80) q − a =0 | a (cid:105)| (cid:105) → √ q (cid:80) q − a =0 | a (cid:105)| x a (mod n ) (cid:105) . His original descriptionspeciﬁes only the process ( a, → ( a, x a mod n ). Nielsen and Chuang in their book Ref.[3] specifythat | a (cid:105)| y (cid:105) → | a (cid:105) U a t − t − · · · U a | y (cid:105) = | a (cid:105)| x a t − t − × · · · × x a y (mod n ) (cid:105) = | a (cid:105)| x a y (mod n ) (cid:105) where a ’s binary representation is a t − a t − · · · a , U is the unitary operation such that U | y (cid:105) ≡| xy (mod n ) (cid:105) , y ∈ { , } (cid:96) , (cid:96) is the bit length of n .We ﬁnd the Nielsen-Chuang quantum modular exponentiation method requires a unitary oper-ations. Apparently, it is inappropriate for the process1 √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) → √ q q − (cid:88) a =0 | a (cid:105)| x a (mod n ) (cid:105) where n ≤ q < n and n is the large number to be factored, because the total amount of unitaryoperations required for this process is O ( q ), not O (log n ). So far, there are few literatures toinvestigate the above mysterious process. In view of that O ( q ) unitary operations can not beimplemented in polynomial time, we do not think that Shor’s factoring algorithm is completelyunderstandable and universally acceptable.Since 2001, some teams have reported that they had successfully factored 15 into 3 × q must satisfy n ≤ q < n . 2 Preliminaries

A quantum analogue of a classical computer operates with quantum bits involving quantum states.The state of a quantum computer is described as a basis vector in a Hilbert space. A qubit is aquantum state | Ψ (cid:105) of the form | Ψ (cid:105) = a | (cid:105) + b | (cid:105) , where the amplitudes a, b ∈ C such that | a | + | b | = 1 , | (cid:105) and | (cid:105) are basis vectors of the Hilbertspace. Here, the ket notation | x (cid:105) means that x is a quantum state. The state of a quantum systemhaving n qubits is a point in a 2 n -dimensional vector space. Given a state n − (cid:88) i =0 a i | χ i (cid:105) , where the amplitudes are complex numbers such that (cid:80) n − i =0 | a i | = 1 and each | χ i (cid:105) is a basisvector of the Hilbert space, if the machine is measured with respect to this basis, the probabilityof seeing basis state | χ i (cid:105) is | a i | . Two quantum mechanical systems are combined using the tensor product . For example, a systemof two qubits | Ψ (cid:105) = a | (cid:105) + a | (cid:105) and | Φ (cid:105) = b | (cid:105) + b | (cid:105) can be written as | Ψ (cid:105)| Φ (cid:105) = (cid:18) a a (cid:19) ⊗ (cid:18) b b (cid:19) =  a b a b a b a b  We shall also use the shorthand notations | Ψ , Φ (cid:105) . We call a quantum state having two or morecomponents entangled state, if it is not a product state. According to the Copenhagen interpre-tation of quantum mechanics, measurement causes an instantaneous collapse of the wave functiondescribing the quantum system into an eigenstate of the observable state that was measured. Ifentangled, one object cannot be fully described without considering the other(s).Operations on a qubit are described by 2 × X = (cid:34) (cid:35) , Y = (cid:34) − ii (cid:35) , Z = (cid:34) − (cid:35) , H = 1 √ (cid:34) − (cid:35) , where H denotes the Hadamard gate. Clearly, H | (cid:105) = √ ( | (cid:105) + | (cid:105) ) . Operations on two qubits are described by 4 × | c (cid:105)| t (cid:105) → c (cid:105)| c ⊕ t (cid:105) , where ⊕ denotes addition modulo 2. The matrix representation of CNOT is   . Likewise, (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) operations (cid:58)(cid:58)(cid:58) on (cid:58)(cid:58) (cid:96) (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) qubits (cid:58)(cid:58)(cid:58) are (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) described (cid:58)(cid:58)(cid:58) by (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) (cid:96) × (cid:96) (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) unitary (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) matrices.There is another method to describe linear operators performed on multiple qubits . Suppose that V and W are vector spaces of dimension 2 µ and 2 ν (they describe quantum systems correspondingto µ and ν qubits, respectively). Suppose | v (cid:105) and | w (cid:105) are vectors in V and W , and A and B arelinear operators on V and W , respectively. Then we can deﬁne a linear operator A ⊗ B on V ⊗ W by the equation ( A ⊗ B )( | v (cid:105) ⊗ | w (cid:105) ) ≡ A | v (cid:105) ⊗ B | w (cid:105) . P. Shor has speciﬁed the operations for the process | (cid:105)| (cid:105) → √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) , where q = 2 s for some positive integer s such that n ≤ q < n , n is to be factored. Notice thatthe ﬁrst register consists of s qubits. He wrote: “this step is relatively easy, since all it entailsis putting each qubit in the ﬁrst register into the superposition √ ( | (cid:105) + | (cid:105) ) . ” (This can be doneusing the Hadamard gate s times.)Shor has not speciﬁed the operations for the process1 √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) → √ q q − (cid:88) a =0 | a (cid:105)| x a (mod n ) (cid:105) . By the way, he has not speciﬁed how many qubits are required in the second register. His originaldescription speciﬁes only the process ( a, → ( a, x a mod n ). For convenience, we now relate it as4ollows.The technique for computing x a (mod ) is essentially the same as the classical method.First, by repeated squaring we compute x i (mod ) for all i < l . Then, to obtain x a (mod )we multiply the powers x a (mod ) where 2 i appears in the binary expansion of a . In ouralgorithm for factoring n , we only need to compute x a (mod ) where a is in a superpositionof states, but x is some ﬁxed integer. This makes things much easier, because we can use areversible gate array where a is treated as input, but where x and n are built into thestructure of the gate array. Thus, we can use the algorithm described by the followingpseudocode; here, a i represents the i th bit of a in binary, where the bits are indexed fromright to left and the rightmost bit of a is a . power :=1 for i = 0 to l − if ( a i == 1) then power := power ∗ x i (mod n ) endifendfor The variable a is left unchanged by the code and x a (mod ) is output as the variable power .Thus, this code takes the pair of values ( a,

1) to ( a, x a (mod )).Remarks on the Shor’s description: • The description indicates only the conventional process( a, → ( a, x a mod n ) , rather than the quantum process | a (cid:105)| (cid:105) → | a (cid:105)| x a mod n (cid:105) , let alone the more complicated quantum process1 √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) → √ q q − (cid:88) a =0 | a (cid:105)| x a (mod n ) (cid:105) . • Since a i is required to compute x a (mod n ) which represents the i th bit of a in binary, one hasto measure the superposition √ q (cid:80) q − a =0 | a (cid:105)| (cid:105) to obtain a . But it is impossible to practicallycompose pure states | a (cid:105)| x a (mod n ) (cid:105) , a = 0 , , · · · , q − , into the superposition √ q (cid:80) q − a =0 | a (cid:105)| x a (mod n ) (cid:105) , because q ≥ n and n is the large number tobe factored. • Although it speciﬁes the Hadamard gate on each qubit in the ﬁrst register, (cid:58)(cid:58) it (cid:58)(cid:58)(cid:58)(cid:58) does (cid:58)(cid:58)(cid:58)(cid:58) not (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) specify (cid:58)(cid:58)(cid:58)(cid:58) how (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) many (cid:58)(cid:58)(cid:58)(cid:58) and (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) what (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) quantum (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) gates (cid:58)(cid:58)(cid:58) or (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) unitary (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) operations (cid:58)(cid:58)(cid:58)(cid:58) are (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) used (cid:58)(cid:58)(cid:58) on (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) each (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) qubit (cid:58)(cid:58) or (cid:58)(cid:58) a (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) group (cid:58)(cid:58) of (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) qubits (cid:58)(cid:58)(cid:58) in (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) second (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) quantum (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) register. (cid:58) .2 The Nielsen-Chuang description Nielsen and Chuang in their book Ref.[3] specify that | a (cid:105)| y (cid:105) → | a (cid:105) U a t − t − · · · U a | y (cid:105) = | a (cid:105)| x a t − t − × · · · × x a y (mod n ) (cid:105) = | a (cid:105)| x a y (mod n ) (cid:105) where a ’s binary representation is a t − a t − · · · a , U is the unitary operation such that U | y (cid:105) ≡ | xy (mod n ) (cid:105) ,y ∈ { , } (cid:96) , (cid:96) is the bit length of n . They wrote:Using the techniques of Section 3.2.5, it is now straightforward to construct areversible circuit with a t bit register and an (cid:96) bit register which, when startedin the state ( a, y ) outputs ( a, x a y (mod n )), using O ( (cid:96) ) gates, which can betranslated into a quantum circuit using O ( (cid:96) ) gates computing the transformation | a (cid:105)| y (cid:105) → | a (cid:105)| x a y (mod n ) (cid:105) .Although they indicate that the classical circuit for the conventional process( a, y ) O ( (cid:96) ) classical gates − − − − − − − −→ ( a, x a y (mod n ))can be translated into a quantum circuit for the quantum process | a (cid:105)| y (cid:105) O ( (cid:96) ) quantum gates − − − − − − − −→ | a (cid:105)| x a y (mod n ) (cid:105) , we now want to remark that the quantum circuit has to invoke U , the unitary operation, a times.Thus, the wanted process 1 √ q q − (cid:88) a =0 | a (cid:105)| (cid:105) → √ q q − (cid:88) a =0 | a (cid:105)| x a (mod n ) (cid:105) has to invoke the unitary operation 1 + 2 + · · · + ( q − ≈ O ( q ) times, if all terms | a (cid:105)| (cid:105) , a =0 , · · · , q −

1, are processed one by one. Even worse, the transformation for the process | q − (cid:105)| y (cid:105) → | a (cid:105)| x q − y (mod n ) (cid:105) has to invoke the unitary operation q − (cid:58) it (cid:58)(cid:58)(cid:58)(cid:58) can (cid:58)(cid:58)(cid:58)(cid:58) not (cid:58)(cid:58)(cid:58) be (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) accomplished (cid:58)(cid:58) in (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) polynomial (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) time (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) because (cid:58) q (cid:58)(cid:58)(cid:58) is (cid:58)(cid:58) a (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) large (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) number. In recent, Markov and Saeedi [4, 5] have proposed a quantum circuit for modular exponentiation.We refer to the following Figure 1 for the outline of their circuit.6 · · · •

QFT − ... | . . . i H • · · · | m i H • · · · (cid:16) b (cid:17) % M (cid:16) b (cid:17) % M · · · (cid:16) b n − (cid:17) % M | . . . i · · · ... · · · Figure 1: An outline of the quantum part of Shor’s algorithm.

Shor’s algorithm seeks to factor a given value

M >

0, which we assume to be semiprime M = pq withunknown factors. The strategy is to consider the functions f b ( x ) = x b % M , potentially with severaldiﬀerent 1 < b < M values and determine their periods in case gcd( b, M ) = 1. When the period isdetermined to be even b π % M = 1, we have ( b π − b π + 1)% M = 0, thus either ( b π −

1) or ( b π + 1)must share at least one prime factor with M . If b π % M = −

1, such a factor can be found usinggcd( b π ± , M ), otherwise it leads to the trivial factors 1 and M . When the period is determined to beodd, another b value is tried.The period-ﬁnding procedure relies on a quantum circuit (Figure 1), instantiated for a given value1 < b < M coprime with M . The circuit operates on two 0-initialized quantum registers [15] with • a block of parallel Hadamard gates on Register 1, • a circuit for modular exponentiation (mod-exp) evaluates f ( y ) = b y % M by mapping | y i| i 7→| y i| f ( y ) i , where y is read from Register 1 and f ( y ) is written to Register 2; Register 1 can betemporarily modiﬁed, but must be restored at the end, • a circuit for the Quantum Fourier Transform (QFT) on Register 1, • a block of parallel measurements on Register 1.The ﬁrst and last blocks cannot be optimized any further. QFT circuits are understood fairly welland are much smaller than circuits for modular exponentiation [15]. Therefore, our focus is on mod-expcircuits. They typically consist of reversible gates — NOT ( N ), CNOT ( C ) and Toﬀoli ( T ) — which canbe modeled and optimized entirely in terms of Boolean logic [17]. However, in physical implementations,Toﬀoli gates must be decomposed into smaller gates directly implementable in a given technology [18].Reversible circuits for modular exponentiation start with an inverter on Register 2 that changes the | · · · i value to | · · · i , and otherwise exhibit the following structure: each ( i -th) bit of Register1 enables (controls) a circuit block that multiplies Register 2 by C i = b i % M and reduces the result% M . When b and M are known, C i can be pre-computed without quantum computation. Therefore,we refer to C i x % M -blocks below. They are typically implemented using shift and addition circuits, anda number of relevant quantum adders are known [9, 19]. The selection of appropriate adder types isdiscussed in [20, 10].Each controlled modular multiplication is traditionally implemented separately. When dealing withreversible logic and quantum circuits, we note that the coprimality of C and M makes x Cx % M areversible transformation. The number of coprime C values is ϕ ( M ) = ( p − q − ϕ ( M ) is theEuler’s totient function and gives the size of ( Z /M Z ) × — the multiplicative group of integers mod- M .For M = 15, modular multiplication circuits for the eight C coprime values are illustrated in Figure 2.Figure 3 shows circuits for f ( x ) = b x %15, gcd( b,

15) = 1.When not knowing p and q , one should also not assume any knowledge that would make it easy toﬁnd them. For example, one should not choose C that satisﬁes C π = 1% M with a known (small) π because such solutions would allow one to factorize M via gcd( C π ± , M ). Also recall that ( Z /M Z ) × isa product of two cyclic groups Z /p Z and Z /q Z , and thus ( Z /M Z ) × admits a generating set with onlytwo elements. However, knowing such generators is tantamount to knowing p and q . When workingwith speciﬁc small M = pq , it is sometimes diﬃcult to avoid using the knowledge of p and q , but resultsobtained this way do not necessarily scale to large values. The same can be said about results producedthrough exhaustive search. Here and in the remaining text, the percent sign % denotes the modulo (remainder) operation, as it does in the C andC++ languages. The Markov-Saeedi quantum circuit for modular exponentiation is ﬂawed, too. The unitarymatrix corresponding to ( b i )% M for some integer i , which is performed on all qubits in the sec-ond quantum registers, has a tremendous dimension (not less than the modular M ). To implementthe operator practically, (cid:58)(cid:58)(cid:58) one (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) must (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) decompose (cid:58)(cid:58) it (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) into (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) tensor (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) product (cid:58)(cid:58) of (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) some (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) linear (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) operators (cid:58)(cid:58)(cid:58)(cid:58) with (cid:58)(cid:58)(cid:58)(cid:58) low (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) dimension. Regretfully, they had not speciﬁed these low dimension linear operators at all.Moreover, they had not speciﬁed the output of the operator ( b )% M . We now want to ask:(1) what are the inputting states of the unitary operator ( b n − )% M ?(2) how to decompose the operator ( b n − )% M into the tensor product of some low di-mension linear operators?(3) (cid:58)(cid:58)(cid:58)(cid:58) how (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) many (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) executable (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) unitary (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) operators (cid:58)(cid:58)(cid:58)(cid:58) are (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) required in the quantum modular exponenti-ation process?In our opinion, their proposed quantum circuit for modular exponentiation is incorrect and mis-leading. We have reported the ﬂaw to some researchers including P. Shor himself, but only received acomment made by MIT professor Scott Aaronson. He explained that (personal communication,2014/09/02):The repeated squaring algorithm works (and works in polynomial time)for any single | a (cid:105)| (cid:105) , mapping it to | a (cid:105)| x a (mod n ) (cid:105) . But, because of thelinearity of quantum mechanics, this immediately implies that the algorithmmust also work for any superposition of | a (cid:105) ’s, mapping (cid:80) a | a (cid:105) to (cid:80) a | a (cid:105)| x a (mod n ) (cid:105) .We do not think that his answer is convincing, because it is too vague to specify how many andwhat quantum gates or unitary operations are used on each qubit or a group of qubits in the secondquantum register . Besides, according to the Nielsen-Chuang description, the process | a (cid:105)| y (cid:105) → | a (cid:105) U a t − t − · · · U a | y (cid:105) = | a (cid:105)| x a t − t − × · · · × x a y (mod n ) (cid:105) = | a (cid:105)| x a y (mod n ) (cid:105) a . Which integer should be extracted in thesuperposition √ q (cid:80) q − a =0 | a (cid:105)| (cid:105) for computing the wanted state √ q (cid:80) q − a =0 | a (cid:105)| x a (mod n ) (cid:105) ? He did notpay more attentions to the diﬀerence between two linear operators performed on a pure state and asuperposition . We know the wanted superposition in the ﬁrst register is modulated by the following procedure.First, a Hadamard gate H = √ (cid:34) − (cid:35) is performed on each qubit to obtain the s intermediatestates of √ ( | (cid:105) + | (cid:105) ). Second, combine all these states using the tensor product.1 √ | (cid:105) + | (cid:105) ) ⊗ √ | (cid:105) + | (cid:105) ) = 12 ( | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) )1 √ | (cid:105) + | (cid:105) ) ⊗ √ | (cid:105) + | (cid:105) ) ⊗ √ | (cid:105) + | (cid:105) )= 12 √ | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) + | (cid:105) )... 1 √ | (cid:105) + | (cid:105) ) ⊗ · · · ⊗ √ | (cid:105) + | (cid:105) ) (cid:124) (cid:123)(cid:122) (cid:125) s qubits = 1 √ q q − (cid:88) a =0 | a (cid:105) Note that the procedure works well because all those involved pure states are in binary form.We would like to stress that if two pure states are in decimal representations | x (cid:105) , | x (cid:105) , then wecan not directly combine them to obtain | x (cid:105) . Suppose that the binary strings for integers x, x are b k · · · b , b (cid:48) i · · · b (cid:48) . We have | x (cid:105) ⊗ | x (cid:105) = | b k · · · b b (cid:48) i · · · b (cid:48) (cid:105) = | i +1 x + x (cid:105) . Thus,1 √ | (cid:105) + | x (cid:105) ) ⊗ √ | (cid:105) + | x (mod n ) (cid:105) ) ⊗ · · · ⊗ √ (cid:16) | (cid:105) + | x s − (mod n ) (cid:105) (cid:17) (cid:54) = 1 √ q q − (cid:88) a =0 | x a (mod n ) (cid:105) , where q = 2 s , although there is a corresponding conventional equation(1 + x )(1 + x )(1 + x ) · · · (1 + x s − ) = q − (cid:88) a =0 x a . (cid:58)(cid:58) It (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) seems (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) that (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) some (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) people (cid:58)(cid:58)(cid:58) are (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) confused (cid:58)(cid:58)(cid:58)(cid:58) by (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) above (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) equation (cid:58)(cid:58)(cid:58)(cid:58) and (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) simply (cid:58)(cid:58)(cid:58)(cid:58) take (cid:58)(cid:58)(cid:58)(cid:58) for (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) granted (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) that (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) quantum (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) modular (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) exponentiation (cid:58)(cid:58) is (cid:58)(cid:58)(cid:58) in (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) polynomial (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) time.8 On some experimental demonstrations of Shor’s algorithm

In 2001, it is reported that Shor’s algorithm was demonstrated by a group at IBM, who factored15 into 3 ×

5, using a quantum computer with , and 4 qubitsfor the second register (see Figure-2) [6].In 2007, a group at University of Queensland reported an experimental demonstration of acompiled version of Shor’s algorithm. They factored 15 into 3 ×

5, using either, and 4 qubits for the second register (see Figure-3) [7].In 2007, a group at University of Science and Technology of China reported another experimentaldemonstration of a complied version of Shor’s algorithm. They factored 15 into 3 × only, and 4 qubits for the second register (see Figure-4) [8].In 2012, a group at University of California, Santa Barbara, reported a new experimentaldemonstration of a compiled version of Shor’s algorithm. They factored 15 into 3 × either, and 2 qubits for the second register (see Figure-5) [9].Demonstrations qubits used in the ﬁrst register qubits used in the second registerFigure 2, Ref.[6] 3 4Figure 3, Ref.[7] 3 4Figure 4, Ref.[8] 2 4Figure 5, Ref.[9] 1 2 Figure 1

L. Vandersypen NATURE 07-Sep-01 inverse

QFT averaginglaropmet1:3:2:4:5:6:7:

90A F G H 9045HHH H H H b.a. m x x mod xanHn B C D E N (3) (4)(2)(1)(0) a. Figure 2: Detailed quantum circuit for the case N = 15 and a = 7.9 application of the order-ﬁnding function produces theentangled state P n − x =0 | x i| C x mod N i ; iii) the inverseQuantum Fourier Transform (QFT) followed by mea-surement of the argument-register in the logical basis,which with high probability extracts the order r after fur-ther classical processing. If the routine is standalone, theinverse QFT can be performed using an approach basedon local measurement and feedforward [21]. Note thatthe inverse QFT in [14] was unnecessary: it is straight-forward to show this is true for any order-2 l circuit [22].Modular exponentiation is the most computationally-intensive part of the algorithm [13]. It can be realised bya cascade of controlled unitary operations, U , as shownin the nested inset of Fig. 1a). It is clear that the reg-isters become highly entangled with each other: since U is a function of C and N , the entangling operation isunique to each problem. Here we choose to factor 15 withthe ﬁrst two co-primes, C =2 and C =4. In these cases en-tire sets of gates are redundant: speciﬁcally, U n = I when xx H Uc Uc Uc QFT -1 ... b)HH abcdefg H HH abcde argumentfunction x abcdefg XH THHH HHH c)d) e)f) g) U U U U U U U U U XXa) C Initialisation Modular exponentiation Inverse QuantumFourier Transform xx x

H HHTH HHT

FIG. 1: a) Conceptual circuit for the order-ﬁnding routine ofShor’s algorithm for number N and co-prime C [13]. The ar-gument and function registers are bundles of n and m qubits;the nested order-ﬁnding structure uses U | y i = | C y mod N i ,where the initial function-register state is | y i =1. The algo-rithm is completed by logical measurement of the argument-register, and reversing the order of the argument qubits. b),c)Implementation of a) for N =15 and C =4 ,

2, respectively; theunitaries are decomposed into controlled- swap gates ( cswap ),marked as x ; controlled-phase gates are marked by dots; h and t represent Hadamard and π/ cswap by controlled- not gates. n.b. e) is equivalent to the N =15 C =7 circuit in Ref.[14]. f ),g) Fully-compiled circuits of d),e),by evaluating log C [ C x mod N ] in the function-register. n> C =4, and U n = I when n> C =2. Figs 1b),c)show the remaining gates for C =4 and C =2, respectively,after decomposition of the unitaries into controlled- swap gates—this level of compiling is equivalent to that in-troduced in Ref. [14]. Further compilation can alwaysbe made since the initial state of the function-registeris ﬁxed, allowing the cswap gates to be replaced bycontrolled- not ( cnot ) gates as shown in Figs 1d),e) [23].We implemented the order-2-ﬁnding circuit, Fig. 1d).The qubits are realised with simultaneous forward andbackward production of photon pairs from parametricdownconversion, Fig. 2a): the logical states are encodedinto the vertical and horizontal polarisations. This circuitrequired implementing a recently-proposed three-qubitquantum-logic gate, Fig. 2b), which realises a cascade of n controlled- z gates with exponentially greater successthan chaining n individual gates [24]. The controlled- not gates are realised by combining Hadamards andcontrolled- z gates based on partially-polarising beam-splitters. The gates are nondeterministic, with one thirdsuccess probability when fully prebiased [8, 9, 10]. A runof each routine is ﬂagged by a fourfold event, where asingle photon arrives at each output. Dependent pho-tons from the forward pass interfere non-classically at V H

F1F2B2B1F1F2B1 F1F2B1B2

VH HV H VV H a)b) c)

VH HV HV HV

Laser d) e)F1 F2 B1

F1 F2B1 B2 R V =1/3R H =1 ! R V =1R H =0 !!! " /2 " /4 beg bcdeb e g b c de SHG PDC

FIG. 2: Experimental schematic. a) Forward and backwardphotons pairs are produced via parametric downconversion(PDC) of a frequency-doubled mode-locked Ti:Sapphire laser(820 nm →

410 nm, ∆ τ =80 fs at 82 MHz repetition rate)through a Type-I 2 mm Bismuth Borate (BiB O ) crystal.Photons are input to the circuits via blocked interferenceﬁlters (820 ± Figure 3: Conceptual circuit for Shor’s algorithm for number N = 15 and co-prime C = 4. a) b) nm n n c) Z PBS PBSHWP

CNOT

FIG. 1: Quantum circuit for the order-ﬁnding routine of Shor’s algorithm. (a). Outline of the quantum circuit. (b). Quantumcircuit for N = 15 and a = 11. The MEF is implemented by two CNOT gates and the QFT is implemented by Hadamardrotations and two-qubit conditional phase gates. The gate-labeling scheme denotes the axis about which the conditional rotationtakes place and the angle of rotation. (c). The simpliﬁed linear optics network using HWPs and PBSs to implement the MEFcircuit and the semiclassical version of the QFT circuit. The double lines denote classical information. Implementations of this algorithm, even for factoriza-tion of a small number, place a lot of challenging exper-imental demands, e.g., coherent manipulations of multi-ple qubits and creations of highly-entangled multiqubitregisters. Here we aim to demonstrate the simplest in-stance of Shor’s algorithm, i.e., the factorization of 15.Quantum networks for evaluating the MEF have beendesigned which involve O ( n ) operations [15, 17]. Since a x = a n − x n − · · · a x a x , the execution of MEF can bedecomposed into a sequence of controlled multiplications.A general purpose algorithm to factorize 15 would requireat least n = 8 , m = 4 , thus total 12 qubits [15]. Severalobservations allow us to reduce the resources substan-tially for the purpose of a proof-of-principle demonstra-tion. First we choose to implement the algorithm with a = 11, this was identiﬁed in [5] as the “easy” case. Since a mod15 = 1, MEF can be simpliﬁed to multiplicationscontrolled only by x , which can be implemented by twocontrolled-NOT (CNOT) gates [18]. A QFT then fol-lows to read out the period r . Such a circuit is shown inFig. 1b. We note there are two qubits in the second reg-ister which evolve trivially during computation and canthus be left out.To demonstrate the circuit of Fig. 1b we use single pho-tons as qubits, where | i and | i are encoded with thephoton’s horizontal ( H ) and vertical ( V ) polarization re-spectively. The diﬃculty in implementing this circuit liesin the CNOT gates and conditional π/ | H i , so the gate could be re-alized in an easier and more eﬃcient fashion. Such aCNOT gate use only a polarizing beam splitter (PBS)and a half-wave plate (HWP), through which an arbi-trary control qubit ( α | H i + β | V i ) and the target qubit | H i evolve into α | H i| H i + β | V i| V i upon post-selection[20], that is, conditioned on that there is one and only onephoton out of each output (see Fig. 1c). Furthermore, the QFT circuit can also be implemented with a more eﬃ-cient method. It was observed by Griﬃths and Niu [21]that when immediately followed by measurements, thefully coherent QFT can be replaced by a semiclassicalversion that employs only single-qubit rotations condi-tioned on measurement outcomes. This eliminates theneed for entangling gates and reduces the numbers ofgates quadratically. Thus we ﬁnally arrive at the simpli-ﬁed linear optics MEF and QFT network in Fig. 1c. Wenote despite of these simpliﬁcations, our circuit suﬃces todemonstrate the underlying principles of this algorithm.Now we proceed with the experimental demonstration.Our experimental set-up is illustrated in Fig. 2, wherea pulsed ultraviolet laser passes through two β -bariumborate (BBO) crystals to create two pairs of entangledphoton [22]. We use polarizers to disentangle the photonsand prepare them in the states | H i i with i denoting thespatial modes (see Fig. 1c). The photons pass throughthe HWPs and are superposed on the PBSs (see Fig. 2) toimplement the necessary single- and two-qubit gates. Toensure good spatial and temporal overlap, the photonsare spectrally ﬁltered (∆ λ FWHW = 3 . a = 11 is chosen, theﬁrst step of this algorithm, the MEF should evolveas (1 / P x =0 | x i| x mod15 i = (1 / | i| i + | i| i + | i| i + | i| i ) . As we rewrite it in binary representation( | i + | i + | i + | i ) /

2, it showsthat a nontrivial Greenberger-Horne-Zeilinger (GHZ) [24]entangled state | ψ i = (1 / √ | i | i | i + | i | i | i )is created between the two registers. For Shor’s algo-rithm as well as some others, multiqubit entanglement isa necessary condition if the quantum algorithm is to oﬀeran exponential speed-up over classical computation [6].In our experiment, as the photons pass through the MEFcircuit, we ﬁrst observe the Hong-Ou-Mandel type inter-ference [25] of three photons in arms 2-3-4 (see Fig. 3b).Then, after ﬁxing the delays at the zero positions, we ex- Figure 4: Outline of quantum circuit for Shor’s algorithm for N = 15 and a = 11. g "1""0" h "1""0" e ggg eee ggg eee01/2 GHZgg ee eegg 1/20-1/2 d HQ2Q3 H HCzH armod(N)Init QuantumFourier TransformModular Exponentiation

H HH HC π /2Q2Q3Q4Q1 |0>|0>|0>|0> H HQ2Q3Q4 |0>|0>|0> "0" "1" ab c i "1""0" f ψ s ψ FIG. 3: Compiled version of Shor’s algorithm. a , Four-qubit circuit to factor N = 15, with co-prime a = 4. The three steps inthe algorithm are initialization, modular exponentiation, and the quantum Fourier transform, which computes a r mod(N) andreturns the period r = 2. b , “Recompiled” three-qubit version of Shor’s algorithm. The redundant qubit Q is removed bynoting that HH = I. Circuits a and b are equivalent for this speciﬁc case. The three steps of the runtime analysis are labeled1,2,3. c , CNOT gates are realized using an equivalent controlled-Z (C Z ) circuit. d , Step 1: Bell singlet between Q and Q with ﬁdelity, F Bell = h ψ s | ρ Bell | ψ s i = 0 . ± .

01 and S L = 0 .

78. From 1 . × direct measurements the output registerreturns the period r = 2, with probability 0 . ± . i ), The density matrix of thesingle-qubit output register without entangling gates, H H | g i = I | g i . The algorithm fails and returns r = 0 100 % of thetime. Compared to the single quantum state | ψ out i = | g i , the ﬁdelity F check = h ψ g | ρ check | ψ g i = 0 . ± .

01, which is less thanunity due to the energy relaxation. “10” (including the redundant qubit) with equal proba-bility, where the former represents a failure and the latterindicates the successful determination of r = 2. We usethree methods to analyze the output of the algorithm:Three-qubit QST, single-qubit QST, and the raw proba-bilities of the output register state. Figures 3g, h are thereal part of the density matrices for the single qubit out-put register from three-qubit QST and one-qubit QSTwith ﬁdelity F = √ ρ σ m √ ρ = 0 . ± .

01 for both den-sity matrices. From the raw probabilities calculated from150 ,

000 repetitions of the algorithm, we measure the out-put “10” with probability 0 . ± . r = 2,and after classical processing we compute the prime fac-tors 3 and 5.The linear entropy S L = 4[1 − Tr( ρ )] / S L = 1 for a completely mixed state[30]. Weﬁnd S L = 0 .

78 for both the reduced density matrix fromthe third step of the runtime analysis (three-qubit QST),and from direct single-qubit QST of the register qubit.As a ﬁnal check of the requisite entanglement, we runthe full algorithm without any of the entangling oper-ations and use QST to measure the single-qubit outputregister. The circuit reduces to two H -gates separated bythe time of the two entangling gates. Ideally Q returnsto the ground state and the algorithm fails (returns “0”)100 % of the time. Figure 3i is the real part of the densitymatrix for the register qubit after running this check ex-periment. The ﬁdelity of measuring the register qubit in | g i is F check = h g | ρ check | g i = 0 . ± .

01. The algorithmfails, as expected, without the entangling operations.In conclusion, we have implemented a compiled ver-

Figure 5: A three-qubit compiled version of Shor’s algorithm to factor N = 15.10e now want to remark that: • All these demonstrations are ﬂawed because they violate the necessary condition that 15 < < × , (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) which (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) means (cid:58)(cid:58) (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) qubits (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) should (cid:58)(cid:58)(cid:58) be (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) used (cid:58)(cid:58) in (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) ﬁrst (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) register. Obviously, the laststep of continued fraction expansion in Shor’s algorithm can not be accomplished if less qubitsare used in the ﬁrst register. It seems that these groups have misunderstood the necessarycondition that n ≤ q < n in Shor’s algorithm. • In Figure 3, it directly denotes the output of the second register by C x mod N . Clearly,the authors confused the number C x mod N with the state | C x mod N (cid:105) . By the way, thewanted state in the second register is the superposition √ (cid:80) x =0 | C x mod N (cid:105) instead of thepure state | C x mod N (cid:105) . • In Figure 5, only 3 qubits are used. Clearly, the modular 15 can not be represented by the 3qubits. In such case, (cid:58)(cid:58)(cid:58) how (cid:58)(cid:58)(cid:58) to (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) ensure (cid:58)(cid:58)(cid:58)(cid:58)(cid:58) that (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) modular (cid:58)(cid:58)(cid:58) is (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) really (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) involved (cid:58)(cid:58)(cid:58) in (cid:58)(cid:58)(cid:58)(cid:58) the (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) computation?In our opinion, the demonstration is unbelievable. Shor’s factoring algorithm is interesting. But its subroutine for quantum modular exponentiationis not speciﬁed. We remark that both the Shor’s original description and the Nielsen-Chuang de-scription for quantum modular exponentiation are ﬂawed. They can be used only for the pure state | a (cid:105)| (cid:105) , not for the superposition √ q (cid:80) q − a =0 | a (cid:105)| (cid:105) . We also remark that some experimental demon-strations of Shor’s algorithm are meaningless and misleading because they violate a necessarycondition for Shor’s algorithm. Acknowledgements . This work was supported by the National Natural Science Foundation ofChina (Grant Nos. 60970110, 60972034), and the State Key Program of National Natural Scienceof China (Grant No. 61033014).

References [1] Miller G.: Riemann’s hypothesis and tests for primality. J. Comput. System Sci., 13: 300-317 (1976)[2] Shor P.: Polynomial-time algorithms for prime factorization and discrete logarithms on a quantumcomputer. SIAM J. Comput. 26 (5): 1484-1509 (1997)[3] Nielspen M., and Chuang I.: Quantum Computation and Quantum Information. Cambridge UniversityPress (2000)[4] Markov I., and Saeedi M.: Constant-Optimized Quantum Circuits for Modular Multiplication and Ex-ponentiation. Quantum Information and Computation, Vol. 12, No. 5&6, pp. 361-394 (2012)[5] Markov I., and Saeedi M.: Faster Quantum Number Factoring via Circuit Synthesis, Physical Review A87, 012310 (2013)[6] Vandersypen L., et al.: Experimental realization of Shor’s quantum factoring algorithm using nuclearmagnetic resonance, Nature 414 (6866): 883-887, arXiv:quant-ph/0112176 (2001)

7] Lanyon B., et al.: Experimental Demonstration of a Compiled Version of Shor’s Algorithm with QuantumEntanglement”, Physical Review Letters 99 (25): 250505. arXiv:0705.1398 (2007)[8] Lu Chao-Yang, et al.: Demonstration of a Compiled Version of Shor’s Quantum Factoring AlgorithmUsing Photonic Qubits, Physical Review Letters 99 (25): 250504, arXiv:0705.1684 (2007)[9] Lucero E., et al.: Computing prime factors with a Josephson phase qubit quantum processor. NaturePhysics 8, 719-723, 2012. arXiv:1202.5707 (2012)7] Lanyon B., et al.: Experimental Demonstration of a Compiled Version of Shor’s Algorithm with QuantumEntanglement”, Physical Review Letters 99 (25): 250505. arXiv:0705.1398 (2007)[8] Lu Chao-Yang, et al.: Demonstration of a Compiled Version of Shor’s Quantum Factoring AlgorithmUsing Photonic Qubits, Physical Review Letters 99 (25): 250504, arXiv:0705.1684 (2007)[9] Lucero E., et al.: Computing prime factors with a Josephson phase qubit quantum processor. NaturePhysics 8, 719-723, 2012. arXiv:1202.5707 (2012)