[PDF] Quantum Gibbs Sampling Using Szegedy Operators

Abstract

We present an algorithm for doing Gibbs sampling on a quantum computer. The algorithm combines phase estimation for a Szegedy operator, and Grover's algorithm. For any ϵ>0 , the algorithm will sample a probability distribution in O( 1 δ √ ) steps with precision O(ϵ) . Here δ is the distance between the two largest eigenvalue magnitudes of the transition matrix of the Gibbs Markov chain used in the algorithm. It takes O( 1 δ ) steps to achieve the same precision if one does Gibbs sampling on a classical computer.

Full PDF

aa r X i v : . [ qu a n t - ph ] J a n Quantum Gibbs SamplingUsing Szegedy Operators

Robert R. TucciP.O. Box 226Bedford, MA [email protected] 24, 2018

Abstract

We present an algorithm for doing Gibbs sampling on a quantum computer. Thealgorithm combines phase estimation for a Szegedy operator, and Grover’s algorithm.For any ǫ >

0, the algorithm will sample a probability distribution in O ( √ δ ) steps withprecision O ( ǫ ). Here δ is the distance between the two largest eigenvalue magnitudesof the transition matrix of the Gibbs Markov chain used in the algorithm. It takes O ( δ ) steps to achieve the same precision if one does Gibbs sampling on a classicalcomputer. 1 Introduction

In Ref.[1], Szegedy proposed a quantum walk operator for each classical Markovchain. In Ref.[2], Somma et al. proposed a method for doing simulated annealing ona quantum computer. In Ref.[3], Wocjan et al. showed how to improve the Somma etal. algorithm. The algorithms of Somma et al. and Wocjan et al. both use Szegedyoperators. In Ref.[4], I presented computer programs called QuSAnn and MultiplexorExpander that implement ideas of Refs.[2] and [3], and also some of my own ideasabout quantum multiplexors.In Ref.[5], I described one particular algorithm for doing Gibbs and Metropolis-Hastings sampling of a classical Bayesian network (i.e., a probability distribution) ona quantum computer. In this paper, I describe a diﬀerent algorithm for doing Gibbssampling on a quantum computer. Unlike my ﬁrst algorithm, this one uses Szegedyoperators. For any ǫ >

0, this new algorithm will sample a Bayesian network in O ( √ δ )steps with precision O ( ǫ ). Here δ is the distance between the two largest eigenvaluemagnitudes of the transition matrix of the Gibbs Markov chain used in the algorithm.It takes O ( δ ) steps to achieve the same precision if one does Gibbs sampling on aclassical computer.This paper assumes that its reader has read the section entitled “Notation andPreliminaries” in Ref.[5]. The reader should refer to Refs.[5, 4] for clariﬁcation whenany notation of this paper eludes him. In this section, we will discuss dual “Gibbs” Markov chains with transition matrices M and M , respectively. These two transition matrices are both deﬁned in terms ofa single classical Bayesian network x . M and M Consider a classical Bayesian net with N nds nodes, labeled x , x , . . . , x N nds where x j ∈ S x j for each j . (As usual in my papers, I indicate random variables by underliningthem.) Let x = ( x , x , . . . , x N nds ). Let x assume values in a set S x which has N S = 2 N B elements.Let π ( x ) = P ( x = x ) (1)for all x ∈ S x .For N nds = 3 and x, y ∈ S x , let M ( y | x ) = P x | x x ( y | x , x ) P x | x x ( y | x , y ) P x | x x ( y | y , y ) , (2)2nd M ( y | x ) = P x | x x ( y | y , y ) P x | x x ( y | y , x ) P x | x x ( y | x , x ) . (3)( M ( y | x ) can be obtained by swapping x i and y i in the conditioned arguments of M ( y | x ).) Note that P y M ( y | x ) = 1 and P y M ( y | x ) = 1. Deﬁne M and M forarbitrary N nds using the same pattern. M and M are transition matrices of the typetypical for Gibbs sampling. (See Ref.[5] for an introduction to Gibbs sampling andthe more general Metropolis-Hastings sampling).You can check that π () is not a detailed balance of either M nor M separately.However, the following property is true. We will refer to this property by saying that π () is a detailed balance of the pair ( M , M ). Claim 1 M ( y | x ) π ( x ) = M ( x | y ) π ( y ) (4) for all x, y ∈ S x . proof: Let P ( x j , x k , . . . ) stand for P ( x j = x j , x k = x k , . . . ). Assume N nds = 3 to beginwith. One has M ( y | x ) M ( x | y ) = P ( y | x , x ) P ( y | x , y ) P ( y | y , y ) P ( x | x , x ) P ( x | x , y ) P ( x | y , y ) (5)= P ( y , x , x ) P ( y , x , y ) P ( y , y , y ) P ( x , x , x ) P ( x , x , y ) P ( x , y , y ) (6)= P ( y , x , x ) P ( y , y , x ) P ( y ) P ( x ) P ( y , x , x ) P ( y , y , x ) (7)= P ( y ) P ( x ) . (8)A proof for an arbitrary number N nds of nodes follows the same pattern. QED M , M and M hyb Let Λ j ( y | x ) = q M j ( y | x ) , (9)for j = 1 , x, y ∈ S x . It’s convenient to deﬁne a hybrid function of M and M ,as follows: 3 hyb ( y | x ) = Λ ( x | y )Λ ( y | x ) (10)for x, y ∈ S x . (Note that unlike M ( y | x ) and M ( y | x ), M hyb ( y | x ) is not a probabilityfunction in y , its ﬁrst argument.)Deﬁne the quantum states | ( π ) η i = X x [ π ( x )] η | x i (11)for η = ,

1. (Note that only the η = state is normalized in the sense of quantummechanics.) Claim 2 M j | π i = | π i for j = 1 , , (12) and M hyb |√ π i = |√ π i . (13) Also, M , M and M hyb have the same eigenvalues. proof: Taking the square root of both sides of the pair detailed balance statementEq.(4), we get Λ ( y | x ) p π ( x ) = Λ ( x | y ) p π ( y ) . (14)Therefore, M hyb ( y | x ) = Λ ( x | y ) 1 p π ( x ) Λ ( x | y ) p π ( y ) = 1 p π ( x ) M ( x | y ) p π ( y ) . (15)Hence, X x M ( y | x ) π ( x ) = X x M ( x | y ) π ( y ) = π ( y ) , (16) X y M ( x | y ) π ( y ) = X y M ( y | x ) π ( x ) = π ( x ) , (17)and X x M hyb ( y | x ) p π ( x ) = X x p π ( x ) M ( x | y ) p π ( y ) p π ( x ) = p π ( y ) . (18)Order the elements of the ﬁnite set S x in some preferred way. Use this preferredorder to represent M , M and M hyb as matrices. Deﬁne a diagonal matrix D whose4iagonal entries are the numbers π ( x ) for each x ∈ S x , with the x ordered in thepreferred order: D = diag [( π ( x )) ∀ x ] . (19)Since M T = D − M D , M

Thyb = D − M D , (20)it follows that det( M − λ ) = det( M − λ ) = det( M hyb − λ ) (21)for any λ ∈ C . QED

Let the eigenvalues of M hyb (and also of M and M ) be m , m , . . . m N S − ∈ C with m = 1 > | m | ≥ | m | . . . ≥ | m N S − | . Deﬁne | m j i to be the correspondingeigenvectors of M hyb (but not necessarily of M and M ). Thus M hyb | m j i = m j | m j i , (22)for j = 0 , , . . . , N S −

1. In particular, | m i = |√ π i .For each j , deﬁne ϕ j ∈ [0 , π ] and η j ∈ [0 , π ) so that m j = e iη j cos ϕ j . (Thus,cos ϕ j ≥ m = 1 so ϕ = 0. The M eigenvalue gap δ is deﬁned as δ = 1 − | m | . δ ≈ ϕ when ϕ is small. U and U U j of M j , for j = 1 ,

2. (For moreinformation about q-embeddings, see Ref.[5].)For simplicity, we begin this section by considering a Bayesian net with only3 nodes x , x , x , and such that each of these nodes is binary (i.e., S x j = Bool for j = 1 , , U of the formshown in Fig.1, with its multiplexor gates deﬁned as follows. Let x j h k i ∈ Bool and x ′ j h k i ∈ Bool for any j, k . U has 3 analogous gates (a.k.a. nodes) labeled( x ′ h i , x h i , x h i ), ( x ′ h i , x ′ h i , x h i ), and ( x ′ h i , x ′ h i , x ′ h i ). Consider the ﬁrstof these for deﬁniteness. Let the probability amplitude A ( x ′ h i , x h i , x h i| x ′ h i , x h i , x h i )of node ( x ′ h i , x h i , x h i ) satisfy the constraint There must be a single eigenvalue 1 and all others must have a magnitude strictly smaller thanone because of the Frobenius-Perron Theorem. The eigenvalues may be complex. = R y R y R y223344 234 x x x x x x Figure 1: Unitary matrix U expressed as a product of quantum multiplexors. A ( x ′ h i , x h i , x h i| x ′ h i = 0 , x h i , x h i ) (23)= q P x | x ,x ( x ′ h i| x h i , x h i ) δ x h i x h i δ x h i x h i . (24)If we indicate non-zero entries by a plus sign, A =

000 001 010 011 · · · ( x ′ , x , x )= 000 + · · · + · · · + · · · + · · · + · · · + · · · + · · · + · · · (25) → X ~b ∈ Bool e iθ ~b σ Y ⊗ P ~b = (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G , (26)for some θ ~b ∈ R . Here the right pointing arrow means that the expression at theorigin of the arrow can be extended to the expression at the target of the arrow.From the above deﬁnition of U , it follows that, for x, x ′ , y, y ′ ∈ Bool ,6 y |h y ′ | U | x i| i ⊗ =  h y | | x ih y | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | x ih y | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | x ih y ′ | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | ih y ′ | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | ih y ′ | | i (27)= Λ ( y ′ | x ) δ ( y, x ) . (28)Hence, U | x i| i ⊗ = | x i Λ | x i or U | i ⊗ | x i = (Λ | x i ) | x i . (29) U = R y 244 234 x x x x x x y R y 233 Figure 2: Unitary matrix U expressed as a product of quantum multiplexors.Besides U , it is convenient to consider another unitary matrix called U . Wedeﬁne U to be of the form of Fig.2, where the multiplexors are deﬁned in such a waythat U satisﬁes, for all x, x ′ , y, y ′ ∈ Bool , h y |h y ′ | U | i ⊗ | x ′ i =  h y | | ih y | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | ih y | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | ih y ′ | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | x ′ ih y ′ | (cid:31)(cid:30)(cid:29)(cid:28)(cid:24)(cid:25)(cid:26)(cid:27) G | x ′ ih y ′ | | x ′ i (30)= Λ ( y | x ′ ) δ ( y ′ , x ′ ) . (31)7ence U | i ⊗ | x ′ i = Λ | x ′ i| x ′ i or U | x ′ i| i ⊗ = | x ′ i (Λ | x ′ i ) . (32) U j is called the q-embedding of M j for j = 1 , A = R y R y R y N Ba N Bb Figure 3: The unitary matrix A is a quantum embedding of a probability matrix P ( b | a ), where a has N Ba bits and b has N Bb bits.So far we have considered the q-embeddings U and U for the case of a classicalBayesian network x with 3 binary nodes. What if x has N nds nodes and some of thosenodes have more than 2 states? In that case, we must use several qubits (horizontallines) for each node x i (and an equal number of qubits for the dual node x ′ i ) inFigs.1 and 2. More speciﬁcally, suppose P ( x | x , x , . . . , x N nds ) equals P ( b | a ) where a ∈ Bool N Ba and b ∈ Bool N Bb . For the number of bits N Ba , deﬁne the number ofstates N Sa = 2 N Ba . Likewise, let N Sb = 2 N Bb . The constraint Eq.(24) generalizes to A ( b, ˜ a | ˜ b = 0 , a ) = p P ( b | a ) δ ˜ aa , (33)where a, ˜ a ∈ Bool N Ba and b, ˜ b ∈ Bool N Bb . Eq.(33) can be expressed in matrix formas follows: [ A ( b, ˜ a | ˜ b = 0 , a )] = (˜ b = 0 , a ) → ( b, ˜ a ) D , ↓ D , · · · D N Sb − , , (34)where, for all b ∈ Bool N Bb , D b, ∈ R N Sa XN Sa are diagonal matrices with entries( D b, ) a, ˜ a = p P ( b | a ) δ ˜ aa . (35)8y adding more columns to the matrix of Eq.(34), one can extended it (see sectionentitled “Q-Embeddings” in Ref.[5]) to a square matrix which can be expressed interms of multiplexors as in Fig.3.The Markov Blanket M B ( i ) for a node x i of the classical Bayesian network x satisﬁes (see section entitled “Notation and Preliminaries” in Ref.[5]) P ( x i | x { i } c ) = P ( x i | x MB ( i ) ) . (36)If the set M B ( i ) is strictly smaller than the set { i } c , this property can be used toreduce the number of controls for the multiplexor in U and U corresponding to P ( x i | x { i } c ).Given the two q-embeddings U and U for a Bayesian network x , we can deﬁnea unitary matrix U as follows U = U † U . (37)Matrix U has the following highly desirable property: Claim 3

For any j, k ∈ { , , . . . , N S − } , h |h m j | U | m k i| i = m j δ kj . (38) proof: h |h m j | U † U | m k i| i = X y,x h m j | y i (cid:20) h y | Λ T h y | (cid:21) (cid:20) | x i Λ | x i (cid:21) h x | m k i (39)= X y,x h m j | y i Λ T ( y | x )Λ ( y | x ) h x | m k i (40)= h m j | M hyb | m k i = m j δ kj . (41) QED W In this section, we will deﬁne a special type of Szegedy quantum walk operator W corresponding to a Bayesian net x . We will then ﬁnd the eigenvalues of W . W As in Ref.[4], deﬁne the projection operator ˆ π and its dual projection operator ˇ π by9 π = | ih | — , ˇ π = l ˆ π l = — | ih | . (42)Then the Szegedy quantum walk operator W for the Bayesian net x is deﬁned by W = U ( − ˇ π U † ( − ˆ π . (43) W To ﬁnd the eigenvalues of W , we will use the following identities. Claim 4 ˆ π | m j i = | m j i , (44a)ˆ π ( U l ) | m j i = m j | m j i , (44b)ˆ π ( l U † ) | m j i = m ∗ j | m j i , (44c) for all j ∈ { , , . . . , N S − } . proof: From the deﬁnition of ˆ π , we see thatˆ π | i| m j i = | i| m j i . (45)Also, ˆ π ( U l ) | i| m j i = X k | ih || m k ih m k | U | m j i| i = m j | i| m j i , (46)and ˆ π ( l U † ) | i| m j i = X k | ih m k || m k ih | U † | i| m j i = m ∗ j | i| m j i . (47) QED

An immediate consequence of Claim 4 is that h m j | U l | m k i = h m j | ˆ πU l | m k i = m j δ kj , (48)for j, k ∈ { , , . . . , N S − } .Note that since m = 1, Eq.(48) implies that | m i = U l | m i . (49)10nother consequence of Claim 4 is that | m i is a stationary state of W .Indeed, one has W | m i = U ( − ˇ π U † ( − ˆ π | m i (50)= U l (1 − π ) l U † ( − | m i (51)= (1 − m U l )( − | m i (52)= (1 − − | m i (53)= | m i . (54)Let V jbusy = span {| m j i , U l | m j i} (55)for j ∈ { , , . . . , N S − } . (By “span” we mean all linear combinations of thesevectors with complex coeﬃcients.) Claim 5 W V jbusy = V jbusy for all j ∈ { , , . . . , N S − } . For j = 0 , let | ψ i = | m i . (56) {| ψ i} is an orthonormal basis for V busy and W | ψ i = | ψ i . For j = 0 , let | ψ ± j i = ± i √ ϕ j ( e − iη j U l | m j i − e ± i ϕ j | m j i ) . (57) {| ψ j i , | ψ − j i} is an orthonormal basis for V jbusy and W | ψ ± j i = e ± i ϕ j | ψ ± j i . proof: Using the identities of Claim 4, one ﬁnds after some algebra that W | m j i = ( − | m j i + (2 m ∗ j ) U l | m j i , (58a)and W ( U l ) | m j i = ( − m j ) | m j i + ( − | m j | ) U l | m j i (58b)for all j .According to Eqs.(58), V jbusy is invariant under the action of W for each j . Byvirtue of Eq.(48), V jbusy is 1-dim for j = 0 and 2-dim if j = 0. We’ve already proventhat | m i is a stationary state of W .Now consider a ﬁxed j = 0. Both U ( − ˇ π U † and ( − ˆ π are reﬂections, andreﬂections are a special type of orthogonal matrix, so the product of these 2 orthog-onal matrices is also an orthogonal matrix. In fact, it’s a rotation about the axisperpendicular to the planar subspace V jbusy . The vectors | m j i , and U l | m j i} are11 |e |m j j j |m j h j i Figure 4: Deﬁnition of | e j i and | e j i .independent but not orthogonal. However, we can express them in terms of orthogo-nal vectors (see Fig.4) as follows: | m j i = | e j i , (59a)and e − iη j U l | m j i = cos( ϕ j ) | e j i + sin( ϕ j ) | e j i . (59b)In the | e j i , | e j i basis, we ﬁnd after substituting m j = e iη j cos( ϕ j ) into Eqs.(58) that W = (cid:20) cos(2 ϕ j ) sin(2 ϕ j ) − sin(2 ϕ j ) cos(2 ϕ j ) (cid:21) . (60)The eigenvalues of this matrix are e ± i ϕ j , with corresponding eigenvectors: | ψ ± j i = 1 √ | e j i ± | e j i ) . (61)These eigenvectors satisfy h ψ − j | ψ j i = 0 , h ψ j | ψ j i = h ψ − j | ψ − j i = 1 . (62)By expressing | e j i and | e j i in Eq.(61) in the original basis, we get Eq.(57). QED

Deﬁne the following vector spaces: V = span {| x i ⊗ | y i : x, y ∈ S x } , (63) V A = span {| x i ⊗ | i : x ∈ S x } , (64) V B = U l V A , (65)and V busy = V A + V B . (66)12 can be expressed as a direct sum of V busy and its orthogonal complement V ⊥ busy : V = V busy ⊕ V ⊥ busy . (67)From Claim 5, it follows that V busy is a direct sum of the subspaces V jbusy : V busy = N S − M j =0 V jbusy . (68)Recall that matrices M , M and M hyb are N S dimensional whereas W is N S di-mensional. Since the size of S x is N S , dim ( V ) = N S . From Eq.(68) and Claim 5, dim ( V busy ) = 2 N S −

1. Furthermore, {| ψ j i : j = 0 , ± , ± , . . . , ± ( N S − } is anorthonormal basis for V busy .At this point we’ve explained the action of W on V busy , but we haven’t saidanything about the action of W on V ⊥ busy . Next we show that W acts simply as theidentity on V ⊥ busy . (This is what one would expect since the vectors in V ⊥ busy are parallelto the axis of the W rotation.) Recall that if S and T are subspaces of a vector space V , then ( S + T ) ⊥ = S ⊥ ∩ T ⊥ . Therefore, V ⊥ busy = V ⊥ A ∩ V ⊥ B . (69)From the deﬁnitions of V A and V B , it’s easy to see that V ⊥ A = span {| x i ⊗ | y i : x ∈ S x , and y ∈ S x − { }} , (70)and V ⊥ B = U l ( V ⊥ A ) . (71) Claim 6 W | φ i = | φ i (72) for all | φ i ∈ V ⊥ busy . proof: Let | φ i ∈ V ⊥ busy = V ⊥ A ∩ V ⊥ B . Hence | φ i ∈ V ⊥ A and | φ i = U l | θ i for some | θ i ∈ V ⊥ A . U ( − ˇ π U † ( − ˆ π | φ i = U l ( − ˆ π l U † ( − | φ i (73)= U l (1 − π ) l U † U l | θ i (74)= U l (1 − π ) | θ i (75)= | φ i . (76) QED M = M = M and π () is a standard detailed balance for M instead of a detailed balancefor the pair ( M , M ). For Ref.[4], M hyb = M sym , m j = m ∗ j , U = ˇ U , U = ˆ U , U = U † U = ˆ U † ˇ U , U = l U † l . When U = l U † l as in Ref.[4], Eq.(44b) and Eq.(44c)are essentially identical, whereas in the M = M case, it’s less obvious that thesetwo equations are true simultaneously. In this section, we will describe an algorithm for doing Gibbs sampling on a quantumcomputer, utilizing the Szegedy operator W that we have so painstakingly discussedin previous sections.We begin by choosing some x ∈ S x for which P ( x = x ) = 0. Now deﬁne | x i = | x = x i ⊗ | i ⊗ N B . (77)Note that | x i ∈ V busy and h ψ | x i = h√ π | x = x i = p π ( x ) . (78) p π ( x ) = p P ( x ) can be easily evaluated at a single point x = x . Our quantumGibbs algorithm consists of performing the original Grover algorithm with beginningstate | x i and target state | ψ i . Deﬁne the following 2 reﬂection operators R beg = ( − | x ih x | , (79)and R tar = ( − | ψ ih ψ | . (80) R beg R tar is a rotation by an angle O ( p π ( x )) in space span {| ψ i , | x i} ⊂ V busy . Let L = O ( 1 p π ( x ) ) . (81)If p π ( x ) = O (1 / √ N S ), then L iterations of R beg R tar will take the beginning stateto the target state. To implement this use of Grover’s algorithm, we need to compile(with polynomial eﬃciency) the operator R beg R tar . R beg is easy to compile; it’s justa single multiply-controlled phase. Next, we will explain how to compile R tar . Perhaps some symmetry of the physical situation being modeled by the Bayesian network x will suggest some x value that has non-zero probability. Alternatively, one can proceed as follows.For deﬁniteness, consider a Bayesian net x = ( x , x , x ) with 3 nodes. Suppose P ( x , x , x ) = P ( x | x , x ) P ( x | x ) P ( x ) and the functions P x | x ,x P x | x and P x are known. Choose y ∈ S x such that P x ( y ) = 0. Then choose y ∈ S x such that P x | x ( y | y ) = 0. Finally, choose y ∈ S x such that P x | x ,x ( y | y , y ) = 0. Set x = ( y , y , y ). We will discuss in a future paper what to do if p π ( x ) is much larger than O (1 / √ N S ). a-1 W W W a-1 W W HHH HHH HHH HHH V b N B bits a bits N B bits + ac Figure 5: Deﬁnition of operator V β for Szegedy operator W .Fig.5, which is identical to Fig.18 in Ref.[4], deﬁnes an operator V β in termsof multiple (modiﬁed) phase estimation steps. The V β of Ref.[4] depends on a param-eter β (inverse temperature) because the operator M in that paper depends on thisparameter. V β in the present paper does not depend on β (because the Bayesian net x doesn’t) so we will drop the β subscript from it henceforth, and refer to it simplyas V . V does not depend on β but it still depends on the positive integers a and c .(In the language of Ref.[4], a =number of probe bits for each PE (Phase Estimation)step, and c =number of PE steps). Note that operator W is applied 2 a c times by V .Let | ac i = | i ⊗ ( ac ) , J = { , ± , ± , . . . , ± ( N S − } , and J ′ = J − { } .According to Lemma 2 of Ref.[3], for any ǫ >

0, if we adjust the integers a and c sothat 2 a ≈

1∆ = O ( 1 √ δ ) , (82)and c ≈ log ( 1 √ ǫ ) , (83)(recall δ = 1 − | m | is the distance between the two largest eigenvalue magnitudes of M ), then V acts on the space V busy ⊗ | ac i as follows: V = | ac ih ac || ψ ih ψ | + X j ∈ J ′ | χ j ih ac || ψ j ih ψ j | + O ( √ ǫ ) , (84)where the | χ j i are states of ac qubits such that h ac | χ j i = 0. In Eq.(84) and for theremainder of this section, the top row represents the ac ancilla qubits shown in Fig.5,and the bottom row represents the 2 N B qubits on which W operates.15ow deﬁne Q = ( − | ac ih ac | = 1 − | ac ih ac | , (85)and ˜ R tar = V † Q — V . (86)It follows that for any | ψ i ∈ V busy ,˜ R tar | ac i| ψ i = (cid:20) − V † | ac ih ac | — V (cid:21) | ac i| ψ i (87)= (cid:20) − | ac ih ac || ψ ih ψ | (cid:21) | ac i| ψ i + O ( √ ǫ ) (88)= | ac i| ψ i + | ac i ( − | ψ ih ψ | ) | ψ i + O ( √ ǫ ) (89)= | ac i R tar | ψ i + O ( √ ǫ ) . (90)Eq.(90) is the essence of Corollary 2 in Ref.[3]. It means that R tar acting on V busy can be approximated by ˜ R tar acting on V busy ⊗ | ac i . Since we already know how tocompile ˜ R tar , we have accomplished our goal of compiling R tar , at least approximately.Next, we will try to estimate the error of our quantum Gibbs algorithm. Sup-pose ˜ π () is our estimate of π (). Note that for any x ∈ S x , | π ( x ) − ˜ π ( x ) | = | ( p π ( x ) − p ˜ π ( x ))( p π ( x ) + p ˜ π ( x )) | (91) ≤ | p π ( x ) − p ˜ π ( x ) | . (92)Suppose ǫ > x | p π ( x ) − p ˜ π ( x ) | ≤ ǫ . (93)Then, since we apply the R beg R tar operator a total of L times, and each time we canincur an error of √ ǫ , ǫ ≈ L √ ǫ . (94)If we deﬁne one step as one W application, then the total number of steps for thewhole algorithm is O ( L a c ) = O ( L √ δ log ( Lǫ )). Thus, our algorithm will yield a sampleof the classical Bayesian net x with precision O ( ǫ ), in O ( L √ δ log ( Lǫ )) steps. Achievingthe same precision with a classical Gibbs sampling algorithm would require O ( δ )steps. 16he Szegedy operator W of this paper can also be used to do quantum sim-ulated annealing and Metropolis-Hastings if the marginals P ( x t +1 i | x t { i } c ) can be cal-culated for each i from the transition matrix P ( x t +1 | x t ). (In the case of simulatedannealing, P ( x t +1 | x t ) is diﬀerent for each β i of the annealing schedule). References [1] Mario Szegedy, “Spectra of Quantized Walks and a √ δǫδǫ