A Birthday Paradox for Markov chains with an optimal bound for collision in the Pollard Rho algorithm for discrete logarithm
aa r X i v : . [ m a t h . P R ] S e p The Annals of Applied Probability (cid:13)
Institute of Mathematical Statistics, 2010
A BIRTHDAY PARADOX FOR MARKOV CHAINS WITH ANOPTIMAL BOUND FOR COLLISION IN THE POLLARDRHO ALGORITHM FOR DISCRETE LOGARITHM
By Jeong Han Kim , Ravi Montenegro,Yuval Peres and Prasad Tetali Yonsei University, University of Massachusetts Lowell, Microsoft Researchand Georgia Institute of Technology
We show a Birthday Paradox for self-intersections of Markovchains with uniform stationary distribution. As an application, weanalyze Pollard’s Rho algorithm for finding the discrete logarithmin a cyclic group G and find that if the partition in the algorithmis given by a random oracle, then with high probability a collisionoccurs in Θ( p | G | ) steps. Moreover, for the parallelized distinguishedpoints algorithm on J processors we find that Θ( p | G | /J ) steps suf-fices. These are the first proofs of the correct order bounds which donot assume that every step of the algorithm produces an i.i.d. samplefrom G .
1. Introduction.
The Birthday Paradox states that if C √ N items aresampled uniformly at random with replacement from a set of N items, then,for large C with high probability, some items will be chosen twice. This canbe interpreted as a statement that with high probability, a Markov chainon the complete graph K N with transitions P ( i, j ) = 1 /N will intersect itspast in C √ N steps; we refer to such a self-intersection as a collision andsay the “ collision time ” is O ( √ N ). Miller and Venkatesan generalized thisin [8] by showing that for a general Markov chain, the collision time isbounded by O ( √ N T s (1 / T s ( ε ) = min { n : ∀ u, v ∈ V, P n ( u, v ) ≥ (1 − ε ) π ( v ) } measures the time required for the n -step distribution to assign everystate a suitable multiple of its stationary probability. Kim, Montenegro and Received April 2008; revised June 2009. Supported by Korea Science and Engineering Foundation (KOSEF) Grant funded bythe Korean government (MOST) R16-2007-075-01000-0. Supported in part by NSF Grant DMS-06-05166. Supported in part by NSF Grants DMS-04-01239, DMS-07-01043.
AMS 2000 subject classifications.
Primary 60J10; secondary 68Q25, 94A60.
Key words and phrases.
Pollard’s Rho, discrete logarithm, Markov chain, mixing time.
This is an electronic reprint of the original article published by theInstitute of Mathematical Statistics in
The Annals of Applied Probability ,2010, Vol. 20, No. 2, 495–521. This reprint differs from the original in paginationand typographic detail. 1
KIM, MONTENEGRO, PERES AND TETALI
Tetali [6] further improved the bound on collision time to O ( p N T s (1 / N/ (32 T s (1 / G of prime order N = | G | 6 = 2. For this walk, T s (1 /
2) = Ω(log N ), and so the results of [6, 8] are insufficient to show thewidely believed Θ( √ N ) collision time for this walk. In this paper we im-prove upon these bounds and show that if a finite ergodic Markov chain hasuniform stationary distribution over N states, then O ( √ N ) steps suffice fora collision to occur as long as the relative-pointwise distance ( L ∞ of thedensities of the current and the stationary distribution) drops steadily early in the random walk; it turns out that the precise mixing time is largely, al-though not entirely, unimportant. See Theorem 3.2 for a precise statement.This is then applied to the Rho walk to give the first proof of collision inΘ( √ N ) steps, matching Shoup’s lower bound [16] on time required for anyprobabilistic generic algorithm to solve this problem, and to van Oorschotand Wiener’s [19] parallel version of the algorithm on J processors to provecollision in Θ( √ N /J ) steps.We note here that it is also well known (see, e.g., [1], Section 4.1) that arandom walk of length L contains roughly Lλ samples from the stationarymeasure (of the Markov chain) where λ is the spectral gap of the chain. Thisyields another estimate on collision time for a Markov chain which is also ofa multiplicative nature (namely, √ N times a function of the mixing time)as in [6, 8]. A main point of the present work is to establish sufficient criteriaunder which the collision time has an additive bound: C √ N plus an estimateon the mixing time. While the Rho algorithm provided the main motivationfor the present work, we find the more general Birthday Paradox result tobe of independent interest, and, as such, expect to have other applicationsin the future.A bit of detail about the Pollard Rho algorithm is in order. The classicaldiscrete logarithm problem on a cyclic group deals with computing the ex-ponents, given the generator of the group; more precisely, given a generator g of a cyclic group G and an element h = g x , one would like to compute x efficiently. Due to its presumed computational difficulty, the problem figuresprominently in various cryptosystems, including the Diffie–Hellman key ex-change, El Gamal system and elliptic curve cryptosystems. About 30 yearsago, J. M. Pollard suggested algorithms to help solve both factoring largeintegers [13] and the discrete logarithm problem [14]. While the algorithmsare of much interest in computational number theory and cryptography,there has been little work on rigorous analysis. We refer the reader to [8] BIRTHDAY PARADOX FOR POLLARD RHO and other existing literature (e.g., [3, 18]) for further cryptographic andnumber-theoretical motivation for the discrete logarithm problem.A standard variant of the classical Pollard Rho algorithm for finding dis-crete logarithms can be described using a Markov chain on a cyclic group G .While there has been no rigorous proof of rapid mixing of this Markov chainof order O (log c | G | ) until recently, Miller and Venkatesan [8] gave a proof ofmixing of order O (log | G | ) steps and collision time of O ( p | G | log | G | ), andKim, Montenegro and Tetali [6] showed mixing of order O (log | G | log log | G | )and collision time of O ( p | G | log | G | log log | G | ). In this paper we give the firstproof of the correct Θ( p | G | ) collision time. By recent results of Miller andVenkatesan [9] this collision will be nondegenerate and will solve the discretelogarithm problem with probability 1 − o (1) for almost every prime order | G | , if the start point of the algorithm is chosen at random or if there is nocollision in the first O (log | G | log log | G | ) steps.The paper proceeds as follows. Section 2 contains some preliminaries,primarily an introduction to the Pollard Rho algorithm and a simple mul-tiplicative bound on the collision time in terms of the mixing time. Themore general Birthday Paradox for Markov chains with uniform stationarydistribution is shown in Section 3. In Section 4 we bound the appropriateconstants for the Rho walk and show the optimal collision time. We finishin Section 5 by proving similar results for the distinguished points methodof parallelizing the algorithm.
2. Preliminaries.
Our intent in generalizing the Birthday Paradox wasto bound the collision time of the Pollard Rho algorithm for discrete log-arithm. As such, we briefly introduce the algorithm here. Throughout theanalysis in the following sections, we assume that the size N = | G | of thecyclic group on which the random walk is performed is odd. Indeed thereis a standard reduction (see [15] for a very readable account and [12] for aclassical reference) justifying the fact that it suffices to study the discretelogarithm problem on cyclic groups of prime order.Suppose g is a generator of G , that is, G = { g i } N − i =0 . Given h ∈ G , the dis-crete logarithm problem asks us to find x such that g x = h . Pollard suggestedan algorithm on Z × N based on a random walk and the Birthday Paradox. Acommon extension of his idea to groups of prime order is to start with a par-tition of G into sets S , S , S of roughly equal sizes, and define an iteratingfunction F : G → G by F ( y ) = gy if y ∈ S , F ( y ) = hy = g x y if y ∈ S and F ( y ) = y if y ∈ S . Then consider the walk y i +1 = F ( y i ). If this walk passesthrough the same state twice, say g a + xb = g α + xβ , then g a − α = g x ( β − b ) , andso a − α ≡ x ( β − b ) mod N and x ≡ ( a − α )( β − b ) − mod N which determines x as long as ( β − b, N ) = 1 (the nondegenerate case). Hence, if we define a collision to be the event that the walk passes over the same group element KIM, MONTENEGRO, PERES AND TETALI twice, then the first time there is a collision it might be possible to determinethe discrete logarithm.To estimate the running time until a collision, one heuristic is to treat F as if it outputs uniformly random group elements. By the Birthday Paradox,if O ( p | G | ) group elements are chosen uniformly at random, then there isa high probability that two of these are the same. Teske [17] has givenexperimental evidence that the time until a collision is slower than whatwould be expected by an independent uniform random process. We analyzeinstead the actual Markov chain in which it is assumed only that each y ∈ G is assigned independently and at random to a partition S , S or S . In thiscase, although the iterating function F described earlier is deterministic,because the partition of G was randomly chosen, then the walk is equivalentto a Markov chain (i.e., a random walk), at least until the walk visits apreviously visited state and a collision occurs. The problem is then one ofconsidering a walk on the exponent of g , that is, a walk P on the cycle Z N with transitions P ( u, u + 1) = P ( u, u + x ) = P ( u, u ) = 1 / Remark 2.1.
By assuming each y ∈ G is assigned independently andat random to a partition, we have eliminated one of the key features ofthe Pollard Rho algorithm, space efficiency. However, if the partitions aregiven by a hash function, f : ( G, N ) → { , , } , which is sufficiently pseudo-random, then we might expect behavior similar to the model with randompartitions. Remark 2.2.
While we are studying the time until a collision occurs,there is no guarantee that the first collision will be nondegenerate. If the firstcollision is degenerate then so will be all collisions as the algorithm becomesdeterministic after the first collision.As mentioned in the introduction, we first recall a simple multiplicativebound on collision time from [6]. The following proposition relates T s (1 / P with uniformdistribution on G as the stationary distribution. Proposition 2.3.
With the above definitions, a collision occurs after T s (1 /
2) + 2 p c | G | T s (1 / steps with probability at least − e − c , for any c > . Proof.
Let S denote the first ⌈ p c | G | T s (1 / ⌉ states visited by thewalk. If two of these states are the same then a collision has occurred, soassume all states are distinct. Even if we only check for collisions every BIRTHDAY PARADOX FOR POLLARD RHO T s (1 /
2) steps, the chance that no collision occurs in the next tT s (1 /
2) steps(so consider t semi-random states) is then at most (cid:18) − | S || G | (cid:19) t ≤ (cid:18) − s cT s (1 / | G | (cid:19) t ≤ exp (cid:18) − t s cT s (1 / | G | (cid:19) . When t = ⌈ q c | G | T s (1 / ⌉ , this is at most e − c , as desired, and so at most ⌈ p c | G | T s (1 / ⌉ − (cid:24)s c | G | T s (1 / (cid:25) T s (1 / − e − c . (cid:3) Obtaining a more refined additive bound on collision time will be the focusof the next section. While the proof can be seen as another application of thewell-known second moment method, it turns out that bounding the secondmoment of the number of collisions before the mixing time is somewhatsubtle. To handle this, we use an idea from [7], who in turn credit theirapproach to [5].
3. Collision time.
Consider a finite ergodic Markov chain P with uniformstationary distribution U (i.e., doubly stochastic), state space Ω of cardinal-ity N = | Ω | , and let X , X , . . . denote a particular instance of the walk. Inthis section we determine the number of steps of the walk required to havea high probability that a “collision” has occurred, that is, a self-intersection X i = X j for some i = j .A key notion when studying Markov chains is the mixing time, or thetime required until the probability of being at each state is suitably close toits stationary probability. Definition 3.1.
The mixing time τ ( ε ) of a Markov chain P with sta-tionary distribution U is given by τ ( ε ) = min { T : ∀ u, v ∈ Ω , (1 − ε ) U ( v ) ≤ P T ( u, v ) ≤ (1 + ε ) U ( v ) } . Now some notation.Fix some T ≥ β >
0. Let the indicator function { X i = X j } equal one if X i = X j , and zero otherwise. Define S = β √ N X i =0 β √ N +2 T X j = i +2 T { X i = X j } KIM, MONTENEGRO, PERES AND TETALI to be the number of times the walk intersects itself in β √ N + 2 T steps where i and j are at least 2 T steps apart. Also, for u, v ∈ Ω, let G T ( u, v ) = T X i =0 P i ( u, v )be the expected number of times a walk beginning at u hits state v in T steps. Finally, let A T = max u X v G T ( u, v ) and A ∗ T = max u X v G T ( v, u ) . To see the connection between these and the collision time, observe that X v G T ( u, v ) = X v T X i =0 T X j =0 P i ( u, v ) P j ( u, v ) ! = T X i =0 T X j =0 X v P i ( u, v ) P j ( u, v )= T X i =0 T X j =0 Pr ( X i = Y j )= T X i =0 T X j =0 E ( { X i = Y j } ) = E T X i,j =0 { X i = Y j } , where { X i } , { Y j } are i.i.d. copies of the chain, both having started at u at time 0, and E denotes expectation. Hence A T is the maximal expectednumber of collisions of two T -step i.i.d. walks of P starting at the same state u . Likewise, A ∗ T is the same for the reversal P ∗ where P ∗ ( u, v ) = P ( v, u )(recall that the stationary distribution was assumed to be uniform).The main result of this section is the following. Theorem 3.2 (Birthday Paradox for Markov chains).
Consider a finiteergodic Markov chain with uniform stationary distribution on a state spaceof size N . Let T be such that mN ≤ P T ( u, v ) ≤ MN for some m ≤ ≤ M andevery pair of states u, v . After c (cid:18) Mm (cid:19) (cid:18)r NM max { A T , A ∗ T } + T (cid:19) steps a collision occurs with a probability of at least − e − c , for any c ≥ . BIRTHDAY PARADOX FOR POLLARD RHO At the end of this section a slight strengthening of Theorem 3.2 is shownat the cost of a somewhat less intuitive bound.In Example 3.5, near the end of this section, we present an example toillustrate the need for the pre-mixing term A T in Theorem 3.2. In contrast,very recently Nazarov and Peres [10] proved a general bound for the birthdayproblem on any reversible Markov chain on N states: Suppose that theratio of the stationary measures of any two states is at most A . Then theyshow that for any starting state, the expected time until the chain visits apreviously visited state is at most C √ N + log( A ) for some universal constant C . In particular, this implies an expected collision time of O ( √ N ) for thesimple random walk on an undirected graph on N vertices, and so the pre-mixing term is not necessary when considering reversible walks.Observe that if A T , A ∗ T , m, M = Θ(1) and T = O ( √ N ), then the collisiontime is O ( √ N ) as in the standard Birthday Paradox. By Lemma 3.3, for thisto occur it suffices that P T be sufficiently close to uniform after T = o ( √ N )steps and that P j ( u, v ) = o ( T − ) + d j for all u, v , for j ≤ T and some d < A T and A ∗ T it suffices to show that themaximum probability of being at a vertex decreases quickly. Lemma 3.3.
If a finite ergodic Markov chain has uniform stationarydistribution, then A T , A ∗ T ≤ T X j =0 ( j + 1) max u,v P j ( u, v ) . Proof. If u is such that equality occurs in the definition of A T , then A T = X v G T ( u, v ) = T X i =0 T X j =0 X v P i ( u, v ) P j ( u, v ) ≤ T X j =0 j X i =0 max y P j ( u, y ) X v P i ( u, v ) ≤ T X j =0 ( j + 1) max y P j ( u, y ) . The quantity A ∗ T plays the role of A T for the reversed chain, and so thesame bound holds for A ∗ T but with max u,v ( P ∗ ) j ( u, v ) = max u,v P j ( v, u ) =max u,v P j ( u, v ). (cid:3) KIM, MONTENEGRO, PERES AND TETALI
In particular, suppose P j ( u, v ) ≤ c + d j for every u, v ∈ Ω and some c, d ∈ [0 , T X j =0 ( j + 1)( c + d j ) = c ( T + 1)( T + 2)2 + 1 − d T +1 − ( T + 1) d T +1 (1 − d )(1 − d ) ≤ (1 + o (1)) cT − d ) , and so if P j ( u, v ) ≤ o ( T − ) + d j for every u, v ∈ Ω, then A T , A ∗ T = o (1)(1 − d ) .The proof of Theorem 3.2 relies largely on the following inequality whichshows that the expected number of self-intersections is large with low vari-ance: Lemma 3.4.
Under the conditions of Theorem 3.2, E [ S ] ≥ mN (cid:18) β √ N + 22 (cid:19) ,E [ S ] ≤ M N (cid:18) β √ N + 22 (cid:19) (cid:18) { A T , A ∗ T } M β (cid:19) . Proof of Theorem 3.2.
Recall the standard second moment bound:using Cauchy–Schwarz, we have that E [ S ] = E [ S { S> } ] ≤ E [ S ] / E [ { S> } ] / , and hence Pr [ S > ≥ E [ S ] /E [ S ] . If β = 2 p { A T , A ∗ T } /M , then byLemma 3.4, Pr [ S > ≥ m /M { A T , A ∗ T } ) / ( M β ) ≥ m M independent of the starting point. If no collision occurs in β √ N + 2 T stepsthen S = 0 as well, and so Pr [ no collision ] ≤ Pr [ S = 0] ≤ − m / M . Hence,in k ( β √ N + 2 T ) steps, Pr [ no collision ] ≤ (1 − m / M ) k ≤ e − km / M . (3.1)Taking k = 2 cM /m completes the proof. (cid:3) Proof of Lemma 3.4.
We will repeatedly use the relation that thereare (cid:0) β √ N +22 (cid:1) choices for i, j appearing in the summation for S , that is, 0 ≤ i and i + 2 T ≤ j ≤ β √ N + 2 T . BIRTHDAY PARADOX FOR POLLARD RHO Now to the proof. The expectation E [ S ] satisfies E [ S ] = E β √ N X i =0 β √ N +2 T X j = i +2 T { X i = X j } (3.2) = β √ N X i =0 β √ N +2 T X j = i +2 T E [ { X i = X j } ] ≥ (cid:18) β √ N + 22 (cid:19) mN because if j ≥ i + T , then Pr ( X j = X i ) = X u Pr ( X i = u ) P j − i ( u, u ) ≥ X u Pr ( X i = u ) mN = mN . (3.3)Similarly, Pr ( X j = X i ) ≤ MN when j ≥ i + T .Now for E [ S ]. Note that E [ S ] = E β √ N X i =0 β √ N +2 T X j = i +2 T { X i = X j } ! β √ N X k =0 β √ N +2 T X l = k +2 T { X k = X l } ! = β √ N X i =0 β √ N X k =0 β √ N +2 T X j = i +2 T β √ N +2 T X l = k +2 T Prob( X i = X j , X k = X l ) . To evaluate this quadruple sum we break it into 3 cases.
Case 1.
Suppose | j − l | ≥ T . Without loss, assume l ≥ j so that, inparticular, l ≥ max { i, j, k } + T . ThenProb( X i = X j , X k = X l )= Prob( X i = X j ) Prob( X l = X k | X i = X j )(3.4) ≤ Prob( X i = X j ) max u,v Prob( X l = v | X max { i,j,k } = u ) ≤ Prob( X i = X j ) MN ≤ (cid:18) MN (cid:19) . The first inequality holds because { X t } is a Markov chain and so given X i , X j , X k the walk at any time t ≥ max { i, j, k } depends only on the state X max { i,j,k } . Case 2.
Suppose | i − k | ≥ T and | j − l | < T . Without loss, assume i ≤ k .If j ≤ l , thenProb( X i = X j , X k = X l ) KIM, MONTENEGRO, PERES AND TETALI = X u,v Prob( X i = u ) P k − i ( u, v ) P j − k ( v, u ) P l − j ( u, v )(3.5) ≤ X u Prob( X i = u ) MN MN X v P l − j ( u, v ) = (cid:18) MN (cid:19) because k ≥ i + T , j ≥ k + T and P v P t ( u, v ) = 1 for any t because P andhence also P t are stochastic matrices. If, instead, l < j then essentially thesame argument works, but with P v P t ( v, u ) = 1 because P and hence also P t are doubly-stochastic. Case 3.
Finally, consider those terms with | j − l | < T and | i − k | < T .Without loss, assume i ≤ k . If l ≤ j , thenProb( X i = X j , X k = X l )= X u,v Prob( X i = u ) P k − i ( u, v ) P l − k ( v, v ) P j − l ( v, u )(3.6) ≤ X u Prob( X i = u ) X v P k − i ( u, v ) MN P j − l ( v, u ) . The sum over elements with i ≤ k < i + T and l ≤ j < l + T is upperbounded as follows: β √ N X i =0 i + T X k = i β √ N +2 T X l = k +2 T l + T X j = l Prob( X i = X j , X k = X l ) ≤ MN β √ N X i =0 β √ N +2 T X l = i +2 T max u X v X k ∈ [ i,i + T ) P k − i ( u, v ) X j ∈ [ l,l + T ) P j − l ( v, u ) ≤ MN β √ N X i =0 β √ N +2 T X l = i +2 T max u X v G T ( u, v ) G T ( v, u )(3.7) ≤ MN β √ N X i =0 β √ N +2 T X l = i +2 T max u sX v G T ( u, v ) X v G T ( v, u ) ≤ MN (cid:18) β √ N + 22 (cid:19) p A T A ∗ T . The case when j < l gives the same bound but with the observation that j ≥ k + T and with A T instead of p A T A ∗ T . BIRTHDAY PARADOX FOR POLLARD RHO Putting together these various cases we get that E [ S ] ≤ (cid:18) β √ N + 22 (cid:19) (cid:18) MN (cid:19) + 2 (cid:18) β √ N + 22 (cid:19) MN A T + 2 (cid:18) β √ N + 22 (cid:19) MN p A T A ∗ T . The (cid:0) β √ N +22 (cid:1) term is the total number of values of i, j, k, l appearing inthe sum for E [ S ], and hence also an upper bound on the number of valuesin Cases 1 and 2. Along with the relation (cid:0) β √ N +22 (cid:1) ≥ β N this simplifies tocomplete the proof. (cid:3) As promised earlier, we now present an example that illustrates the needfor the pre-mixing term A T in Theorem 3.2. Example 3.5.
Consider the random walk on Z N which transitions from u → u + 1 with probability 1 − / √ N , and with probability 1 / √ N transitions u → v for a uniformly random choice of v .Heuristically the walk proceeds as u → u + 1 for ≈ √ N steps, then ran-domizes, and then proceeds as u → u + 1 for another √ N steps. This ef-fectively splits the state space into √ N blocks of size about √ N each, soby the standard Birthday Paradox it should require about √ N / of theserandomizations before a collision will occur, in short, about N / steps intotal.To see the need for the pre-mixing term, observe that T s ≈ √ N log 2 whileif T = T ∞ ≈ √ N log(2( N − m = 1 / M = 3 / T s or T ∞ are considered, it will be insufficient totake O ( T + √ N ) steps. However, the number A T of collisions between twoindependent copies of this walk is about √ N since once a randomizationstep occurs then the two independent walks are unlikely to collide anytimesoon. Our collision time bound says that O ( N / ) steps will suffice which isthe correct bound.A proper analysis shows that − o (1) √ N / steps are necessary to have acollision with a probability of 1 /
2. Conversely, when T = √ N log N then m = 1 − o (1), M = 1 + o (1) and A T , A ∗ T ≤ o (1)2 √ N , so by equation (3.1),(2 + o (1)) N / steps are sufficient to have a collision with a probability ofat least 1 /
2. Our upper bound is thus off by at most a factor of 2 √ ≈ . KIM, MONTENEGRO, PERES AND TETALI
Theorem 3.6 (Improved Birthday Paradox).
Consider a finite ergodicMarkov chain with uniform stationary distribution on a state space of size N . Let T be such that mN ≤ P T ( u, v ) ≤ MN for some m ≤ ≤ M and everypair of states u, v . After c vuut T X j =1 j max u,v P j ( u, v ) ! NM + T ! steps a collision occurs with a probability of at least − (1 − m M ) c , indepen-dent of the starting state. Proof.
We give only the steps that differ from before.First, in equation (3.7), note that the triple sum after max u can be re-written as X α ∈ [0 ,T ) X β ∈ [0 ,T ) X v P α ( u, v ) P β ( v, u ) ≤ T − X γ =0 ( γ + 1) P γ ( u, u ) . The original quadruple sum then reduces to MN (cid:18) β √ N + 22 (cid:19) max u T − X γ =0 ( γ + 1) P γ ( u, u ) . For the case when i < k and j < l proceed similarly, then reduce, as inLemma 3.3, to obtain the upper bound, MN (cid:18) β √ N + 22 (cid:19) T − X α =1 T − X β =1 X v P α ( u, v ) P β ( u, v ) ≤ MN (cid:18) β √ N + 22 (cid:19) T − X γ =1 (2 γ −
1) max v P γ ( u, v ) . Adding these two expressions gives an expression of at most MN (cid:18) β √ N + 22 (cid:19) T X γ =1 γ max v P γ ( u, v ) ! . The remaining two cases will add to the same bound, so effectively thissubstitutes the expression 2(1 + max u P Tγ =1 γ max v P γ ( u, v )) in place of a4 max { A T , A ∗ T } in the original theorem. (cid:3) To simplify, note that if max u,v P j ( u, v ) ≤ c + d j for c, d ∈ [0 , BIRTHDAY PARADOX FOR POLLARD RHO T X j =1 j ( c + d j ) = 3 cT (2 T + 1) + 3 d − d T − T d T (1 − d )(1 − d ) (3.8) ≤ (1 + o (1))6 cT + 3 d (1 − d ) .
4. Convergence of the Rho walk.
Let us now turn our attention to thePollard Rho walk for discrete logarithm. To apply the collision time resultwe will first show that max u,v ∈ Z N P s ( u, v ) decreases quickly in s so thatLemma 3.3 may be used. We then find T such that P T ( u, v ) ≈ /N forevery u, v ∈ Z N . However, instead of studying the Rho walk directly, mostof the work will instead involve a “block walk” in which only a certain subsetof the states visited by the Rho walk are considered. Definition 4.1.
Let us refer to the three types of moves that the PollardRho random walk makes, namely ( u, u + 1) , ( u, u + x ) and ( u, u ) as moves ofType 1, Type 2 and Type 3, respectively. In general, let the random walk bedenoted by Y , Y , Y , . . . with Y t indicating the position of the walk (modulo N ) at time t ≥
0. Let T be the first time that the walk makes a move ofType 3. Let b = Y T − − Y T (i.e., the ground covered, modulo N , only usingconsecutive moves of Types 1 and 2). More generally, let T i be the first time,since T i − , that a move of Type 3 happens, and set b i = Y T i − − Y T i − . Thenthe block walk B is the walk X s = Y T s = 2 s Y T + 2 P si =1 s − i b i .By combining our Birthday Paradox for Markov chains with several lem-mas to be shown in this section, we obtain the main result of the paper: Theorem 4.2.
For every choice of starting state, the expected numberof steps required for the Pollard Rho algorithm for discrete logarithm on agroup G to have a collision is at most (1 + o (1))12 √ p | G | < (1 + o (1))52 . p | G | . In order to prove this it is necessary to show that B s ( u, v ) decreases quicklyfor the block walk. Lemma 4.3. If s ≤ ⌊ log N ⌋ , then for every u, v ∈ Z N the block walksatisfies B s ( u, v ) ≤ (2 / s . If s > ⌊ log N ⌋ then B s ( u, v ) ≤ / N log − ≤ / √ N . KIM, MONTENEGRO, PERES AND TETALI
A bound on the asymptotic rate of convergence is also required:
Theorem 4.4. If s ≥ ⌈ m log m − ε ⌉ where m = ⌈ log N ⌉ , then for every u, v ∈ Z N the block walk satisfies − εN ≤ B s ( u, v ) ≤ εN . This is all that is needed to prove the main result:
Proof of Theorem 4.2.
The proof will use Theorem 3.6 because thisgives a somewhat sharper bound. Alternatively, Theorem 3.2 and Lemma 3.3can be applied nearly identically to get the slightly weaker (1 + o (1))72 p | G | .First consider steps of the block walk. Lemma 4.3 implies that B s ( u, v ) ≤ / √ N + ( ) s , for s ≥ u, v . Hence, by equation (3.8), if T = o ( √ N ),then 1 + P Tj =1 j B j ( u, v ) ≤
19 + o (1). By Theorem 4.4, M ≤ ε and m ≥ − ε after 2(log N )(log log N + log ε ) steps. Hence, if ε = 1 /N , then T =(4 + o (1))(log N ) = o ( √ N ), and m = 1 − o (1 /N ), and M = 1 + o (1 /N ).Plugging this into Theorem 3.6, a collision fails to occur in k vuut T X j =1 j max u,v B j ( u, v ) ! NM + 2 T ! = (1 + o (1))2 √ k √ N steps with a probability of, at most, (1 − δ ) k where δ = m / M = (1 − o (1)) / T i denotes the number of Rhosteps required for i block steps. The difference T i +1 − T i is an i.i.d. randomvariable with the same distribution as T − T . Hence, if i ≥ j , then E [ T i − T j ] = ( i − j ) E [ T − T ] = 3( i − j ). In particular, if we let r = (1 + o (1))2 √ N ,let R denote the number of Rho steps before a collision, and let B denotethe number of block steps before a collision, then E [ R ] ≤ ∞ X k =0 Pr[
B > kr ] E [ T ( k +1) r − T kr | B > kr ]= ∞ X k =0 Pr[
B > kr ] E [ T ( k +1) r − T kr ] ≤ ∞ X k =0 (cid:18) o (1)2 (cid:19) k r = (1 + o (1))12 √ √ N . (cid:3)
Proof of Lemma 4.3.
We start with a weaker, but somewhat moreintuitive, proof of a bound on B s ( u, v ) and then improve it to obtain theresult of the lemma. The key idea here will be to separate out a portion BIRTHDAY PARADOX FOR POLLARD RHO of the Markov chain which is tree-like with some large depth L , namelythe moves induced solely by b i = 0 and b i = 1 moves. Because of the highdepth of the tree, the walk spreads out for the first L steps, and hence theprobability of being at a vertex also decreases quickly.Let S = { i ∈ [1 , . . . , s ] : b i ∈ { , }} and z = P i/ ∈ S s − i b i be random vari-ables whose values are determined by the first T s steps of the random walk.Then Y T s = 2 s Y T + 2 z + 2 P i ∈ S s − i b i . Hence, choosing Y T = u , Y T s = v , wemay write B s ( u, v )= X S Prob( S ) X z ∈ Z N Prob( z | S ) Prob X i ∈ S s − i b i = v − s − u − z (cid:12)(cid:12)(cid:12) z, S ! ≤ X S Prob( S ) max w ∈ Z N Prob (cid:18)X i ∈ S s − i b i = w (cid:12)(cid:12)(cid:12) S (cid:19) , and so for a fixed choice of S , we can ignore what happens on S c .Each w ∈ [0 , . . . , N −
1] has a unique binary expansion, and so if s ≤⌊ log N ⌋ , then modulo N each w can still be written in, at most, one way asan s bit string. For the block walk, Prob( b i = 0) ≥ / b i = 1) ≥ /
9, and so max { Prob( b i = 0 | i ∈ S ) , Prob( b i = 1 | i ∈ S ) } ≤ . It followsthat max w ∈ Z N Prob (cid:18)X i ∈ S s − i b i = w (cid:12)(cid:12)(cid:12) S (cid:19) ≤ (8 / | S | (4.1)using independence of the b i s. Hence, B s ( u, v ) ≤ X S Prob( S )(8 / | S | = s X r =0 Prob( | S | = r )(8 / r ≤ s X r =0 (cid:18) sr (cid:19) (cid:18) (cid:19) r (cid:18) − (cid:19) s − r (cid:18) (cid:19) r = (cid:18)
49 89 + 59 (cid:19) s = (cid:18) (cid:19) s . The second inequality was because (8 / | S | is decreasing in | S | and so under-estimating | S | by assuming Prob( i ∈ S ) = 4 / B s ( u, v ).In order to improve on this, we will shortly redefine S (namely, events { i ∈ S } , { i / ∈ S } ) and auxiliary variables c i , using the steps of the Rho walk.Also note that the block walk is induced by a Rho walk, so we may assumethat the b i were constructed by a series of steps of the Rho walk. Withprobability 1 / i ∈ S and c i = 0, otherwise if the first step is of Type 1,then set i ∈ S and c i = 1 while if the first step is of Type 3 then put i / ∈ S and c i = 0, and finally if the first step is of Type 2, then again repeat the KIM, MONTENEGRO, PERES AND TETALI above decision making process, using the subsequent steps of the walk. Notethat the above construction can be summarized as consisting of one of fourequally likely outcomes (at each time) where the last three outcomes dependon the type of the step that the Rho walk takes; indeed, each of these threeoutcomes happens with probability × = 1 /
4; finally, a Type 2 step forcesus to reiterate the four-way decision-making process.In summary, Pr( i ∈ S ) = P ∞ l =0 (1 / l (1 /
2) = 2 /
3. Also observe that Pr( c i =0 | i ∈ S ) = Pr( c i = 1 | i ∈ S ) and that Pr( b i − c i = x | i ∈ S, c i = 0) = Pr( b i − c i = x | i ∈ S, c i = 1). Hence the steps done earlier (leading to the weakerbound) carry through with z = P i s − i ( b i − c i ) and with P i ∈ S s − i b i re-placed by P i ∈ S s − i c i . In (4.1) replace (8 / | S | by (1 / | S | , and in showingthe final upper bound on B s ( u, v ), replace 4 / /
3. This leads to thebound B s ( u, v ) ≤ (2 / s .Finally, when s > ⌊ log N ⌋ , simply apply the preceding argument to S ′ = S ∩ [1 , . . . , ⌊ log N ⌋ ]. Alternately, note that when s ≥ ⌊ log N ⌋ , then B s ( u, v ) ≤ max w B ⌊ log N ⌋ ( u, w ) ≤ (2 / log N − for every doubly-stochastic Markov chain B . (cid:3) In [6, 8] sufficiently strong bounds on the asymptotics of B s ( u, v ) areshown in several ways, including the use of characters and quadratic forms,canonical paths and Fourier analysis. We give here the Fourier approach,as it establishes the sharpest mixing bounds. To bound mixing time of theblock walk, it suffices to show that for large enough s , the distribution ν s of Z s = 2 s − b + 2 s − b + · · · + b s is close to the uniform distribution U = 1 /N because then the distribution of X s = 2 s Y T + 2 Z s will be close to uniform as well. More precisely, convergencein chi-square distance will be shown that Lemma 4.5. If ν s ( j ) = Pr[ Z s = j ] , ξ = 1 − −√ , and m satisfies m − By Cauchy–Schwarz, (cid:12)(cid:12)(cid:12)(cid:12) B s ( u, v ) − U ( v ) U ( v ) (cid:12)(cid:12)(cid:12)(cid:12) = (cid:12)(cid:12)(cid:12)(cid:12) P w ( B s ( u, w ) − U ( w ))( B s ( w, v ) − U ( v )) U ( v ) (cid:12)(cid:12)(cid:12)(cid:12) BIRTHDAY PARADOX FOR POLLARD RHO (4.2) = (cid:12)(cid:12)(cid:12)(cid:12)X w U ( w ) (cid:18) B s ( u, w ) U ( w ) − (cid:19)(cid:18) B ∗ s ( v, w ) U ( w ) − (cid:19)(cid:12)(cid:12)(cid:12)(cid:12) ≤ X w U ( w ) (cid:12)(cid:12)(cid:12)(cid:12) B s ( u, w ) U ( w ) − (cid:12)(cid:12)(cid:12)(cid:12) X x U ( x ) (cid:12)(cid:12)(cid:12)(cid:12) B ∗ s ( v, x ) U ( x ) − (cid:12)(cid:12)(cid:12)(cid:12) . Lemma 4.5 bounds the first sum of (4.2). The second sum is the same quan-tity but for the time-reversed walk B ∗ ( y, x ) = B ( x, y ). To examine the re-versed walk let b ∗ i denote the sum of steps taken by B ∗ between the ( i − i th time that a u → u/ Z ∗ s = 2 − s +1 b ∗ + · · · + b ∗ s . If we define b i = − b ∗ i ,then the b i are independent random variables from the same distribution asthe blocks of B , and so Pr [ − s − Z ∗ s = j ] = Pr [ b + 2 b + · · · + 2 s − b s = j ]= Pr [ Z s = j ] . Lemma 4.5 thus bounds the second sum of (4.2) as well, and the theoremfollows. (cid:3) Before proving Lemma 4.5 let us review the standard Fourier transformand the Plancherel identity. For any complex-valued function f on Z N and ω = e πi/N , recall that the Fourier transform ˆ f : Z N → C is given by ˆ f ( ℓ ) = P N − j =0 ω ℓj f ( j ), and the Plancherel identity asserts that N N − X j =0 | f ( j ) | = N − X j =0 | ˆ f ( j ) | . For the distribution µ of a Z N -valued random variable X , its Fouriertransform is ˆ µ ( ℓ ) = N − X j =0 ω ℓj µ ( j ) = E [ ω ℓX ] . Thus, given distributions µ , µ of two independent random variables Y , Y ,the distribution ν of X := Y + Y has the Fourier transform ˆ ν = ˆ µ ˆ µ , sinceˆ ν ( ℓ ) = E [ ω ℓX ] = E [ ω ℓ ( Y + Y ) ](4.3) = E [ ω ℓY ] E [ ω ℓY ] = ˆ µ ( ℓ )ˆ µ ( ℓ ) . Generally, the distribution ν of X := Y + · · · + Y s with independent Y i s hasthe Fourier transform ˆ ν = Q sr =1 ˆ µ r . Moreover, for the uniform distribution KIM, MONTENEGRO, PERES AND TETALI U , it is easy to check thatˆ U ( ℓ ) = (cid:26) , if ℓ = 0,0 , otherwise.As the random variables 2 r b s − r s are independent, ˆ ν s = Q s − r =0 ˆ µ r where µ r are the distributions of 2 r b s − r . The linearity of the Fourier transform andˆ ν s (0) = E [1] = 1 yield \ ν s − U ( ℓ ) = ˆ ν s ( ℓ ) − ˆ U ( ℓ ) = , if ℓ = 0, s − Y r =0 ˆ µ r ( ℓ ) , otherwise. Proof of Lemma 4.5. By Plancherel’s identity, it is enough to showthat N − X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) s − Y r =0 ˆ µ r ( ℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ ξ ⌊ s/m ⌋ ) m − − . Let A r be the event that b s − r = 0 or 1. Then,ˆ µ r ( ℓ ) = E [ ω ℓ r b s − r ]= Pr [ b s − r = 0] + Pr [ b s − r = 1] ω ℓ r + Pr [ ¯ A r ] E [ ω ℓ r b s − r | ¯ A r ] , and, for x := Pr[ b s − r = 0] and y := Pr[ b s − r = 1], | ˆ µ r ( ℓ ) | ≤ | x + yω ℓ r | + (1 − x − y ) | E [ ω ℓ r b s − r | ¯ A r ] |≤ | x + yω ℓ r | + 1 − x − y. Notice that | x + yω ℓ r | = (cid:18) x + y cos 2 πℓ r N (cid:19) + y sin πℓ r N = x + y + 2 xy cos 2 πℓ r N . If cos πℓ r N ≤ 0, then | ˆ µ r ( ℓ ) | ≤ ( x + y ) / + 1 − x − y = 1 − ( x + y − ( x + y ) / ) . Since x = Pr[ b s − r = 0] ≥ / 3, and y = Pr[ b s − r = 1] ≥ / 9, it is easy to seethat x + y − ( x + y ) / has its minimum when x = 1 / y = 1 / 9. (Forboth partial derivatives are positive.) Hence, | ˆ µ r ( ℓ ) | ≤ ξ = 1 − − √ 109 provided cos 2 πℓ r N ≤ . BIRTHDAY PARADOX FOR POLLARD RHO If cos πℓ r N > 0, we use the trivial bound ˆ µ r ( ℓ ) = E [ ω ℓ r b s − r ] ≤ ℓ = 1 , . . . , N − 1, let φ s ( ℓ ) be the number of r = 0 , . . . , s − πℓ r N ≤ 0. Then s − Y r =0 | ˆ µ r ( ℓ ) | ≤ ξ φ s ( ℓ ) . (4.4)To estimate φ s ( ℓ ), we consider the binary expansion of ℓ/N = 0 .α ℓ, α ℓ, · · · α ℓ,s · · · ,α ℓ,r ∈ { , } with α ℓ,r = 0 infinitely often. Hence, ℓ/N = P ∞ r =1 − r α ℓ,r . Thefractional part of ℓ r /N may be written as { ℓ r /N } = 0 .α ℓ,r +1 α ℓ,r +2 · · · α ℓ,s · · · . Notice that cos πℓ r N ≤ ℓ r /N is (inclusively) be-tween 1 / / α r +1 = α r +2 . Thus φ s ( ℓ ) is at least aslarge as the number of alterations in the sequence ( α ℓ, , α ℓ, , . . . , α ℓ,s +1 ).We now take m such that 2 m − < N < m . Observe that, for ℓ = 1 , . . . , N − 1, the subsequences α ( ℓ ) := ( α ℓ, , α ℓ, , . . . , α ℓ,m ) of length m are pairwisedistinct: If α ( ℓ ) = α ( ℓ ′ ) for some ℓ < ℓ ′ then ℓ ′ − ℓN is less than P r ≥ m +1 − r ≤ − m which is impossible as N < m . Similarly, for fixed r and ℓ = 1 , . . . , N − 1, all subsequences α ( ℓ ; r ) := ( α ℓ,r +1 , α ℓ,r +2 , . . . , α ℓ,r + m ) are pairwise distinct.In particular, for fixed r with r = 0 , . . . , ⌊ s/m ⌋ − 1, all subsequences α ( ℓ ; rm ), ℓ = 1 , . . . , N − 1, are pairwise distinct. Since the fractional part { rm ℓN } =0 .α ℓ,rm +1 α ℓ,rm +2 · · · must be the same as ℓ ′ N for some ℓ ′ in the range 1 ≤ ℓ ′ ≤ N − 1, there is a unique permutation σ r of 1 , . . . , N − α ( ℓ ; rm ) = α ( σ r ( ℓ )). Writing | α ( σ r ( ℓ )) | A for the number of alternations in α ( σ r ( ℓ )), wehave φ s ( ℓ ) ≥ ⌊ s/m ⌋− X r =0 | α ( σ r ( ℓ )) | A , where σ is the identity. Therefore, (4.4) gives N − X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) s − Y r =0 ˆ µ r ( ℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N − X ℓ =1 ξ P ⌊ s/m ⌋− r =0 | α ( σ r ( ℓ )) | A . Using ξ x + y + ξ x ′ + y ′ ≤ ξ min { x,x ′ } +min { y,y ′ } + ξ max { x,x ′ } +max { y,y ′ } KIM, MONTENEGRO, PERES AND TETALI inductively, the above upper bound may be maximized when all σ r s are theidentity, that is, N − X ℓ =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) s − Y r =0 ˆ µ r ( ℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) ≤ N − X ℓ =1 ξ ⌊ s/m ⌋| α ( ℓ ) | A . Note that 1 /N ≤ ℓ/N ≤ − /N implies that α ( ℓ ) is neither (0 , . . . , 0) nor(1 , . . . , 1) (both are of length m ). This means that all α ( ℓ ) have at least onealternation. Since α ( ℓ )s are pairwise distinct, N − X ℓ =1 ξ ⌊ s/m ⌋| α ( ℓ ) | A ≤ X α : | α | A > ξ ⌊ s/m ⌋| α | A , where the sum is taken over all sequences α ∈ { , } m with | α | A > H ( z ) be the number of α s with exactly z alterations. Then H ( z ) = 2 (cid:18) m − z (cid:19) , and hence X α : | α | A > ξ ⌊ s/m ⌋| α | A = 2 m − X z =1 (cid:18) m − z (cid:19) ξ ⌊ s/m ⌋ z = 2((1 + ξ ⌊ s/m ⌋ ) m − − . (cid:3) Remark 4.6. For the reader interested in applying these methods toshow a Birthday-type result for other problems, it is worth noting that aFourier approach can also be used to show that B s ( u, v ) decreases quickly,and so A T , A ∗ T = O (1).For the distribution ν s of X s the Plancherel identity givesmax v Pr[ X s = v ] = max v ν s ( v ) ≤ N − X w =0 ν s ( w ) = 1 N N − X ℓ =0 | ˆ ν s ( ℓ ) | = 1 N N − X ℓ =0 (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) s − Y r =0 ˆ µ r ( ℓ ) (cid:12)(cid:12)(cid:12)(cid:12)(cid:12) . For ℓ = 0 , , . . . , N − 1, let φ s ( ℓ ) be the number of r = 0 , . . . , s − πℓ r N ≤ 0. Then s − Y r =0 | ˆ µ r ( ℓ ) | ≤ ξ φ s ( ℓ ) . BIRTHDAY PARADOX FOR POLLARD RHO Take m such that 2 m − < N < m . Then, for s ≤ m − α , . . . , α s (that is, α j ∈ { , } ), there are at most ⌈ − s N ⌉ ℓ s suchthat the binary expansion of ℓ/N up to s digits is 0 .α · · · α s . Since there areat most 2 e − Ω( s ) s binary sequences with fewer than ( s − / s − Y r =0 | ˆ µ r ( ℓ ) | = 2 e − Ω( s ) except for at most 2 e − Ω( s ) s ⌈ − s N ⌉ = 2 e − Ω( s ) N values of ℓ . Using a trivialbound Q s − r =0 | ˆ µ r ( ℓ ) | ≤ ℓ ’s, we havemax v Pr[ X s = v ] = 2 e − Ω( s ) + 2 e − Ω( s ) = 2 e − Ω( s ) . If s > m − 1, then Q s − r =0 | ˆ µ r ( ℓ ) | ≤ Q m − r =0 | ˆ µ r ( ℓ ) | implies thatmax v Pr[ X s = v ] = 2 e − Ω( m − = O ( N − Ω(1) ) . One might expect that the correct order of the mixing time of the Blockwalk X s is indeed Θ(log p log log p ). This is in fact the case, at least forcertain values of p and x . Theorem 4.7. If p = 2 t − and x = p − , then the block walk has mixingtime τ (1 / 4) = Θ(log p log log p ) . Proof. The upper bound on mixing time, O (log p log log p ), was shownin Theorem 4.4 via a Fourier argument.The proof of a lower bound of Ω(log p log log p ) is modeled after an ar-guments of Hildebrand [4] which in turn closely follows a proof of Chung,Diaconis and Graham [2]. The basic idea is by now fairly standard: choosea function and show that its expectation under the stationary distributionand under the n -step distribution P n are far apart with sufficiently smallvariance to conclude that the two distributions ( P n and π ) must differ sig-nificantly. Theorem 4.7 is not used in main results of this paper, and theproof is fairly long, and so it is left for the Appendix. (cid:3) 5. Distinguished point methods. The Rho algorithm can be parallelizedto J processors via the Distinguished Points method of van Oorschot andWiener [19]. To do this, start with a global choice of (random) partition S ∐ S ∐ S (i.e., a common iterating function F ), and choose J initialvalues { y j } Jj =1 from Z N , one per processor. Then run the Rho walk on pro-cessor j starting from initial state g ( y j ) , until a collision occurs betweeneither two walks or a walk and itself. To detect a collision let ϕ : G → { , } be an easily computed hash function with support { x ∈ G : ϕ ( x ) = 1 } to be KIM, MONTENEGRO, PERES AND TETALI called the distinguished points . Each time a distinguished point is reachedby a processor, it is sent to a central repository and compared against pre-viously received states. Once a distinguished point is reached twice then acollision has occurred, and the discrete logarithm can likely be found while,conversely, once a collision occurs, the collision will be detected the nexttime a distinguished point is reached.The proofs in previous sections immediately imply a factor of J speed upwhen parallelizing. To see this, suppose the initial values { y j } Jj =1 are chosenuniformly at random. Run a Rho walk for some T steps per processor,and then define { X i } by starting with the Rho walk of processor Y ji denotes the i th stateof copy j of the walk, for i ∈ { , , . . . , T } and j ∈ { , , , . . . , J − } , then X i = Y i div( T +1) i mod( T +1) for i ∈ { , , . . . , J ( T + 1) − } . This is a time-dependentrandom walk which follows the Rho walk, except at multiples of time T + 1where it instead jumps to a uniformly random state. Since our proofs involvepessimistic estimates on the distance of a distribution from uniform, andthese jumps result in uniform samples, then they can only improve the result.Hence this effectively leads to a Rho walk with J ( T + 1) − J speed-up per processor is achieved. If the initial values werenot uniform then discard the first O (log N ) steps per processor and treatthe next state as the initial value which by Theorem 4.4 will give a nearlyuniform start state. APPENDIX Proof of Theorem 4.7. Our approach follows that taken in Section 4of Hildebrand [4], “A proof of Case 2.” Recall that we will show that E U ( f )and E P n ( f ) are far apart for some function f with sufficiently small varianceto conclude that the two distributions ( P n and the uniform distribution U on Z p ) must differ significantly.More precisely let P n be the distribution of the block walk on Z p startingat state u = 0 and proceeding for n steps. For some α > 0, let A = { y : | f ( y ) − E U ( f ) | ≥ α p Var U ( f ) } . By Chebyshev’s inequality, U ( A ) ≤ /α . Also, for some β > B = { y : | f ( y ) − E P n ( f ) | ≥ β p Var P n ( f ) } . By Chebyshev’s inequality, P n ( B ) ≤ /β . If A c ∩ B c = ∅ , then P n ( A c ) ≤ P n ( B ) ≤ /β , and somin v ∈ A c P n ( v ) U ( v ) ≤ P n ( A c ) U ( A c ) ≤ /β − /α . BIRTHDAY PARADOX FOR POLLARD RHO If p Var U ( f ) , p Var P n ( f ) = o ( | E U ( f ) − E P n ( f ) | ) for a sequence n ( p ) =Ω(log p log log p ), then as p → ∞ it is possible to choose α, β → ∞ such that A c ∩ B c = 0. The theorem then follows as min v ∈ Ω P n ( v ) U ( v ) p →∞ −→ f : Z p → C to be used here is f ( k ) = t − X j =0 q k j where q = e πi/p . Then E U ( f ) = p P t − j =0 P p − k =0 ( q j ) k = p P t − j =0 q α is a primi-tive root of unity when p is prime and 1 ≤ α < p . Likewise, E U ( f ¯ f ) = p P t − j,j ′ =0 P p − k =0 q k ∗ j q k ∗ j ′ = p P t − j =0 p = t by the orthogonality relationshipof roots of unity. It follows that Var U ( f ) = t .The Block walk on Z p with n = rt steps will be considered where r ∈ N willbe chosen later. Let P n ( · ) denote the distribution of Z n = 2 n − b + 2 n − b + · · · + b n induced by n steps of the block walk starting at state u = 0. Ageneric increment will be denoted by b , since the b i are independent randomvariables from the same distribution.It will be useful to introduce a bit of notation.If α ∈ Z and y ∈ Z p , then define µ α ( y ) = Pr[2 α b = y ] = Pr[ b = y − α ] . Recall the Fourier transform of a distribution ν on Z p is given by ˆ ν ( ℓ ) = P p − k =0 q ℓk ν ( k ) = E ν ( q ℓk ). Properties of certain Fourier transforms are re-quired in our work. First, since p = 2 t − µ α ( y ) = µ α + ct ( y ) for c ∈ N ,and so ˆ µ α ( y ) = ˆ µ ( α mod t ) ( y ). By this and (4.3), if y ∈ Z p , thenˆ P n ( y ) = n − Y α =0 ˆ µ α ( y ) = t − Y α =0 ˆ µ α ( y ) ! r = ˆ P t ( y ) r . Also, µ α (2 j y ) = µ ( α − j ) ( y ), and so ∀ j ∈ N : ˆ P t (2 j y ) = t − Y α =0 ˆ µ α (2 j y ) = t − Y α =0 ˆ µ α ( y ) = ˆ P t ( y ) . Finally, for 0 ≤ j ≤ t − 1, defineΠ j = ˆ P t (2 j − 1) = t − Y α =0 ˆ µ α (2 j − . Note that ˆ P t (1 − j ) = ˆ P t (2 j − 1) because µ α ( y ) = µ α ( − y ) when the step sizesare { , x } = { , − } . Also, Π t − j = Π j because, modulo p , 2 t − j − − j − P t (2 t − j − 1) = ˆ P t (2 − j − 1) = ˆ P t (2 j (2 − j − P t (2 j − KIM, MONTENEGRO, PERES AND TETALI Now turn to mean and variance: E P n ( f ) = X k P n ( k ) t − X j =0 q k j = t − X j =0 ˆ P n (2 j )= t − X j =0 ˆ P t (2 j ) r = t − X j =0 ˆ P t (1) r = t Π r . Likewise, E P n ( f ¯ f ) = X k P n ( k ) t − X j,j ′ =0 q k (2 j − j ′ ) = t − X j,j ′ =0 ˆ P n (2 j − j ′ )= t − X j,j ′ =0 ˆ P t (2 j − j ′ ) r = t − X j =0 t − X β =0 ˆ P t (2 j (1 − β )) r = t t − X β =0 ˆ P t (2 β − r = t t − X β =0 Π rβ . It follows thatVar P n ( f ) = E P n ( f ¯ f ) − E P n ( f ) E P n ( ¯ f ) = t t − X j =0 Π rj − t | Π | r . To apply these relations in Chebyshev’s inequality the quantities Π j needto be examined further. Let b denote the increment taken but with arithmeticNOT done modulo p , so that b = p + 1 is possible, that is, repeatedly decidewith probability 1 / − 1. Let a k = Pr[ b = k ], and note that also a k = Pr[ b = − k ] since the nondoubling steps are symmetric, that is, u → u + 1and u → u + x = u − 1. Then a k satisfies the recurrence relation a k = ( a k − + a k +1 ) , a = + a , a ∞ = 0which has solution a k = √ ( −√ ) | k | . For y ∈ R , let G ( y ) = ∞ X k = −∞ a k e πiky = 1 √ − ((3 − √ / − √ / − (3 − √ 5) cos(2 πy ) . Since µ α ( k ) = Pr[ b = k − α ] = Pr[ b α ≡ k mod p ] = X { b : b α ≡ k mod p } a b , BIRTHDAY PARADOX FOR POLLARD RHO then Π j = t − Y α =0 ˆ µ α (2 j − 1) = t − Y α =0 ∞ X k = −∞ q k (2 j − µ α ( k )= t − Y α =0 ∞ X b = −∞ a b e πib α (2 j − /p = t − Y α =0 G (cid:18) α (2 j − p (cid:19) . We can now show the necessary bounds. Recall that n = rt for somesequence r = r ( t ) ∈ N to be defined. Let λ = λ ( t ) ∈ R be a sequence suchthat λ t →∞ −−→ ∞ , λ = o (log t ), and r = log t / | Π | ) − λ is an integer. Such a sequence will exist if | Π | is bounded away from 0. Claim 1. | Π | is bounded away from and as p = 2 t − → ∞ . Proof. First, a few preliminary calculations are necessary. If y ∈ [0 , / G ( y ) ≥ √ − ((3 − √ / − √ / + (3 − √ 5) = 15 . Also, since y ∈ R , then | G ( y ) | ≤ P a k | e πiky | = P a k = 1 with equality at y =0. And, since the first derivative satisfies the relation 0 ≥ G ′ ( y ) ≥ − π (3 −√ > − 5, then G ( ε ) ≥ G (0) − ε = 1 − ε for ε ≥ is bounded away from 1 observe thatΠ ≤ G (2 t − /p ) ∗ t − t →∞ −−→ G (1 / 2) = . To bound Π away from 0 note that since G ( y ) is decreasing for y ∈ [0 , / = t − Y α =0 G (2 α /p )= t − Y α = t − G (2 α /p ) t − Y α =0 G (2 α /p ) KIM, MONTENEGRO, PERES AND TETALI ≥ Y β =1 t Y β =6 (1 − · − β − · − t ) ≥ − e − / (2 (1 − / − / t )) . The final inequality used the relation is ln(1 − x ) ≥ − x − x . (cid:3) Claim 2. (Var U ( f )) / = o ( | E P n ( f ) − E U ( f ) | ) . Proof. Since | Π | is bounded away from 0 and 1, then | E P n ( f ) − E U ( f ) | = t | Π | r = √ t | Π | − λ . The claim follows as (Var U ( f )) / = √ t and | Π | λ → (cid:3) Claim 3. (Var P n ( f )) / = o ( | E P n ( f ) − E U ( f ) | ) . Proof. Assume 1 t t − X j =0 (cid:18) Π j Π (cid:19) r t →∞ −−→ . Then the claim follows from(Var P n ( f )) / = t | Π | r vuut t t − X j =0 (cid:18) Π j Π (cid:19) r − o ( t | Π | r ) . Hildebrand [4] requires 4 pages (pages 351–354) to prove the assumption,albeit for a different function G ( y ). Fortunately we do not need to reworkhis proof as it does not make explicit use of G ( y ) but instead depends ononly a few properties of it. There are two facts required in his proof. First,he shows the following: Fact 1. There is some t such that if t ≥ t , then Π j ≤ Π for all j ≥ G ( y ) = G (1 − y ) = G ( − y ), G ( y ) is decreasing when y ∈ [0 , / G ( − − i ) is decreas-ing in i ≥ p =2 t − →∞ G (cid:18) t − − t − j − p (cid:19) ≤ G (3 / G (1 / G ( y ) as is shownin the proof of our Claim 1.The second necessary tool is the following: BIRTHDAY PARADOX FOR POLLARD RHO Fact 2. There exists constants c , t such that for t ≥ t and t / ≤ j ≤ t/ 2, then Π j Π ≤ c j . Hildebrand’s proof utilizes the following properties: G ′ (0) = 0, | G ′ ( y ) | ≤ A , and G ( y ) ≥ B for all y and some A, B > 0. All three conditions wereshown in the proof of our Claim 1, and so his argument will carry through.The two facts can now be combined to finish the proof. Let t be suffi-ciently large that c ( t / ≤ /r which is possible as r = O (log t ). Recallingthat Π t − j = Π j , then t − X j =1 (cid:12)(cid:12)(cid:12)(cid:12)(cid:18) Π j Π (cid:19) r − (cid:12)(cid:12)(cid:12)(cid:12) ≤ X ≤ j 1) + o ( t ) (cid:19) = 1 + o (1) . In the other direction, since Var P n ( f ) ≥ 0, then t P t − j =0 ( Π j Π ) r ≥ (cid:3) Acknowledgments. The authors thank S. Kijima, S. Miller, I. Mironov,R. Venkatesan and D. Wilson for several helpful discussions.REFERENCES [1] Aldous, D. and Fill, J. (2009). Reversible Markov Chains and Random Walks onGraphs Chung, F. R. K. , Diaconis, P. and Graham, R. L. (1987). Random walks arisingin random number generation. Ann. Probab. Crandall, R. and Pomerance, C. (2005). Prime Numbers , 2nd ed. Springer, NewYork. MR2156291[4] Hildebrand, M. (2006). On the Chung–Diaconis–Graham random process. Electron.Comm. Probab. Le Gall, J.-F. and Rosen, J. (1991). The range of stable random walks. Ann.Probab. Kim, J-H. Montenegro, R. and Tetali, P. (2007). Near optimal bounds for col-lision in Pollard Rho for discrete log. In Proceedings of the 48th Annual IEEESymposium on Foundations of Computer Science KIM, MONTENEGRO, PERES AND TETALI[7] Lyons, R. , Peres, Y. and Schramm, O. (2003). Markov chain intersections andthe loop-erased walk. Ann. Inst. H. Poincar´e Probab. Statist. Miller, S. D. and Venkatesan, R. (2006). Spectral analysis of Pollard Rho colli-sions. In Algorithmic Number Theory . Lecture Notes in Computer Science Miller, S. D. and Venkatesan, R. (2009). Non-degeneracy of Pollard Rho colli-sions. Int. Math. Res. Not. IMRN Nazarov, F. and Peres, Y. (2010). The birthday problem for finite reversibleMarkov chains: A uniform bound. Preprint.[11] Pak, I. (2002). Mixing time and long paths in graphs. In Proceedings of the 13thAnnual ACM-SIAM Symposium on Discrete Algorithms Pohlig, S. C. and Hellman, M. E. (1978). An improved algorithm for computinglogarithms over GF( p ) and its cryptographic significance. IEEE Trans. Infor-mation Theory IT-24 Pollard, J. M. (1975). A Monte Carlo method for factorization. Nordisk Tidskr.Informationsbehandling (BIT) Pollard, J. M. (1978). Monte Carlo methods for index computation (mod p ). Math.Comp. Pomerance, C. (2008). Elementary thoughts on discrete logarithms. In AlgorithmicNumber Theory: Lattices, Number Fields, Curves and Cryptography . Mathemati-cal Sciences Research Institute Publications Shoup, V. (1997). Lower bounds for discrete logarithms and related problems. In Advances in Cryptology—EUROCRYPT’97 (Konstanz) . Lecture Notes in Com-puter Science Teske, E. (1998). Speeding up Pollard’s Rho method for computing discrete loga-rithms. In Algorithmic Number Theory (Portland, OR, 1998) . Lecture Notes inComputer Science Teske, E. (2001). Square-root algorithms for the discrete logarithm problem (a sur-vey). In Public-key Cryptography and Computational Number Theory (Warsaw,2000) van Oorschot, P. C. and Wiener, M. J. (1999). Parallel collision search withcryptanalytic applications. J. Cryptology J. H. KimDepartment of MathematicsYonsei UniversitySeoul, 120-749KoreaandNational Institute forMathematical Sciences628 Daeduk BoulevardDaejeon, 305-340KoreaE-mail: [email protected]@nims.re.kr R. MontenegroDepartment of Mathematical SciencesUniversity of Massachusetts LowellLowell, Massachusetts 01854USAE-mail: ravi [email protected] BIRTHDAY PARADOX FOR POLLARD RHO Y. PeresMicrosoft ResearchOne Microsoft WayRedmond, Washington 98052USAE-mail: [email protected]