Efficient computation of the Jacobi symbol
EEfficient computation of the Jacobi symbol
Niels M¨oller2019
Abstract
The family of left-to-right gcd algorithms reducesinput numbers by repeatedly subtracting the smallernumber, or multiple of the smaller number, from thelarger number. This paper describes how to extendany such algorithm to compute the Jacobi symbol,using a single table lookup per reduction. For bothquadratic time gcd algorithms (Euclid, Lehmer)and subquadratic algorithms (Knuth, Sch¨onhage,M¨oller), the additional cost is linear, roughly one ta-ble lookup per quotient in the quotient sequence. Thismethod was used for the 2010 rewrite of the Jacobisymbol computation in gmp . The Legendre symbol and its generalizations, the Ja-cobi symbol and the Kronecker symbol, are importantfunctions in number theory. For simplicity, in this pa-per we focus on computation of the Jacobi symbol,since the Kronecker symbol can be computed by thesame function with a little preprocessing of the in-puts.
Two quadratic algorithms for computing the Kro-necker symbol (and hence also the Jacobi symbol) aredescribed as Algorithm 1.4.10 and 1.4.12 in [3]. Thesealgorithms run in quadratic time, and consists of aseries of reduction steps, related to Euclid’s gcd al-gorithm and the binary gcd algorithm, respectively.Both Kronecker algorithms share one property withthe binary gcd algorithm: The reduction steps exam-ine the current pair of numbers in both ends. They examine the least significant end to cast out powersof two, and they examine the most significant endto determine a quotient (like in Euclid’s algorithm)or to determine which number is largest (like in thebinary gcd algorithm).Fast, subquadratic, gcd algorithms work by di-vide-and-conquer, where a substantial piece of thework is done by examining only one half of the in-put numbers. Fast left-to-right gcd is related to fastalgorithms for computing the continued fraction ex-pansion [7, 5]. These are left-to-right algorithms, inthat they process the input from the most significantend. The binary recursive algorithm [9] is a right-to-left algorithm, in that it processes inputs from theleast significant end. The asymptotic running timesof these algorithms are O ( M ( n ) log n ), where M ( n )denotes the time needed to multiply two n -bit num-bers. The gcd algorithm used in recent versions ofthe gmp library [4] is a variant of Sch¨onhage’s algo-rithm [6].It is possible to compute the Jacobi symbol in sub-quadratic time, with the same asymptotic complexityas gcd . One algorithm is described in [1] (solution toexercise 5.52), which says:This complexity bound is part of the“folklore” and has apparently never ap-peared in print. The basic idea can be foundin Gauss [1876]. Our presentation is basedon that in Bachmann [1902]. H. W. Lenstra,Jr. also informed us of this idea; he at-tributes it to A. Sch¨onhage.Since the quadratic algorithms for the Jacobi sym-bol examines the data at both ends, some reorgani-zation is necessary to construct a divide-and-conqueralgorithm that processes data from one end. The bi-1 a r X i v : . [ c s . D S ] J u l ary gcd algorithm has the same problem. In thebinary recursive gcd algorithm, this is handled byusing a slightly different reduction step using 2-adicdivision.Recently, the binary recursive gcd algorithm hasbeen extended to compute the Jacobi symbol [2]. Themain difference to the corresponding gcd algorithmis that it needs the intermediate reduced values to benon-negative, and to ensure this the binary quotientsmust be chosen in the range 1 ≤ q < k +1 rather than | q | < k . As a result, the algorithm is slower than the gcd algorithm by a small constant factor. This paper describes a fairly simple extension to awide class of left-to-right gcd algorithms, includingLehmer’s algorithm and the subquadratic algorithmin [6], which computes the Jacobi symbol using only O ( n ) extra time and O (1) extra space . This indicatesthat also for the fastest algorithms for large inputs,the cost is essentially the same for computing the gcd and computing the Jacobi symbol. Like the algorithm described in [1], the computa-tion is related to the quotient sequence. The updatesof the Jacobi symbol are somewhat different, insteadfollowing an unpublished algorithm by Sch¨onhage [8]for computing the Jacobi symbol from the quotientsequence modulo four. In the gcd algorithms in gmp ,the quotients are not always applied in a single step;instead, there is a series of reductions of the form a ← a − mb , where m is a positive number equal toor less than the correct quotient (cid:98) a/b (cid:99) . In the cor-responding Jacobi algorithms, the Jacobi sign is up-dated for each such partial quotient. Most of the par- The size of the additional state to be maintained is O (1).But in a practical implementation, which does not store thisstate in a global variable, either the state or a pointer to it willbe copied into each activation record, which for a subquadraticrecursive divide-and-conquer algorithm costs O (log n ) extraspace rather than O (1) Even though we cannot rule out the existence of a left-to-right gcd algorithm which is a constant factor faster than Ja-cobi. Such an algorithm would lie outside the class of “genericleft-to-right gcd algorithms” we describe in this paper, e.g.,it might use intermediate reduced values of varying signs andquotients that are rounded towards the nearest integer ratherthan towards −∞ . g ← gcd ( a, b )In: a, b > repeat if a ≥ b a ← a − mb , with 1 ≤ m ≤ (cid:98) a/b (cid:99) if a = 05 return b else b ← b − ma , with 1 ≤ m ≤ (cid:98) b/a (cid:99) if b = 09 return a Algorithm 1: Generic left-to-right gcd algorithm.tial quotients are determined from truncated inputswhere the least significant parts of the numbers areignored. The least significant two bits, needed for theJacobi computation, must therefore be maintainedseparately.
The time needed to multiply two n -bit numbers is de-noted M ( n ), where M ( n ) = O ( n log n ) for the fastestknown algorithms. The Jacobi symbol is denoted ( a | b ). We use theconvention that [condition] means the function thatis one when the condition is true, otherwise 0, e.g.,(0 | b ) = [ b = 1]. In this paper, we will not describe the details offast gcd algorithms. Instead we will consider Algo-rithm 1, which is a generic left-to-right gcd algo-rithm, with a basic reduction step where a multipleof the smaller number is subtracted from the largernumber. We also describe the main idea of fast in-stantiations of this algorithm. Multiplication in gmp is based on the more practicalSch¨onhage-Strassen algorithm, with asymptotic complexity O ( n log n log log n ). a, b ) is reduced, un-til a = b and the algorithm terminates. It returns thecorrect value, since gcd ( a, b ) is unchanged by eachreduction step.The running time of an instantiation of this algo-rithm depends on the choice of m in each step, and onthe amount of computation done in each step. E.g.,if m = 1, the worst case number of iterations in ex-ponential. Euclid’s algorithm is a special case where,in each step, m is the correct quotient of the currentnumbers.The faster algorithms implements an iteration thatdepends only on some of the most significant bitsof a and b : These bits determine which of a and b is largest, and they also suffice for computing an m which is close to the quotient (cid:98) a/b (cid:99) or (cid:98) b/a (cid:99) . Further-more, one can compute an initial part of the sequenceof reductions based on the most significant parts of a and b , collect the reductions into a transformationmatrix, and apply all the reductions at once to theleast significant parts of a and b later on. This savesa lot of time, since it omits computing all the inter-mediate a and b to full precision. If one repeatedlychops off one or two of the most significant words,one gets Lehmer’s algorithm, and by chopping num-bers in half, one can construct a divide-and-conqueralgorithm with subquadratic complexity.We will extend this generic algorithm to also com-pute the Jacobi symbol. To do that, we need to in-vestigate how the basic reduction a − mb affects theJacobi symbol. When we have sorted this out, in thenext section, the result is easily applied to all variantsof Algorithm 1. In this section, we summarize the properties of theJacobi symbol we use, derive the update rules neededfor our left-to-right algorithm. Finally, we give theresulting algorithm and prove its correctness.
The Jacobi symbol ( a | b ) is defined for b odd and pos-itive, and arbitrary a . We work primarily with non-negative a , and make use of the following propertiesof the Jacobi symbol. Proposition 1
Assume that a is positive and that b is odd and positive. Then(i) (0 | b ) = [ b = 1] .(ii) ( a | b ) = ( − ( b − / ( − a | b ) (iii) If both a and b are odd, then ( a | b ) = ( − ( a − b − / ( b | a ) (iv) ( a | b ) = ( a − mb | b ) for any m .(v) If a = 0 (mod 4) and ≤ m ≤ (cid:98) b/a (cid:99) , then ( a | b ) = ( a | b − ma ) (vi) If a = 2 (mod 4) and ≤ m ≤ (cid:98) b/a (cid:99) , then ( a | b ) = ( − m ( b − / m ( m − / ( a | b − ma ) Proof : For (i) to (iv) we refer to standard textbooks.The final two are not so well-known, and their use forJacobi computation is suggested by Sch¨onhage [8]. Toprove them, assume that a is even and a < b . Then( a | b ) = ( a − b | b ) By (iv)= ( − ( b − / ( b − a | b ) By (ii)= ( − ( b − / b − b − a − / ( b | b − a ) By (iii)= ( − ( b − / b − b − a − / ( a | b − a ) By (iv)Since b − b , we get aresulting exponent, modulo two, of( b − / b − b − a − / a ( b − / a = 0 (mod 4), this exponent is even and hencethere is no sign change. And this continues to hold ifthe subtraction is repeated, which proves (v). Next,consider the case a = 2 (mod 4). Then a/ m times givesthe exponent a { ( b −
1) + ( b − a −
1) + · · · + ( b − ( m − a − } / m ( b − / m ( m − / (cid:3) Finally, note that in these formulas, all the signsare determined by the least significant two bits of a , b and m . The gcd algorithm works with two non-negative in-tegers a and b , where multiples of the smaller one issubtracted from the larger. To compute the Jacobisymbol we maintain these additional state variables: e ∈ Z Current sign is ( − e α ∈ Z Least significant bits of aβ ∈ Z Least significant bits of bd ∈ Z Index of denominatorThe value of d is one if the most recent reduc-tion subtracted b from a , and zero if it subtracted a from b . We collect these four variables as the state S = ( e, α, β, d ). The state is updated by the func-tion jupdate , Algorithm 2. Since the inputs of thisfunction are nine bits, and the outputs are six bits,it’s clear it can be implemented using a lookup tableconsisting of 2 six-bit entries, which fits in 512 bytesif entries are padded to byte boundaries. Algorithm 3 extends the generic left-to-right gcd algorithm to compute the Jacobi symbol. The mainloop of this algorithm differs from Algorithm 1 onlyby the calls to jupdate for each reduction step.
Let a and b denote the original inputs to Algo-rithm 3. Since the reduction steps and the stop con-dition are the same as in Algorithm 1, it terminates One quarter of the entries in this table corresponds to in-valid inputs, since at least one of α and β is always odd. If wealso note that the value of d is needed only when α = β = 3,the state can be encoded into only 26 values, and then thetable can be compacted to only 208 entries. S (cid:48) ← jupdate ( S, d (cid:48) , m )In: d (cid:48) ∈ Z , m ∈ Z , S = ( e, α, β, d )1 if d (cid:54) = d (cid:48) and both α and β are odd2 e ← e + ( α − β − / // Reciprocity3 d ← d (cid:48) if d = 15 if β = 26 e ← e + m ( α − / m ( m − / α ← α − mβ else if α = 210 e ← e + m ( β − / m ( m − / β ← β − mα return S (cid:48) = ( e, α, β, d )Algorithm 2: Updating the state of the Jacobi symbolcomputation. j ← jacobi ( a, b )In: a, b > b oddOut: The Jacobi symbol ( a | b )State: S = ( e, α, β, d )1 S ← (0 , a mod 4 , b mod 4 , repeat if a ≥ b a ← a − mb , with 1 ≤ m ≤ (cid:98) a/b (cid:99) S ← jupdate ( S, , m mod 4)6 if a = 07 return [ b = 1]( − e else b ← b − ma , with 1 ≤ m ≤ (cid:98) b/a (cid:99) S ← jupdate ( S, , m mod 4)11 if b = 012 return [ a = 1]( − e Algorithm 3: The algorithm for computing the Jacobisymbol.4fter a finite number of steps. We now prove that itreturns ( a | b ).Algorithm 3 clearly maintains α = a mod 4 and β = b mod 4. We next prove that the following holdsat the start of each iteration:If d = 0 we have( a | b ) = ( − e × (cid:40) ( b | a ) α odd( a | b ) α even (1)and if d = 1 we have( a | b ) = ( − e × (cid:40) ( a | b ) β odd( b | a ) β even (2)This clearly holds at the start of the loop, to provethat it is maintained, consider the case a ≥ b (the case a < b is analogous). Let a , b (unchanged) and S =( e, α, β, d ) denote the values of the variables beforeline 4. There are a couple of different cases, dependingon the state: • If β is odd and either α is even or d = 1, then( a | b ) = ( − e ( a | b ) = ( − e ( a − mb | b ). • If α and β are both odd and d = 0, then ( a | b ) =( − e ( b | a ) = ( − e +( a − b − / ( a − mb | b ). • If β = 0 (mod 4), then ( a | b ) = ( − e ( b | a ) =( − e ( b | a − mb ). • If β = 2 (mod 4), then ( a | b ) = ( − e ( b | a ) =( − e + m ( a − / m ( m − / ( b | a − mb ).In each case, the call to jupdate makes the appropri-ate change to e , and Eq. (2) holds after the iteration. The algorithm was implemented in gmp -5.1.0, re-leased 2012. In benchmarks at the time, comparingthe old binary algorithm to the new Jacobi exten-sion of Lehmer’s gcd algorithm (both O ( n )), thenew algorithm computed Jacobi symbols about twiceas fast for moderate size numbers (around 2000 bits),and 10 times faster for numbers of size of 500000 bits.For even larger numbers, the Jacobi extension of sub-quadratic gcd brought even greater speedups. Acknowledgments
The author wishes to thank Richard Brent for pro-viding valuable background material and for encour-aging the writing of this paper.
References [1] Eric Bach and Jeffrey Shallit.
Algorithmic Num-ber Theory, Vol. 1: Efficient Algorithms . Founda-tions of Computing. MIT Press, 1996.[2] Richard P. Brent and Paul Zimmermann. AnO(M(n) log n) algorithm for the Jacobi symbol.In
ANTS-IX , volume 6197 of
LCNS . Springer-Verlag, July 2010. See http://arxiv.org/abs/1004.2091 .[3] Henri Cohen.
A course in computational algebraicnumber theory . Springer, 1996.[4] Torbj¨orn Granlund. GNU multiple precisionarithmetic library. http://gmplib.org/ .[5] Donald E. Knuth. The analysis of algo-rithms.
Actes du Congr´es International desMath´ematiciens , pages 269–274, 1970.[6] Niels M¨oller. On Sch¨onhage’s algorithmand subquadratic integer gcd computa-tion.
Mathematics of Computation , 2008.See .[7] Arnold Sch¨onhage. Schnelle Berechnung von Ket-tenbruchentwicklungen.
Acta Informatica , 1:139–144, 1971.[8] Arnold Sch¨onhage. Email to Richard P. Brent,2009. Excerpt from 1987 notes.[9] Damien Stehl´e and Paul Zimmermann. A bi-nary recursive GCD algorithm. In D. Buell, edi-tor,
ANTS-VI , volume 3076 of