Polynomial modular product verification and its implications
PPolynomial modular product verification and its implications
Pascal Giorgi Bruno Grenet Armelle Perret du CrayLIRMM, Univ. Montpellier, CNRSMontpellier, France { pascal.giorgi,bruno.grenet,armelle.perret-du-cray } @lirmm.frJanuary 7, 2021 Abstract
Polynomial multiplication is known to have quasi-linear complexity in both the dense and the sparsecases. Yet no truly linear algorithm has been given in any case for the problem, and it is not clear whether itis even possible. This leaves room for a better algorithm for the simpler problem of verifying a polynomialproduct. While finding deterministic methods seems out of reach, there exist probabilistic algorithms for theproblem that are optimal in number of algebraic operations.We study the generalization of the problem to the verification of a polynomial product modulo a sparsedivisor. We investigate its bit complexity for both dense and sparse multiplicands. In particular, we are ableto show the primacy of the verification over modular multiplication when the divisor has a constant sparsityand a second highest-degree monomial that is not too large. We use these results to obtain new bounds onthe bit complexity of the standard polynomial multiplication verification. In particular, we provide optimalalgorithms in the bit complexity model in the dense case by improving a result of Kaminski and develop thefirst quasi-optimal algorithm for verifying sparse polynomial product.
Polynomials are one of the most basic objects in computer algebra and the study of fast polynomial operationsremains a very challenging task. Polynomials can be represented using either the dense representation, thatstores all the coefficients in a vector, or the more compact sparse representation, that only stores nonzeromonomials. Depending on which representation is chosen the problems might have a very different flavorleading to two very separate lines of research.Polynomial multiplication is the most noticeable problem that attracted a lot of attention since manydecades, culminating nowadays with quasi-optimal algorithms [
2, 15 ] . Although such algorithms are reallyefficient in theory and in practice, there are not yet optimal and they often rely on complex approaches thatcan be error prone. Therefore, looking for rather simple procedure to verify the correctness of polynomialproducts is of great interest. From a theoretical perspective, the goal is then to provide asymptotically fasteralgorithms than those for multiplying polynomials, ultimately seeking for an optimal algorithm. In practicethe objective is barely to find simpler and faster procedures that reveal easier to trust.In this work, we intend to present the most recent advances in verifying polynomial products in both thedense and sparse case, to extend such results to either optimal algorithms or to more reliable solutions inpractice. Finally, we extend the problem to some specific modular multiplication of polynomials which seemsto not having been explored yet. Dense polynomial multiplication
We know from the early 60’s that dense polynomial arithmetic is sub-quadratic, and that it can even be quasi-linear when the so-called FFT applies [ ] . It has been more than twodecades later that Cantor and Kaltofen [ ] provide a quasi-linear algorithm without any assumption on thepolynomial algebra. They show that two dense polynomials of degree less than n over an algebra (cid:65) can bemultiplied with (cid:79) ( n log n log log n ) operations in (cid:65) . In regards to the bit complexity model, the operations inthe base ring (cid:65) cannot count (cid:79) ( ) anymore, and the previous algorithms may not lead to the best complexityestimates for specific domains such as (cid:65) = (cid:70) q or (cid:65) = (cid:90) . There, the use of Kronecker substitution togetherwith fast integer multiplication turns out to be the best alternative [
8, Section 8.4 ] . It has been showed byHarvey and van der Hoeven in [ ] that one can reach a bit complexity of (cid:79) ( n log q log ( n log q ) log ∗ ( n ) ) for poly-nomial multiplication over (cid:70) q [ X ] for any prime field (cid:70) q . We shall mention that very recently, such complexityhave been further improved to (cid:79) ( n log q log ( n log q )) bit operations [ ] under some mild hypothesis. For1 a r X i v : . [ c s . S C ] J a n olynomials with integer coefficients bounded by an integer C , the complexity falls down to multiplying twointegers of bit length (cid:79) ( n ( log n + log C )) which gives (cid:79) ( n ( log n + log C log n + log C log log C ) = ˜ (cid:79) ( n log C ) when we assume that n -bits integer multiplication complexity is I ( n ) = (cid:79) ( n log n ) [ ] . For clarity in thepresentation, we will often use M ( n ) as the number of operations in (cid:82) required to multiply two dense poly-nomials of size n , while M q ( n ) will denote the bit complexity for such multiplication over a prime field (cid:70) q . Sparse polynomial multiplication
In the sparse representation, a polynomial F = (cid:80) ni = f i X i ∈ (cid:82) [ X ] isexpressed as a list of pairs ( e i , f e i ) such that all the f e i are nonzero. We denote by F the sparsity of thepolynomial F which corresponds to its number of nonzero coefficients. Let F be a polynomial of degree n ,and log C be a bound on bit length of its coefficients. Then, the size of the sparse representation of F is (cid:79) ( F ( log n + log C )) bits. Contrary to the dense case, note that fast algorithms for sparse polynomials musthave a (poly-)logarithmic dependency on the degree, and that the size of the output does not exclusivelydepend on the size of the inputs. Indeed, the product of two polynomials F and G has at most F G nonzerocoefficients. But it may have as few as 2 nonzero coefficients, as shown by the following example. Example . Let F = X + X + G = X + X + H = X − X +
2. Then
F G = X + X + X + X + X + X + X + X + F H = X + (cid:79) ( ) [
3, 25 ] . The classical approach for computing the product of twopolynomials of sparsity T is to generate all the T possible monomials, and to sort them and merge thoseof equal degree to collect the monomials of the result. Using radix sort, this algorithm takes for instance (cid:79) ( T ( I ( log C ) + log n )) bit operations over (cid:90) and it exhibits a T factor in the space complexity, whatever thenumber of terms in the result. Many improvements have been proposed to reduce this space complexity, toextend the approach to multivariate polynomials, and to provide fast implementations in practice [
19, 22, 23 ] .Yet, none of these results reduces the T factor in the time complexity. In general, no complexity improvementis expected as the output polynomial may have as many as T nonzero coefficients. However, this numberof nonzero coefficients can be overestimated, giving the opportunity for output-sensitive algorithms. Suchalgorithms have first been proposed for special cases. Notably, when the output size is known to be small dueto sufficiently structured inputs [ ] , especially in the multivariate case [
17, 16 ] , or when the support of theoutput is known in advance [ ] .Output-sensitive multiplication algorithms try to take into account the two reasons that can decrease thesparsity of the product. The first one is exponent collisions, while the second one occurs when these collisionsimply some coefficient cancellations. The exponent collision is captured by the sumset of the exponents of F = (cid:80) Ti = f i X α i and G = (cid:80) Tj = g j X β j , that is { α i + β j : 1 ≤ i , j ≤ T } . Arnold and Roche call this set the structural support of the product F G and its size the structural sparsity [ ] . If H = F G , then the structuralsparsity S of the product F G satisfies 2 ≤ H ≤ S ≤ T . Observe that although H and S can be close, theirdifference can reach (cid:79) ( T ) as shown by the next example. Example . Let F = (cid:80) T − i = X i , G = (cid:80) T − i = ( X iT + − X iT ) and H = F G . We have F = T , G = T and thestructural sparsity of F G is T + H = X T − (cid:79) ( S log n ) operations in the RAM model with (cid:79) ( log ( C n )) word size [ ] , where log ( C ) bounds the bitsize of the coefficients. Arnold and Roche improve thiscomplexity to ˜ (cid:79) ( S log n + H log C ) bit operations for polynomials with both positive and negative integercoefficients [ ] . A recent algorithm of Nakos avoids the dependency on the structural sparsity for the caseof integer polynomials [ ] , using the same word RAM model as Cole and Hariharan. Unfortunately, the bitcomplexity of this algorithm, ˜ (cid:79) (( T log + H log n ) log ( C n ) + log n ) , is not quasi-linear. More recently, wepropose in [ ] the first quasi-optimal algorithm for sparse polynomial multiplication yielding a bit complexityof ˜ (cid:79) ( T (cid:48) ( log n + log C )) where T (cid:48) = max ( T , H ) . More precisely, taking k = T (cid:48) ( log n + log c ) which is the bitlength of the input and output, we are able to reach a bit complexity of (cid:79) ( k log k log T ( log T + log log k )) . Verification of polynomial products
Considering the non-optimality of polynomial multiplication in bothrepresentations, it is quite natural to ask whether it is rather a simple task or not to verify an instance of theproblem. More formally, given three polynomials F , G and H , can we assert that H is equal to the productof F and G in less operations than computing the product itself? Furthermore, we want such procedure to Here, and throughout the article, ˜ (cid:79) ( f ( n )) denotes (cid:79) ( f ( n ) log k ( f ( n ))) for some constant k >
2e as simple as possible and to not rely on polynomial multiplication if possible. Unfortunately, doing thiswith a deterministic procedure is not yet known, but using probabilistic algorithms lead to positive answers asshown by several papers [
6, 29, 32, 10 ] . Here and henceforth, polynomials are assumed to have coefficientsin an integral domain (cid:82) , rather than in a more general algebra.For dense polynomials this verification amounts to choosing a random element α in a finite subset of (cid:82) and to assert that H ( α ) − F ( α ) G ( α ) is zero. In that case, the complexity for the verification becomes (cid:79) ( n ) operations in (cid:82) , which is optimal. Of course, the probability of error is less than one as soon as (cid:82) has morethan n elements. If this not the case, for instance when (cid:82) = (cid:70) , it is not desirable to choose α in a sufficientlarge extension of (cid:82) to have (cid:79) ( n ) elements. The latter would require an extension of degree (cid:79) ( log n ) and itwould raise the complexity to (cid:79) ( n M ( log n )) . This is actually larger than the complexity M ( n ) of computingthe product. In [ ] , Gamin’s solved the latter problem by replacing the evaluation, that corresponds tocomputing within (cid:82) [ X ] / ( X − α ) , by doing a polynomial multiplication within (cid:82) [ X ] / ( X i − ) for a randominteger i < n . More precisely, by choosing i = (cid:79) ( n − e ) for some 0 < e < /
2, his verification algorithm runs in (cid:79) ( n ) operations in (cid:82) , whatever its size, with a probability of error bounded away from one. While the resultsounds optimal from a theoretical perspective, it might be mitigated for practical applications as it verifiespolynomial multiplication of degree n by doing multiplication of polynomials of degree (cid:79) ( n − e ) .All these results remain valid under the bit complexity model, but the obtained complexity might not beoptimal. For polynomials over (cid:70) q [ X ] , both approaches using products in (cid:70) q [ X ] / ( X − α ) or in (cid:70) q [ X ] / ( X i − ) lead to a bit complexity of (cid:79) ( n I ( log q )) = (cid:79) ( n log q log log q ) . While being non optimal, they remainhowever asymptotically faster than the computation of the initial products by a factor log n / log log q . Actually,Kaminski’s approach has a better bit complexity than the standard method and can even yield a linear bitcomplexity in favorable cases. For polynomials over (cid:90) [ X ] , the result is more surprising as it is possible toreach an optimal bit complexity of (cid:79) ( n log C ) for any input. This result should be attributed to Kaminski, ashe provided in [ ] all the necessary materials while not noticing the result explicitly. It seems surprising,but we haven’t found any references advertising such result. Thus, we propose to provide the description ofthose optimal verifications of polynomial products.For sparse polynomials, the verification of products remains less studied. It is misleading to think thatusing polynomial evaluation is satisfactory. Assuming that only T coefficients are nonzero, sparse polynomialevaluation is not quasi-linear in the input size (cid:79) ( T ( log n + log C )) . Indeed, computing α n requires (cid:79) ( log n ) operations in (cid:82) which implies a complexity of (cid:79) ( T log n ) operations in (cid:82) when applied to the T nonzeromonomials. Since one needs to use a subset of (cid:82) of size at least n to ensure a nonzero probability of success,this implies that the bit complexity is at least (cid:79) ( T log n ) . Using similar ideas as Kaminski’s [ ] , we proposedrecently in [ ] to verify sparse polynomial identities by doing the computation in (cid:82) [ X ] / ( X p − ) for a randomprime p . In particular, we prove that choosing p = (cid:79) ( T log n ) ensures that ( H − F G ) mod ( X p − ) = H − F G = (cid:79) ( T ( log n + log C )) bit operations.Another important measure for randomized verification algorithms is the probability of failure. All theknown verification algorithms are True-biased one-sided Monte Carlo algorithms. This means that they al-ways return True if H = F G and return False with probability at least 1 − ε otherwise. Given an algorithm witherror probability at most ε , we can attain any smaller probability of error τ by repeating (cid:79) ( log τ log ε ) rounds of thealgorithm. This shows that the complexity of the algorithm is actually dependent on the target error probabil-ity. In our results, we always explicitly indicate this dependency. We can distinguish several regimes of valuesfor the error probability: the constant regime ε = (cid:79) ( ) , the inverse polynomial regime ε = / n (cid:79) ( ) and theinverse exponential regime ε = / (cid:79) ( n ) , where n is the input degree. Given an algorithm with constant errorprobability, one can attain any smaller constant probability using a constant number of rounds. This keepsthe same asymptotic complexity. The same is true for two probabilities inside the inverse polynomial regime.To get to the inverse polynomial regime from the constant regime, the number of rounds must be (cid:79) ( log n ) ,slightly increasing the asymptotic complexity. The inverse exponential regime can then be attained using apolynomial number of rounds. In our context of linear and quasi-linear algorithms, the inverse exponentialregime is not attainable in general. The best known verification algorithms have linear bit complexity in theinverse polynomial regime. Contributions
As an extension of our prior work [ ] , we propose to study more generally the verification ofpolynomial multiplication in (cid:82) [ X ] / P where P is a monic sparse polynomial. In the dense case, this generalizesa work from one of the authors on the probabilistic verification on polynomial middle product [ ] . By reusingour modular product’s verification, we show that we can address the difficulty of Kaminski’s approach thatverifies polynomial products using products of roughly the same degree, more than (cid:112) n . In particular, we3how that we can avoid the dependency on polynomial multiplication in every cases. When dealing withfinite field arithmetic it is quite common to rely on irreducible polynomials that are sparse [ ] . Therefore,having the possibility to verify multiplication over finite fields in less operations than computing the productseems of great interest. In particular, we show that the verification of products in (cid:70) q s can be done in (cid:79) ( s P ) operations in (cid:70) q where P ∈ (cid:70) q [ X ] is the monic irreducible polynomial of degree s used to define (cid:70) q s . Clearlyfor irreducible polynomial with constant sparsity, as it is often the case over (cid:70) [ ] and more generally (cid:70) q [ ] , this offers an optimal verification procedure. Finally, for sparse polynomials, this work extendsour prior result for (cid:82) [ X ] / ( X p − ) in [ ] that was of great importance to achieve the first quasi-linear timealgorithm for sparse polynomial multiplication. We hope our new insight on this problem will leverage otherfast algorithms for sparse polynomial operations, especially for the division problem [ ] .All our techniques and results extend to the more general problem of verifying a polynomial identity of theform (cid:80) i F i G i mod P =
0, where the sum may have an arbitrary number of terms. It would be interesting to beable to extend these results to more general polynomial identities. As a very simple example, given F , F , F , H and P ∈ (cid:82) [ X ] for some integral domain (cid:82) , what is the complexity of the verification of H = F F F mod P ?Obviously if the inputs are dense polynomials, the computation of F F F mod P can be done in quasi-lineartime. But the question is to design an algorithm than runs faster than performing the computation. In thesparse case, the computation may increase the input size quite a lot and even a quasi-linear time algorithm islacking. More generally, the problem is to verify identities of the form (cid:80) i (cid:81) j (cid:80) k · · · (cid:81) (cid:96) F i , j , k ,..., (cid:96) mod P = Modular Polynomial Identity Testing (Modular PIT) problem. The standardPolynomial Identity Testing (PIT) problem takes as input an arithmetic circuit, or equivalently a straight-lineprogram, and consists in deciding whether the polynomial it represents is zero. In this extension, a polynomial P is also given as input and the question is whether the polynomial represented by the circuit is divisible by P . The standard PIT problem admits polynomial-time, and even quasi-linear-time, randomized algorithm. Avery important open question is whether it also admits a polynomial-time deterministic algorithm. For theModular PIT problem, the question is already to design efficient randomized algorithms. If the dense case,the challenge is to obtain faster algorithms than performing the product, ideally linear-time algorithms. Inthe sparse case, it is not even known how to solve the problem in randomized polynomial-time. Our resultsmay be seen as a first step towards this goal. Outline
We start our work in Section 2 by introducing all the technical materials that serve to demonstrateour main results. Then, Section 3 is devoted to the study of the evaluation of modular multiplication. In par-ticular, we provide algorithms and their thorough analysis for evaluating ( F G ) mod P on α without computing F G mod P . The results of that section serve to derive efficient algorithms in Section 4 for the verification ofmodular multiplication of polynomials. Finally, we present in Section 5 the more general results on the ver-ification of classical polynomial multiplication. In particular, we extend the work of Kaminski [ ] for thedense case with a thorough analysis of its bit complexity that enables to reach optimal verification. We alsogive a more detailed presentation of our first quasi-optimal algorithm for the sparse case that appears in [ ] . Let Q ∈ (cid:82) [ X ] be a degree- n polynomial. We denote its coefficient of degree i by q i . The sparsity of Q is itsnumber of nonzero monomials and is denoted by Q . The support of Q is the set supp ( Q ) = { i : q i (cid:54) = } . If Q is a polynomial over (cid:90) , we denote by (cid:107) Q (cid:107) ∞ its norm, defined as max ≤ i ≤ n | q i | . We denote by log ( · ) the base-2logarithm and by ln ( · ) the natural logarithm. We also use log b ( · ) to denote the base- b logarithm defined bylog b ( x ) = log x log b .We work in this paper with dense and sparse polynomials. A dense polynomial is represented as the vectorof its coefficients, which has size n + n polynomial. A sparse polynomial is represented by thelist of its nonzero monomials. We consider that we work, for sparse polynomials, with an abstract structure of sparse vector . In practice, this can be implemented by several data structures, depending on the operations thatneed to be performed. A standard choice in sparse polynomial arithmetic is the use of heaps [
19, 22, 23 ] . Toget better complexities, we might resorting to van Emde Boas Trees [
5, Chapter 20 ] as in Section 3.3.We alsouse sparse vectors in some algorithms to represent data which are not directly polynomials. The underlyingdata structure is the same as for sparse polynomials. 4 omplexity of dense polynomial multiplications We denote by M ( n ) the number of ring operationsneeded to compute a product of degree- n dense polynomials over an integral domain. We can take M ( n ) = (cid:79) ( n log n log log n ) [ ] . We denote by M q ( n ) the bit complexity of the multiplication of two degree- n densepolynomials over a finite field (cid:70) q . The best known bounds on M q ( n ) are (cid:79) ( n log q log ( n log q ) log ∗ ( n log q ) ) [ ] unconditionally and (cid:79) ( n log q log ( n log q )) assuming the existence of some Linnik constant [ ] . To simplifythe notation, we assume the existence of this Linnik constant. The cost of multiplying two elements in anextension field (cid:70) q d is the cost of degree- d polynomial multiplications, that is (cid:79) ( M q ( d )) . Let F , G ∈ (cid:90) [ X ] of degree n and norm C . Their product has norm at most nC . To compute F G , we can evaluate both F and G on some power of 2 larger than nC , multiply the resulting integers (that have size n log ( nC ) ), andread the coefficients of F G directly on the output integer. Let I ( m ) = (cid:79) ( m log m ) denote the bit complexityof multiplying two m -bit integers [ ] . Then the bit complexity of multiplying F and G is I ( n log ( nC )) = (cid:79) ( n log n + n log n log C + n log C log log C ) . Complexity of polynomial evaluation
The evaluation of a dense degree- n polynomial F ∈ (cid:82) [ X ] on apoint α ∈ (cid:82) requires (cid:79) ( n ) operations in (cid:82) using for instance Horner scheme. If α lies in an extensionring (cid:82) ext of (cid:82) , the evaluation requires (cid:79) ( n ) operations in (cid:82) ext . If F has coefficients in a finite field (cid:70) q , thistranslates directly to a linear number of operations in (cid:70) q . Now if F has coefficients in (cid:90) , one must take intoaccount the growth of the integers during the computation. Using a divide-and-conquer approach to usebalanced integer multiplications, the cost of the evaluation is (cid:79) ( I ( n log C ) log ( n log C )) bit operations where C = max ( | α | , (cid:107) F (cid:107) ∞ ) . We note that this cost is quasi-linear in the worst case output size while using Hornerscheme would have been quadratic.To evaluate a sparse polynomial F ∈ (cid:82) [ X ] on α ∈ (cid:82) , we compute the relevant powers of α and thenperform F multiplications and additions in (cid:82) . Computing each power independently yields (cid:79) ( F log n ) operations in (cid:82) . Using simultaneous exponentiation [ ] , the cost is reduced to (cid:79) ( log n + F log n / log log n ) operations in (cid:82) . Again, this directly translates to operations in (cid:70) q if (cid:82) = (cid:70) q . For polynomials with integercoefficients, the growth is much more severe than in the dense case. Indeed, α n has (cid:79) ( n log | α | ) bits. Thisimplies that the bit complexity is at least linear in n which is exponentially larger than the input size. Thecost is actually not better than with dense polynomials. Reducing a polynomial modulo P changes its norm and sparsity. We provide bounds on these growths. Theyrely on the gap between the degree of P and its second degree , that is the degree of its second highest-degreemonomial. Definition 1.
Let P = X n + (cid:80) ki = p i X i for k < n and p k (cid:54) =
0. The second degree of P is the integer k . The gapparameter γ of P is γ = n ( n − k ) .In particular, the second degree of P is ( − γ ) n . The parameter γ is between 0 and 1. If γ is close to0, the polynomial actually has no gap, while γ = aX n + b . We note that giventhis definition, γ is always upper bounded by n . Polynomials with a large gap are also known as sedimentary polynomials [ ] . A polynomial is said t -sedimentary if it is of the form X n + H where deg ( H ) = t . A t -sedimentary polynomial is a polynomial with gap parameter ( n − t ) / n and conversely a monic polynomialwith a gap parameter γ is ( − γ ) n -sedimentary.The norm of the product of two polynomials is classically related to their norms and degrees. This can beslightly refined using the sparsities instead of the degrees. Lemma 2.1.
Let F and G be two polynomials over (cid:90) . Then (cid:107)
F G (cid:107) ∞ ≤ min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ .Proof. Let H = (cid:80) k h k X k be the product of F and G . Then h k = (cid:80) i + j = k f i g j . Let T = min ( F , G ) . Thenthe sum to define h k has size of most T . Since | f i | ≤ (cid:107) F (cid:107) ∞ for all i and | g j | ≤ (cid:107) G (cid:107) ∞ for all j by definition, | h k | ≤ T (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ , whence the result.The modular reduction of polynomials has a bigger impact on the norm. It is actually related to severalparameters such as the gap parameter of the divisor and the difference of the degrees. The following exampleshows a large increase in the norm, as well as a densification of the result. Example . Let P = X + X + X − X + X + Q = X + X + X − X − X + X + X .Here the gap parameter of P is γ = , P =
6, Q = (cid:107) P (cid:107) ∞ = (cid:107) Q (cid:107) ∞ =
8. The polynomial Q mod P has degree 79, sparsity 53 and norm 11912. 5he following proposition bounds the growth on the different parameters of the polynomial after a mod-ular reduction. Proposition 2.2.
Let Q be a sparse polynomial of degree at most n − + k and P a monic polynomial of degreen with P ≥ . The polynomial Q mod P has at most Q ( P − ) (cid:100) k γ n (cid:101) monomials. If Q and P are defined over (cid:90) , (cid:107) Q mod P (cid:107) ∞ ≤ (cid:107) Q (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) k γ n (cid:101) .Proof. We analyse the growth of the norm and the sparsity while performing the euclidean division.Instead of following the classical quadratic algorithm, we first reduce once all the monomials of Q withdegree at least n to obtain a new dividend. We repeat this process until the dividend has degree less than n . Letus define the sequence ( Q [ i ] ) i by Q [ ] = Q and Q [ i + ] = ( Q [ i ] mod X n ) + ( Q [ i ] quo X n )( X n − P ) . Then Q [ i ] mod P = Q mod P for all i . Since deg ( Q [ i ] quo X n ) = deg ( Q [ i ] ) − n and deg ( X n − P ) ≤ ( − γ ) n , deg ( Q [ i + ] ) ≤ max ( n −
1, deg ( Q [ i ] ) − γ n ) , whence deg ( Q [ i ] ) ≤ max ( n −
1, deg ( Q ) − i γ n ) .Also, Q [ i + ] is at most Q [ i ] ( P − ) , thus Q [ i ] ≤ Q ( P − ) i . Finally, (cid:107) Q [ i + ] (cid:107) ∞ ≤ (cid:107) Q [ i ] (cid:107) ∞ ( + min ( Q [ i ] , P − ) (cid:107) P (cid:107) ∞ ) .Therefore, (cid:107) Q [ i ] (cid:107) ∞ ≤ ( P (cid:107) P (cid:107) ∞ ) i (cid:107) Q (cid:107) ∞ .Since deg ( Q [ i ] ) ≤ n + k − − i γ n , deg ( Q [ i ] ) < n if i = (cid:100) k γ n (cid:101) . This implies that Q [ i ] = Q mod P . We collect in this section some useful results to produce random prime numbers and random irreduciblepolynomials over finite fields.
Proposition 2.3 ( [ ] ) . If λ ≥ , there are at least λ/ ln λ prime numbers in [ λ , 2 λ ] . Using this proposition together with Miller-Rabin probability test, we can produce integers that are primewith good probability [ ] . Proposition 2.4.
There exists an algorithm R ANDOM P RIME ( λ , ε ) that returns an integer q in [ λ , 2 λ ] , such thatq is prime with probability at least − ε . It requires (cid:79) ( log ( ε ) log ( λ ) I ( log λ ) log log λ ) bit operations. Proposition 2.5.
Let H ∈ (cid:82) [ X ] be a nonzero polynomial of degree at most n and sparsity at most T , < ε < and λ = max ( ε T ln n ) . With probability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime number p suchthat H mod X p − (cid:54) = .Proof. It is sufficient, for H mod X p − e of H that is not congruentto any other exponents e j modulo p . In other words, it is sufficient that p does not divide any of the T − δ j = e j − e .Noting that δ j ≤ n , the number of primes in [ λ , 2 λ ] that divide at least one δ j is at most ( T − ) ln n ln λ . Sincethere exists λ/ ln λ primes in this interval, the probability that a prime randomly chosen from it divides atleast one δ j is at most ε/
2. R
ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] with probability at least 1 − ε/ Proposition 2.6.
Let H ∈ (cid:90) [ X ] be a nonzero polynomial, < ε < and λ ≥ max ( ε ln (cid:107) H (cid:107) ∞ ) . Then withprobability at least − ε , R ANDOM P RIME ( λ , ε ) returns a prime q such that H mod q (cid:54) = .Proof. Let h i be a nonzero coefficient of H , a random prime from [ λ , 2 λ ] divides h i with probability at most ln (cid:107) H (cid:107) ∞ /λ ≤ ε/
2. Since R
ANDOM P RIME ( λ , ε/ ) returns a prime in [ λ , 2 λ ] with probability at least 1 − ε/ Proposition 2.7 ( [
30, Chapter 19 ] ) . The number of irreducible monic polynomial of degree d over a field (cid:70) q isbetween q d d and q d d . Proposition 2.8 ( [
30, Chapter 20 ] ) . There exists an algorithm that, given a finite field (cid:70) q , an integer d and < ε < , computes a degree-d polynomial in (cid:70) q [ X ] that is irreducible with probability at least − ε . It requires (cid:79) ( log ( ε ) d M ( d )( log q + log log d )) operations in (cid:70) q or (cid:79) ( log ( ε ) d log q ) operations in (cid:70) q if using only naivepolynomial multiplications. Remark 2.9.
Shoup [ ] presents Las Vegas algorithms for Propositions 2.4 and 2.8. We consider Monte Carloversions of his algorithms. Also, he analyses the complexities with naive algorithms. Our complexity estimatesuse fast integer and polynomial arithmetic. Evaluation for polynomial multiplication in a quotient ring
As seen earlier, the verification of polynomial multiplication mainly relies on the evaluation of the polynomialidentity at a random point. In this section we present algorithms to efficiently compute the evaluation of amodular product ( F G ) mod P on a point α , without computing ( F G ) mod P . There, the modulus P is alwaysconsidered as a sparse polynomial, while F and G can be either dense or sparse.Section 3.1 describes our method in the simpler case where P is a binomial. We obtain linear-time evalu-ations, whether F and G are dense or sparse. Section 3.2 generalizes the method to the product of two densepolynomials modulo a sparse modulus, and Section 3.3 presents the case of a sparse modular product. Let us first present our method to evaluate a modular product
F G mod P where P = X n −
1. This specialcase illustrates our more general method. It also has its own interest since it is used as the main tool for theverification of a product of two polynomials in Section 5, either for dense or sparse representation.We first describe the algorithm for dense polynomials F and G . Theorem 3.1.
Let F and G be two polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) . The polynomial ( F G ) mod X n − can be evaluates on α using (cid:79) ( n ) operations in (cid:82) .Proof. Let H = F G and M = H mod X n −
1. We denote by f i (resp. g i , h i , m i ) the coefficient of degree i of thepolynomial F (resp. G , H , M ). Let also (cid:126) g = ( g , . . . , g n − ) T , (cid:126) h = ( h , . . . , h n − ) T and (cid:126) m = ( m , . . . , m n − ) T .It is a well-known fact that considering F as fixed, the multiplication by F is a linear map described by aToeplitz matrix. More precisely, we have (cid:126) h = T F (cid:126) g where T F = f f f ... ... f n − . . . . . . f f n − f ... ... f n − .Since M = H mod X n − m i = h i + h i + n − for 0 ≤ i < n − m n − = h n − . Therefore, (cid:126) m = C F (cid:126) g where C F is the circulant matrix C F = f f n − · · · f f f · · · f ... ... ... f n − f n − · · · f .On the other hand, evaluating M on α corresponds to the inner product (cid:126)α n (cid:126) m where (cid:126)α n = ( α , . . . , α n − ) .Therefore, our aim is to compute (cid:126)α n C F (cid:126) g . The standard way to perform this evaluation corresponds to firstcomputing (cid:126) m = C F (cid:126) g and then (cid:126)α n (cid:126) m . As noticed by Giorgi [ ] , the bracketing ( (cid:126)α n C F ) (cid:126) g yields a faster algorithmdue to the structure of the matrix C F .Let (cid:126) c = (cid:126)α n C F . Then c j + = (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j − ) mod n = f n − j − + α (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j ) mod n . Since for j > (cid:80) n − (cid:96) = α (cid:96) f ( (cid:96) − j ) mod n = c j − α n − f n − j − , we obtain the recurrence relation (cid:168) c j + = α c j − P ( α ) f n − j − for j ≥ c = F ( α ) (1)where P = X n − c = F ( α ) .It is immediate that exploiting such recurrence relation for computing the evaluation of ( F G ) mod P leadsto a complexity of (cid:79) ( n ) operations in (cid:82) . Indeed, once c and P ( α ) = α n − c j canbe computed sequentially at cost (cid:79) ( ) .For completeness, we provide the full description of this method in Algorithm 3.1.We can actually be more precise on the number of operations required by Algorithm 3.1. In particularwhen α does not lie into (cid:82) but in an extension (cid:82) ext of (cid:82) , we can distinguish between operations in (cid:82) and (cid:82) ext . In the next corollary, we call scalar multiplications those that are multiplications of an element of (cid:82) ext by an element of (cid:82) . The following analysis minimizes the number of non-scalar multiplications.7 lgorithm 1 E VALUATION M ODULO B INOMIAL
Input: F , G ∈ (cid:82) [ X ] with deg ( F ) , deg ( G ) < n , and α ∈ (cid:82) . Output: ( F G mod X n − )( α ) c ← F ( α ) P α ← α n − β ← c g for j = n − do c ← α c − P α f n − j β ← β + c g j return β Corollary 3.2.
Let F and G be two polynomials in (cid:82) [ X ] of degree less than n and α ∈ (cid:82) ext . The polynomial ( F G ) mod X n − can be evaluated on α using n − multiplications and n − additions in (cid:82) ext , and n − scalar multiplications.Proof. We can first compute α , α , . . . , α n using ( n − ) multiplications. Then, F ( α ) can be computed using ( n − ) scalar multiplications and additions, and P ( α ) = α n − c g of β requires one scalar multiplication. Then each iteration of the loop require one multiplication, twoscalar multiplications and two additions. Therefore, the complete evaluation require 3 n − n − n − F usingHorner’s scheme with ( n − ) multiplications, one scalar multiplication and ( n − ) additions. Then α n has to becomputed using at most 2 log n multiplications. This results in 3 n − n − + n multiplicationsand 2 n scalar multiplications. The total number of multiplications (scalar or not) is a bit less.We now turn to the analysis of the algorithm for F and G given in sparse representation. Theorem 3.3.
Let F and G be two sparsely represented polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) .The polynomial ( F G ) mod X n − can be evaluated on α using (cid:79) (( F + G ) log n ) operations in (cid:82) .Proof. We use the same notations as in the previous proof. If the support of G is supp ( G ) = { j , . . . , j G − } with j < · · · < j G − , the inner product (cid:126) c (cid:126) g is equal to (cid:80) G − k = c j k g j k . This means that only the G entries c j , . . . , c j G − of (cid:126) c need to be computed. Applying the recurrence relation (1) as many times as necessary, weobtain the new recurrence relation (cid:168) c j k + = α j k + − j k c j k − P ( α ) (cid:80) j k + (cid:96) = j k + α (cid:96) f n − (cid:96) for k ≥ c j = (( X j F ) mod X n − )( α ) . (2)The initial value c j can be computed using (cid:79) ( F log n ) operations in (cid:82) since it needs F exponentiationsof α with exponent bounded by n . Most values of f n − (cid:96) are actually equal to zero since F is sparse.A nonzero coefficient f t of F appears in the definition of c j k + if and only if n − j k + ≤ t < n − j k . Thus, each f t is used exactly once to compute all the c j k ’s. Since for each summand, one needs to compute α (cid:96) for some (cid:96) < n , the total cost for computing all the sums is (cid:79) ( F log n ) operations in (cid:82) . Similarly, the computationof α j k + − j k c j k for all k ∈ [
0, G − ] costs (cid:79) ( G log n ) operations in (cid:82) plus G − (cid:79) ( log n ) -bitintegers to get the exponents. As one operation in (cid:82) requires at least one bit operation, the integer additionsthat costs (cid:79) ( G log n ) bit operations are negligible. The last remaining step is the final inner product whichcosts (cid:79) ( G ) operations in (cid:82) , whence the result.As in the dense case, one can be more precise on the complexity if α liesin an extension (cid:82) ext . In contrary tothe dense case where there is more operations in (cid:82) than in (cid:82) ext , one can note that the number of operationsin (cid:82) is negligible in the sparse case. Corollary 3.4.
Let F and G be two sparsely represented polynomials in (cid:82) [ X ] of degrees less than n and α ∈ (cid:82) ext .The polynomial ( F G ) mod X n − can be evaluated on α using log n + (cid:79) (( F + G ) log n / log log n ) operationsin (cid:82) ext plus G − additions of (cid:79) ( log n ) -bit integers.Proof. We first notice that in the sparse case the operations on α dominate the complexity. These operationsare operations in (cid:82) ext . To improve the complexity estimates, we remark that in the sparse settings we need tocompute α t for several values of t . The computation of c j requires to know F values of α t , more precisely8hose with t = (cid:96) − j mod n for each nonzero coefficient f (cid:96) of F . To apply Equation (2), one needs to compute α t for t = j k + − j k , 1 ≤ k < G , and for t = (cid:96) where f n − (cid:96) (cid:54) =
0. The value α n is also needed to compute P ( α ) .Finally, the inner product requires to compute α t for each nonzero g t . Altogether, one needs α t for at most2 ( F + G ) values of t , each at most n . They can be computed independently using fast exponentiation, usingat most (cid:79) (( F + G ) log n ) multiplications, as it is done in Theorem 3.3. Actually, Yao [ ] shows that thesevalues of α t can be computed simultaneously using only log n + (cid:79) (( F + G ) log n / log log n ) multiplications.Once these α t have been computed, computing c j and the c j k ’s by means of Equation (2), as well as the innerproduct (cid:126) c (cid:126) g , only require (cid:79) ( F + G ) operations. In this section, we extend the previous algorithm to the evaluation of a polynomial
F G mod P where P is anymonic sparse polynomial. We first consider the case where F and G are given in dense representation. Thecase where they are given is sparse representation is postponed to the next section.The algorithm goes along the same lines as the evaluation modulo X n −
1. Let F [ i ] = ( X i F ) mod P . We canrewrite F G mod P = (cid:80) n − i = g i F [ i ] where g i is the coefficient of degree i in G . The evaluation of this equalityon a point α yields the formula ( F G mod P )( α ) = n − (cid:88) i = g i F [ i ] ( α ) . (3)To make use of this formula, we need to be able to efficiently evaluate each F [ i ] on α . Note that consecutive F [ i ] ’s are bound by the recurrence relation F [ i + ] = ( X F [ i ] ) mod P . Since deg ( F [ i ] ) = n − ( X F [ i ] ) mod P = X F [ i ] − f [ i ] n − P where f [ i ] n − is the coefficient of degree n − F [ i ] . Consequently we have the followingrecurrence relation (cid:168) F [ i + ] ( α ) = α F [ i ] ( α ) − f [ i ] n − P ( α ) for i ≥ F [ ] ( α ) = F ( α ) (4)The evaluations of each F [ i ] on α can thus be computed iteratively from F ( α ) , only knowing the coefficient f [ i ] n − of F [ i ] for 0 < i < n − P = X n − f [ i ] n − = f n − − i . In the general case, thecomputation is based on the recurrence relation F [ i + ] = X F [ i ] − f [ i ] n − P , which implies f [ i + ] k = f [ i ] k − − f [ i ] n − p k (5)for 0 < k ≤ n −
1. This allows to compute each f [ i ] n − , starting from the values of f [ ] k for all k . These valuesare given as input since F [ ] = F by definition. Note that since P is a sparse polynomial, Equation (5) actuallyreduces to an equality f [ i + ] k = f [ i ] k − in many cases. Algorithm 2 takes this into account and only performs therequired updates. Algorithm 2 L EADING C OEFFICIENTS
Input:
Two polynomials P and F in (cid:82) [ X ] , with deg ( F ) < deg ( P ) = n and P monic. Output:
The vector [ f [ ] n − , f [ ] n − , . . . , f [ n − ] n − ] , where f [ i ] n − is the coefficient of degree n − F [ i ] = ( X i F ) mod P . V ← [ f n − , f n − , . . . , f ] for i = n − do for k ∈ supp ( P ) such that i < k < n do V [ i + n − k ] ← V [ i + n − k ] − p k V [ i ] return V Lemma 3.5.
Algorithm 2 is correct. It uses (cid:79) ( n P ) operations in (cid:82) .Proof. The number of operations is clear: all operations are performed at Step 4 and it is called (cid:79) ( n P ) times.Note that the external for loop can be stopped as soon as there exists no k ∈ supp ( P ) such that i < k < n . Inother words, i never goes beyond deg ( X n − P ) − i of the external loop,the following property (cid:80) ( i ) holds: V [ j ] = f [ j ] n − for any j ≤ i + V [ j ] = f [ i + ] n − ( j − i ) for j > i + (cid:80) ( − ) holds since it reads V [ j ] = f [ ] n − j − for all j , and F = F [ ] by definition.Suppose that (cid:80) ( i − ) holds. In particular, V [ j ] = f [ j ] n − for j ≤ i . During iteration i , only V [ i + ] to V [ n − ] can be modified so these equalities remain after that iteration. For j > i , V [ j ] = f [ i ] n − ( j − i + ) beforethe iteration by hypothesis. After the iteration, it becomes V [ j ] = f [ i ] n − ( j − i + ) − p n − j + i V [ i ] = f [ i ] n − j + i − − p n − j + i f [ i ] n − .Equation (5) shows that V [ j ] = f [ i + ] n − j + i after Step 4, and (cid:80) ( i ) holds.To conclude, after the last iteration, V [ j ] = f [ j ] n − for all j ≤ n − F G mod P on a point α . In the following algorithm, weassume that α belongs to some extension ring (cid:82) ext of (cid:82) . Our analysis distinguishes between operations in (cid:82) and in (cid:82) ext . Algorithm 3 M ODULAR E VALUATION
Input: P , F , G ∈ (cid:82) [ X ] with deg ( F ) , deg ( G ) < deg ( P ) = n , P monic, and α ∈ (cid:82) ext . Output: ( F G mod P )( α ) V ← [ f [ ] n − , . . . , f [ n − ] n − ] using a call to L EADING C OEFFICIENTS ( P , F ) P α ← P ( α ) F α ← F ( α ) β ← F α g for i = n − do F α ← α F α − V [ i − ] P α β ← β + F α g i return β Theorem 3.6.
Algorithm 3 is correct. It uses (cid:79) ( n P ) operations in (cid:82) and (cid:79) ( n ) operations in (cid:82) ext .Proof. Step 6 relies on Equation (4) to compute F α = F [ i ] ( α ) . Step 7 uses this evaluation together with Equa-tion (3) to correctly compute ( F G mod P )( α ) . The first step requires (cid:79) ( n P ) operations in (cid:82) by Lemma 3.5.(It does not depend on α .) The other steps require (cid:79) ( n ) operations in (cid:82) ext .As before, we notice that the operations in (cid:82) ext are sometimes scalar multiplications, that is multiplicationsof an element of (cid:82) ext by an element of (cid:82) . We provide an analysis that minimizes the number of non-scalarmultiplications. Corollary 3.7.
Let P, F , G and α as in Algorithm 3. Then ( F G ) mod P can be evaluated on α using n − multiplications and ( n − + P ) additions in (cid:82) ext , ( n − + P ) scalar multiplications in (cid:82) ext , and ( n − )( P − ) multiplications and additions in (cid:82) .Proof. We first note that the number of operations performed by Algorithm 2 is at most ( n − )( P − ) multiplications and additions in (cid:82) . In Algorithm 3, we need to evaluate both P and F on α . To minimizethe number of non-scalar multiplications, we first compute α , . . . , α n using n − (cid:82) ext .We can then compute P α using P − P − F α using n − n − β and the for loop require n − n − n − n − (cid:82) ext , ( n − + P ) scalar multiplications in (cid:82) ext , and ( n − + P ) additions in (cid:82) ext .In a different context where the aim is specifically to compute the evaluation with no restriction to the useof polynomial arithmetic one can first compute the polynomial F G mod P and then evaluate it on α ∈ (cid:82) ext .Such method requires (cid:79) ( n log n log log n ) operations in (cid:82) for the polynomial multiplication F × G and divisionby P and (cid:79) ( n ) operations in (cid:82) ext for the evaluation. Thus we see that if P verifies P < log n log log n ourtechnique is more efficient. In this section, we adapt and analyse the previous algorithms for polynomials F and G given in sparse rep-resentation. Our results depend on the difference between the highest and the second highest exponents in P . Recall that the gap parameter γ is a measure of this difference, defined by 1 − γ = n max { k < n : p k (cid:54) = } .10n particular, the second highest exponent with nonzero coefficient in P is ( − γ ) n . Proposition 2.2 givesa relation between the gap parameter and the sparsity of ( F G ) mod P . The potential growth of the sparsityinduced by the reduction modulo P explains the dependency of our results on the gap parameter.As G is sparse, Equation (3) becomes ( F G mod P )( α ) = (cid:88) i ∈ supp ( G ) g i F [ i ] ( α ) , (6)with the same notations as in the previous section. The recurrence relation F [ i + ] = X F [ i ] − f [ i ] n − P still holds,hence F [ i + ] ( α ) = α F [ i ] ( α ) − f [ i ] n − P ( α ) too. The goal now is to efficiently compute F [ i ] ( α ) for all i ∈ supp ( G ) only, not for all indices i . When γ is not close to zero, there are actually few indices i such that f [ i ] n − (cid:54) =
0. Infact, the number of such indices depends on F , P and γ . Let I = { i , . . . , i t } denote this set of indices. Wewill prove in Lemma 3.8 that this set is of size (cid:79) ( F P (cid:100) /γ − (cid:101) ) . We decide to first provide some argumentsand an explicit algorithm to prove this claim.An important remark is that for any 0 ≤ j < n −
1, in particular those verifying j ∈ supp ( G ) , if we assume i to be the largest index in I not larger than j , then Equation (5) implies F [ j ] ( α ) = α j − i F [ i ] ( α ) . (7)Therefore, the recurrence relation given in (4) becomes (cid:168) F [ i k + ] ( α ) = α i k + − i k − ( α F [ i k ] ( α ) − f [ i k ] n − P ( α )) for k ≥ F [ i ] ( α ) = α i F [ ] ( α ) (8)To efficiently use Equations (7) and (8) to perform the evaluation, we need to provide a sparse variant ofAlgorithm 2. It computes a sparse representation of the vector V = [ f [ ] n − , . . . , f [ n − ] n − ] , that is the sparse vector { ( i , f [ i ] n − ) : f [ i ] n − (cid:54) = } .The idea of Algorithm 4 is to mimic Algorithm 2 in the sparse settings. For simplicity of the presentation,we first consider to store V as a sparse vector as it is sufficient to our needs for proving our claims on the sizeof the set I . We will show in Corollary 3.9 that we must require another structure to minimize the complexityattached to data management.The initial nonzero values in V are the nonzero coefficients of F = F [ ] , with V [ i ] = f n − − i if n − i − ∈ supp ( F ) . Let now consider the external loop in Algorithm 2. Iteration i does not require any operation if f [ i ] n − = f [ i + ] k = f [ i ] k − in that case. Therefore, we must loop over indices i suchthat f [ i ] n − is nonzero. For such an index i , the same updates as in Step 4 of Algorithm 2 are required. For k ∈ supp ( P ) , i < k < n , we must perform the update V [ i + n − k ] ← V [ i + n − k ] − p k f [ i ] n − . If V [ i + n − k ] is already nonzero, its value is already stored in V and must be updated. Otherwise, the new value − p k f [ i ] n − must be inserted in V with index i + n − k .It remains to be able to only loop over the indices i such that f [ i ] n − (cid:54) =
0. Let us assume that iteration i hasbeen performed since f [ i ] n − (cid:54) =
0. The proof of Lemma 3.5 shows that V [ i + ] then contains f [ i + ] n − . Thereforein the sparse setting, we know that iteration i + V [ i + ] (cid:54) =
0. Moregenerally, the next index to be considered is the index of the next nonzero entry of V after V [ i ] . Algorithm 4below uses such method for computing all the indices i such that f [ i ] n − is nonzero. Algorithm 4 S PARSE L EADING C OEFFICIENTS
Input: P , F ∈ (cid:82) [ X ] with deg ( F ) < deg ( P ) = n , and P monic Output:
The list { ( i , f [ i ] n − ) : 0 ≤ i < n − f [ i ] n − (cid:54) = } , sorted by increasing values of i . L ← empty list V ← { ( i , f n − − i ) : n − − i ∈ supp ( F ) } (sparse vector) while V is not empty do ( i , v ) ← extract the element of minimal index from V if v (cid:54) = then Add ( i , v ) to the list L for k ∈ supp ( P ) such that i < k < n do V [ i + n − k ] ← V [ i + n − k ] − p k v return L Lemma 3.8.
Algorithm 4 is correct. If the polynomial P has a gap parameter γ , the algorithm uses (cid:79) ( F P (cid:100) /γ − (cid:101) ) operations in (cid:82) and additions of (cid:79) ( log n ) -bits integers. In particular, there are at most F P (cid:100) /γ − (cid:101) indices i such that f [ i ] n − (cid:54) = . roof. As explained above, Algorithm 4 is an adaptation of Algorithm 2 to the sparse settings that only com-putes those f [ i ] n − that are nonzero, using Equations (7) and (8) in place of Equation (4). Instead of consideringall the F i [ n − ] one after the other it only considers those which are not zero. Let us call “iteration i ” theiteration in the while loop that extract a pair ( i , v ) from V . To prove the correctness, we prove by inductionthat at the end of iteration i , V = { ( j , f [ i + ] n − j + i ) : j > i , f [ i + ] n − j + i (cid:54) = } and L = { ( j , f [ j ] n − ) : j ≤ i , f [ j ] n − (cid:54) = } .Before the loop (“iteration − L is empty and V contains exactly the pairs ( j , f [ ] n − j − ) such that f [ ] n − j − (cid:54) = (cid:96) , and let ( i , v ) be the pair extracted at thenext iteration. We first prove that f [ i ] n − = v and f [ j ] n − = (cid:96) < j < i . By minimality of i and inductionhypothesis, f [ (cid:96) + ] n − j + (cid:96) = (cid:96) < j < i at the end of iteration (cid:96) . In particular, f [ (cid:96) + ] n − =
0. By Equation (5), f [ (cid:96) + ] n − j + (cid:96) + = f [ (cid:96) + ] n − j + (cid:96) − f [ (cid:96) + ] n − p n − j + (cid:96) + =
0. And an easy recurrence shows that f [ j ] n − = (cid:96) < j < i . Nowthis implies that f [ i ] n − = f [ i − ] n − = · · · = f [ (cid:96) + ] n − i + (cid:96) . Yet by induction hypothesis, at the end of iteration (cid:96) , V contains ( j , f [ (cid:96) + ] n − j + i ) if f [ (cid:96) + ] n − j + i (cid:54) =
0. Therefore, if f [ i ] n − is nonzero, f [ (cid:96) + ] n − i + (cid:96) is nonzero too and V contains the pair ( i , f [ (cid:96) + ] n − i + (cid:96) ) = ( i , f [ i ] n − ) . In other words, the value v extracted from V is indeed equal to f [ i ] n − and the propertyholds for L after iteration i .Now with the same argument, f [ i ] n − − j + i = f [ (cid:96) + ] n − j + (cid:96) for (cid:96) < j < i . Right before iteration i , V contains thenthe pairs ( j , f [ i ] n − − j + i ) for f [ i ] n − − j + i (cid:54) =
0. After iteration i , such pairs are replaced by ( j , f [ i ] n − − j + i − p n − j + i f [ i ] n − ) ,that is ( j , f [ i + ] n − j + i ) by Equation (5). And if f [ i ] n − − j + i = p n − j + i (cid:54) =
0, a new pair ( j , − p n − j + i f [ i ] n − ) = ( j , f [ i ] n − j + i ) is inserted into V . Therefore, the property holds for V too after iteration i .The second point is to count the number of operations. Since the while loop stops when V is empty, thenumber of operations in (cid:82) is at most twice the number of pairs that are inserted into V during the algorithmand the same number of additions in (cid:90) are performed as the index of each pair is computed by two additionsof number at most n . We will classify the pairs by generations . Initially, V contains F pairs which formgeneration 0. New pairs can be inserted into V when a pair ( i , v ) is extracted. If ( i , v ) is a pair of generation t , the new pairs inserted at iteration i belong to generation t +
1. At any iteration, at most P − V . Therefore, there are at most F ( P − ) pairs of generation 1, F ( P − ) pairs ofgeneration 2, and in general F ( P − ) t pairs of generation t . Now we need to bound the number ofgenerations. Note the pairs of generation 0 have an index i between 0 and n −
1. But at generation 1, thenew pairs have index ( i + n − k ) for some k ∈ supp ( P ) , k < n . There comes the gap into account: If P has gapparameter γ , the largest exponent less than n in supp ( P ) is ( − γ ) n by definition. Therefore, at generation 1,all pairs have an index at least i + n − ( − γ ) n ≥ γ n . At generation 2, all pairs have then an index at least 2 γ n .At generation t , all pairs have an index at least t γ n . Since indices are bounded by n −
1, there cannot be anypair of generation t if t γ n ≥ n . In other words, the largest possible generation is t = (cid:100) /γ − (cid:101) . Altogether,the total number of pairs inserted into V is at most (cid:100) /γ − (cid:101) (cid:88) t = F ( P − ) t = F ( P − ) (cid:100) /γ (cid:101) − P − P >
2, and is at most (cid:100) /γ (cid:101) F if P =
2. To simplify the exposition, we bound both of them by F P (cid:100) /γ − (cid:101) in the following. Note that of course, this number is also a bound on the number of pairsthat are extracted from V during the algorithm.This has two consequences. First, the number of extracted pairs is a bound on the size of the list at thenend of the algorithm. Therefore there are at most (cid:79) ( F P (cid:100) /γ − (cid:101) ) nonzero values f [ i ] n − . Second, this numberalso bounds the total number of executions of Step 8, that is the total number of operations. Corollary 3.9.
All operations of I NSERTION , R EMOVAL , M INIMUM and S EARCH of pairs ( i , ν ) in the data structureV within Algorithm 4 can be done with (cid:79) ( F P (cid:100) /γ − (cid:101) log log n ) bit operations.Proof. By definition the size of the sparse vector V is at most n . Therefore, using a data structure for V oftype van Emde Boas tree with a universe of size n , ensures that any requested operations can be done with (cid:79) ( log log n ) bit operations, see [
5, Chapter 20 ] . Remark 3.10.
As the bit-complexity of all the operations in (cid:90) of Algorithm 4 is (cid:79) ( F P (cid:100) /γ − (cid:101) log n ) the costdriven by the data structure of V is negligible. F G mod P on some point α , when F and G are given in sparse representation, relies on Equations (6), (8) and (7). More precisely, we first compute each F [ i ] ( α ) for indices i such that f [ i ] n − (cid:54) = F [ j ] ( α ) for j ∈ supp ( G ) by means of Equation (7). Finally, we deduce ( F G mod P )( α ) using Equation (6).In Algorithm 5, all these computations are intertwined. The idea is to loop over all indices j such thateither f [ j ] n − (cid:54) = j ∈ supp ( G ) . If f [ j ] n − (cid:54) =
0, we update the value F [ j ] ( α ) using Equation (8). If j ∈ supp ( G ) ,we accumulate partial evaluations of ( F G mod P )( α ) using Equations (6) and (7). Algorithm 5 S PARSE M ODULAR E VALUATION
Input: P , F and G ∈ (cid:82) [ X ] , with deg ( F ) , deg ( G ) < deg ( P ) = n , P monic and α ∈ (cid:82) ext . Output: ( F G mod P )( α ) V ← { ( i , f [ i ] n − ) : 1 ≤ i < n − f [ i ] n − (cid:54) = } , using a call to S PARSE L EADING C OEFFICIENTS ( P , F ) if f [ ] n − = then insert (
0, 0 ) in V P α ← P ( α ) F α ← F ( α ) β ← F α g (cid:46) β ← / ∈ supp ( G ) i ← for j ∈ supp ( V ) ∪ supp ( G ) \ { } , by increasing order do if j ∈ supp ( V ) then F α ← α j − i − ( α F α − V [ i ] P α ) (cid:46) Equation (8) i ← j if j ∈ supp ( G ) then β ← β + α j − i F α g j (cid:46) Equations (6) and (7) return β Theorem 3.11.
Algorithm 5 is correct. It uses (cid:79) ( F P (cid:100) γ − (cid:101) ) operations in (cid:82) , (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) operations in (cid:82) ext .Proof. We prove that at the end of iteration j , F α = F [ j ] ( α ) if j ∈ supp ( F ) and β = (cid:80) i g i F [ i ] ( α ) where thesum ranges over indices i ∈ supp ( G ) ∩ {
0, . . . , j } . The property is satisfied after iteration 0 (before enteringthe loop) since F α = F [ ] ( α ) and β = F α g = g F [ ] ( α ) . Let us assume that the property holds before enteringiteration j . Index i denotes the previous index that belongs to supp ( F ) . Therefore, if j ∈ supp ( F ) , Equation (8)ensures that F α has the right value after iteration j since V [ i ] = f [ i ] n − . And Equations (6) and (7) justify that β also has the right value if j ∈ supp ( G ) .The evaluations P ( α ) and F ( α ) require (cid:79) ( P log n ) and (cid:79) ( F log n ) operations in (cid:82) ext respectively.Steps 9 and 12 each require (cid:79) ( log n ) operations in (cid:82) ext to compute powers of α and (cid:79) ( ) additions withintegers of size (cid:79) ( log n ) to compute the appropriate exponent. These steps are executed (cid:79) ( V + G ) times. Since V = (cid:79) ( F P (cid:100) /γ − (cid:101) ) this gives a total of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) operations in (cid:82) ext plus (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n ) bit operations for the integer additions. Since we can easily assume that one op-eration in (cid:82) ext will cost more than one bit operation, the latter complexity is dominated by the computationpart in (cid:82) ext . The cost of Step 1 is given by Lemma 3.8. We can still use van Edme Boas tree to iterate over the union of the supports of V and G at Step 7 witha total of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log log n ) bit operations which is less than the number bit operations requiredby the additions in (cid:90) and thus negligible.Obviously, since polynomial multiplication over integral domains is commutative, the roles of F and G can be exchanged in Algorithm 5. In particular if G < F , this exchange decreases the complexity inTheorem 3.11. In other words, the statement remains valid if F is replaced by min ( F , G ) and G bymax ( F , G ) . The same remark applies to subsequent results. Remark 3.12.
As in Corollary 3.4, we can decrease the number of operations over (cid:82) ext by using simultaneousexponentiation on α . This results in log n + (cid:79) (( F P (cid:100) /γ − (cid:101) + G ) log n / log log n ) operations in (cid:82) ext . If the gap parameter γ is close to n , the polynomial F G mod P is in general a dense polynomial even if F G is sparse and the dense modular evaluation will be more appropriate. On the contrary,
F G mod P remainssparse if γ is close to 1, in particular if γ ≥ . Remark 3.13. If γ ≥ , the evaluation requires (cid:79) (( F P + G ) log n ) operations in (cid:82) ext . emark 3.14. The factor F P (cid:100) γ − (cid:101) + G in the complexity may be larger than the actual sparsity of F G mod
P.For instance, the sparsity can be if P divides F G. Yet, it is smaller than the general bound F G ( P − ) (cid:100) γ (cid:101) given by Proposition 2.2. Thus in general it is more efficient to use our method than to directly evaluate F G mod Pif the polynomial is known.
This section is devoted to the verification of polynomial modular product. That is, given F , G , H and P , suchthat deg ( F ) , deg ( G ) , deg ( H ) < deg ( P ) = n , we want to test whether H = F G mod P . The idea is classical, thatis to evaluate the identity at a random point. Contrary to the more straightforward verification of polynomialmultiplication, we cannot do such evaluation directly since we do not know the polynomial Q = ( F G − ( F G mod P )) / P . As seen in the previous section, we provide new algorithms to do such evaluation efficientlywithout reverting to the computation of Q . We remind that P is always taken monic. Note that this is a mildassumption since ( F G ) mod P = ( F G ) mod ( λ P ) for any invertible constant λ .In the following, all algorithms are analysed both when the polynomials F , G and H are dense and whenthey are sparse. On the other hand, P is always considered as a sparse polynomial. We shall recall that γ denotes the gap parameter of P , defined by 1 − γ = deg ( P − X n ) , and it serves to control the densification ofthe modular reduction.We first begin in Section 4.1 with an abstract case where the polynomials are defined over an integraldomain. There, we analyse the algorithms by counting the number of ring operations. In Sections 4.2 and 4.3we discuss some adaptations of the algorithm to the case of integers and small finite fields in order to providefiner analysis in the bit complexity model. (cid:82) [ X ] Algorithm 6 depicted below is straightforward from Theorems 3.6 or 3.11. We mainly provide his descriptionto serve as a starting point for its adaptations in the next sections. The algorithm covers both the dense andthe sparse case. The only difference is at Step 1.
Algorithm 6 M ODULAR V ERIFICATION
Input: F , G , H and P ∈ (cid:82) [ X ] , P monic of degree n and F , G and H of degrees < n ; 0 < ε < Output:
True if H = F G mod P , False with probability at least 1 − ε otherwise. if F , G and H are given in sparse representation then if H > F G ( P − ) (cid:100) /γ (cid:101) then return False (cid:46)
Proposition 2.2 α ← random element from a subset (cid:69) of (cid:82) , of size ≥ ε ( n − ) β ← ( F G mod P )( α ) (cid:46) using Theorem 3.6 or 3.11 return True if β = H ( α ) , False otherwise Theorem 4.1. If (cid:82) has at least ε ( n − ) elements, Algorithm 6 is correct.If F , G and H are dense, the algorithm uses (cid:79) ( Pn ) operations in (cid:82) .If F , G and H are sparse, the algorithm uses (cid:79) (( F P (cid:100) γ − (cid:101) + G + H ) log n ) operations in (cid:82) .Proof. Step 1 dismisses a trivial mistake if the polynomials are sparse. If H = ( F G ) mod P , H ( α ) = ( F G mod P )( α ) for any α and the algorithm always returns True. Otherwise, let ∆ = H − ( F G ) mod P . Then ∆ hasdegree < n , hence has at most n − α , randomlychosen in (cid:69) , is a root of ∆ is at most ( n − ) / ε ( n − ) = ε . The algorithm returns True in that case withprobability at most ε .The complexity is given by the cost of a single modular product evaluation that is stated in Theorem 3.6for the dense case and in Theorem 3.11 for the sparse case.If (cid:82) is not large enough, the algorithm fails and it is customary to revert to an extension ring (cid:82) ext toperform the evaluation in a larger set. Using Theorems 3.6 and 3.11, we get the following extension whenthe polynomials are evaluated on a random point of (cid:82) ext rather than (cid:82) . Corollary 4.2.
Let (cid:82) be an integral domain with less than ε ( n − ) elements, and (cid:82) ext an extension ring of (cid:82) with at least ε ( n − ) elements. Then Algorithm 6 can be adapted by choosing a random element from (cid:82) ext , withthe same probability of success. It uses (cid:79) ( n P ) operations in (cid:82) and (cid:79) ( n ) operations in (cid:82) ext if F , G and H aredense, and (cid:79) (( F P (cid:100) γ − (cid:101) + G + H ) log n ) operations in (cid:82) ext if they are sparse.
14n the dense case, Algorithm 6 uses an optimal number of operations in (cid:82) as soon as P is constant.It is always faster than a modular product when Pn < M ( n ) that is when P < log ( n ) log log ( n ) for ageneral ring (cid:82) . In the sparse case, Theorem 4.1 is not linear in the input size. Indeed, P is raised to apotentially large power (cid:100) γ − (cid:101) and more importantly there is a factor log n in the number of operationsin (cid:82) while the input has only ( F + G + H ) elements of (cid:82) . Nevertheless, the efficiency of verificationhas to be compared with the cost of computing F G mod P . We assume the latter to be done with a sparsemultiplication followed by a sparse division with P . This hypothesis seems reasonable as no work have beendone to optimize such operation yet. Letting aside the division, the number of operations in (cid:82) for sparsemultiplication could be either (cid:79) ( F G ) with naive approach or ˜ (cid:79) ( F + G + ( F G )) using [ ] . Assuming P to be constant, our verification has a complexity of (cid:79) (( F + G + H ) log n ) which is always faster when ( F G ) = o ( F G / log n ) . If we assume that ( F G ) = (cid:79) ( F G ) , our verification will be faster at least when n = (cid:79) ( F + G ) and P constant. Depending on the cost of the division, our algorithm could be faster in morecases.Of course, these conditions are not very restrictive. We use Algorithm 6 in Section 5 to verify classical poly-nomial multiplication, where P will be a binomial of degree either logarithmic in the sparsity or polynomialin the input degree.Yet the efficiency of Algorithm 6 depends heavily on the integral domain (cid:82) . Indeed the complexity ofpolynomial multiplication in (cid:82) [ X ] can be faster than (cid:79) ( n log n log log n ) operations in (cid:82) [
13, 15 ] . Further-more, if (cid:82) is small, one operation in (cid:82) ext corresponds to a non-constant number of operations in (cid:82) . In thefollowing sections we consider polynomials over the integers or finite fields and we provide thorough analysestogether with adapted versions when necessary. (cid:90) [ X ] If the polynomials are defined over (cid:90) , there is no difficulty with the size of (cid:69) in Algorithm 6. However wemust prevent the integers growth during the evaluation. It is very classical to choose a random prime q andto map the whole computation into (cid:70) q . To do so, we must ensure two properties on the prime q . First, q must be large enough to use the algorithm, that is at least ε ( n − ) . Second, if H (cid:54) = ( F G ) mod P , we need thisinequality to hold modulo q as well. For this second property, we define ∆ = H − ( F G ) mod P . To ensure that ∆ does not vanish modulo q , we need that at least one coefficient of ∆ is nonzero modulo q . We then needto bound its coefficients to assess the latter fact. Proposition 4.3.
The coefficients of ∆ are bounded by (cid:107) H (cid:107) ∞ + min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) γ (cid:101) .Proof. The coefficient of ∆ are bounded by (cid:107) H (cid:107) ∞ + (cid:107) F G mod P (cid:107) ∞ . The bound follows from Proposition 2.2,with Q = F G and (cid:107) Q (cid:107) ∞ ≤ min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ by Lemma 2.1.Using this bound and Proposition 2.6 we can determine an appropriate prime q to adapt the Algorithm 6to the integer case. This is done in the Algorithm M ODULAR V ERIFICATION O VER
Z below.
Algorithm 7 M ODULAR V ERIFICATION O VER Z Input: F , G , H and P ∈ (cid:90) [ X ] , P monic of degree n and F , G and H of degrees < n ; 0 < ε < Output:
True if H = F G mod P , False with probability at least 1 − ε otherwise. ∆ ∞ ← (cid:107) H (cid:107) ∞ + min ( F , G ) (cid:107) F (cid:107) ∞ (cid:107) G (cid:107) ∞ ( P (cid:107) P (cid:107) ∞ ) (cid:100) γ (cid:101) q ← R ANDOM P RIME ( λ , ε ) where λ = max ( ε n , ε ln ∆ ∞ ) ( F q , G q , H q , P q ) ← ( F mod q , G mod q , H mod q , P mod q ) return M ODULAR V ERIFICATION ( F q , G q , H q , P q , ε ) Theorem 4.4.
Algorithm 7 is correct. If C = max ( (cid:107) P (cid:107) ∞ , (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ) and T = max ( F , G , H ) ,the algorithm requires (cid:79) ( log ( n ε log C ) log log ( n ε log C ) log ε I ( log ( n ε log C ))) bit operations to get a prime number,plus • (cid:79) (( Pn + n log C log ( n ε log C ) ) I ( log ( n ε log C ))) bit operations if F , G and H are dense, or • (cid:79) (( T P (cid:100) γ − (cid:101) log n + ( T + P ) log C log ( n ε log C ) ) I ( log ( n ε log C ))) bit operations if F , G and H are sparse.Proof. To ensure that M
ODULAR V ERIFICATION works properly, we need q to be at least ε ( n − ) . This is thecase since λ ≥ ε n . The algorithm always returns the correct answer when H = ( F G ) mod P . Otherwise,15t may incorrectly return True in two cases: Either H q = ( F q G q ) mod P q while the equality does not holdover (cid:90) , or M ODULAR V ERIFICATION incorrectly returns True. Both situations occur with probability at most ε . Indeed, by Proposition 4.3, ∆ ∞ is always a bound on (cid:107) ∆ (cid:107) ∞ where ∆ = H − ( F G ) mod P . Therefore,Proposition 2.6 shows that with probability at least ε , the number q chosen at Step 2 is a prime number suchthat ∆ mod q (cid:54) =
0. The error probability of Algorithm 7 is thus at most ε .Let us now analyse the complexity of the algorithm. As a first step, we shall express q in terms of the inputsize. Since P , F , G ≤ n , ∆ ∞ = (cid:79) ( n + (cid:100) γ (cid:101) C + (cid:100) γ (cid:101) ) . Thus, ln ∆ ∞ = (cid:79) ( γ ( log n + log C )) = (cid:79) ( n ( log n + log C )) since γ ≤ n . This implies q = (cid:79) ( n ε ( log n + log C )) and log q = (cid:79) ( log ( n ε log C )) .By Proposition 2.4, Step 2 costs (cid:79) ( log ε log q log log q I ( log q )) bit operations, that is (cid:79) (cid:0) log ε log ( n ε log C ) log log ( n ε log C ) I ( log ( n ε log C )) (cid:1) . (9)Step 3 requires (cid:79) ( n log C log q I ( log q )) bit operations in the dense case, and (cid:79) (( T + P ) log C log q I ( log q )) bit operationsin the sparse case. By Theorem 4.4, Step 4 requires (cid:79) ( Pn I ( log q )) bit operations in the dense case and (cid:79) ( T P (cid:100) γ − (cid:101) log n I ( log q )) bit operations in the sparse case. Adding the complexities of all these steps leadsto the claimed bit complexity. Remark 4.5.
In most cases, the cost of finding a prime number is negligible in comparison to the rest of thealgorithm. In the dense case, it is negligible as long as ε = / n (cid:79) ( ) . In the sparse case, it is negligible whenthe degree n is not too large compared to the other input parameters. More precisely, this is the case when n ε = (( T + P ) log C ) (cid:79) ( ) . When computing a multiplication followed by a modular reduction with polynomials in (cid:90) [ X ] , the sizeof the coefficients can grow significantly as shown by the bound on (cid:107) F G mod P (cid:107) ∞ given in the proof ofProposition 4.3. At the opposite, our verification algorithm is done with bounded integers of bit length (cid:79) ( log ( n ε log C )) . This is logarithmic in the input size in dense representation and linear in the sparse one.Our verification therefore avoids paying the coefficient growth, contrary to the direct computation. Takingthis growth into account to compare our verification algorithm with the computation seems hard. We onlydetail the cases where our algorithm is already faster even without considering the coefficient growth. Weshall mention that for sparse polynomial, ( F G mod P ) might be smaller that H . In our analysis, we assumefor simplicity that both have approximately the same size. Remark 4.6.
For ε = / n (cid:79) ( ) , Algorithm 7 is faster than the polynomial modular product(i) when the polynomials are dense and P < min ( log n loglog n , log C logloglog C ) ;(ii) when the polynomials are sparse and n ε = (( T + P ) log C ) (cid:79) ( ) .Proof. We assume ε = / n (cid:79) ( ) . In particular, log n ε = (cid:79) ( log n ) . To simplify the analysis, we place ourselvesin the case of a negligible cost for finding the prime q , as described in Remark 4.5. In the sparse case, thisimplies that n ε must be polynomial in ( T + P ) log C . Therefore log n < min ( T , P ) and I ( log ( n log C )) = I ( log log C ) . This makes our verification faster than computing the modular product since the latter requiresat least ( F G mod P ) = (cid:79) ( T P (cid:100) γ (cid:101) ) operations on integers of bit length log C .For the dense case, we compare to the cost of multiplying two polynomials of degree n and coefficientsbounded by C . As seen in the introduction, this reduces to integer multiplication with Kronecker substitutionand it costs I ( n ( log n + log C )) = (cid:79) ( n ( log n + log n log C + log C log log C ) bit operations. By Theorem 4.4 ourverification needs (cid:79) ( Pn I ( log n + log log C ) + n log C log ( log n + log log C )) bit operations. The second termis dominated by the complexity of multiplying the polynomials when ε = / n (cid:79) ( ) . The first term is (cid:79) ( Pn (( log n + log log C ) log log n + ( log n + log log C ) log log log C )) .When P < min ( log n loglog n , log C logloglog C ) , this term is bounded by (cid:79) ( n ( log n + log n log C + log C log log C )) , that isthe complexity of computing the product. (cid:70) q [ X ] The situation over a finite field (cid:70) q is different since there is no growth to prevent. When q is large enough,Theorem 4.1 applies directly. Otherwise, one can revert to computing in a sufficiently large extension field of (cid:70) q , where Corollary 4.2 can be applied. We first give the precise complexity bounds for these two cases.16 orollary 4.7. Let F , G, H and P ∈ (cid:70) q [ X ] as in Algorithm 6. One can test whether H = ( F G ) mod P usingAlgorithm 6, with (cid:79) ( log ε ( log q n ε ) M ( log q n ε )( log q + log log q n ε )) operations in (cid:70) q to get an irreducible polynomialof degree (cid:79) ( log q n ε ) , plus • (cid:79) ( n P + n M ( log q n ε )) operations in (cid:70) q if F , G and H are dense, or • (cid:79) ( T P (cid:100) γ − (cid:101) log n M ( log q n ε )) operations in (cid:70) q if F , G and H are sparse with at most T nonzero monomials.Proof. Let us assume that q < ε ( n − ) otherwise Theorem 4.1 applies straightforwardly. In that case, Corol-lary 4.2 requires to choose a random point in an extension (cid:70) q d of (cid:70) q with at least ε ( n − ) elements. Moreprecisely, we use Proposition 2.8 to produce with probability 1 − ε an irreducible polynomial of degree d over (cid:70) q , where d is the smallest integer such that q d ≥ ε ( n − ) . The algorithm may be incorrect if either thepolynomial used to defined (cid:70) q d fails to be irreducible, or if the Algorithm 6 fails. If we choose α at random in (cid:70) q d , the error probability of Algorithm 6 is at most ε . This gives a total probability of error of at most ε .By definition, the degree of the extension is d = (cid:79) ( log q n ε ) . The cost of generating the irreducible poly-nomial of degree d is (cid:79) ( log ε d M ( d )( log q + log d )) by Proposition 2.8. Using Corollary 4.2 and the fact thatan operation in (cid:70) q d costs M ( d ) operations in (cid:70) q , we obtain the claimed costs. Remark 4.8. If ε = / n (cid:79) ( ) , the cost of getting an irreducible polynomial is negligible in the dense case. Thenthe algorithm requires (cid:79) ( Pn + n M ( log q n )) operations in (cid:70) q . If we add the degree constraint log n = o ( T ) ,the cost of getting an irreducible polynomials is also negligible in the sparse case and the algorithm requires (cid:79) ( T P (cid:100) γ − (cid:101) log n M ( log q n )) operations in (cid:70) q . As ( F G mod P ) is bounded by T P (cid:100) γ (cid:101) and computing F G mod P requires at least one operation in (cid:70) q for each monomial, this remark gives directly a case where the verification is faster than the modular productin general. Moreover, naive algorithms can be used to perform products in the extension of (cid:70) q . Remark 4.9.
When ε = / n (cid:79) ( ) and n = T (cid:79) ( ) , Algorithm 6 in the sparse case is in general faster than themodular multiplication and requires ˜ (cid:79) ( T P (cid:100) γ − (cid:101) ) bit operations if q < n ε . In the dense case, we can see from Remark 4.8 that the verification complexity might in fact be larger thanthe cost of computing the modular product
F G mod P . Indeed, assuming M q ( n ) = (cid:79) ( n log q ( log n + log log q )) bit operations [ ] , we have M q ( log q n ε ) = (cid:79) ( log n ε ( log log n ε + log log q )) . While the cost of computing F G mod P uses M q ( n ) bit operations, our verification requires (cid:79) ( Pn log q log log q + n log n ε ( log log n ε + log log q )) bitoperations. When P is not a constant, the latter is always larger. The following remark precise when wecan expect a positive result. Remark 4.10.
Assuming
P to be constant and ε = / n (cid:79) ( ) , Algorithm 6 in the dense case with (cid:82) = (cid:70) q isasymptotically faster than the modular multiplication when(i) log n ε < q < n ε , since modular multiplication costs (cid:79) ( n log q log n ) bit operations while verification, whichdoes need an extension, is (cid:79) ( n log n ε log log n ε ) .(ii) n ε < q < n ε , since modular multiplication costs (cid:79) ( n log q log n ) bit operations while verification, whichdoes not use an extension, is (cid:79) ( n log q log log q ) ; If the field is very large q > n ε , Algorithm 6 is asymptotically as fast as the modular multiplication. Thedominant factor in both complexity is (cid:79) ( n log q log log q ) .We shall mention that the verification cost in ( i ) of Remark 4.10 assumes the use of fast multiplication ofpolynomials in order to check fast multiplication of polynomial of degree (cid:79) ( n ) . Even though this dependencyis not a problem in theory it might not be satisfactory in practice. One solution would be to use a naivepolynomial multiplication for the extension field arithmetic but this further tightens the superiority of theverification. Remark 4.11.
Assuming that extension field arithmetic is done naively using quadratic polynomial multiplica-tion, Algorithm 6 remains faster than modular multiplication only when n / < q in the dense case. We now propose an novel method that enables us to improve all the dense cases where an extension field isnecessary, while not relying on any polynomial arithmetic. More precisely, we show that fast verification doesexist when q < log n , which is of great interest for the field (cid:70) . It is based on the evaluation of polynomialson matrices rather than scalars, combined with Freivalds algorithm for verifying matrix multiplication [ ] .17ndeed, choosing α from an extension field inherently leads to depend on polynomial multiplication.Instead of picking a random point that is probably not a root of ∆ = H − ( F G ) mod P when ∆ (cid:54) =
0, we picka polynomial R ∈ (cid:70) q [ X ] of degree k < n that is probably not a divisor of ∆ (cid:54) =
0. To test whether R divides ∆ ,we evaluate ∆ on the companion matrix C R of R , defined by C R = · · · − r · · · − r · · · − r ... ... ... ... ...0 0 · · · − r k where R = (cid:80) ki = r i X i . This strategy relies on the fact that R is the minimal polynomial of its companion matrix.Therefore, R ( C R ) = ∆ such that ∆ ( C R ) = R . In other words, R divides ∆ if and only if ∆ ( C R ) =
0. We will show that taking R irreducible over (cid:70) q of degree k = (cid:79) ( log n ) makes this approach faster then the one using extension field when ε is constant. Furthermore, it will extendthe possibility to have fast verification for any fields, whatever the size of the polynomials.To check whether ∆ ( C R ) =
0, we need to evaluate H and ( F G ) mod P on C R , and to verify that theevaluations match. Of course, one cannot directly evaluate those polynomials on C R as it would cost (cid:79) ( nk ω ) operations in (cid:70) q for the dense case, where ω < [ ] .Since k = (cid:79) ( log n ) , this would not give any improvement to Remark 4.10.Instead, we rely on the so-called Freivald’s technique to verify matrix multiplication [ ] . The idea is thatthe matrix product C = A × B ∈ (cid:82) k × k can be verified by asserting that uC = ( uA ) × B for a random vector u ∈ {
0, 1 } n with a probability of error of 1 /
2. To assert that two polynomials evaluations on the matrix C R match, it is sufficient to verify that their projection by the vector u are equal. Given a degree- n polynomials H ∈ (cid:70) q [ X ] , one can compute uH ( C R ) in (cid:79) ( nk ) operation in (cid:70) q using Horner evaluation: uH ( C R ) = u n (cid:88) i = h i C Ri = (cid:130) n (cid:88) i = h i uC Ri − (cid:140) C R + uh . (10)Since matrix-vector product with C R only costs (cid:79) ( k ) operations in (cid:70) q , and Horner procedure only uses n ofthose matrix-vector products, the cost is clear. Remark 4.12.
It is sufficient to replace the evaluation of F ( α ) and P ( α ) by uF ( C R ) and uP ( C R ) in Algorithm 3( M ODULAR E VALUATION ) to reach a complexity of (cid:79) ( n ( P + deg ( R ))) operations in (cid:70) q for computing u ( F G mod P )( C R ) in the dense case. More informally, it is sufficient to say that any of the operations in (cid:82) ext have now thecost of one matrix-vector product with C R . Theorem 4.13.
Let F , G, H and P ∈ (cid:70) q [ X ] , as in Algorithm 6. We can check whether H = F G mod
P in (cid:79) ( Pn + n log q n log ε ) operations in (cid:70) q with a probability of error at most ε if H (cid:54) = F G.Proof.
Let 0 < ε < be a fixed probability. The algorithm needs two steps. First it computes with probabilityat least 1 − ε an irreducible polynomial R of degree d = (cid:100) log q n ε (cid:101) using Proposition 2.8. Second, it computes uH ( C R ) and u ( F G mod P )( C R ) for some random vector u ∈ {
0, 1 } d . If both evaluations are distinct, thealgorithm returns False. Otherwise, it repeats (cid:79) ( log ε ) these two steps until one of the repetition fails. If thisis the case it return false, otherwise the algorithm return true.If H = ( F G ) mod P , the algorithm always returns True. Let us assume that H (cid:54) = ( F G ) mod P , and let ∆ = H − ( F G ) mod P . For the algorithm to return True, each repetition must ensure that uH ( C R ) = u ( F G mod P )( C R ) . This may happen if either R divides ∆ , whence H ( C R ) = ( F G mod P )( C R ) , or R does not divide ∆ but uH ( C R ) = u ( F G mod P )( C R ) . Since there are at least q d / d irreducible polynomials of degree d in (cid:70) q byProposition 2.7 and at most n / d of them divide ∆ , the probability that R divides ∆ is at most 2 n / q d ≤ ε provided R is irreducible. Taking into account the probability that R is not irreducible, the probability that R divides ∆ is at most 2 ε . Then, using Freivalds standard argument, if R does not divide ∆ , the probability u ∆ ( C R ) = . Altogether, the probability that one iteration returns True is at most + ε < ε / log ( + ε ) independent iterations all return True is at most ε .Let us now analyse the complexity of the algorithm. Since ε is a constant, the second step uses (cid:79) ( Pn + n log q n ) operations in (cid:70) q using Remark 4.12. The first step is negligible, even if naive polynomial arithmeticis used. Note that in this complexity, Pn is the cost of Algorithm 2 (L EADING C OEFFICIENTS ). Since it isdeterministic and only depends on P and F , it can be called only once rather than at each iteration.We then get the complexity (cid:79) ( Pn + n log q n log ε ) .18 emark 4.14. Note that compared to using evaluation at α in an extension field, no operation depends on ε . If ε is fixed, the new method replaces a factor M ( log q n ) in the complexity by log q n. Moreover our new approachrequires only simple computations: additions of vectors, multiplication of a vector by a scalar and matrix-vectorproduct with a companion matrix. Furthermore, when P and ε are constants, the verification is always fasterthan the modular multiplication, whatever the size of q. This new method still requires some polynomial arithmetic even if only naive polynomial multiplicationis used. This is because the algorithm called to provide the degree- d polynomial R relies on polynomialproducts and GCDs to ensure that R is probably irreducible. In order to remove the dependency to polynomialarithmetic, we can just choose a random monic degree- d polynomial R and compute the evaluation on C R even if R is not irreducible. This implies to take several random polynomials R to reach the target probability ε . Corollary 4.15.
Let F , G, H and P ∈ (cid:70) q [ X ] , as in Algorithm 6. Without using any polynomial multiplicationwe can check whether H = F G mod
P in (cid:79) ( Pn + n ( log q n ) log ε ) operations in (cid:70) q with a probability of errorat most ε if H (cid:54) = F G.Proof.
In the proof of Theorem 4.13, we replace one evaluation on C R with R irreducible with probability atleast 1 − ε by few evaluations on several C R with R random monic polynomial of degree d . As the probabilityof a random monic polynomial R to be irreducible is at least d by Proposition 2.7, we need to generate (cid:79) ( d log ε ) random polynomials R to reach a probability at least 1 − ε that at least one of them is irreducible.Thus the evaluation part of the algorithm is repeated (cid:79) ( d ) = (cid:79) ( log q n ) times since ε is constant.Even if this new approach does not improve the complexity from evaluation at α in an extension fieldusing naive polynomial multiplication, it allows to remove completely the dependency to any polynomialmultiplication algorithm While this result might not being seen useful as first sight, it will be used in Sec-tion 5.1 to provide efficient verification for polynomial multiplication. Indeed, in that case we will need touse verification with P being of degree smaller than the input degree.This new method using companion matrix also works in the sparse case. Indeed, any power α t in Al-gorithm 5 (S PARSE M ODULAR E VALUATION ) are now replaced with C tR . However, only few powers C tR with1 < t < n are relevant and we cannot compute all of them as in the dense case. This implies that com-puting u ( F G mod P )( C R ) instead of ( F G mod P )( C R ) is useless in that case. Indeed, using fast exponenti-ation together with the structure of the powers of companion matrices [ ] already yield a complexity of (cid:79) (( F P (cid:100) γ − (cid:101) + G ) log n log q n ) operations in (cid:70) q , and we cannot hope to lower this down by some randomvector projection. In that case, using Freivald technique is useless and we have a better probability of success.Choosing (cid:79) ( log n ε log ε ) polynomials R at random, at least one of them is irreducible and does not divide ∆ = H − F G mod P with probability 1 − ε . This leads to an algorithm which does not use any polynomialproduct in the sparse case too. Even though it is asymptotically not as fast as the verification in an extensionfield where naive polynomial arithmetic is used, it is still quasi-linear. Corollary 4.16.
Let P ∈ (cid:70) q [ X ] be monic of degree n and F , G, H ∈ (cid:70) q [ X ] of degree less than n and sparsity atmost T , and < ε < . Using a direct evaluation on a companion matrix, we can check whether H = F G mod
Pwith a probability of error at most ε if H (cid:54) = F G in (cid:79) ( T P (cid:100) γ − (cid:101) log n ( log q n ε ) log ε ) operations in (cid:70) q , withoutperforming any polynomial product. In this section we study the simpler problem of verifying a classical polynomial multiplication. Given threepolynomials F , G and H ∈ (cid:82) [ X ] of respective degrees n , n and 2 n , the classical idea to verify H = F G simplyfalls down to testing H ( α ) = F ( α ) G ( α ) for some random α in a large enough set (cid:83) . As mentioned in theintroduction, this strategy may or may not have an optimal bit complexity, depending on the context. Herewe are concerned with two difficulties that arise in either the dense or the sparse cases.If the polynomials are dense, the verification through evaluation requires a number of operations in (cid:83) that is linear in the input polynomials degree n . When (cid:82) has more than n elements, taking (cid:83) ⊂ (cid:82) issufficient to use evaluation. However, multiplication in (cid:82) has not a linear bit complexity and best knownresults remain quasi-linear [
2, 15, 14 ] . The evaluation therefore leads to a quasi-linear bit complexity of (cid:79) ( n I ( log n )) = (cid:79) ( n log n log log n ) . When (cid:82) is too small, for instance with a small finite field, (cid:83) is classicallytaken as a field extension of (cid:82) , large enough to make it unlikely that α is a root of H − F G . Therefore, each19peration in (cid:83) corresponds to an operation over (cid:82) [ X ] with non-negligible degree, meaning that the numberof operations in (cid:82) is no more linear in the inputs degree n . As mentioned in the introduction, Kaminski’sapproach [ ] circumvents the later problem by replacing the evaluation with a computation in (cid:82) [ X ] / ( X i − ) for random integer i in a prescribed range. There, his algorithm is able to verify dense polynomial productswith a linear number of operations in (cid:82) whatever the ring size. However, the same difficulty as for largerings may arise. Since operations in (cid:82) do not have an linear bit complexity, unless say (cid:82) = (cid:70) , this is notalways sufficient to reach an optimal bit complexity for the verification. In Section 5.1, we present Kaminski’sapproach [ ] and we provide a thorough analysis in the bit complexity model. In particular, we show thatit is possible to get optimal verification in the bit complexity model for any polynomial in (cid:90) [ X ] and for somepolynomials in (cid:70) q [ X ] , depending on the relation between q and n .If the polynomials F , G and H are sparse with at most T nonzero coefficients, the evaluation requires anumber of operations in (cid:83) that is (cid:79) ( T log n ) . However the input bit size is given by the size of the exponents plus the size of the coefficients, that is (cid:79) ( T + log n ) bits. Since (cid:83) has to be of size at least (cid:79) ( log n ) , thebit complexity of evaluation would be (cid:79) ( T log n ) which is not even quasi-linear. In Section 5.2 we developa novel method, already appearing in [ ] , to verify sparse polynomial multiplication with a quasi-linear bitcomplexity of ˜ (cid:79) ( T + log n ) . In [ ] , Kaminski describes an algorithm to verify a polynomial product H = F G ∈ (cid:82) [ X ] using a linearnumber of operations in (cid:82) , regardless of its size. His method chooses at random a polynomial P that probablydo not divides ∆ = H − F G if ∆ (cid:54) =
0. Then he verifies H = F G ∈ (cid:82) [ X ] / P using fast polynomial multiplication.Surprisingly, taking P of degree o ( n ) in his algorithm enables to reach a linear number of operations in (cid:82) . Inthe following we will often use δ > The first step is to randomly select a polynomial from a fixed set, such that it most probably does not divide ∆ = H − F G if ∆ (cid:54) =
0. A standard approach could be to consider irreducible polynomials. This would be thedirect generalization of the evaluation method. However, Kaminski considers polynomials that are instead ofthe form X i −
1, for some integers i >
0. These polynomials have two advantages: Reduction modulo X i − Proposition 5.1 ( [ ] ) . For any integer set I ⊂ (cid:78) , (cid:81) i ∈ I Φ i divides lcm { X i − i ∈ I } , where Φ i is the i-thcyclotomic polynomial in (cid:82) [ X ] . Kaminski also gives a lower bound on the degree of lcm { X i − i ∈ I } , depending in I . In particular theproposition implies that a nonzero polynomial, divisible by k polynomials, of the form X i −
1, cannot have atoo small degree. In the converse direction, a nonzero polynomial ∆ of degree at most 2 n cannot have toomany divisors of the form X i −
1. This is the content of Kaminski’s main theorem.
Theorem 5.2 ( [ ] ) . Let ∆ be a nonzero polynomial in (cid:82) [ X ] of degree ≤ n and < e < . Let k = (cid:100) δ n e ln ln ( n − e ) (cid:101) . At most k − polynomials in the set { X i − | n − e ≤ i < n − e } divide ∆ . Kaminski’s approach is then to choose a random integer i ∈ [ n − e , 2 n − e [ , to reduce the input polynomialsmodulo X i − (cid:82) [ X ] / ( X i − ) . We provide in Algorithm K AMINSKI V ERIFICATION a more precise description of this approach.
Algorithm 8 K AMINSKI V ERIFICATION
Input: F , G , H ∈ (cid:82) [ X ] of degree n , n and 2 n ; and 0 < e < . Output:
True if H = F G , False with probability at least 1 − ( (cid:100) δ n e ln ln ( n − e ) (cid:101) − ) / n − e otherwise. i ← random integer in [ n − e , 2 n − e [ F i , G i , H i ← F mod X i − G mod X i − H mod X i − M ← F i G i (cid:46) Using a fast multiplication algorithm M i ← M mod X i − return M i = H i heorem 5.3 ( [ ] ) . Let F , G and H ∈ (cid:82) [ X ] of degree at most n, n and n, < e < and an integer k asin Theorem 5.2. Algorithm 8 uses (cid:79) ( n ) operations in (cid:82) , and its failure probability is at most ( k − ) / n − e ifH (cid:54) = F G.
Remark 5.4.
To be more precise, Algorithm 8 requires (cid:79) ( n ) additions in (cid:82) at Step 2 to compute the first threereductions modulo X i − , M ( n − e ) operations in (cid:82) to compute the product at Step 3, and (cid:79) ( n − e ) additions in (cid:82) to compute the last reduction. One shall remark that the product in Step 3 must be computed with a subquadratic algorithm such that M ( n − e ) = (cid:79) ( n ) since e < /
2. If the parameter e is taken close enough to 1 /
2, Karatsuba’s algorithm sufficesto reach a linear number of operations. The failure probability is (cid:79) ( log log n / n − e ) , whence the need to have e < /
2. We can bound this probability by (cid:79) ( n e (cid:48) ) for any positive integer e (cid:48) < − e . In order to reach aprobability ε of error, the algorithm should be repeated (cid:79) ( log n ε ) times. Note that this number of rounds isconstant if ε is taken as 1 / n (cid:79) ( ) .The drawback of such approach is to crucially rely on a somewhat fast multiplication algorithm, and toperform multiplications of polynomials of degrees more than (cid:112) n . This means that optimal verification of theproduct of two degree- n polynomials uses a product of polynomials of degrees close to n . In some contexts,such as verifying an implementation, relying on the same problem is definitively problematic.We note that all steps starting from Step 3 aim to verify H i = F i G i mod X i − Corollary 5.5. If (cid:82) = (cid:90) or a finite field, and F , G and H ∈ (cid:82) [ X ] of degrees n, n and n. We can checkwhether H = F G with a probability of failure at most ε if H (cid:54) = F G. This requires (cid:79) ( n log n ε ) additions in (cid:82) pluso ( n log n ε ) operations in (cid:82) , without reverting to any polynomial multiplication. In particular, the algorithmuses an optimal number of operations in (cid:82) when ε = / n (cid:79) ( ) .Proof. We replace the last three steps of Algorithm 8 by a modular product verification, with a probability offailure at most 1 / n . Over (cid:90) or large finite fields, the complexity of this part is given by the dense version ofTheorem 4.1 with P = i = (cid:79) ( n − e ) . Over small finite fields, we rely instead on Corollary 4.15.In both cases, one can achieve a failure probability at most 1 / n with at most (cid:79) ( log n ) repetitions of thealgorithm, for a total number of operations in (cid:82) that remains o ( n ) .The total probability of failure of the modified algorithm is then 1 / n + (cid:79) ( / n e (cid:48) ) = (cid:79) ( / n e (cid:48) ) for some e (cid:48) >
0. We can repeat this modified algorithm for (cid:79) ( log n ε ) rounds to get the announced failure probabilityand complexity. In [ ] , Kaminski only details the algebraic complexity of its polynomial product verification, and no furtherinsights on the bit complexity are given. We now perform this analysis for polynomials over finite fields andover (cid:90) . We surprisingly prove that his algorithm remains linear in number of bit operations in many cases.For polynomials over (cid:70) q , the algorithm fails to be linear only when q is doubly exponentially larger than thedegree. For polynomials over (cid:90) , a similar condition applies. However, we are able to describe a variant ofthe algorithm that has linear bit complexity for polynomials with large coefficients. Hence we prove thatpolynomial product verification over (cid:90) has linear bit complexity in all cases. Our variant is based on integerproduct verification, for which Kaminski actually gives also in [ ] a linear-time algorithm. Of course allthose algorithms are therefore optimal.The next theorem provides the bit complexity analysis of Kaminski’s algorithm over finite fields. Theorem 5.6.
Let F , G and H ∈ (cid:70) q [ X ] of degrees n, n and n, and < e < . Algorithm 8 requires (cid:79) ( n log q + n − e log q log log q ) bit operations. When log log q = (cid:79) ( n e ) , one can verify if H = F G with failure probability atmost ε if H (cid:54) = F G, using (cid:79) ( n log q log n ε ) bit operations which is optimal when ε = / n (cid:79) ( ) .Proof. We apply the count of operations given in Remark 5.4. The additions give the term (cid:79) ( n log q ) . Thebit complexity of the product of degree- (cid:79) ( n − e ) polynomials over (cid:70) q is M q ( n − e ) = (cid:79) ( n − e log q log ( n log q )) ,which is (cid:79) ( n log q + n − e log q log log q ) . We obtain the claimed complexity.The second part directly follows from the observation that (cid:79) ( log n ε ) rounds of the algorithm yield a failureprobability at most ε . 21ote that the bound log log q = (cid:79) ( n e ) to get a linear number of bit operations in n log q is only validwhen using the fastest known multiplication algorithm. If we replace by a slower algorithm, the boundbecomes smaller. For instance, using Karatsuba’s algorithm the product of degree- (cid:79) ( n − e ) polynomials uses (cid:79) ( n ( − e ) log3 log q log log q ) ring operations. For the algorithm to still have an optimal complexity, we needthat n ( − e ) log3 log log q = (cid:79) ( n ) . This implies e ≥ − / log 3 (cid:39) q = (cid:79) ( n − ( − e ) log3 ) . If we take e close to 1 /
2, say 0.45, the bound reads log log q = (cid:79) ( n ) while it is log log q = (cid:79) ( n ) using the fastest multiplication algorithm.Further, as mentioned previously, using a fast multiplication algorithm for the verification of a polynomialproduct is problematic. We now analyse the bit complexity of our variant that does not use any polynomialproduct, that is of Corollary 5.5. We show that the same complexity and the same bound on q can be obtainedwithout any polynomial product. Remark 5.7.
Let F , G, H ∈ (cid:70) q [ X ] of degrees n, n and n, and < e < . Algorithm 8 can be implemented usinga modular product verification and without any polynomial product. This variant has bit complexity (cid:79) ( n log q + n − e log q log log q ) . When log log q = (cid:79) ( n e ) , one can verify if H = F G with failure probability at most ε ifH (cid:54) = F G, using (cid:79) ( n log q log n ε ) bit operations, which is optimal when ε = / n (cid:79) ( ) , and without reverting to anypolynomial product.Proof. The proof simply consists in using Corollary 4.15 in place of Remark 5.4 in the previous proof.Now we consider F , G and H ∈ (cid:90) [ X ] with (cid:107) F (cid:107) ∞ , (cid:107) G (cid:107) ∞ , (cid:107) H (cid:107) ∞ ≤ C . We first analyse the bit complexityof Algorithm 8 and provide conditions for the algorithm to use a linear number of bit operations. Laterwe propose a variant to be able to verify H = F G with a linear number of bit operations for any integerpolynomials..
Theorem 5.8.
Let F , G and H ∈ (cid:90) [ X ] of degrees n, n and n, and norms at most C, and < e < . Algorithm 8requires (cid:79) ( n log C + n − e log C log log C ) bit operations. When log log C = (cid:79) ( n e ) , one can verify if H = F G withfailure probability at most ε if H (cid:54) = F G, using (cid:79) ( n log C log n ε ) bit operations which is optimal when ε = / n (cid:79) ( ) .Proof. The first three reductions require (cid:79) ( n ) additions in (cid:90) to compute F i , G i and H i , whose norms are atmost n e C . A careful computation of these additions using a binary tree uses (cid:79) ( (cid:80) log ni = n i log ( iC ) = (cid:79) ( n log C ) bit operations. Then the polynomial product is performed with inputs of degree n − e and norm n e C . Asdiscussed in the introduction, it requires I ( n − e ( log ( n e C ) + log n − e )) bit operations, that is (cid:79) ( n − e ( log n + log C )( log n + log log C )) = (cid:79) ( n log C + n − e log C log log C ) . Finally the last reduction is performed withdegree 2 n − e and norm n ( n e C ) in (cid:79) ( n log C ) bit operations.Repeating (cid:79) ( log n ε ) times the algorithm provides the second part of the theorem.As for polynomials over finite fields, the final computations can be replaced by a modular product verifica-tion. Here this yields a slightly better complexity. This improvement translates into an exponentially smallerconstraint on the norm C for the algorithm to be optimal. Remark 5.9.
Let F , G and H ∈ (cid:90) [ X ] , of degrees n, n and n and norms at most C, and < e < . Algorithm 8can be implemented using a modular product verification and without any polynomial product. This variant hasbit complexity (cid:79) ( n log C + n − e log ( C ) log log log ( C )) . When log log log C = (cid:79) ( n e ) , one can verify if H = F Gwith failure probability at most ε if H = F G, using (cid:79) ( n log C log n ε ) bit operations, which is optimal when ε = / n (cid:79) ( ) , and without reverting to any polynomial product.Proof. The proof is once again similar, using the dense part of Theorem 4.4 for the modular product verifi-cation. This verification is performed on polynomials of degrees (cid:79) ( n − e ) and norm at most n e C . Its bit com-plexity is then (cid:79) ( n − e ( I ( log ( n log C )+ log ( C ) log log ( n log C )))) which is (cid:79) ( n log C + n − e log ( C ) log log log ( C )) .This proves the first part of the remark. The second part relies on repetition of Algorithm 8.As long as the coefficients are not insanely huge compared to the degree, the previous remark appliesand the polynomial product verification is linear. More precisely, this corresponds to C ranging from (cid:79) ( ) to2 (cid:79) ( n ) . To deal with this extreme case of huge coefficients, we develop another approach that is valid as soonas log n = (cid:79) ( log C ) . This means that all cases are covered with an optimal bit complexity. We shall mentionthat both methods are applicable when C is ranging from n (cid:79) ( ) to 2 (cid:79) ( n ) , which could be interesting whendesigning the most efficient implementation.To treat the huge coefficient case, we rely on a result of Kaminski about the verification of the productof two integers. His technique is similar to the polynomial case: He reduces s -bit integers modulo 2 i − i between s − e and 2 s − e , and then performed the product with reduced integers.22 heorem 5.10 ( [ ] ) . Let a, b, c be integers of at most s, s and s bits, < e < and k = (cid:100) δ s e ln ln ( s − e ) (cid:101) where δ > . We can check whether ab = c in (cid:79) ( s ) bit operations with a probability of error at most ( k − ) / s − e if ab (cid:54) = c. To verify a polynomial product H = F G over (cid:90) , we use the same idea as for computing the product. We useKronecker substitution. If we evaluate each polynomial on β that is some large power of two, the coefficientsof F G can directly be read on the digits of the integer F ( β ) G ( β ) . These evaluations at β require no operation.The polynomial product verification is thus reduced to an integer product verification H ( β ) = F ( β ) G ( β ) . Theorem 5.11.
Let F , G, H ∈ (cid:90) [ X ] of respective degrees n, n and n, and norm at most C. If log n = (cid:79) ( log C ) ,we can check whether H = F G with failure probability at most ε if H (cid:54) = F G, using (cid:79) ( n log C log n log C ε ) bitoperations, which is optimal when ε = / n (cid:79) ( ) .Proof. As F and G have norm C and degree n , F G has norm at most nC . Let β be the first power of 2 greaterthan nC . Then H = F G if and only if H ( β ) = F ( β ) G ( β ) .The integers F ( β ) , G ( β ) and H ( β ) have bit length (cid:79) ( n log β ) = (cid:79) ( n log ( nC )) = (cid:79) ( n log C ) since log n = (cid:79) ( log C ) . As β is a large enough power of 2, the evaluation on β does not require any operation. Thereforeall the cost comes from the verification of F ( β ) G ( β ) = H ( β ) . This is linear in the size of F ( β ) , G ( β ) and H ( β ) by Theorem 5.10, hence linear in n log C .To get the appropriate probability bound, we use (cid:79) ( log n log C ε ) round of this algorithm. This is supportedby the fact that the probability bound in Theorem 5.10 is 1 / s (cid:79) ( ) . Given three sparse polynomials F , G and H in (cid:82) [ X ] , we want to assert that H = F G . As already mentioned,evaluating the polynomials at a random point α cannot yield a quasi-linear algorithm. Our approach is totake a random prime p and to verify the equality modulo X p − (cid:82) . We furtherextend the description and the analysis of this algorithm for the specific cases (cid:82) = (cid:90) and (cid:82) = (cid:70) q . Algorithm 9 S PARSE V ERIFICATION
Input: H , F , G ∈ (cid:82) [ X ] ; 0 < ε < Output:
True if H = F G , False with probability at least 1 − ε otherwise. Define 0 < ε < and 0 < ε < ε + ( − ε ) ε ≤ ε n ← deg ( H ) if H > F G or n (cid:54) = deg ( F ) + deg ( G ) then return False λ ← max ( ε ( F G + H ) ln n ) p ← R ANDOM P RIME ( λ , ε ) ( F p , G p , H p ) ← ( F mod X p − G mod X p − H mod X p − ) return True if H p = ( F p G p ) mod X p −
1, False otherwise (cid:46) using Theorem 4.1 with probability ε Theorem 5.12. If (cid:82) is an integral domain of size ≥ ε ε ( F G + H ) ln ( n ) , Algorithm 9 works as speci-fied. Assuming that n = deg ( H ) and T = max ( F , G , H ) , it requires (cid:79) ( T log ( ε T log n )) operations in (cid:82) ,and (cid:79) ( T log n log log ( ε T log n )) bit operations plus (cid:79) ( log ε log ( ε T log n ) log log ( ε T log n )) bit operations toobtain a prime p.Proof. Step 3 dismisses two trivial mistakes and ensures that n is a bound on the degree of each polynomial.If H = F G , the algorithm always returns True. Otherwise, there are two sources of failure. Either X p − H − F G . Since this polynomial has at most H + F G terms, this failure occurs with probability atmost ε by Proposition 2.5. Or X p − H − F G but the modular product verification fails.This occurs with probability at most ε . Altogether, the failure probability is at most ε + ( − ε ) ε ≤ ε .To analyse the complexity, we consider ε , ε ∼ ε (for example ε = ε and ε = ε ). Let us remarkthat p = (cid:79) ( ε T log n ) . To get the prime p , Step 5 requires only (cid:79) ( log ε log p log log p ) bit operations byProposition 2.4. This gives the announced complexity once log p is replaced by (cid:79) ( log ( ε T log n )) .The operations in Step 6 are T divisions by p on integers bounded by n . Their cost is (cid:79) ( T log n log p I ( log p )) = (cid:79) ( T log n log log p ) bit operations, that is (cid:79) ( T log n log log ( ε T log n ))) , plus T additions in (cid:82) .23n Step 7, F p , G p and H p have degree p = (cid:79) ( ε T log n ) and at most T monomials. They are still sparseand we can use the sparse version of Theorem 4.1 with P = X p −
1. The verification of H p = F p G p mod X p − (cid:79) ( T log p ) = (cid:79) ( T log ( ε T log n )) operations in (cid:82) . Other steps have negligible cost.To clarify the complexity, we will use the notation (cid:79) ε ( f ( n )) as a shortcut for (cid:79) ( f ( n ) log k ε ) for some k . Using this notation, the complexity of Algorithm 9 becomes (cid:79) ε ( T log ( T log n )) operations in (cid:82) plus (cid:79) ε ( T log n log log ( T log n ))) bit operations as getting the prime p is logarithmic in T and log n .The rest of the section is dedicated to the bit complexity analysis of this algorithm over integers or finitefields. Our goal is to have bit complexities that are as close as possible to linear. To ease the comparisonwith truly linear complexity, we express these bit complexities in terms of the total bit size s of the input. Adegree- n polynomial with T monomials has bit size s = (cid:79) ( T ( log n + log q )) if it has coefficients in (cid:70) q , and s = (cid:79) ( T ( log n + log C )) if it has coefficients in (cid:90) of absolute value at most C .We first note that reducing the input polynomials modulo X p − (cid:79) ε ( T log n log log ( T log n )) , which is (cid:79) ε ( s log log s ) . We shall provethat in some cases, this step is actually the dominant term in the complexity.We begin with the analysis over the integers. Corollary 5.13.
Let F , G and H ∈ (cid:90) [ X ] of degree at most n, with norm at most C and sparsity at most T . ThenAlgorithm 9has bit complexity (cid:79) ε ( s log s log log s ) , where s = T ( log n + log C ) is the input size.Proof. The modification only concerns Step 7, where we use Theorem 4.4 for the modular product verificationwith P = X p − F p , G p , H p that have sparsity T and norm T C . So this step costs (cid:79) ε ( T log p I ( log ( p log C )) + T log ( T C ) log log ( p log T C )) .Since T ≤ n , T log p = (cid:79) ε ( T log n ) = (cid:79) ε ( s ) . And log ( p log C ) = (cid:79) ε ( log ( T log n log C )) = (cid:79) ε ( log ( T log n ) + log log C ) = (cid:79) ε ( log s ) . Thus the first term is (cid:79) ε ( s log s log log s ) . Also, T log ( T C ) = (cid:79) ( T log nC ) = (cid:79) ( s ) .As log ( p log T C ) = (cid:79) ε ( log ( T log n ) + log log C ) = (cid:79) ε ( log S ) , the second term is (cid:79) ε ( s log log s ) .Since Step 6 is unchanged and has bit complexity (cid:79) ε ( s log log s ) , the result follows.The complexity is actually better for very sparse polynomials. Remark 5.14.
If F , G, H ∈ (cid:90) [ X ] of bit size s have sparsity at most T = Θ ( log k n ) for some k, Algorithm 9 hasbit complexity (cid:79) ε ( s log log s ) .Proof. The input size is s = Θ ( log k + n + log k n log C ) . In this case, log p = (cid:79) ε ( log log n ) . In the previousproof, there is one dominant term of order (cid:79) ε ( s log s log log s ) , while the other terms are already of order (cid:79) ε ( s log log s ) . It is sufficient to prove that with the new assumption, the dominant term is also (cid:79) ε ( s log log s ) .The dominant term (cid:79) ε ( s log s log log s ) in the complexity comes from the term (cid:79) ε ( T log p I ( log ( p log C ))) .Since log ( p log C ) = (cid:79) ε ( log log n + log log C ) , this dominant term becomes (cid:79) ε ( log k n log log n ( log log n + log log C ) log ( log log n + log log C )) .Note that log log n and log log C are both (cid:79) ( log s ) , therefore this can be rewritten (cid:79) ε ( log k n log s log log s ) .Since log k n = (cid:79) ( s k / ( k + ) ) , this yields (cid:79) ε ( s log log s ) .We now switch to polynomials over finite fields. There are more cases to consider, depending on the sizeof the field with respect to the degree and sparsity of the inputs. The first easy case is the case of large finitefields: If there are enough points for the evaluation, the generic algorithm keeps its guarantee of successwhile offering a quasi-linear bit complexity. Corollary 5.15.
Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T where q > ε ε ( F G + H ) ln n. Then Algorithm 9 has bit complexity (cid:79) ε ( s log ( s )) where s = T ( log n + log q ) is the input size.Proof. It is still enough to analyse Step 7. Each ring operation in (cid:70) q costs (cid:79) ( log ( q ) log log ( q )) bit operationswhich implies that the bit complexity of Step 7 is (cid:79) ε ( T log ( T log n ) log ( q ) log log ( q )) . Since both T log q and T log n are (cid:79) ( s ) and log log q = (cid:79) ( log s ) , the result follows.24f the field is not large enough, we need to use some extension field. This slightly modifies the algorithmbut actually yields a better complexity bound than for large finite fields. This is due to the fact that in thatcase, we choose an extension of the exact appropriate size. Note that the probability of success remainsunchanged. Corollary 5.16.
Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T where q < ε ε ( F G + H ) ln n. Algorithm 9 has bit complexity (cid:79) ε ( s log s log log s ) , where s = T ( log n + log q ) is the input size.Proof. By Corollary 4.7 Step 7 requires (cid:79) ε ( T log p M q ( log q p ) + ( log q p ) M q ( log q p )( log q + log log q p )) op-erations in (cid:70) q . Since log p = (cid:79) ε ( log ( T log n )) = (cid:79) ε ( log s ) and log q = (cid:79) ε ( log s ) too, the second termis polylogarithmic in s . As log p = (cid:79) ε ( log ( T log n )) the first term is (cid:79) ε ( T log ( T log n ) M q ( log q ( T log n ))) .Since log ( T log n ) = (cid:79) ( log n ) , T log ( T log n ) = (cid:79) ( s ) . Furthermore, log ( T log n ) = (cid:79) ( log s ) and the firstterm simplifies to (cid:79) ε ( s M q ( log q s )) . Now M q ( log q s ) = (cid:79) ( log s log log s ) . Altogether T log p M q ( log q p ) = (cid:79) ε ( s log s log log s ) . The result follows.Again, we note that for very sparse polynomials over some fields, the complexity is even better. Remark 5.17.
Let F , G and H ∈ (cid:70) q [ X ] of degree at most n and sparsity at most T , where q < ε ε ( F G + H ) ln n. The bit complexity of Algorithm 9 is(i) (cid:79) ε ( s log s ) if log q ( T log n ) = (cid:79) ( ) ,(ii) (cid:79) ε ( s log log s ) if T = Θ ( log k n ) for some constant k.Proof. The most significant term in the complexity is (cid:79) ε ( T log ( T log n ) M q ( log q ( T log n )) . In the firstcase, it becomes (cid:79) ε ( T log ( T log n ) M q ( )) = (cid:79) ε ( T log ( T log n ) log q log log q ) . As log q = (cid:79) ε ( log log n ) , T log q log log q = (cid:79) ε ( s ) and the complexity becomes (cid:79) ε ( s log s ) . In the second case, the most significantterm can be bounded by (cid:79) ε ( T log ( T log n )) . But T = (cid:79) ( s k / ( k + ) ) , and this most significant term becomes (cid:79) ε ( s ) only. The global bit complexity is then dominated by Step 6 and is (cid:79) ε ( s log log s ) .To conclude, the bit complexity of Algorithm 9 over integers or finite fields range from (cid:79) ε ( s log log s ) in the most favorable cases, to (cid:79) ε ( s log s ) in more complicated situations. We note that in the best cases,the complexity is actually dominated by the cost of the modular reduction of the exponents of the inputpolynomials. Remark 5.18.
Verification of a sparse product is always faster than computing the sparse product over (cid:90) or (cid:70) q .Proof. Assuming s = T ( log n + log ζ ) to be the input size of the sparse polynomial F , G and H . Over (cid:90) we havelog ζ = log C where C is the norm of the coefficients, while log ζ = log q when in (cid:70) q . The best know result forcomputing the product F G needs (cid:79) ε ( s log ( s ) log ( T )( log T + log log s )) bit operations [ ] . Taking the worstcase complexity for our verification yields a cost of (cid:79) ε ( s log s ) . This means that we are always faster by afactor (cid:79) ( log ( T )( log T + log log s )) . Of course, for some small finite fields we are even beyond this value. References [ ] A. Arnold and D. S. Roche. Output-sensitive algorithms for sumset and sparse polynomial multiplication.In
ISSAC ’15 , pages 29–36. ACM, 2015. [ ] D. G. Cantor and E. Kaltofen. On fast multiplication of polynomials over arbitrary algebras.
Acta Infor-matica , 28:693–701, 1991. [ ] R. Cole and R. Hariharan. Verifying candidate matches in sparse and wildcard matching. In
STOC , pages592–601. ACM, 2002. [ ] J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Calculation of Complex Fourier Series.
Mathematics of Computation , 19:297–301, 1965. [ ] Th. H. Cormen, Ch. E. Leiserson, R. L. Rivest, and C. Stein.
Introduction to Algorithms . The MIT Press,3rd edition, 2009. [ ] R. A. Demillo and R. J. Lipton. A probabilistic remark on algebraic program testing.
Information Pro-cessing Letters , 7(4):193 – 195, 1978. 25 ] R. Freivalds. Fast probabilistic algorithms. In
Mathematical Foundations of Computer Science , volume 74,pages 57–69. Springer Berlin Heidelberg, 1979. [ ] J. von zur Gathen and J. Gerhard.
Modern Computer Algebra (third edition) . Cambridge University Press,2013. [ ] P. Giorgi, B. Grenet, and A. Perret du Cray. Essentially optimal sparse polynomial multiplication. In
Proceedings of the 2020 international symposium on symbolic and algebraic computation , ISSAC, pages202–209. ACM, 2020. [ ] P. Giorgi. A probabilistic algorithm for verifying polynomial middle product in linear time.
InformationProcessing Letters , 139:30 – 34, 2018. [ ] S. W. Golomb.
Shift register sequences . Aegean Park Press, 1982. [ ] D. Gries and G. Levin. Computing fibonacci numbers (and similarly defined functions) in log time.
Inf.Process. Lett. , 11:68–69, 1980. [ ] D. Harvey and J. van der Hoeven. Faster polynomial multiplication over finite fields using cyclotomiccoefficient rings.
Journal of Complexity , 54:101404, 2019. [ ] D. Harvey and J. van der Hoeven. Integer multiplication in time O ( n log n ) . To appear in Ann. of Math. ,March 2019. [ ] D. Harvey and J. van der Hoeven. Polynomial multiplication over finite fields in time O(n log n). workingpaper or preprint, March 2019. [ ] J. van der Hoeven, R. Lebreton, and É. Schost. Structured FFT and TFT: Symmetric and Lattice Polyno-mials. In
ISSAC’13 , pages 355–362. ACM, 2013. [ ] J. van der Hoeven and G. Lecerf. On the Complexity of Multivariate Blockwise Polynomial Multiplication.In
ISSAC’12 , pages 211–218. ACM, 2012. [ ] J. van der Hoeven and G. Lecerf. On the bit-complexity of sparse polynomial and series multiplication.
J. Symb. Comput. , 50:227–254, 2013. [ ] S. C. Johnson. Sparse polynomial arithmetic.
ACM SIGSAM Bulletin , 8(3):63–71, 1974. [ ] M. Kaminski. A note on probabilistically verifying integer and polynomial products.
J. ACM ,36(1):142–149, January 1989. [ ] F. Le Gall. Powers of tensors and fast matrix multiplication. In
ISSAC , pages 296–303, New York, NY,USA, 2014. ACM. [ ] M. Monagan and R. Pearce. Parallel sparse polynomial multiplication using heaps. In
ISSAC , page 263.ACM, 2009. [ ] M. Monagan and R. Pearce. Sparse polynomial division using a heap.
J. Symb. Comput. , 46(7), 2011. [ ] G. L. Mullen and D. Panario.
Handbook of Finite Fields . Chapman & Hall / CRC, 1st edition, 2013. [ ] V. Nakos. Nearly optimal sparse polynomial multiplication.
IEEE Transactions on Information Theory ,66(11):7231–7236, 2020. [ ] D. S. Roche. Chunky and equal-spaced polynomial multiplication.
Journal of Symbolic Computation ,46(7):791 – 806, 2011. doi:10.1016 / j.jsc.2010.08.013. [ ] D. S. Roche. What can (and can’t) we do with sparse polynomials? In
ISSAC , 2018. [ ] J. B. Rosser and L. Schoenfeld. Approximate formulas for some functions of prime numbers.
Illinois J.Math. , 6(1):64–94, 03 1962. http: // projecteuclid.org / euclid.ijm / [ ] J. T. Schwartz. Fast probabilistic algorithms for verification of polynomial identities.
J. ACM , 27(4):701–717, October 1980. [ ] V. Shoup.
A Computational Introduction to Number Theory and Algebra . Cambridge University Press,second edition, 2008. 26 ] A. C. Yao. On the Evaluation of Powers.
SIAM Journal on Computing , 5(1):100–103, 1976. [ ] R. Zippel. Probabilistic algorithms for sparse polynomials. In